Many organizations already leverage syslog for data collection. It’s easy to get up and running and start logging data to a file. However, when it comes to bringing this data in to Splunk, there are a few things that can help with your long-term success. In this three-part series, we’ll explore why you should use Syslog-ng (syslog next generation) to collect your syslog data, how to configure Syslog-ng, and troubleshooting and advanced syslog topics.
Why are you suggesting syslog-ng? What about kiwi syslog?
Simply put, if you want to syslog effectively, you need to do it on Linux. Windows doesn’t handle large volumes of constant data as well as Linux systems do. When you’re establishing a logging architecture, you want to ensure you receive all the data, and that the data is correct and easy to work with. Particularly when you start to scale beyond a handful of systems, or a multiple increments of 10GB of data per day, you really need a Linux syslog server to accomplish your goals.
What about rsyslog? Rsyslog is supported out of the box on RedHat and works well. Syslog-ng is intended to do more than rsyslog and is becoming a de facto syslog receiver due to its strong feature set. If you’re already comfortable using rsyslog, there’s no reason to change, provided you’re getting the performance and data usability you are looking for. But if you’re starting fresh, I suggest using syslog-ng.
Why not syslog directly to Splunk?
For organizations that aren’t currently using syslog, the first question is generally, “Why not send syslog data straight to Splunk?” After all, Splunk does support receiving syslog data, but there are some limitations to using Splunk as a syslog receiver.
The default protocol for syslog is UDP, which sends data and does not care if that data is ever received or not. Think of it as a radio station broadcasting a song (UDP) – if your radio is on and tuned to that station, you can hear the song, but if your radio is turned off that radio station is still broadcasting the same song. This can be problematic if your receiving server is unavailable. Splunk servers will periodically restart due to things like application installs and can take some time to shut down and restart, causing the receiving end to miss the messages which results in data loss. A syslog receiver only needs to be restarted for OS patching, updating syslog configuration, or unexpected outages. Generally the syslog process can stop/start almost instantaneously, resulting in very little downtime.
You cannot load balance syslog data to multiple Splunk servers, however, collecting data using a UF from a syslog server allows you to send the data to multiple Splunk indexers in a load balanced way.
Syslog generally listens on port 514, and some devices don’t allow you to configure a different output port. To get Splunk (or any service) to open port 1-1024, you must run the service as a root user or implement firewall rules to reroute ports. If you were to run Splunk as a root user, it would increase the attack surface of Splunk. Modifying firewall rules to reroute data adds extra complexity.
Disaster recovery – Setting up a new syslog server is much faster than setting up a new Splunk server. Once you have your syslog config, it can be easily backed up and repurposed for recovery or establishing a new syslog receiver on a different network segment.
Using Splunk as a syslog receiver doesn’t allow you to filter out events until after they’ve been processed. Syslog allows you to sort and filter events before they hit Splunk, which is more efficient.
Splunk as a syslog receiver does a 1:1 mapping between receiving port and index/sourcetype to save the data. So, all data sent to a given port gets the same sourcetype and the same index.
It’s possible to add multiple inbound ports to make mapping to index and sourcetype easier, but this requires more firewall rules to allow the traffic. Adding a new port for each index/sourcetype combination may not be feasible.
Some devices don’t allow you to define a syslog port, so you’re stuck with whatever the device supports.
It’s possible to use Splunk transforms to reroute the index or to change the sourcetype, but this is a complex task that adds work to the Splunk system and requires a new transform for each new sourcetype or index combination you need to add. Further, the transforms would need to run against all data sent to that given port, which impacts ingestion performance.
You can use host restrictions to assist in routing data to specific index/sourcetype. However, this also becomes cumbersome as you mut update the inputs each time you onboard a new host to Splunk.
Are there any prerequisites or external dependencies for syslog success?
There are two primary prerequisites: a good network topology and good DNS records for all devices that send syslog data.
All devices must be able to communicate directly with the syslog server. Events should not pass through routers or firewalls. This may mean using VLANs or multiple NICs to ensure all devices can communicate directly with the syslog server.
DNS is an integral part of syslog success. Whether you want your devices recognized by hostname or IP address, you’ll need good DNS information. This includes both an A record (name to IP) and a PTR record (IP to name) to allow for forward and reverse lookup. Why?
Allow for filtering data by hostname when the syslog message itself only presents an IP address.
Allow for output to a file name with the hostname even if the message doesn’t contain the hostname.
Allow for filtering data by IP address when the syslog message itself only presents a hostname.
Allow for output to a file name with the IP address even if the message doesn’t contain the IP address.
A & PTR Records
Allow for filtering data by hostname and/or IP address regardless of which value was in the original message.
Allow for output to a file name with the IP address or hostname regardless of which value is in the original message.
Some common issues that can arise when you don’t leverage both A records and PTR records for the devices you’re logging with syslog:
Multiple devices may have the same hostname (E.g., localhost). This means you can only filter by and write to a file by IP address, or you end up with a ‘localhost’ file that contains data from multiple devices. This is common with ESX hosts, F5 load balancers, checkpoint, and appliances, which sometimes log their internal events differently than their other syslog events.
Output hostnames may be inconsistent. For example, some devices may report their hostname as the IP address while others use a short hostname or an FQDN.