By: Dan Potter, Security Engineer
Many organizations already leverage syslog for data collection. It’s easy to get up and running and get data logged to a file. However, when it comes to bringing this data in to Splunk, there are a few things that can help with your long-term success. In this three-part series we will explore why you should use Syslog-ng to collect your syslog data, How to Configure Syslog-ng, and Troubleshooting and Advance Syslog Topics.
Why are you suggesting syslog-ng? What about kiwi syslog?
The truth is, if you want to syslog and syslog right, you need to do it on Linux. Simply put, Windows doesn’t handle large volumes of constant data as well as Linux systems do. When you are establishing a logging architecture you want to ensure you receive all the data, the data is correct, and is easy to work with. Particularly when you start to scale beyond a handful of systems, or a few 10s of GB/day of data, you really need a Linux syslog server to accomplish those goals.
What about rsyslog? Rsyslog is supported out of the box on RedHat and works well. Syslog-ng (Syslog Next Generation) is intended to do more than rsyslog and is becoming a defacto syslog receiver due to its feature set. If you are already using rsyslog and are comfortable with that, there is no real reason to change, provided you are getting the performance and data usability you are looking for. But if you are starting fresh, I suggest using syslog-ng.
Why not syslog directly to Splunk?
For organizations that are not currently using syslog, the first question is generally, why not send syslog data straight to Splunk? After all, Splunk does support receiving syslog data. However, there are some limitations to using Splunk for a syslog receiver:
- The default protocol for syslog is UDP, which sends data and does not care if that data is ever received or not. Think of it as a radio station broadcasting a song (UDP), if your radio is on you can hear it, but if the radio is turned off the radio station is still broadcasting the song. As you may imagine, this can cause problems if your receiving server is not available. Splunk servers will periodically restart due to things like application installs and can take some time to shutdown and start back up causing the receiving end to miss the messages which means you have data loss. A syslog receiver will only need to be restarted for OS patching, updating syslog configuration, or unexpected outages, and generally the syslog process can stop/start almost instantaneously, resulting in very little downtime.
- You cannot load balance syslog data to multiple Splunk servers, however collecting data using a UF from a syslog server allows you to send the data to multiple Splunk indexers in a load balanced way.
- Syslog generally listens on port 514, and some devices do not allow you to configure a different output port. To get Splunk (or any service) to open port 1-1024 you must run the service as a root user or implement firewall rules to reroute ports. If you were to run Splunk as a root user, it increases the attack surface of Splunk. Modifying firewall rules to reroute data adds extra complexity
- Disaster Recovery – Setting up a new syslog server is much faster than setting up a new Splunk server. Once you have your syslog config it can be easily backed up and repurposed for recovery or establishing a new syslog receiver on a different network segment.
- Using Splunk a syslog receiver does not allow you to filter out events until after they have been processed. Syslog allows you to sort and filter events before they hit Splunk which is more efficient.
- Splunk as a syslog receiver does a 1:1 mapping between receiving port and index/sourcetype to save the data as. So, all data sent to a given port gets the same sourcetype and the same index.
- It’s possible to add multiple inbound ports to make mapping to Index and sourcetype easier, but this requires more firewall rules to allow the traffic. Adding a new port for each index/sourcetype combination may not be feasible.
- Some devices do not allow you to define a syslog port, so you are stuck with whatever the device supports.
- It is possible to use Splunk transforms to re-route the index or to change the sourcetype, but this can be a complex task that adds work to the Splunk system and requires a new transform for each new sourcetype or index combination you need to add. Further, the transforms would need to run against all data sent to that given port, which impacts the ingestion performance
- You can use host restrictions to assist in routing data to specific index/sourcetype, however this also becomes cumbersome to add new hosts to Splunk as you need to update the Inputs each time you onboard a new host.
Are there any prerequisites or external dependencies for Syslog success?
There are two primary prerequisites. A good network topology and good DNS records for all devices that send syslog data.
All devices must be able to communicate directly with the syslog server. Events should not pass through routers or firewalls. This may mean using VLANs or multiple NICs to ensure all devices can communicate directly with the syslog server
DNS is an integral part of syslog success. Whether you want your devices recognized by Hostname or IP address, you will need good DNS information. This includes both an A record (name to IP) and a PTR record (IP to name), to allow for forward and reverse lookup. Why?
- A Records
- Allows for filtering data by Hostname when the syslog message itself only presents an IP address.
- Allows for output to a file name with the Hostname even if the message doesn’t contain the hostname
- PTR Records
- Allows for filtering data by IP address when the syslog message itself only presents a Hostname
- Allows for output to a file name with the IP address even if the message doesn’t contain the IP address
- A & PTR Records
- Allows for filtering data by Hostname and/or IP address regardless of which value was in the original message.
- Allows for output to a file name with the IP address or Hostname regardless of which value is in the original message
Some common issues that can arise when you do not leverage both A records and PTR records for the devices you are logging with syslog:
- Multiple devices may have the same hostname (Eg. ‘localhost’). This means you can only filter by and write to a file by IP address or you end up with a ‘localhost’ file that contains data from multiple devices. This is common with ESX hosts, F5 Load Balancers, Checkpoint, and appliances, which sometimes log their internal events differently than their other syslog events.
- Output hostnames may be inconsistent. For example, some devices may report their hostname as the IP address while others use a short hostname while others use an FQDN