Splunk data onboarding: Success with Syslog-ng and Splunk – part 3 troubleshooting
NuHarbor SecurityIf you’re just getting started, be sure to check out Parts 1 and 2 of our series:
Data Onboarding Success Part 1 – Success with Syslog-NG and Splunk, The Install and Setup.
Data Onboarding Success Part 2 – Success with Syslog-NG and Splunk. The Basics.
Troubleshooting Syslog-ng and Splunk
Below is an assortment of common troubleshooting situations you’ll face with syslog-ng and Splunk.
Service Issues
1. Syslog-ng Service Won’t Start
Begin troubleshooting syslog-ng service issues by running:
syslog-ng –syntax-only
Organizing Issues
2. Keeping Track of Events From Multiple Syslog Servers
We can add new metadata to all events that come from your syslog server to keep track of which host sent the data to Splunk. To do this, we need to update our inputs.conf and add a fields.conf file.
Add the following to your inputs.conf:
[default]
_meta = splunk_syslog_server::SERVERNAME
Create a fields.conf file with the following:
[splunk_syslog_server]
INDEXED = true
3. Centralized Syslog Server: Consolidating Multiple Departments/Networks and Displaying Original Syslog Server Events
Syslog will always use the “last hop” value for the hostname or IP address. So, if data passes through an intermediate syslog server, you’ll need to modify the options on that intermediate forwarder to chain the hostnames.
# If the log message is forwarded to the logserver via a relay, and the
# chain_hostnames() option is 'yes', the relay adds its own hostname to
# the hostname of the client, separated with a / character.
chain_hostnames(yes);
Generally, you’ll want to leave chain_hostnames(no).
Resourcing Issues
4. Event Efficiency: Streamlining Syslog Server's Event Forwarding to Splunk
You’ll need to increase the output limits so your UF can send data as fast as it receives it. To do this, modify the limits.conf in Splunk.
# By default a universal or light forwarder is limited to 256kB/s
# Either set a different limit in kB/s, or set the value to zero to
# have no limit.
# Note that a full speed UF can overwhelm a single indexer.
[thruput]
#maxKBps = 0 is unlimited throughput. Which is what we want
#for our syslog server to ensure we send syslog as fast as
#we receive it
maxKBps = 0
5. Monitoring for Dropped Packets
netstat -su | grep “receive errors”
If you’re dropping UDP packets, please see below:
1. Adjust so-rcvbuf:
When receiving messages using the UDP protocol, increase the size of the UDP receive buffer on the syslog receiver host. On certain platforms, even low message load (~200 messages per second) can result in message loss, unless the so-rcvbuf() option of the source is increased.
In such cases, you’ll need to increase the net.core.rmem_max parameter of the host but do not modify the net.core.rmem_default parameter. As a general rule, increase the so-rcvbuf() so that the buffer size in kilobytes is higher than the rate of incoming messages per second.
For example, to receive 2000 messages per second, set the so-rcvbuf() at least to 2,097,152 bytes.
i.e.: source s_udp_network { udp(ip(0.0.0.0) port(514) so_rcvbuf(2097152)); };
2. Adjust net.core.rmem_max parameter (use large buffers in kernel):
If syslog-ng cannot read the messages fast enough from the UDP socket, the kernel receive buffers will start to fill and after the configured limit has been reached, the kernel will start discarding messages. In this case, it’s necessary to adjust the buffer size accordingly. To raise the size of the kernel receive buffers, use the sysctl command to tune the net.core.rmem_max parameter. Next, raise the size of the so-rcvbuf option of the syslog-ng source definition as well, so that syslog-ng can utilize the larger kernel receive buffers.
In a high traffic environment, up to 256MB might be necessary:
Enter the value in bytes:
sysctl -w net.core.rmem_max=268435456
In the example above, 256*1024*1024=268435456 bytes.
As a rule of thumb, this buffer size should be enough to accommodate incoming peak message rate for at least one second.
3. Adjust log_fifo_size()
To be able to handle message bursts, increase the value of the log_fifo_size() option to match the value configured for net.core.rmem_max in the previous step.
Configuration Pointers
6. TLS syslog Input
You’ll need to add a new source line, and update your log line:
source s_tls_remotes { network(ip(0.0.0.0) port(10514) transport("tls") tls(
key-file("/opt/syslog-ng/key.d/privkey.pem")
cert-file("/opt/syslog-ng/cert.d/cacert.pem")
peer-verify(optional-untrusted)
### Logs
log {
source(s_tls_remote);
filter(f_TLS_SOURCETYPE);
destination(d_TLS_SOURCETYPE);
};
In the example above, we enable a new listener on port 10514, specify the transport as TLS, and point to a valid certificate private key and certificate file on the syslog server. The peer-verify(optional-untrusted) instructs syslog-ng not to care if the cert is valid against the incoming data source. If you need to validate the certificate, comment out the peer-verify line or remove it.
If you need further assistance with TLS inputs, refer to the Administration Guide at https://support.oneidentity.com/technical-documents/syslog-ng-premium-edition/7.0.9/administration-guide/52#TOPIC-955503
7. Sorting Out ‘Info’ and ‘Debug’ Messages to Only Show Warnings and Errors
You can get granular with the logging by log level or select ranges of log levels. You can then tell Splunk to only onboard the messages you care about while retaining or discarding the rest.
## Filters
## Useful filters
## level(emerg, alert, crit, err, warning, notice, info, debug)
## The level() filter selects messages corresponding to a single importance level, or a level-range.
## To select messages of a specific level, use the name of the level as a filter parameter,
## for example use the following to select warning messages: level(warning)
## To select a range of levels, include the beginning and the ending level in the filter,
## separated with two dots (..). For example, to select every message of error or higher level, use the following filter:
## level(err..emerg)
filter f_alert { level(alert); };
filter f_crit { level(crit); };
filter f_debug { level(debug); };
filter f_emerg { level(emerg); };
filter f_err { level(err); };
filter f_info { level(info); };
filter f_notice { level(notice); };
filter f_warn { level(warning); };
#destination d_cisco_asa_info { file("/opt/syslog/logs/cisco/asa/$HOST/$YEAR-$MONTH-$DAY-cisco-asa-INFO.log" template(t_default_format) ); };
#destination d_cisco_asa_notice { file("/opt/syslog/logs/cisco/asa/$HOST/$YEAR-$MONTH-$DAY-cisco-asa-NOTICE.log" template(t_default_format) ); };
#destination d_cisco_asa_warn { file("/opt/syslog/logs/cisco/asa/$HOST/$YEAR-$MONTH-$DAY-cisco-asa-WARN.log" template(t_default_format) ); };
#destination d_cisco_asa_crit { file("/opt/syslog/logs/cisco/asa/$HOST/$YEAR-$MONTH-$DAY-cisco-asa-CRIT.log" template(t_default_format) ); };
#destination d_cisco_asa_err { file("/opt/syslog/logs/cisco/asa/$HOST/$YEAR-$MONTH-$DAY-cisco-asa-ERR.log" template(t_default_format) ); };
#destination d_cisco_asa_err { file("/opt/syslog/logs/cisco/asa/$HOST/$YEAR-$MONTH-$DAY-cisco-asa-EMERG.log" template(t_default_format) ); };
8. Devices Sending Syslog From Multiple Time Zones
You’ll probably want to modify your t_default_template to use the ${ISODATE} rather than ${DATE}. This will ensure that we have the time zone included in all logged events. We can then tell Splunk to use that time zone when parsing the timestamps.
Help, I need a different template for one of my log sources, what other options are there?
Here are some additional fields that may be useful in formatting your syslog messages:
- an S_ prefix on a time means take the value from the Source of the event. R_ means the time the syslog server received the message. C_ means current time
- useful to grab the correct timestamp from the sending server rather than using the receiving time of the syslog server. E.g. ${S_DATE}
- To use the S_ macros, the keep-timestamp() option must be enabled (this is the default behavior of syslog-ng OSE).
- ${DATE} = Jun 13 15:58:00
- ${ISODATE} = 06-13T15:58:00+01:00
- ${HOST} = server that generated the message
- ${LOGHOST} = server that received the message (if you use DNS lookup the host must be in DNS or it will say localhost.localdomain)
- ${MSGHDR} = The ${PROGRAM} and ${PID} fields, which are the program name and program ID for the sysloglog message
- PROGRAM[PID]: format. Includes a trailing whitespace after the :
- ${MSG} = Alias for the {MESSAGE} macro. Text contents of the log message without the ${MSGHDR} macro, and separately in the ${PROGRAM} and ${PID} macros.
- ${SOURCEIP} = Alternative to using {HOST}, useful if you don’t have DNS for a host
Additional macro info here: https://syslog-ng.com/documents/html/syslog-ng-ose-latest-guides/en/syslog-ng-ose-guide-admin/html/reference-macros.html.
If you have issues with escape characters, see this guide: https://syslog-ng.com/documents/html/syslog-ng-ose-latest-guides/en/syslog-ng-ose-guide-admin/html/configuring-macros.html.
9. ${S_DATE} Used in Template; Now Getting Events With Incorrect Timestamps (e.g., Dec 31, 1900)
When we tell syslog to accept whatever timestamp is in the original event, it’s possible we end up with bad time data. It’s crucial that you have NTP enabled on all devices that send syslog data before configuring ${S_DATE}. Additionally, some devices, like routers, may start up with a bad date until they connect to an NTP server and update their time.
10. ${DATE} Used in Template; Now Some Events Have a Current Timestamp Rather Than the Original Event Timestamp
This is expected behavior and can occur if a device does not have connectivity for a while, then reconnects. To account for issues like this, you may want to include two timestamps in your template, ${S_DATE} and ${DATE}. This is also useful in determining lag between source and syslog receiver.
11. Filters Aren’t Working; All the Data in catch_all Directory Says it’s Coming From the Same Device Name/IP
This can happen if you’re trying to syslog across subnets and passing the data through a router. Ensure that your device can directly communicate with the syslog server without passing through routers or firewalls.
Still want to know more?
More than you ever wanted to know about syslog: https://tools.ietf.org/html/rfc5424.
Here is a link to the current version of the Syslog-NG Open Source Edition Administrators Guide: https://www.syslog-ng.com/documents/html/syslog-ng-ose-latest-guides/en/syslog-ng-ose-guide-admin/html/
Missed the first two parts?
Splunk Data Onboarding: Success With Syslog-NG and Splunk – Part 1
Splunk Data Onboarding: Success With Syslog-NG and Splunk – Part 2