overriding default syslog host extraction

I had a customer recently ask how to change the host that was applied to a particular set of incoming events. Normally this wouldn’t be a big deal, just specify the new name in inputs.conf. But this is from syslog. When you set one of the syslog sourcetypes there is some extra processing to extract the correct hostname which overrides other settings. And the hostname in the event is wrong.

So to get the right one, I set up this transform to force it to a specified value. And still give it my correct syslog sourcetype.
My inputs.conf is tailing an entire directory, which for sake of demonstration I’m going to pretend is all syslog.

$ more inputs.conf
host = support09.splunk.com
[tail:///var/log]
disabled = false
host = support09.splunk.com
sourcetype = syslog

props.conf is specifying a transform only for the source of interest:

$ more props.conf
[source::/var/log/system.log]
# note: overriding default syslog transform!
TRANSFORMS = feorlenhost

and transforms.conf is defining what to do to it. I have to specify a REGEX, but I’m not actually using it so I’ll just say ‘.’ to match everything. The FORMAT line is what is going to set my host:

Digging into metrics.log

Occasionally people ask for help in identifying a rogue data input that is suddenly spewing events. If it’s hidden in a ton of similar data it can be difficult to sort out which one is actually the problem. One place to look is the Splunk internal metrics.log. You can find it by searching the internal index (add “index=_internal” to your search) or just look in the file itself (located in $SPLUNK_HOME/var/log/splunk.)

Before I get into what can be found there, I need to explain what metrics.log is not. It is a sampling over 30 second intervals, so it will not give you an exact accounting of all your inputs. For each type of item reported, you get the top ten hot sources over the interval, based on the size of the event (_raw.) It is different from the numbers reported by LicenseManager, which include the indexed fields. Also, the default configuration only maintains the metrics data in the internal index a few days, but by going to the files you can see trends over a period of months if your rolled files go that far back.

A typical metrics.log has stuff like this:

conf files, part 2

Here are a couple more of my conf files explained. First the simple one:

server.conf

[sslConfig]
enableSplunkSearchSSL = true

All this says is that I’m using SSL on the front end. I clicky clicky the nice UI control and it magically happens. There could be a pile of other stuff in here, like specifying real paid-money-for certs if I were using any. But I’m not. Self-signed works for me, even if it means my users get whiny messages from their browsers. Whatever.

access_controls.conf

[roles]
apache2 = source::/var/log/apache2

[groups]
hosted_user = apache2

[users]
user1 = hosted_user

I added some access controls to help out one of my novice users, somebody who maintains the content on several sites but isn’t a big sysadmin. I set up a role that only allows access to the apache logs and assign it to the group hosted_user, which is then specified for user1. I thought about giving her access to just the files she needs, but that would mean specifying them each individually, either in multiple roles or one role with a bunch of OR terms in a single role.

conf file 101, part 1

I’m going over some stuff for the new support engineers, so I thought it would be useful to put it in a blog post. As an example of what you can do with conf files, I’ve got the changes I make to my own configuration and why. This is more focused on 3.1.x rather than preview, but I’m basically using the same configuration in both so far. For public consumption, I’ve changed some names but otherwise this is the contents of my conf files.

This first post is about inputs.conf, props.conf and transforms.conf, the basics of event handling.

inputs.conf

host = myhost

[tail:///Library/Logs/CrashReporter]
disabled = false
sourcetype = crashreporter

[tail:///Library/Logs/MySQL.log]
disabled = false

[tail:///Library/Logs/Software Update.log]
disabled = false

[tail:///Library/Logs/DirectoryService]
disabled = false

[tail:///var/log]
disabled = false

I added the tail on /var/log from the UI but the rest of this I did by hand. That wasn’t strictly necessary, but it was easier for me to add a couple stanzas at once that way. “host = myhost” is setting the name of my machine so everything has the correct hostname even if something in the actual event might make it get set to something else. (syslog type events are the usual offender for me, even if I’m not actually getting syslog from another host. Some tend to show up as “www” if I’m not paying attention.) CrashReporter, MySQL.log, Software Update.log and DirectoryService are things specifically in /Library/Logs that I wanted. I needed to set the sourcetype manually for crashreporter, so I just listed the others while I was at it.

getting my existing index into preview

Preview is out the door, woohoo! So up here in support I’m busy with the existing versions so I hadn’t checked out many of the new features. I wanted to mess with real data I care about, so I figured I’d copy my existing index and drop it into my splunkpreview directory. I host a handful of domains at home (on Leopard Server) and I’m using Splunk to watch various things I want to know, like who’s commenting on my blog and how many dictionary attacks I’ve had today. I thought it would be nifty to look at the same data in both 3.1.3 (my current production version) and preview.

The first time I tried it, I thought I’d be clever and set it all up before first startup with my whole index, users, saved searches and basically everything. Because, well, I clone this stuff all the time between 3.1.x versions when I’m setting up repro environments for customer issues. Wrong! Not sure what I forgot, but for my efforts I got a nice big segfault. Well, nothing a little rm won’t fix.