Download | Support
Splunk.com | SplunkBase | dev.splunk.com

Splunk Dev: Archive for the 'tech' Tab

May 12th, 2008

Did you know that your Acitve Directory is just a glorified LDAP?

Microsoft Tube Surfers,

Wanted to take a minute to talk about authenticating Splunk against Active Directory. In case you didn’t know Active Directory is running on top of LDAP. While the guys up in Redmond do their best to make sure tha you have no need to know LDAP they give you the ability to interface with it over LDAP if you know what you’re doing. Let’s take this time to let you know what you need to do.

If you are comfortable with the command line you can run the command ldifede. The ldifde command is the windows equivalent of ldapsearch and should allow you to get an ldif entry for yourself and a group. With those two entries we should be able to come up with authentication.conf that will allow Splunk to authenticate users.

For those of you that are more comfortable with a GUI The Sysinternals team offers a nice utility called Active Directory Explorer. This gives you tree view of your Active Directory/LDAP structure.

The information provided from these utilities is pretty much everything you need to know in order to follow along with the documentation. If you are still struggling to get it working send an email to support@splunk.com with the output from the ldifde command and your authentication.conf and someone from team will help square you away.

Read More...

April 30th, 2008

Help Me Help You

Peoples of the Interweb,

As one of the Splunk Support Monkeys I am going to try to start a semi-regular series of posts on a topic that is near and dear to me — getting the Splunk community to be able to troubleshoot their issues without the need to reach out to the Support Team.

The most important piece of any troubleshooting exercise is getting a solid understanding of the problem. The common statement “Shit is broke” while ’summarizing’ the problem doesn’t do much in the way of isolating the specific problem. Taking a minute or two to think about the problem at and documenting the sequence of events leading up to the problem goes a long way to getting outsiders up to speed on the issue.
Here are few things to keep in mind when working with support:

I don’t work in the next cube over.

This means I don’t have insight into all of the other moving parts of your network. Try avoiding acronyms that are specific to your organization. I don’t know the naming convention that you use for machine names, so if one box is in LA and the other is New York tell me, don’t expect me to know that foo.company.com is sitting in the LA data center.

Less is not more.

You can never give a support engineer to much data. Often times folks think that they have identified the offending error message in the logs and provide that one line in their support ticket. The problem with this is that the support engineer does not get the benefit of context. Most errors are the result of a series of events leading up the final failure. Being able to see what was going on leading up to the problem often times is what allows

Read More...

April 28th, 2008

Splunk Windows Registry Monitor

Hey everyone, just wanted to let you know that a preview release of Splunk just left the docks.

http://www.splunk.com/index.php/preview

I want to introduce to you one the latest features for Windows Splunk - the monitoring of Windows registry in real time for activity/events, and the indexing and searching these events with Splunk.

While working on this we had a few challenges:

First, there aren’t any published win32 APIs that does this in user mode. The best that you can do with win32 API is to poll the registry for certain registry key/hives, and you’ll be notified when if the key or subkey of the hive has been changed. Even when you get a notification for a change, you will not be told which key exactly has changed, you’ll have to figure that out yourself .

Second, scalability. You can’t possibly poll all of the registry in user mode for changes. There are simply too many keys to query.

The solution is to write a device driver that hooks to the kernel and intercepts all registry events. The driver bubbles up the events to the user mode for filtering and tagging, and finally pipe them to Splunk for indexing. Obviously, this driver needs to be very stable and reliable, needs to scale to the point where if you want to monitor all of the events in the registry, and it should be able to handle the load.

With this preview release we launched the first version of the splunk-regmon tool. The tool writes events to standard output, and using Splunk’s ExecProcessor(popen). Splunk is able to get these events and send them through the indexing pipeline. A basic filtering is in place, hard coded for now to only monitor registry events related to changes - i.e. Create, Delete, Set, etc. Create type events

Read More...

April 24th, 2008

On the off chance you need help with Windows

Hello Internets,

As one of the splunkers responsible for answering the phone I’m going to use this space to talk about something near and dear to my hart — empowering my customers so they are able to figure out their own problems thereby allowing me read FARK all day long.

Since we recently released our Windows version a bunch of the folks in the office have been trying to figure out how they do the things they do in a UNIX enviornment (like wget a file) in Windows. I’ve been sharing some of my favorite Windows resources here at the office and figures the rest of you would probably like to know about them as well.

Google
Everyone seems to start here when they are looking for something. Most however don’t know that http://www.google.com/microsoft will restirct your search to Windows sites. They also have these search sites for linux, bsd, and the mac.

SysInternals
Mark and Bryce have created the ultimate coolection of free Windows utilities. Simple executables that allow to get so many of the diagnostic/monitoring things that a UNIX admin takes for granted. Some of my favorites (and especially useful in working with Splunk) in no particular order:

  • AccessEnum
    Lets you see who has access to what. This is really helpful when trying to figure out why Splunk isn’t indexing one of your files.
  • Process Monitor
    Watch the registry, running process/thread/DLL, and file system usage in real-time
  • PS Tools
    A bunch of command-line utilities for listing the processes running, working with the event log, rebooting the machine, etc.
  • Active Directory Explorer
    Advanced viewer/editor for Actiive Directory. This will be a godsend you are trying to configure Splunk to authenticate against your domain controller
  • WhoIS
    Doesn’t do much in the way of troubleshooting Splunk, but who doesn’t want to be able to see if ultramegaextrmeme.com is available and if not

Read More...

April 16th, 2008

overriding default syslog host extraction

I had a customer recently ask how to change the host that was applied to a particular set of incoming events. Normally this wouldn’t be a big deal, just specify the new name in inputs.conf. But this is from syslog. When you set one of the syslog sourcetypes there is some extra processing to extract the correct hostname which overrides other settings. And the hostname in the event is wrong.

So to get the right one, I set up this transform to force it to a specified value. And still give it my correct syslog sourcetype.
My inputs.conf is tailing an entire directory, which for sake of demonstration I’m going to pretend is all syslog.

$ more inputs.conf
host = support09.splunk.com
[tail:///var/log]
disabled = false
host = support09.splunk.com
sourcetype = syslog

props.conf is specifying a transform only for the source of interest:

$ more props.conf
[source::/var/log/system.log]
# note: overriding default syslog transform!
TRANSFORMS = feorlenhost

and transforms.conf is defining what to do to it. I have to specify a REGEX, but I’m not actually using it so I’ll just say ‘.’ to match everything. The FORMAT line is what is going to set my host:

$ more transforms.conf
[feorlenhost]
DEST_KEY = MetaData:Host
REGEX = .
FORMAT = host::feorlenhost.splunk.com

So whatever syslog put in there for host, ignore and use my static value instead.

Read More...

March 27th, 2008

Splunk for Virtualization

I’m looking for some help.
I’ve built a VMWare app for splunk and in the process of doing the same for Xen. These Apps use the VMWare and Xensource API’s to index everything about the VM environment. When combined with splunk instances running within the guest OS you get a very comprehensive historical picture. I’m curious are there any splunk customers out there using VMWare or Xen? I’m looking for usecases so that i better understand how to configure the apps. I’d be curious to know what types of information would be useful to capture and what types of searches would one want to perform. Both Xen and VMWare have so much data available that configuration could be complicated. I’m trying to narrow it down to several useful out of the box configurations. If your have any thoughts comment here or email me at erik at splunk dot com.

Thanks
e.

Read More...

March 13th, 2008

Digging into metrics.log

Occasionally people ask for help in identifying a rogue data input that is suddenly spewing events. If it’s hidden in a ton of similar data it can be difficult to sort out which one is actually the problem. One place to look is the Splunk internal metrics.log. You can find it by searching the internal index (add “index=_internal” to your search) or just look in the file itself (located in $SPLUNK_HOME/var/log/splunk.)

Before I get into what can be found there, I need to explain what metrics.log is not. It is a sampling over 30 second intervals, so it will not give you an exact accounting of all your inputs. For each type of item reported, you get the top ten hot sources over the interval, based on the size of the event (_raw.) It is different from the numbers reported by LicenseManager, which include the indexed fields. Also, the default configuration only maintains the metrics data in the internal index a few days, but by going to the files you can see trends over a period of months if your rolled files go that far back.

A typical metrics.log has stuff like this:

03-13-2008 10:48:55.620 INFO Metrics - group=pipeline, name=tail, processor=tail, cpu_seconds=0.000000, executes=31, cumulative_hits=73399
03-13-2008 10:48:55.620 INFO Metrics - group=pipeline, name=typing, processor=annotator, cpu_seconds=0.000000, executes=63, cumulative_hits=134912
03-13-2008 10:48:55.620 INFO Metrics - group=pipeline, name=typing, processor=clusterer, cpu_seconds=0.000000, executes=63, cumulative_hits=134912
03-13-2008 10:48:55.620 INFO Metrics - group=pipeline, name=typing, processor=readerin, cpu_seconds=0.000000, executes=63, cumulative_hits=134912
03-13-2008 10:48:55.620 INFO Metrics - group=pipeline, name=typing, processor=sendout, cpu_seconds=0.000000, executes=63, cumulative_hits=134912
03-13-2008 10:48:55.620 INFO Metrics - group=thruput, name=index_thruput, instantaneous_kbps=0.302766, instantaneous_eps=2.129032, average_kbps=0.000000, total_k_processed=19757, load_average=0.124023
03-13-2008 10:48:55.620 INFO Metrics - group=per_host_thruput, series=”fthost”, kbps=0.019563, eps=0.096774, kb=0.606445
03-13-2008 10:48:55.620 INFO Metrics - group=per_host_thruput, series=”grumpy”, kbps=0.283203, eps=2.032258, kb=8.779297
03-13-2008 10:48:55.620 INFO Metrics - group=per_index_thruput, series=”_internal”, kbps=0.275328, eps=1.903226, kb=8.535156
03-13-2008 10:48:55.620 INFO Metrics - group=per_index_thruput, series=”_thefishbucket”, kbps=0.019563, eps=0.096774, kb=0.606445

Read More...

February 22nd, 2008

Splunk2LCD : Display your Alerts on an LCD

This morning I got a nice little LCD from Crystalfontz that allows me to connect to it via the open source project lcdproc. After a bit of compiling and installing, LCDproc (which runs natively on linux, darwin (osx) and most other unix distros) connects to any serial, parallel or USB LCD device. In this case, the Crystalfontz LCD is 4 line by 20 character display.

Splunk2LCD

Once configured and connected, you start the server and accept connections.

I then grabbed the IO-LCDproc perl module and modified it to display to the LCDproc server. You can get the IO-LCDproc through CPAN.

Read More...

February 4th, 2008

The SSL Performance Odyssey

When you come to dev.splunk.com, you see pictures of beer pong, full bars, stuffed ponies with fart machines taped to their ass, etc - basically engineers gone wild. Somewhere between all of this insaneness, we actually find the time to write code and solve problems like this one.This post is all about a crazy-weird performance issue that we were experiencing, how it manifested itself and ultimately how it was fixed.

I suspect others may be having this problem, as the problem lives in some very popular open source code as far as I can tell. With that, I’ll begin telling you about my journey into hell.

Splunk has a home grown embedded HTTP(S) server that serves up all external interfaces to the ’splunkd’ daemon. We use it as the core engine for our REST and XML/RPC-like API’s. The GUI and the CLI both end up talking to the daemon via this server.

When I wrote the core of it a few months ago, I ran some rudimentary performance tests on several platforms and it seemed decent enough for our use, but a week ago, the manager of the Search and Indexing team (Stephen) said that he was seeing abysmal performance using SSL. He said that the GUI performance was being impacted. I didn’t believe him and insisted that it was something else and that he was high.

So to prove to him that it wasn’t my server, or my problem like all engineers do, I gave him a small python script that hits the server in a tight loop and we checked the performance. It sucked. Continuing with the theme of “this isn’t my problem” - I told him it was probably the handler of the request that was doing something that made the server seem slow. This is when he laughed

Read More...

January 30th, 2008

Your most important IT data: funny quotes

bash.org is a natural dataset for splunking. It’s a huge blob of loosely structured text data, and it’s made of win.

To play with a live instance, go to bash.splunklabs.com, login: guest, password: guest.

Of course, Splunk duplicates the functionality of the site itself. We can find, for example, the top 100 IRC quotes:

Splunk lets us do considerably more, though. What are the top one-liners?

How many more quotes mention “girlfriend” than “boyfriend”, i.e. exactly how bad is this sausage party?

Are there any commonly quoted individuals?

Are there any interesting trends in quote scores over time? Take a look at high quote scores vs. quote ID:

It seems likely that older quotes, especially good ones, benefit from a disproportionately greater number of views (the rich getting richer, so to speak); this might explain why the peaks in the low-quote-ID ranges are higher than the peaks for more recent quotes. Or maybe the internet just doesn’t produce the same quality of LOLs that it once did.

To try this yourself, add the following to props.conf:

[sourcetype::bash]
BREAK_ONLY_BEFORE = (#[0-9]* \+)|([0-9]+-[0-9]+-[0-9]+-[0-9]+-[0-9]+-[0-9]+)
REPORT-bash = bash

and the following to transforms.conf:

[bash]
REGEX = #([0-9]+) \+\((-?[0-9]+)\)- \[X\]
FORMAT = $0 bash_quote_id::$1 bash_quote_score::$2

Then, get a static copy of bash.org. You can grab the one I’ve created here, or you can generate it yourself:

$ curl -o '#1.html' 'http://bash.org/?browse&p=[001-409]'
$ for cur in * ; do lynx -dump -nonumbers ./$cur >> /tmp/bash.txt ; done

Finally, push the data into Splunk:

$ splunk add tail -source /tmp/bash.txt -sourcetype bash

Read More...


Close
E-mail It