Download | Support
Splunk.com | SplunkBase | dev.splunk.com

brian

July 5th, 2006

Auto host resolving in splunk using python

This only works in 2.0.x
Ok so I’ve had a couple of people ask me how to resovle the ip addresses in their syslog files to their hostnames in splunk.
There’s no way to do this just by tweaking a config variable .. we need to dig a little deeper under the surface. It’s actually pretty easy to get splunk to call out to python during event processing so I’ve used that functionality to solve this problem.

Note that this will negatively impact indexing performance but it should work until we get this behavior baked into splunk.

First up I’ve created a python script that calls socket.gethostbyaddr to resolve the hosts. It will also cache the results so that the performance hit for dns misses is reduced.
So copy and paste the following into your favorite editor and save it to <SPLUNK_HOME>lib/python2.4/site-packages/splunk/pyHostNameResolve.py . This directory is where the dynamic loaded python will look for scripts; the filename will be referenced later in a config change.


#Copyright (C) 2006 Splunk Inc. All Rights Reserved. This work contains trade
#secrets and confidential material of Splunk Inc., and its use or disclosure in
#whole or in part without the express written permission of Splunk Inc. is prohibited.

from pipeline_data import PipelineDataWrapper #This is a virtual module/class that gets inserted into the python namespace at runtime by splunk
import traceback
import socket

#Set global variables
HOST_KEY = "MetaData:Host"

HOST_RESOLVE_MAP = {} #cache so we don't have to call gethostbyaddr ( expensive ) every event

def resolveHost( pdata, confDictString ):
    global HOST_RESOLVE_MAP
    try:

        host = pdata.get(HOST_KEY)

        resolvedHostName = None

        if host.startswith("host::") :
            host = host[6:]

        if host in HOST_RESOLVE_MAP:
            resolvedHostName = HOST_RESOLVE_MAP[ host ]

        if not resolvedHostName:
            try:
                resolved = socket.gethostbyaddr(host)
                resolvedHostName = resolved[0]
                HOST_RESOLVE_MAP[ host ] = resolvedHostName
            except:
                HOST_RESOLVE_MAP[ host ] = host
                print "Could not resolve

Read More...

April 27th, 2006

Splunk Cheat Sheet !

I’ve been pretty busy so I haven’t updated for a while but I thought I should share this :
Corey Shields has made a great splunk cheat sheet ! It’s available at : http://staff.osuosl.org/~cshields/?p=140
It’s pretty awesome, and I’m recommending that everyone I know that uses splunk downloads it.
Until next time,
Brian

Read More...

March 14th, 2006

Splunking from Python Part I

One of the neat things about splunk is that it’s search interface is a SOAP call. In this post I’m going to talk about using the python modules that ship with splunk to talk to splunk over this SOAP interface.
First off you will need to set some environment variables so that you are running the version of python that ships with splunk :


export SPLUNK_HOME=<WHERE_YOU_INSTALLED_SPLUNK>
export PATH=$SPLUNK_HOME/bin:$PATH
export LD_LIBRARY_PATH=$SPLUNK_HOME/lib:$LD_LIBRARY_PATH

Ok so now you should be good to go so fire up python. Your python version should be 2.4.2. If it’s not do a “which python” from the command prompt to make sure you are using the python that shipped with splunk.
We need to do some setup before any searches can be run :


Python 2.4.2 (#1, Mar 11 2009, 21:45:07)
[GCC 4.0.2] on linux2
Type “help”, “copyright”, “credits” or “license” for more information.


>>> import splunk.search.splunkTest #initialize the python internals without using twistd
>>> import splunk.search.SearchCore as SearchCore #This is the module we are going to use to issue searches

If you want to run against a remote splunk server or on different ports you can run the following :


>>> SearchCore.SearchService.gSearchService._searchEngineURL = “http://<remote_host>:<searchengine_port>”

The method on the SearchCore module that executes the queries is called runQuery and it takes two arguments.


def runQuery(queryString, userStr )

The userStr can be any string for now; in future releases it will probably be an auth token. It is the user that your searches will appear under in the searchhistory domain.
The queryString is where the magic happens ) .
Basically a query string contains three major elements.

QUERY : Terms following this are as you would see in the splunk web ui search box. This pulls the resulting ids into an id space internally in the query.
GET : Terms following this instruct splunk on what extract from ids in the id space into results the result

Read More...

March 10th, 2006

Slow queries and solutions.

Since the launch of the 1.2 product some people are experiencing really slow query times. This is especially noticable when you are running a live splunk pretty often, as this tends to fragment the database quiet a bit.

Fear not as there is a hidden undocumented call that you can make ! If you run the query “++cmd++::optimize” you will cause a database optimization. This call may take a while to return so use with care. Soon we will have a release with an auto-optimizer but if it’s hampering your splunking right now you can create a live splunk to run every 10-30 mins that runs “++cmd++::optimize”.

Laters,

Brian

Read More...

March 7th, 2006

First Post

First Post !

So this is the start of my splunk blog.

First up I’m splunk employee #1. Way back in Sept. 2004 I joined Erik, Rob and Michael when they were still based down in the VC offices in Palo Alto. I’m responsible for searches and indexing so if you have splunks that are taking WAAAY too long to complete I’m the person that’s probably responsible.

I’ll post more later on what I’m coding, struggling against or just hacking on.

Brian out.

Read More...


Close
E-mail It