Download | Support
Splunk.com | SplunkBase | dev.splunk.com

Splunk Dev: Archive for the 'splunk' Tab

March 27th, 2008

Splunk for Virtualization

I’m looking for some help.
I’ve built a VMWare app for splunk and in the process of doing the same for Xen. These Apps use the VMWare and Xensource API’s to index everything about the VM environment. When combined with splunk instances running within the guest OS you get a very comprehensive historical picture. I’m curious are there any splunk customers out there using VMWare or Xen? I’m looking for usecases so that i better understand how to configure the apps. I’d be curious to know what types of information would be useful to capture and what types of searches would one want to perform. Both Xen and VMWare have so much data available that configuration could be complicated. I’m trying to narrow it down to several useful out of the box configurations. If your have any thoughts comment here or email me at erik at splunk dot com.

Thanks
e.

Read More...

March 6th, 2008

Splunk Replay: Search results in motion

Inspired by glTail.rb and Digg Lab’s Stack, Splunk Replay is an animated data visualization that “replays” search results as a simulated event stream. The simulation displays events at a rate proportional to the times at which the events originally occurred.

Each event is represented by a single square particle that flows from its place in a legend of values to its corresponding position in a stacked column chart. Upon landing in the column chart, one of the event’s fields is output in a readable format below the chart. Both the legend of values and the stacked column chart retain the order of their values according to a configurable comparator and truncate older values to make space for new ones. Rolling your mouse over any column displays the field values for that column.

Read More...

January 31st, 2008

Standing on Our Own Platform

Splunk is on track to become a billion-dollar company and you, the intrepid sysadmin/developer, are going to help us get there. Now, this is not a statement that I’m making as an analyst who “covers” the enterprise software market, and compiles a list of “top software companies to watch”. I’m writing this as Splunk’s Platform Architect, a techie whose goals are to ensure that what comes out of our development group is compelling and exciting to those that are actually working with the product.

It is this developer-centric ethos that sets us apart from so many of the other enterprise software firms and has already paid dividends on community goodwill. Instead of making prospective buyers jump through registration hoops just to view a guided webcast tour, Splunk provides fully functional software downloads to try out on your own data, inside your own network, free from webinar smoke and mirrors.

We don’t just want you to try out the software, we want you to try doing things that aren’t covered in our brochureware, things that sound ludicrous at first but are doable. In fact, in a perverse way, we hope that you do break our product because it reveals new limitations for us to solve, ultimately leading to a product that lets you do your job the way you want, yet easier and faster.

This is where the Splunk Platform comes into play. We want to increase the ubiquity of Splunk by, 1) exposing major components of Splunk as individual services, and, 2) allowing external developers to build on top of Splunk and leverage our award-winning IT search infrastructure. Starting with version 3.2 (you can download the preview version today), there is a new REST API that provides unprecedented access and consistency to every aspect of the Splunk Server. We are

Read More...

November 18th, 2007

Making reports faster by caching scheduled searches

I find this hard to explain even though its an extremely simple concept. It would be nice to get some feedback since I think we want to productize the idea but we are not clear on what makes sense.

If I have a search/report that I want to run faster, I will save that search and have splunk run it over a small timeframe (5,15,30,60 min) taking the results of that search/report and feeding them back into an index i create to hold cached results.

For example, suppose I like to run nightly reports where I show “top users by bandwidth”. Its easy enough to run the report every night, but suppose there are times during the day when I want incrementals, or I want to look at last week, or perhaps get dailies over a month. Every time I run the search/report I need to search and recalculate “top users by bandwidth”, which if over billions of events can take time ;-)

Instead, I’ll just save the search/report and have Splunk run it every 15 minutes with the results being sent to a “cache” index. This way if I ever want to do an adhoc search on “top users” or if I want to do “weekly reports by day” all the data is precalculated.

Think of this as creating “logs” that are the output of a search/report and then having Splunk index those “logs”. To get fast results you can then search/report on the summarized cached data.

If not obvious why it’s faster, suppose you are indexing 500M events a day and 100M of those have bandwidth data. To report on “top bandwidth by users” I need to run a search to get the 100M events then run the report across all 100M.
If instead I were in the

Read More...

October 12th, 2007

Being the girl in dev at Splunk

Like a lot of tech companies, Splunk’s development organization isn’t a model of perfect gender balance. For a year and a half now, I’ve been the only woman in the dev organization.

Surprisingly, this is not an uncomfortable place to be. In 11 years in industry I’ve worked in a variety of organizations: the now-bankrupt dot-com best known for putting an ad with a naked guy up during the Super Bowl, 2 major marquee names with vastly differing corporate cultures, a security start-up stocked with emancipated-minor hackers. Aside from that doomed dot-com — which had a surprisingly strong gender balance throughout technical roles and a culture blessedly free of gender-based intimidation at all levels — Splunk may be the most comfortable place I’ve ever worked. There’s no creepy tokenism (unlike stories I’ve heard about certain other bay area employers), That Guy Who’s Never Seen A Girl Before doesn’t work here…and as far as I can tell, no one really gets harassed except Amrit.

Perhaps a better testament for the dev culture than my opinion — because, frankly, I’m pretty weird to start with — is that other women in the company seem to be pretty comfortable visiting the dev area, either on work errands or just to take a break from the sales-focused environment upstairs. Frankly I can’t imagine that happens too often in the bay area…and more’s the pity.

Read More...

October 11th, 2007

Diagraming Splunk’s data-flow (part 2 - performance overlays)

In my previous post “Diagraming Splunk’s data-flow” I wrote a small python script that parsed Splunk’s runtime environment ($SPLUNK_HOME/var/run/splunk/composite.xml) and generated a file which when input into graphviz would generate a nice architectural diagram of how pipelines and processors are wired together.

In this installment, I took it to the next level by using Splunk’s search capability to overlay performance metrics on the diagram. The combination of Splunk logging metrics information for each processor within each pipeline (thanks Brad) and the ability to have Splunk execute a search processor written in Python made this possible. Here is how you use it:

First download graphviz. I particularly like the OSX application that they’ve written because you can see the graph on the screen and as the file changes, those changes are reflected in the graph you are viewing. If you don’t have a Mac, use the command line version to generate different types of output file formats like .jpeg, etc.

Go to SplunkBase to download my python script. Copy the .py file into $SPLUNK_HOME/etc/searchscripts

Start Splunk.

Type the following into the search box:index___internal metrics pipeline processor NOT get - over all time - localhost - Splunk 3.2-UNSTABLE-4.jpg
This will search for the appropriate metrics information and pipe the results through the script.

There are 2 options to perfgraph:

perfgraph [output filename] [cpu, execs, cumhits]

Unfortunately (because I’m lazy) you can’t specify cpu, execs or cumhits without also specifying an output file.The parameter is the full path and file name of the ‘dot’ file you wish to create. It defaults to /tmp/out.dot.

The second parameter, if specified tells the script to highlight in red the slowest processor (cpu), the processor with the most hits (execs) or the processor with the most cumulative hits (cumhits). This parameter defaults to ‘none’, or no highlighting.

The above search string results in the following graph (portion). Notice the performance information overlayed into the processors:

Read More...

October 10th, 2007

Diagraming Splunk’s data-flow

This blog entry is not about how the framework works. It is about a semi-cool visualization that I created using python and graphviz. If you watched the video where I presented Splunks framework architecture from a high level you know what pipelines and processors are. If you haven’t here is a very quick overview.

  • A pipeline is a thread of execution that lives within the splunkd process. Each pipeline executes a series of processors, each one which operates on data. The data is created when the first processor on the pipeline reads it from some input (like tailing a file, or receiving it on a network port). Each processor then does something to the data. Eventually, the data gets indexed and execution is returned to the first processor to get more data again.
  • Pipelines are connected via queues. A queue output processor (the last processor in a pipeline) puts data on to a queue and blocks if the queue is full. A queue input processor (the first processor at the top of a pipeline) gets the data item from the bottom of the queue and sends it on down the pipeline. If there is no data, it blocks waiting for some to be put on the queue.

Enough already. Go watch the video. So, I decided that I’m tired of drawing these diagrams and wrote some code to produce them for me.

I Implemented some python code that took the composite.xml file, parsed it and produced a .dot file. Composite.xml, for those of you who don’t know is an amalgamation of all pipelines and processors in the system. It represents the current (or last) runtime environment for Splunk. It lives in $SPLUNK_HOME/var/run/splunk.

I then took the resultant .dot file and ran it through graphviz. After lots of tweeking, here is what I came up

Read More...

September 17th, 2007

The Feature Magpie Phenomenon

Having lived through the software-as-building-architecture argument every few years i am accustomed to thinking of (refuting) how software and software development is (not) like the traditional field of building design and development. The analogy driven mind needs something to reference and i guess us noobs in software are desperate to find something historical to feel validated.
Every discipline needs a role model and building design seems to be our adopted hero.

This post proposes an analogy that is far less intellectual than a typical comparison between Christoper Alexander and the design of an EventLoop Abstract Factory class.
My analogy here is based more on THIS weekend with MY wife.

Our house is full of stuff.
This stuff; chairs, tables, artwork, rugs,… we have acquired for good reason - and we could use it. The problem is that despite all good intention these hand-me-downs, gifts, rash purchases, sometimes just don’t work.

Yes we need a coffee table in the front room - but not THAT one.
Maybe the design is wrong.
Maybe the size is wrong.
Maybe the idea of a coffee table that is also is a fireplace seemed like a good idea at the time but WTF.
coffeetable

Sometimes you just need to throw stuff away.
In fact, its best if you institutionalize the process so that you actually do it.
For us its a quarterly Sunday where we drink lots of coffee, send the kids out for the day, get worked up, and throw (give) shit away.

So my analogy here is not about refactoring or deleting obsolete/deprecated code - its about the stuff on the surface - Features.

At least at Splunk, features are a bit like the coffee table we bought on sale because we had guests coming and the fact that it was also a fireplace was a double win. Useful features in theory, and perhaps even for competitive reasons we *need*

Read More...

September 14th, 2007

Reliable syslog/tcp input - splunk bundle style

Wanted to drop this someplace for feedback.
Splunk is often hooked up to syslog(ng) or tcp ports.
Customers then shoot data as fast as they can at splunk.

You can have splunk buffer inputs or have the sender buffer but in many cases this is less than optimal - Its usually not a good idea to rely on sender side buffering.

As an interesting alternative you can use a splunk bundle to catch data off the network port and spew it to a file(s) and have splunk tail those files at its leisure. If splunk can keep up it will be seconds before you can search it. If you get a huge burst, no problem the bundle will just go to disk and splunk we be right behind. Furthermore, if someone wanted to restart splunk ( or splunk were to crash - yes it happens ) then again, just going to disk.

The advantage in making/using these scripted imput bundles (same mechanism as monitoring and imap) is that the code is usually in scripts ( perl,sh,python,ruby ) thus on-the-fly mods in the field are easier than filing a support enhancement request with splunk and waiting for someone to compile it into the product. I often give the ( perhaps poor ) analogy that our scripted input bundles are to splunk what cgi was to early webservers. They are a great place to do anything and when we see enough of them in the field we can better build it into the server.

This bundle will use scripted input to listen on a port and cut files of up to a given size. The bundle is configured to keep up to a certain number of these files before deleting the oldest. For example, you can configure it listen on port 9999 and make up to 5 files 800M

Read More...

September 9th, 2007

Beer Pong @ Splunk

Come friday at 5PM - the table came out and it was time for Beer Pong.
Myself, i had not heard of Beer Pong until Nick Mealy (in picture below on right) explained.
He has an annual pilgrimage for a week to play and pointed out that there are acutal leagues.

Splunk is all about the proper Beer Pong - with paddles - not Beruit sytle.
I’m not really up on the details but its goes something like - you place cups of beer ( see our double tap keg in background of picture on far left ) on the table and your opponents try to hit the ball into the cup forcing you to drink. I think these are the rules along the lines played at splunk.

Unfortunately, i only had iphone to take picture. Next time i’ll get a movie.

** Friday Beer Pong @ Splunk **

None of this could happen without our Beer Man from Mikes Liquors. Must be SF’s best - we call in with an order and hours later our Man ( see blow ) shows up in his Beer Guy jacket to rack the booze.

** Mikes Liquors Beer Guy - the best guy on the planet **

If your not the beer pong type and perfer to just sit back and slowly unwind after a solid week of innovation and excitement - thankfully our Beer Guy brings more than beer.

** This is part of our special cabinet **

Hold up before you all go and write you congressman that we are a bunch of boozers. I can safely claim that we actually drink LESS with the booze in the office than not. I don’t know why but its true.

If your up for a round of Beer Pong or perhaps just one of Rob’s Martini’s drop us a line - we’d be happy to oblige.

e.

Read More...


Close
E-mail It