Download | Support
Splunk.com | SplunkBase | dev.splunk.com

david

David Carasso, Chief Mind and Manager of Knowledge Engineering. Responsible for extracting and managing the knowledge needed to make splunk intelligent. Responsible for innovating and prototyping a class of hard problems at the core of Splunk, including file classification, event aggregation, dynamic tagging of events, and the new 3.0 search language, as well as automatic field extraction and automatic vocabulary building.

January 21st, 2008

O’Rly?

Below are a few easter egg features found inside Splunk.

  • From the commandline: “splunk ftw” produces an ascii-art “O’Rly?“.
  • From the commandline: the “outputrawr” produces ascii-art fireworks.
  • From the searchbox, piping results to the “marklar” processor (e.g. “*|marklar”), converts all search results into the Marklarian language.
  • From the searchbox, piping result to the “loglady” processor (e.g., “*|loglady”), converts all the search results into quotes from Twin Peaks’s LogLady.

Enjoy them while they last, before they are removed by the Silliness Police, who%$($%%$
^H^H^H^NO CARRIER

Read More...

January 10th, 2008

Bomberman

The world’s most fun video game, keeping us sane — 1993’s Bomberman for NES, played on the Wii.
“Look out, rotsky, you’ve got fast aids!”

Read More...

December 5th, 2007

Simple Transactions

In the Preview release, we have search-time discovery of simple transactions, with the new transam search command, in honor of one of our developers that hasn’t quite moved on from high school. Transam collapses a set of events that belong to a transaction into a single event. You can specify the parameters as arguments to the transam operator right in the search, or you can refer to a named-transaction definition in transactions.conf. A few simple examples will give you an idea of some things you can do.

  • get events with ‘http’, and group any search results into “bursts” of events, grouping any events that occur within two seconds of each other into the same transaction event. [Note: there is an implied “search” command at the head of all searches, so “http” is really “search http”.]
  • http | transam maxpause=2s
  • get events with ‘http’, and collapse those that share the same host and cookie value, that occur within 30 seconds:
  • http | transam fields=host,cookie maxspan=30s maxpause=30s
  • get events with ’sendmail’, and collapse those that have the same userid, between a login and a logout, that occur within 10 minutes:
  • sendmail | transam fields=uid startswith="eventtype=login" endswith="eventtype=logout" maxspan=10m maxpause=10m
  • get events with ‘http’, and then find transactions as defined by email_transaction found in transactions.conf:
  • http | transam email_transaction

Read More...

October 27th, 2007

Tutorial: Event Types in 3.2

Hi, I’m David Carasso, perhaps you’ve seen my famous File Classifier Video. It’s the number one video at CurrentTV.

Below is a second screen capture video that I just made to describe Splunk’s new Event Typer. The Event Typer dynamically tags system events in custom, yet, universal ways. For example, I can say that for any event that happens on Sunday, that has ’status=Fatal’, and that has “sourcetype=weblogic”, to be dynmaically tagged as a “weekend_fatal_weblogic” event. Topics covered include: what is an event type; how to search, view, and count event types; creating an event type; creating an event-type template; and discovering event-types.

Yes, production value is what you’ve come to expect from a Carasso Production. That’s right 15 minutes of unscripted nerd talk. Now with a bonus 45 seconds of video as I type in an off-camera window. But I promise you’ll learn a few useful things you didn’t know.
EventTyperVideo (15 minutes of emacs magic)

Read More...

October 26th, 2007

Tutorial: File Classifier

Hi, I’m David Carasso and below is a screen capture video I just made to describe Splunk’s File Classifer. The File Classifier takes a file and tell you what type it is. From that sourcetype we determine what to do with the file and how to process it. It’s pretty critical for properly handling a file, including time-stamping events and aggregating multiple lines into single events. There are several methods that the File Classifer uses to classify a file, and we’ll cover each one with real-world examples.

Yes, production value is at a new low here as I cover 18 minutes unscripted, but I promise you’ll learn a few useful things you didn’t know. There’s a free Splunk t-shirt for the commentor that guesses the actual number of times I say “uhhhhh”.

File ClassifierVideo (18 minutes of action packed emacs video)

Read More...

October 12th, 2007

Semi-Automatic Discovery of Extraction Patterns for Log Analysis

Here’s a paper I recently wrote on some of the automatic field extraction we’re doing with Splunk.

Abstract
This paper presents an interactive bootstrapping process used in Splunk that automatically learns to extract fields from log events. End users simply select one or more example values of a field and a learning process discovers additional instances, along with the patterns to extract them. The user is able to correct the instances and save the extraction patterns. Immediately afterward, while searching log events the newly-taught fields will be extracted from the event’s raw text.

Click here to read full paper

Feedback appreciated.

Read More...

January 23rd, 2007

Still looking up syslog problems the old way?

Read More...

September 30th, 2005

One Geeks Reasons for Splunk

I don’t think our website makes it painfully clear why you’d want Splunk.
Here is my view why you will want Splunk.


What is Splunk?

    Splunk is a search server that indexes all your log files.

    If you need to search and troubleshoot log files, you need Splunk. It
    handles any log format, including syslog, Apache, Jboss, mysql,
    oracle, router data, etc. It parses and indexes in real time.

Grep works fine. Why do I need Splunk?

    grep is totally fine for small, simple, local files, but grep doesn’t
    work on 20GB of log files, across a dozen servers
    ; doesn’t group
    multiline log messages together; doesn’t unify timestamps across
    files; doesn’t automatically find related log events; doesn’t show
    histograms of log events; doesn’t search gigabytes in seconds; doesn’t
    have a cool ajax web interface similar to google.

What are multiline log messages?

    As an example, java exceptions look like this:

    [source:java]java.lang.reflect.UndeclaredThrowableException
    at $Proxy231.getAllAttributes(Unknown Source)
    at com.collation.proxy.clientproxy.common.Module.getModelObject(Module.java:326)
    at com.collation.proxy.clientproxy.server.action.ChangeHistoryModule.getDependencies(ChangeHistoryModule.java:402)
    at com.collation.proxy.clientproxy.server.action.ChangeHistoryModule.getIdsWithDependencies(ChangeHistoryModule.java:386)

    [/source]

You can’t use
grep to search for java proxy exceptions because
“Exception” and “proxy” don’t occur on the same line!
The same
would apply to sql, router data, email, or any other multiline event.
Splunk automatically groups
multiline events into single events
, so the above exception
would become one event. Splunk does this with advanced heuristics and
machine learning algorithms, as well as customizeable groupping rules.

What about unifying timestamps?

    Most log files have timestamps embedded in them. Splunk understands
    dozens and dozens of timestamp formats, unifying them across
    timezones. Some log files write events out as GMT (Greenwich Mean
    Time) some as local time such as PST (Pacific Standard Time). Some
    logs can come from servers on the east coast, some from the west
    coast, or beyond. By
    normalizing all these timeszones in dozens of timestamp formats,
    Splunk allows you to say “What happened at 11:57pm”, world-wide,
    across all my log files, across all my servers.
    “I got an error
    at 1:15am yesterday. Show me the log events from all my logs just
    before

Read More...


Close
E-mail It