How to work with searches and jobs using the Splunk SDK for Java

Searches run in different modes, determining when and how you can retrieve results:

  • Normal: A normal search runs asynchronously. It returns a search job immediately. Poll the job to determine its status. You can retrieve the results when the search has finished. You can also preview the results if "preview" is enabled. Normal mode works with real-time searches.
  • Blocking: A blocking search runs synchronously. It does not return a search job until the search has finished, so there is no need to poll for status. Blocking mode doesn't work with real-time searches.
  • Oneshot: A oneshot search is a blocking search that is scheduled to run immediately. Instead of returning a search job, this mode returns the results of the search once completed. Because this is a blocking search, the results are not available until the search has finished.
  • Real-time: A real-time search runs in normal mode and searches live events as they stream into Splunk Enterprise for indexing. The events that are returned match your criteria within a specified time range window.
  • Export: An export search runs immediately, does not create a job for the search, and starts streaming results immediately. This search is useful for exporting large amounts of data from Splunk Enterprise.

For those searches that produce search jobs (normal, blocking, and real-time), the search results are saved for a period of time on the server and can be retrieved on request. For those searches that stream the results (oneshot and export), the search results are not retained on the server. If the stream is interrupted for any reason, the results are not recoverable without running the search again.

This topic contains the following sections:

The job APIs

The classes for working with jobs are:

  • The JobCollection class for the collection of search jobs.
  • The Job class for an individual search job.
  • The JobArgs class with arguments for creating jobs.
  • The JobEventsArgs class with arguments for retrieving events from a job.
  • The JobExportArgs class with arguments for creating an export search.
  • The JobResultsArgs class with arguments for retrieving results from a job.
  • The JobResultsPreviewArgs class with arguments for retrieving preview results from a job.
  • The JobSummaryArgs class with arguments for retrieving a summary of a job's results.

Access these classes through an instance of the Service class. Retrieve a collection, and from there you can access individual items in the collection and create new ones. For example, here's a simplified program for getting a collection of jobs and creating a new one:

// Connect to Splunk Enterprise
Service service = Service.connect(connectArgs);

// Retrieves the collection of search jobs
JobCollection jobs = service.getJobs();

// Creates a search job
Job job = jobs.create(query);

// Another way to create a search job
Job job = service.getJobs().create(query);

The methods for running other searches are available from the Service class (rather than the Job class) because these searches don't create search jobs:

Code examples

This section provides examples of how to use the job APIs, assuming you first connect to a Splunk Enterprise instance:

Note: This topic touches on displaying search results, but for more in-depth information, see How to display search results.
 

To list search jobs for the current user

This example gets the collection of jobs available to the current user:

// Retrieve the collection
JobCollection jobs = service.getJobs();
System.out.println("There are " + jobs.size() + " jobs available to 'admin'\n");

// List the job SIDs
for (Job job : jobs.values()) {
    System.out.println(job.getName());
}

To run a normal search and poll for completion

Running a normal search creates a search job and immediately returns the search ID, so you need to poll the job to find out when the search has finished.

When you create a search job, you need to set the parameters of the job as an argument map of key-value pairs. For a list of all the possible parameters, see Search job parameters.

This example runs a normal search, waits for the job to finish, and then displays the results along with some statistics:

// Additional imports
import java.io.InputStream;
import java.util.HashMap;

...

// Run a normal search
String searchQuery_normal = "search * | head 100";
JobArgs jobargs = new JobArgs();
jobargs.setExecutionMode(JobArgs.ExecutionMode.NORMAL);
Job job = service.getJobs().create(searchQuery_normal, jobargs);

// Wait for the search to finish
while (!job.isDone()) {
    try {
        Thread.sleep(500);
    } catch (InterruptedException e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    }
}

// Get the search results and use the built-in XML parser to display them
InputStream resultsNormalSearch =  job.getResults();

ResultsReaderXml resultsReaderNormalSearch;

try {
    resultsReaderNormalSearch = new ResultsReaderXml(resultsNormalSearch);
    HashMap<String, String> event;
    while ((event = resultsReaderNormalSearch.getNextEvent()) != null) {
        System.out.println("\n****************EVENT****************\n");
        for (String key: event.keySet())
            System.out.println("   " + key + ":  " + event.get(key));
    }
} catch (Exception e) {
    e.printStackTrace();
}

// Get properties of the completed job
System.out.println("\nSearch job properties\n---------------------");
System.out.println("Search job ID:         " + job.getSid());
System.out.println("The number of events:  " + job.getEventCount());
System.out.println("The number of results: " + job.getResultCount());
System.out.println("Search duration:       " + job.getRunDuration() + " seconds");
System.out.println("This job expires in:   " + job.getTtl() + " seconds");

To run a blocking search and display properties of the job

Running a blocking search creates a search job and runs the search synchronously. The job is returned after the search has finished and all the results are in.

When you create a search job, you need to set the parameters of the job as an argument map of key-value pairs. For a list of all the possible parameters, see Search job parameters.

This example runs a blocking search, waits for the job to finish, and then displays some statistics:

// Run a blocking search
String searchQuery_blocking = "search * | head 100"; // Return the first 100 events
JobArgs jobargs = new JobArgs();
jobargs.setExecutionMode(JobArgs.ExecutionMode.BLOCKING);

// A blocking search returns the job when the search is done
System.out.println("Wait for the search to finish...");
Job job = service.getJobs().create(searchQuery_blocking, jobargs);
System.out.println("...done!\n");

// Get properties of the job
System.out.println("Search job properties:\n---------------------");
System.out.println("Search job ID:         " + job.getSid());
System.out.println("The number of events:  " + job.getEventCount());
System.out.println("The number of results: " + job.getResultCount());
System.out.println("Search duration:       " + job.getRunDuration() + " seconds");
System.out.println("This job expires in:   " + job.getTtl() + " seconds");

To run a basic oneshot search and display results

Unlike other searches, the oneshot search does not create a search job, so you can't access it using the Job and JobCollection classes. Instead, use the Service.oneshotSearch method. To set properties for the search (for example, to specify a time range to search), you'll need to create an argument map with the parameter key-value pairs. Some common parameters are:

  • output_mode: Specifies the output format of the results (XML, JSON, JSON_COLS, JSON_ROWS, CSV, ATOM, or RAW).
  • earliest_time: Specifies the earliest time in the time range to search. The time string can be a UTC time (with fractional seconds), a relative time specifier (to now), or a formatted time string.
  • latest_time: Specifies the latest time in the time range to search. The time string can be a UTC time (with fractional seconds), a relative time specifier (to now), or a formatted time string.
  • rf: Specifies one or more fields to add to the search.

For a full list of possible properties, see the list of Search job parameters, although most of these parameters don't apply to a oneshot search.

This example runs a oneshot search within a specfied time range and displays the results in XML.

Note: If you don't see any search results with this example, you might not have anything in the specified time range. Just modify the date and time as needed for your data set.

// Set the parameters for the search:
Args oneshotSearchArgs = new Args(); 
oneshotSearchArgs.put("earliest_time", "2012-06-19T12:00:00.000-07:00");
oneshotSearchArgs.put("latest_time",   "2012-06-20T12:00:00.000-07:00");
String oneshotSearchQuery = "search * | head 10";

// The search results are returned directly
InputStream results_oneshot =  service.oneshotSearch(oneshotSearchQuery, oneshotSearchArgs);

// Get the search results and use the built-in XML parser to display them
try {
    ResultsReaderXml resultsReader = new ResultsReaderXml(results_oneshot);
    System.out.println("Searching everything in a 24-hour time range starting June 19, 12:00pm and displaying 10 results in XML:\n");
    HashMap<String, String> event;
    while ((event = resultsReader.getNextEvent()) != null) {
        System.out.println("\n********EVENT********");
        for (String key: event.keySet())
            System.out.println("   " + key + ":  " + event.get(key));
    }
    resultsReader.close();
} catch (Exception e) {
    e.printStackTrace();
}

To run a real-time search

Real-time searches return live events as they are indexed, and this type of search continues to run as events continue to arrive. So, to view results from a real-time search, you must view the preview results. You can think of the previews as a snapshot of the search results at that moment in time. A few search job parameters are required to run a real-time search, which you can set by using the methods of the JobArgs class:

  • Set the execution mode ("exec_mode") to "normal".
  • Set the search mode ("search_mode") to "realtime", which also enables previews.
  • Set the earliest and latest times to search ("earliest_time" and "latest_time") to "rt". (Setting these to "rt" also sets the search mode to "realtime".)
    If you want to specify a sliding window for your search (let's say a one-minute window), you can use relative time modifiers (for example, set "earliest_time" to "rt-1m", "latest_time" to "rt"), and the time range is continuously updated based on the current time.

The real-time search continues to run until you either cancel or finalize it. If you cancel the search, the search job is deleted. If you finalize it, the search is stopped and the search job is completed.

Displaying the real-time search results requires displaying preview results (this is also described in To display preview results):

  • By default, the most recent 100 previews are retrieved. To change this number, set a value for "count". Use the "offset" value to page through large sets of previews.
  • Only previews from the time range to search are retrieved.
  • Depending on the time range to search, the number of events that are arriving to be indexed, and the count of previews to retrieve, the previews from one set to the next might include duplicates or be incomplete.

The following example shows a real-time search of your internal index. Results are displayed in XML using a standard Java stream reader. For more about displaying results, output formats, and results readers, see How to display search results.

// Create an argument map for the job arguments:
// a normal real-time search, a window of 1 minute, with timeline data
JobArgs jobArgs = new JobArgs();
jobArgs.setExecutionMode(JobArgs.ExecutionMode.NORMAL);
jobArgs.setSearchMode(JobArgs.SearchMode.REALTIME);
jobArgs.setEarliestTime("rt-1m");
jobArgs.setLatestTime("rt");
jobArgs.setStatusBuckets(300);

// Create the job
String mySearch = "search index=_internal";
Job job = service.search(mySearch, jobArgs);

// Wait for the job to be ready
while (!job.isReady()) {
    Thread.sleep(500);
}

// View the results--a stream of previews--using standard Java classes
JobResultsPreviewArgs previewArgs = new JobResultsPreviewArgs();
previewArgs.setCount(300);     // Retrieve 300 previews at a time

// Use a continual loop 
while (true) {
    InputStream stream = job.getResultsPreview(previewArgs);
    String line = null;
    BufferedReader reader = new BufferedReader(new InputStreamReader(
        stream, "UTF-8"));
    while ((line = reader.readLine()) != null) {
        System.out.println(line);
    }
    reader.close();
    stream.close();
    Thread.sleep(500);
}

To run an export search

An export search is the most reliable way to return a large set of results because exporting returns results in a stream, rather than as a search job that is saved on the server. So the server-side limitations to the number of results that can be returned don't apply to export searches.

You can run an export search in normal and real-time modes, and running the search is similar to running a regular search. However, displaying the results of an export search is a little tricky because they require different parsing—these issues are covered in To work with results from an export search.

To show you how to run a simple export search, this example runs a normal-mode export search of your internal index over the last hour and uses the SDK's MultiResultsReaderXml class to display the output. For more about displaying results, see How to display search results.

// Create an argument map for the export arguments
JobExportArgs exportArgs = new JobExportArgs();
exportArgs.setEarliestTime("-1h");
exportArgs.setLatestTime("now");
exportArgs.setSearchMode(JobExportArgs.SearchMode.NORMAL); 

// Run the search with a search query and export arguments
String mySearch = "search index=_internal";
InputStream exportSearch = service.export(mySearch, exportArgs);

// Display results using the SDK's multi-results reader for XML 
MultiResultsReaderXml multiResultsReader = new MultiResultsReaderXml(exportSearch);

int counter = 0;  // count the number of events
for (SearchResults searchResults : multiResultsReader)
{
    for (Event event : searchResults) {
        System.out.println("***** Event " + counter++ + " *****");
        for (String key: event.keySet())
            System.out.println("   " + key + ":  " + event.get(key));
    }
}
multiResultsReader.close();

Parameters

The following parameters are available for search jobs:

Collection parameters

By default, all entities are returned when you retrieve a collection. Using the parameters below, you can specify the number of entities to return, how to sort them, and so on. Set these paramaters using setters from the CollectionArgs class, or create a generic Args map.

ParameterDescription
countA number that indicates the maximum number of entities to return.
offsetA number that specifies the index of the first entity to return.
searchA string that specifies a search expression to filter the response with, matching field values against the search expression. For example, "search=foo" matches any object that has "foo" as a substring in a field, and "search=field_name%3Dfield_value" restricts the match to a single field.
sort_dirAn enum value that specifies how to sort entities. Valid values are "asc" (ascending order) and "desc" (descending order).
sort_keyA string that specifies the field to sort by.
sort_modeAn enum value that specifies how to sort entities. Valid values are "auto", "alpha" (alphabetically), "alpha_case" (alphabetically, case sensitive), or "num" (numerically).

Search job parameters

Properties to set

The parameters you can use for search jobs correspond to the parameters for the search/jobs endpoint in the REST API.

This list summarizes the properties you can set for a search job (click here for properties you can retrieve). There are different ways to set these properties—you can create an argument map with these parameters as key-value pairs using the generic Args class, or you can use the setters for these specialized classes with different Job methods:

ParameterDescription
searchRequired. A string that contains the search query.
auto_cancelThe number of seconds of inactivity after which to automatically cancel a job. 0 means never auto-cancel.
auto_finalize_ecThe number of events to process after which to auto-finalize the search. 0 means no limit.
auto_pauseThe number of seconds of inactivity after which to automatically pause a job. 0 means never auto-pause.
earliest_timeA time string that specifies the earliest time in the time range to search. The time string can be a UTC time (with fractional seconds), a relative time specifier (to now), or a formatted time string. For a real-time search, specify "rt".
enable_lookupsA Boolean that indicates whether to apply lookups to events.
exec_modeAn enum value that indicates the search mode ("blocking", "oneshot", or "normal").
force_bundle_replicationA Boolean that indicates whether this search should cause (and wait depending on the value of "sync_bundle_replication") bundle synchronization with all search peers.
idA string that contains a search ID. If unspecified, a random ID is generated.
index_earliestA string that specifies the time for the earliest (inclusive) time bounds for the search, based on the index time bounds. The time string can be a UTC time (with fractional seconds), a relative time specifier (to now), or a formatted time string.
index_latestA string that specifies the time for the latest (inclusive) time bounds for the search, based on the index time bounds. The time string can be a UTC time (with fractional seconds), a relative time specifier (to now), or a formatted time string.
latest_timeA time string that specifies the latest time in the time range to search. The time string can be a UTC time (with fractional seconds), a relative time specifier (to now), or a formatted time string. For a real-time search, specify "rt".
max_countThe number of events that can be accessible in any given status bucket.
max_timeThe number of seconds to run this search before finalizing. Specify 0 to never finalize.
namespaceA string that contains the application namespace in which to restrict searches.
nowA time string that sets the absolute time used for any relative time specifier in the search.
reduce_freqThe number of seconds (frequency) to run the MapReduce reduce phase on accumulated map values.
reload_macrosA Boolean that indicates whether to reload macro definitions from the macros.conf configuration file.
remote_server_listA string that contains a comma-separated list of (possibly wildcarded) servers from which to pull raw events. This same server list is used in subsearches.
rfA string that adds one or more required fields to the search.
rt_blockingA Boolean that indicates whether the indexer blocks if the queue for this search is full. For real-time searches.
rt_indexfilterA Boolean that indicates whether the indexer pre-filters events. For real-time searches.
rt_maxblocksecsThe number of seconds indicating the maximum time to block. 0 means no limit. For real-time searches with "rt_blocking" set to "true".
rt_queue_sizeThe number indicating the queue size (in events) that the indexer should use for this search. For real-time searches.
search_listenerA string that registers a search state listener with the search. Use the format: search_state;results_condition;http_method;uri;
search_modeAn enum value that indicates the search mode ("normal" or "realtime"). If set to "realtime", searches live data. A real-time search is also specified by setting "earliest_time" and "latest_time" parameters to "rt", even if the search_mode is normal or is not set.
spawn_processA Boolean that indicates whether to run the search in a separate spawned process. Searches against indexes must run in a separate process.
status_bucketsThe maximum number of status buckets to generate, which corresponds to the size of the data structure used to store timeline information. This value is also used for summaries. A value of 0 means to not generate timeline or summary information.
sync_bundle_replicationA Boolean that indicates whether this search should wait for bundle replication to complete.
time_formatA string that specifies the format to use to convert a formatted time string from {start,end}_time into UTC seconds.
timeoutThe number of seconds to keep this search after processing has stopped.

Properties to retrieve

This list summarizes the properties that are available for an existing search job:

PropertyDescription
cursorTimeThe earliest time from which no events are later scanned.
delegateFor saved searches, specifies jobs that were started by the user.
diskUsageThe total amount of disk space used, in bytes.
dispatchStateThe state of the search. Can be any of QUEUED, PARSING, RUNNING, PAUSED, FINALIZING, FAILED, DONE.
doneProgressA number between 0 and 1.0 that indicates the approximate progress of the search.
dropCountFor real-time searches, the number of possible events that were dropped due to the "rt_queue_size".
eai:aclThe access control list for this job.
eventAvailableCountThe number of events that are available for export.
eventCountThe number of events returned by the search.
eventFieldCountThe number of fields found in the search results.
eventIsStreamingA Boolean that indicates whether the events of this search are being streamed.
eventIsTruncatedA Boolean that indicates whether events of the search have not been stored.
eventSearchSubset of the entire search before any transforming commands.
eventSortingA Boolean that indicates whether the events of this search are sorted, and in which order ("asc" for ascending, "desc" for descending, and "none" for not sorted).
isDoneA Boolean that indicates whether the search has finished.
isFailedA Boolean that indicates whether there was a fatal error executing the search (for example, if the search string syntax was invalid).
isFinalizedA Boolean that indicates whether the search was finalized (stopped before completion).
isPausedA Boolean that indicates whether the search has been paused.
isPreviewEnabledA Boolean that indicates whether previews are enabled.
isRealTimeSearchA Boolean that indicates whether the search is a real time search.
isRemoteTimelineA Boolean that indicates whether the remote timeline feature is enabled.
isSavedA Boolean that indicates whether the search is saved indefinitely.
isSavedSearchA Boolean that indicates whether this is a saved search run using the scheduler.
isZombieA Boolean that indicates whether the process running the search is dead, but with the search not finished.
keywordsAll positive keywords used by this search. A positive keyword is a keyword that is not in a NOT clause.
labelA custom name created for this search.
messagesErrors and debug messages.
numPreviewsNumber of previews that have been generated so far for this search job.
performanceA representation of the execution costs.
priorityAn integer between 0-10 that indicates the search's priority.
remoteSearchThe search string that is sent to every search peer.
reportSearchIf reporting commands are used, the reporting search.
requestGET arguments that the search sends to splunkd.
resultCountThe total number of results returned by the search, after any transforming commands have been applied (such as stats or top).
resultIsStreamingA Boolean that indicates whether the final results of the search are available using streaming (for example, no transforming operations).
resultPreviewCountThe number of result rows in the latest preview results.
runDurationA number specifying the time, in seconds, that the search took to complete.
scanCountThe number of events that are scanned or read off disk.
searchEarliestTimeThe earliest time for a search, as specified in the search command rather than the "earliestTime" parameter. It does not snap to the indexed data time bounds for all-time searches (as "earliestTime" and "latestTime" do).
searchLatestTimeThe latest time for a search, as specified in the search command rather than the "latestTime" parameter. It does not snap to the indexed data time bounds for all-time searches (as "earliestTime" and "latestTime" do).
searchProvidersA list of all the search peers that were contacted.
sidThe search ID number.
ttlThe time to live, or time before the search job expires after it has finished.