How to run searches and jobs using the Splunk SDK for PHP

The Splunk SDK for PHP is deprecated. For more information, see Deprecation notice.

Searches run in different modes, determining when and how you can retrieve results:

  • Normal: A normal search runs asynchronously. It returns a search job immediately. Poll the job to determine its status. You can retrieve the results when the search has finished. You can also preview the results if "preview" is enabled. Normal mode works with real-time searches.
  • Blocking: A blocking search runs synchronously. It does not return a search job until the search has finished, so there is no need to poll for status. Blocking mode doesn't work with real-time searches.
  • Oneshot: A oneshot search is a blocking search that is scheduled to run immediately. Instead of returning a search job, this mode returns the results of the search once completed. Because this is a blocking search, the results are not available until the search has finished.
  • Export: An export search is another type of search operation that runs immediately, does not create a job for the search, and starts streaming results immediately. At this time, the Splunk® SDK for PHP does not support this type of search.

For those searches that produce search jobs (normal and blocking), the search results are saved for a period of time on the server and can be retrieved on request. For those searches that stream the results (oneshot and export), the search results are not retained on the server. If the stream is interrupted for any reason, the results are not recoverable without running the search again.

 

The job APIs

The classes for working with jobs are:

Access these classes through an instance of the Splunk_Service class. Retrieve a collection, and from there you can access individual items in the collection and create new ones. For example, here's a simplified program for getting a collection of jobs and creating a new one:

<?php

// Connect to Splunk
$service = new Splunk_Service($connectArguments);
$service->login();

// Get the collection of search jobs
$jobs = $service->getJobs(); 

// Create a search job
$job = $jobs->create($query);

?>
 

Code examples

This section provides examples of how to use the search APIs, assuming you first connect to a Splunk instance. To view these APIs in action, navigate to the index.php page within the examples directory (for instance, if you're developing and testing on your local machine, the URL might be http://localhost:8888/splunk-sdk-php/examples/index.php), and then click "List Jobs" under "Examples".

The following parameters are available for search jobs:

 

To list search jobs for the current user

This example shows how to use the Splunk_Jobs class to retrieve the collection of jobs available to the current user:

<?php

// Get all jobs for all users and apps
$jobs = $service->getJobs()->items(array(
    'namespace' => Splunk_Namespace::createUser('admin', NULL),
));
?>

<!-- List the name of each job -->
<ul>
  <?php
  foreach ($jobs as $job)
    {
    echo '<li>';
    echo htmlspecialchars($job->getName());
    echo '</li>';
  }
  ?>
</ul>
 

To run a normal search and poll for completion

Running a normal search creates a search job and immediately returns the search ID, so you need to poll the job to find out when the search has finished.

When you create a search job, you need to set the parameters of the job as an argument map of key-value pairs. For a list of all the possible parameters, see Search job parameters.

This example runs a normal search, waits for the job to finish, and then displays the results along with some statistics:

<?php
$searchQueryNormal = 'search * | head 100';

// Run a normal search        
$job = $service->getJobs()->create($searchQueryNormal, array(
    'exec_mode' => 'normal',
));

try
  {
    // Print progress of the job as it is running
    echo '<ul>';
    while (!$job->isDone())
    {
      echo '<li>';
      printf("%03.1f%%", $job->getProgress() * 100);
      echo '</li>';
      flush();

      usleep(0.5 * 1000000);
      $job->refresh();
    }
    echo '<li>Done</li>';
    echo '</ul>';

    // Get job results
    $resultsNormalSearch = $job->getResults();
    $messages = array();
  }
catch (Exception $e)
  {
    // Generate fake result that contains the exception message
    $resultsNormalSearch = array();
    $messages = array();
    $messages[] = new Splunk_ResultsMessage('EXCEPTION', $e->getMessage());
  }	

// Use the built-in XML parser to display the job results
foreach ($resultsNormalSearch as $result)
  {
    if ($result instanceof Splunk_ResultsFieldOrder)
    {
      // Process the field order
      print "FIELDS: " . implode(',', $result->getFieldNames()) . "\r\n";
    }
    else if ($result instanceof Splunk_ResultsMessage)
    {
      // Process a message
      print "[{$result->getType()}] {$result->getText()}\r\n";
    }
    else if (is_array($result))
    {
      // Process a row
      print "{\r\n";
      foreach ($result as $key => $valueOrValues)
        {
         if (is_array($valueOrValues))
          {
            $values = $valueOrValues;
            $valuesString = implode(',', $values);
            print "  {$key} => [{$valuesString}]\r\n";
          }
         else
          {
            $value = $valueOrValues;
            print "  {$key} => {$value}\r\n";
          }
        }
      print "}\r\n";
    }
    else
    {
      // Ignore unknown result type
    }
  }

?>
 

To run a blocking search and display properties of the job

Running a blocking search creates a search job and runs the search synchronously. The job is returned after the search has finished and all the results are in.

When you create a search job, you need to set the parameters of the job as an argument map of key-value pairs. For a list of all the possible parameters, see Search job parameters.

This example runs a blocking search, waits for the job to finish, and then displays some statistics:

<?php

// Run a blocking search
$searchQueryBlocking = 'search * | head 100'; // Return the first 100 events


// A blocking search returns the job when the search is done
echo '<p>Waiting for the search to finish...</p>';
$job = $service->getJobs()->create($searchQueryBlocking, array(
    'exec_mode' => 'blocking',
));
echo '<p>...done!</p>';

// Display properties of the job
echo '<p>Search job properties:</p><hr/>';
echo '<p>Search job ID:' . htmlspecialchars($job['sid']);
echo '</p><p>The number of events:' . htmlspecialchars($job['eventCount']);
echo '</p><p>The number of results:' . htmlspecialchars($job['resultCount']);
echo '</p><p>Search duration:' . htmlspecialchars($job['runDuration']);
echo ' seconds';
echo '</p><p>This job expires in:' . htmlspecialchars($job['ttl']);
echo ' seconds</p>';

?>
 

To run a basic oneshot search and displaying results

Unlike other searches, the oneshot search does not create a search job, so you can't access it using the Splunk_Job class. Instead, use the Splunk_Service::oneshotSearch method. To set properties for the search (for example, to specify a time range to search), you'll need to create a dictionary of key-value pairs. Some common parameters are:

  • output_mode: Specifies the output format of the results (XML, JSON, JSON_COLS, JSON_ROWS, CSV, ATOM, or RAW). You shouldn't need to change this from the XML default unless you intend to parse job results yourself.
  • earliest_time: Specifies the earliest time in the time range to search. The time string can be a UTC time (with fractional seconds), a relative time specifier (to now), or a formatted time string.
  • latest_time: Specifies the latest time in the time range to search. The time string can be a UTC time (with fractional seconds), a relative time specifier (to now), or a formatted time string.
  • rf: Specifies one or more fields to add to the search.

For a full list of possible properties, see the list of Search job parameters. Be aware, however, that most of these parameters don't apply to a oneshot search.

This example runs a oneshot search within a specified time range and displays the results in XML.

Note: If you don't see any search results with this example, you might not have anything in the specified time range. Just modify the date and time as needed for your data set.
<?php

// Run a oneshot search
$searchQueryOneshot = 'search * | head 100'; // Return the first 100 events

// Set the search parameters; specify a time range
$searchParams = array(
    'earliest_time' => '2012-06-19T12:00:00.000-07:00',	
    'latest_time' => '2013-12-02T12:00:00.000-07:00'
);

// Run a oneshot search that returns the job's results
$resultsStream = $service->oneshotSearch($searchQueryOneshot, $searchParams);
$resultsOneshotSearch = new Splunk_ResultsReader($resultsStream);

// Use the built-in XML parser to display the job results
foreach ($resultsOneshotSearch as $result)
  {
    if ($result instanceof Splunk_ResultsFieldOrder)
    {
      // Process the field order
      print "FIELDS: " . implode(',', $result->getFieldNames()) . "\r\n";
    }
    else if ($result instanceof Splunk_ResultsMessage)
    {
      // Process a message
      print "[{$result->getType()}] {$result->getText()}\r\n";
    }
    else if (is_array($result))
    {
      // Process a row
      print "{\r\n";
      foreach ($result as $key => $valueOrValues)
        {
          if (is_array($valueOrValues))
            {
              $values = $valueOrValues;
              $valuesString = implode(',', $values);
              print "  {$key} => [{$valuesString}]\r\n";
            }
          else
            {
              $value = $valueOrValues;
              print "  {$key} => {$value}\r\n";
            }
        }
      print "}\r\n";
    }
    else
    {
      // Ignore unknown result type
    }
  }

?>
 

Collection parameters

By default, all entities are returned when you retrieve a collection. Using the parameters below, you can specify the number of entities to return and how to sort them. These parameters are available whenever you retrieve a collection.

Parameter
Description
count A number that indicates the maximum number of entities to return.
offset A number that specifies the index of the first entity to return.
search A string that specifies a search expression to filter the response with, matching field values against the search expression. For example, "search=foo" matches any object that has "foo" as a substring in a field, and "search=field_name%3Dfield_value" restricts the match to a single field.
sort_dir An enum value that specifies how to sort entities. Valid values are "asc" (ascending order) and "desc" (descending order).
sort_key A string that specifies the field to sort by.
sort_mode An enum value that specifies how to sort entities. Valid values are "auto", "alpha" (alphabetically), "alpha_case" (alphabetically, case sensitive), or "num" (numerically).
 

Search job parameters

Properties to set

The parameters you can use for search jobs correspond to the parameters for the search/jobs endpoint in the REST API.

This list summarizes the properties you can set for a search job (click here for properties you can retrieve). For examples of setting these properties, see To run a blocking search and display properties of the job and To run a normal search and poll for completion.

Parameter
Description
search Required. A string that contains the search query.
auto_cancel The number of seconds of inactivity after which to automatically cancel a job. 0 means never auto-cancel.
auto_finalize_ec The number of events to process after which to auto-finalize the search. 0 means no limit.
auto_pause The number of seconds of inactivity after which to automatically pause a job. 0 means never auto-pause.
earliest_time A time string that specifies the earliest time in the time range to search. The time string can be a UTC time (with fractional seconds), a relative time specifier (to now), or a formatted time string. For a real-time search, specify "rt".
enable_lookups A Boolean that indicates whether to apply lookups to events.
exec_mode An enum value that indicates the search mode ("blocking", "oneshot", or "normal").
force_bundle_replication A Boolean that indicates whether this search should cause (and wait depending on the value of "sync_bundle_replication") bundle synchronization with all search peers.
id A string that contains a search ID. If unspecified, a random ID is generated.
index_earliest A string that specifies the time for the earliest (inclusive) time bounds for the search, based on the index time bounds. The time string can be a UTC time (with fractional seconds), a relative time specifier (to now), or a formatted time string.
index_latest A string that specifies the time for the latest (inclusive) time bounds for the search, based on the index time bounds. The time string can be a UTC time (with fractional seconds), a relative time specifier (to now), or a formatted time string.
latest_time A time string that specifies the latest time in the time range to search. The time string can be a UTC time (with fractional seconds), a relative time specifier (to now), or a formatted time string. For a real-time search, specify "rt".
max_count The number of events that can be accessible in any given status bucket.
max_time The number of seconds to run this search before finalizing. Specify 0 to never finalize.
namespace A string that contains the application namespace in which to restrict searches.
now A time string that sets the absolute time used for any relative time specifier in the search.
reduce_freq The number of seconds (frequency) to run the MapReduce reduce phase on accumulated map values.
reload_macros A Boolean that indicates whether to reload macro definitions from the macros.conf configuration file.
remote_server_list A string that contains a comma-separated list of (possibly wildcarded) servers from which to pull raw events. This same server list is used in subsearches.
rf A string that adds one or more required fields to the search.
rt_blocking A Boolean that indicates whether the indexer blocks if the queue for this search is full. For real-time searches.
rt_indexfilter A Boolean that indicates whether the indexer pre-filters events. For real-time searches.
rt_maxblocksecs The number of seconds indicating the maximum time to block. 0 means no limit. For real-time searches with "rt_blocking" set to "true".
rt_queue_size The number indicating the queue size (in events) that the indexer should use for this search. For real-time searches.
search_listener A string that registers a search state listener with the search. Use the format: search_state;results_condition;http_method;uri;
search_mode An enum value that indicates the search mode ("normal" or "realtime"). If set to "realtime", searches live data. A real-time search is also specified by setting "earliest_time" and "latest_time" parameters to "rt", even if the search_mode is normal or is not set.
spawn_process A Boolean that indicates whether to run the search in a separate spawned process. Searches against indexes must run in a separate process.
status_buckets The maximum number of status buckets to generate, which corresponds to the size of the data structure used to store timeline information. A value of 0 means to not generate timeline information.
sync_bundle_replication A Boolean that indicates whether this search should wait for bundle replication to complete.
time_format A string that specifies the format to use to convert a formatted time string from {start,end}_time into UTC seconds.
timeout The number of seconds to keep this search after processing has stopped.
 
Properties to retrieve

This list summarizes the properties that are available for an existing search job:

Property
Description
cursorTime The earliest time from which no events are later scanned.
delegate For saved searches, specifies jobs that were started by the user.
diskUsage The total amount of disk space used, in bytes.
dispatchState The state of the search. Can be any of QUEUED, PARSING, RUNNING, PAUSED, FINALIZING, FAILED, DONE.
doneProgress A number between 0 and 1.0 that indicates the approximate progress of the search.
dropCount For real-time searches, the number of possible events that were dropped due to the "rt_queue_size".
eai:acl The access control list for this job.
eventAvailableCount The number of events that are available for export.
eventCount The number of events returned by the search.
eventFieldCount The number of fields found in the search results.
eventIsStreaming A Boolean that indicates whether the events of this search are being streamed.
eventIsTruncated A Boolean that indicates whether events of the search have not been stored.
eventSearch Subset of the entire search before any transforming commands.
eventSorting A Boolean that indicates whether the events of this search are sorted, and in which order ("asc" for ascending, "desc" for descending, and "none" for not sorted).
isDone A Boolean that indicates whether the search has finished.
isFailed A Boolean that indicates whether there was a fatal error executing the search (for example, if the search string syntax was invalid).
isFinalized A Boolean that indicates whether the search was finalized (stopped before completion).
isPaused A Boolean that indicates whether the search has been paused.
isPreviewEnabled A Boolean that indicates whether previews are enabled.
isRealTimeSearch A Boolean that indicates whether the search is a real time search.
isRemoteTimeline A Boolean that indicates whether the remote timeline feature is enabled.
isSaved A Boolean that indicates whether the search is saved indefinitely.
isSavedSearch A Boolean that indicates whether this is a saved search run using the scheduler.
isZombie A Boolean that indicates whether the process running the search is dead, but with the search not finished.
keywords All positive keywords used by this search. A positive keyword is a keyword that is not in a NOT clause.
label A custom name created for this search.
messages Errors and debug messages.
numPreviews Number of previews that have been generated so far for this search job.
performance A representation of the execution costs.
priority An integer between 0-10 that indicates the search's priority.
remoteSearch The search string that is sent to every search peer.
reportSearch If reporting commands are used, the reporting search.
request GET arguments that the search sends to splunkd.
resultCount The total number of results returned by the search, after any transforming commands have been applied (such as stats or top).
resultIsStreaming A Boolean that indicates whether the final results of the search are available using streaming (for example, no transforming operations).
resultPreviewCount The number of result rows in the latest preview results.
runDuration A number specifying the time, in seconds, that the search took to complete.
scanCount The number of events that are scanned or read off disk.
searchEarliestTime The earliest time for a search, as specified in the search command rather than the "earliestTime" parameter. It does not snap to the indexed data time bounds for all-time searches (as "earliestTime" and "latestTime" do).
searchLatestTime The latest time for a search, as specified in the search command rather than the "latestTime" parameter. It does not snap to the indexed data time bounds for all-time searches (as "earliestTime" and "latestTime" do).
searchProviders A list of all the search peers that were contacted.
sid The search ID number.
ttl The time to live, or time before the search job expires after it has finished.