How to get data into Splunk Enterprise using the Splunk SDK for Java

Getting data into Splunk® Enterprise involves taking data from inputs, and then indexing that data by transforming it into individual events that contain searchable fields. Here's a brief overview of how it all works.

This topic contains the following sections:

 

Data inputs

A data input is a source of incoming event data. Splunk Enterprise can index data from the following types of inputs:

  • Files and directories—the contents of files and directories of files. You can upload a file for one-time indexing (a oneshot input), monitor for new data, or monitor for file system changes (events are generated when the directory undergoes a change). Files and directories can be included using whitelists, and excluded using blacklists.
  • Network events—data that is received over network Transmission Control Protocol (TCP) and User Datagram Protocol (UDP) ports, such as data that is sent from a Splunk forwarder from a remote computer. TCP inputs are separated into raw (unprocessed) and cooked (processed) inputs, with SSL as an option for either type.
  • Windows data—data from Windows computers, which includes:
    • Windows event log data
    • Windows Registry data
    • Windows Management Instrumentation (WMI) data
    • Active Directory data
    • Performance monitoring (perfmon) data
  • Other data sources—data from custom apps, FIFO queues, scripts that get data from APIs, and other remote data interfaces and message queues.

Your data inputs and their configurations are saved in the inputs.conf configuration file.

 

Indexes

The index stores compressed, raw event data. When receiving data from your inputs, Splunk Enterprise parses the data into events and then indexes them, as follows:

  • During parsing, Splunk Enterprise extracts default fields, configures character-set encoding, identifies line termination, identifies timestamps (creating them if they aren't there), masks sensitive or private data, and can apply custom metadata. Parsing can be done by heavy forwarders. Universal forwarders do minimal parsing.
  • During indexing, Splunk Enterprise breaks events into segments, builds the index data structures, and writes the raw data and index files to disk.

Splunk Enterprise can usually determine the data type and handle the data accordingly. But when setting up new inputs, you might consider sending data to a test index first to make sure everything is configured the way you want. You can delete the indexed data (clean the index) and start over as needed. Event processing rules are set in the props.conf configuration file, which you'll need to modify directly if you want to reconfigure how events are processed.

Each index is stored as a collection of database directories (also known as buckets) in the file system, located in $SPLUNK_HOME/var/lib/splunk. Buckets are organized by age:

  • Hot buckets are searchable, actively being written to, one per index. Hot buckets roll to warm at a certain size or when splunkd is restarted, then a new hot bucket is created.
  • Warm buckets are searchable. Oldest warm buckets roll to cold when the number of warm buckets reaches a number limit.
  • Cold buckets are searchable. After a set period of time, cold buckets roll to frozen.
  • Frozen buckets are not searchable. These buckets are archived or deleted.

You can configure aspects such as the path configuration for your buckets. For example, keep the hot and warm buckets on a local computer for quick access, and put the cold and frozen buckets on a separate disk for long-term storage. You can also set the storage size.

By default, data is stored in the main index, but you can add more indexes for different data inputs. You might want multiple indexes to:

  • Control user access. Users can search only in indexes they are allowed to by their assigned role.
  • Accommodate varying retention policies. Set a different archive or retention policy by index.
  • Speed searches in certain situations. Create dedicated indexes for each data source, search just in the index you want.
 

The input and index APIs

The classes for working with data inputs are:

The classes for working with indexes are:

  • The Index class for working with an individual index.
  • The IndexCollection class for working with the collection of indexes.
  • The IndexCollectionArgs class for retrieving a collection of indexes.

Access these classes through an instance of the Service class. Retrieve a collection, and from there you can access individual items in the collection and create new ones. For example, here's a simplified program for getting a collection of inputs and creating a new one:

// Connect to Splunk Enterprise
Service service = Service.connect(connectArgs);

// Retrieve the collection of data inputs
InputCollection myInputs = service.getInputs();

// Create a TCP input
TcpInput tcpInput = myInputs.create(port, InputKind.Tcp);
 

Code examples

This section provides examples of how to use the index and input APIs, assuming you first connect to a Splunk Enterprise instance:

 

To list data inputs

This example shows how to retrieve and list the collection of data inputs that have been configured for Splunk Enterprise.

When retrieving a collection, you can set additional parameters to specify how many entities to return, the sorting order, and so on. For more, see Collection parameters.

Note: To be able to list inputs, the user's role must include those capabilites. For a list of available capabilities, see Capabilities.

 

// Get the collection of data inputs
InputCollection myInputs = service.getInputs();

// Iterate and list the collection of inputs        
System.out.println("There are " + myInputs.size() + " data inputs:\n");
for (Input entity: myInputs.values()) {
    System.out.println("  " + entity.getName() + " (" + entity.getKind() + ")");
}
 

To create a new data input

This example shows how to create a data input. You'll need to provide an input kind and a name. The input kind is a member of the InputKind enumeration. The name you specify depends on the type of input you are creating:

Input kind

Name

Active DirectoryThe name of the configuration for a specific domain controller.
MonitorThe file or directory path to monitor.
ScriptThe name of the script.
TCP cooked, TCP raw, UDPThe port number of the input.
Windows event log, Windows Perfmon, WMIThe name of the collection.
Windows RegistryThe name of the configuration stanza.

Note: Oneshot inputs are created differently (see To add data directly to an index).

You can set additional properties for the data input in different ways:

  • Use the setter methods. The setter methods are the easiest way to set and modify properties, but they aren't available until after the data input has been created. See the next section for more about modifying a data input.
  • Create an argument map of key-value pairs. Creating an argument map is the only way to set properties at the same time you create a data input, but it requires a little more work to look up properties and provide values in the correct format. For a list of possible properties, see Input parameters.

This example shows how to create a monitor data input.

Note: To be able to create and modify inputs, the user's role must include those capabilites. For a list of available capabilities, see Capabilities.

 

// Get the collection of data inputs
InputCollection myInputs = service.getInputs(); 

// Create a new Monitor data input
String monitor_filepath = "/Applications/splunk/readme-splunk.txt";
MonitorInput monitorInput= myInputs.create(monitor_filepath, InputKind.Monitor);
 

To view and modify the properties of a data input

This example continues from the previous example―it displays the properties of the new monitor input, then modifies a few of them using the setter methods, which set the properties on your local, cached copy of the object. To make these changes to the server, call the Entity.update method. For more about the properties you can set for different types of data inputs, see Input parameters.

Note: To be able to create and modify inputs, the user's role must include those capabilites. For a list of available capabilities, see Capabilities.

 

// Retrieve the new input
String testinput = "/Applications/splunk/readme-splunk.txt";
MonitorInput monitorInput = (MonitorInput) service.getInputs().get(testinput);

// Retrieve and display some properties for the new input
System.out.println("Name:      " + monitorInput.getName());
System.out.println("Kind:      " + monitorInput.getKind());
System.out.println("Path:      " + monitorInput.getPath());
System.out.println("Index:     " + monitorInput.getIndex());
System.out.println("Whitelist: " + monitorInput.getWhitelist());

// Modify some properties and update the server
System.out.println("\nSet some properties\n");
monitorInput.setIndex("test_index");
monitorInput.setWhitelist("phonyregex*2");
monitorInput.update();

// Display the changed properties again to show the change
System.out.println("Index:     " + monitorInput.getIndex());
System.out.println("Whitelist: " + monitorInput.getWhitelist());
 

To list indexes

This example shows how to retrieve and list the indexes that have been configured for Splunk Enterprise, along with the number of events contained in each. The collection is also sorted by using some additional parameters.

When retrieving a collection, you can set additional parameters to specify how many entities to return, the sorting order, and so on. For more, see Collection parameters.

// Retrieve the collection of indexes, sorted by number of events
IndexCollectionArgs indexcollArgs = new IndexCollectionArgs();
indexcollArgs.setSortKey("totalEventCount");
indexcollArgs.setSortDirection(IndexCollectionArgs.SortDirection.DESC);
IndexCollection myIndexes = service.getIndexes(indexcollArgs);

// List the indexes and their event counts
System.out.println("There are " + myIndexes.size() + " indexes:\n");
for (Index entity: myIndexes.values()) {
    System.out.println("  " + entity.getName() + " (events: " 
            + entity.getTotalEventCount() + ")");
}
 

To create a new index

When you create an index, all you need to specify is a name. As usual, you can set additional properties when you create the index by also creating an argument map of key-value pairs. Or, you can wait and use the setter methods after the index has been created, as described in the next example.

Note: To be able to create and modify an index, the user's role must include those capabilites. For a list of available capabilities, see Capabilities.

This example shows how to create a new index.

Note: If you are using a version of Splunk Enterprise earlier than 5.0, you can't delete indexes using the SDK or the REST API—something to be aware of before creating lots of test indexes.

 

//Get the collection of indexes
IndexCollection myIndexes = service.getIndexes();

//Create a new index
Index myIndex = myIndexes.create("test_index");
 

To view and modify the properties of an index

This example shows how to view the properties of the index created in the previous example and modify its properties. For more about the properties available for indexes, see Index parameters.

Note: To be able to create and modify an index, the user's role must include those capabilites. For a list of available capabilities, see Capabilities.

 

// Retrieve the index that was created earlier
Index myIndex = service.getIndexes().get("test_index");

// Retrieve properties      
System.out.println("Name:                " + myIndex.getName());
System.out.println("Current DB size:     " + myIndex.getCurrentDBSizeMB() + "MB");
System.out.println("Max hot buckets:     " + myIndex.getMaxHotBuckets());
System.out.println("# of hot buckets:    " + myIndex.getNumHotBuckets());
System.out.println("# of warm buckets:   " + myIndex.getNumWarmBuckets());
System.out.println("Max data size:       " + myIndex.getMaxDataSize());
System.out.println("Max total data size: " + myIndex.getMaxTotalDataSizeMB() + "MB");

// Modify a property and update the server
myIndex.setMaxTotalDataSizeMB(myIndex.getMaxTotalDataSizeMB()-1);
myIndex.update();
System.out.println("Max total data size: " + myIndex.getMaxTotalDataSizeMB() + "MB");
 

To clean events from an index

This example shows how to clean an index, which removes the events from it.

// Retrieve the index that was created earlier
Index myIndex = service.getIndexes().get("test_index");

// Clean events from the index, printing the before-and-after size
System.out.println("Current DB size:     " + myIndex.getCurrentDBSizeMB() + "MB");
myIndex.clean(180);
System.out.println("Current DB size:     " + myIndex.getCurrentDBSizeMB() + "MB");
 

To add data directly to an index

There are different ways to add data directly to an index, without configuring a data input. First, retrieve an index using the Index class, and then use one of the following methods:

  • Use the upload method to upload a single file as an event stream for one-time indexing, which corresponds to a oneshot data input. You'll need to specify the file and path to upload, too.
  • Use the submit method to send an event over HTTP. You'll need to provide the event as a string, and can specify values to apply to the event (host, source, and sourcetype).
  • Use the attach and attachWith methods to send events over a writeable socket. You can also specify the values to apply to these events (host, source, and sourcetype).

Here is an example of uploading a single file:

// Retrieve the index for the data
Index myIndex = service.getIndexes().get("test_index");

// Specify a file and upload it
String uploadme = "/Applications/splunk/readme-splunk.txt";
myIndex.upload(uploadme);

Here is an example of using the submit method to submit single events directly to the index over HTTP:

// Retrieve the index for the data
Index myIndex = service.getIndexes().get("test_index");

// Specify  values to apply to the event
Args eventArgs = new Args();
eventArgs.put("sourcetype", "access_combined.log");
eventArgs.put("host", "local");

// Submit an event over HTTP
myIndex.submit(eventArgs, "This is my HTTP event");

The submit method opens a new connection for each event, so if you have many events to send, this method may not perform well. In this case, use the attach method get an open socket to Splunk Enterprise that you can write your events to. Here is an example of sending an event (with a timestamp) to a socket opened by the attach method:

// Additional imports
import java.io.*;
import java.net.Socket;
import java.text.SimpleDateFormat;
import java.util.Date;

...

// Retrieve the index for the data
Index myIndex = service.getIndexes().get("test_index");

// Set up a timestamp
SimpleDateFormat sdf = new SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ssZ");
String date = sdf.format(new Date());

// Open a socket and stream
Socket socket = myIndex.attach();
try {
     OutputStream ostream = socket.getOutputStream();
     Writer out = new OutputStreamWriter(ostream, "UTF8");

     // Send events to the socket then close it
     out.write(date + "Event one!\r\n");
     out.write(date + "Event two!\r\n");
     out.flush();
} finally {
     socket.close();
}

You should always wrap a socket created with attach in a try block, and the socket should be closed in a finally clause. Because this is easy to forget, the Splunk SDK for Java provides the attachWith method to handle it for you. The attachWith method takes the body of the try block as an anonymous implementation of the ReceiverBehavior interface, and handles the set up and tear down around it automatically. For example:

// Additional imports
import java.io.*;
import java.net.Socket;
import java.text.SimpleDateFormat;
import java.util.Date;

...

// Retrieve the index
Index myIndex = service.getIndexes().get("test_index");

// Open the socket and stream, set up a timestamp
myIndex.attachWith(new ReceiverBehavior() {
     public void run(OutputStream stream) throws IOException {
         DateFormat dateFormat = new SimpleDateFormat("yyyy/MM/dd-HH:mm:ss");
         String date = dateFormat.format(new Date());
         String eventText = date + " Boris the mad baboon!\r\n";
         stream.write(eventText.getBytes("UTF8"));
     }
});

Note that any local variables from the outer scope that you use in the block must be declared final.

The attachWith method creates the socket and a stream attached to it, then calls the run method that you implement with that stream. When run finishes, whether from returning or throwing an exception, the socket is closed. Unless you have a good reason to do otherwise, you should use attachWith instead of attach.

 

To add data directly to a TCP input

There are different ways to add data directly to a TCP input. First, retrieve a TCP input using the TCPInput class, and then use one of the following methods to send data:

  • Use the submit method to send a single event over its own connection.
  • Use attachWith to get an open output stream to the TCP input that you can write an arbitrary number of events to.
  • Use attach if you need to get a raw socket to the TCP input without the protections provided by attachWith.

Here is an example of sending a single event to a TCP input with submit:

// Retrieve the input
TcpInput myInput = (TcpInput)service.getInputs().get("10000");

// Send a single event to the input
myInput.submit("This is my event.");

The submit method opens a socket, sends the event, and closes the socket. You can get an open socket directly and write events to it by calling the attach method:

// Additional imports
import java.io.*;
import java.net.Socket;
import java.text.SimpleDateFormat;
import java.util.Date;

...

// Retrieve the input
TcpInput myInput = (TcpInput)service.getInputs().get("10000");

// Open a socket
Socket socket = myInput.attach();

// Wrap the socket in a try block so we can close it in case of an error
try {
     OutputStream ostream = socket.getOutputStream();
     Writer out = new OutputStreamWriter(ostream, "UTF8");

     DateFormat dateFormat = new SimpleDateFormat("yyyy/MM/dd-HH:mm:ss");
     String date = dateFormat.format(new Date());

     // Send events to the socket then close it
     out.write(date + "Event one!\r\n");
     out.write(date + "Event two!\r\n");
     out.flush();
} finally {
     socket.close();
}

Calling attach directly requires you to manage cleaning up the socket in the presence of errors. Because this is easy to forget, the Splunk SDK for Java provides the attachWith method to handle it for you. The attachWith method takes the body of the try block as an anonymous implementation of the ReceiverBehavior interface, and handles the set up and tear down around it automatically. For example:

// Additional imports
import java.io.*;
import java.net.Socket;
import java.text.SimpleDateFormat;
import java.util.Date;

...

// Retrieve the input
TcpInput myInput = (TcpInput)service.getInputs().get("10000");

// Open a socket
Socket socket = myInput.attach();

// Wrap the socket in a try block so we can close it in case of an error
try {
     OutputStream ostream = socket.getOutputStream();
     Writer out = new OutputStreamWriter(ostream, "UTF8");

     DateFormat dateFormat = new SimpleDateFormat("yyyy/MM/dd-HH:mm:ss");
     String date = dateFormat.format(new Date());

     // Send events to the socket then close it
     out.write(date + "Event one!\r\n");
     out.write(date + "Event two!\r\n");
     out.flush();
} finally {
     socket.close();
}

Note that any local variables from the outer scope that you use in the block must be declared final.

The attachWith method creates the socket and a stream attached to it, then calls the run method that you implement with that stream. When run finishes, whether from returning or throwing an exception, the socket is closed. Unless you have a good reason to do otherwise, you should use attachWith instead of attach.

 

To add data directly to a UDP input

UDP is an unreliable protocol—you can dispatch data using it, but with no guarantees of arrival. Because there is no notion of a connection, UDP inputs have a single method, submit, for sending events, as shown in the following example:

// Get a UDP input to send to
UdpInput myInput = (UdpInput)service.getInputs().get("9999");

// Send an event via a UDP datagram
myInput.submit("This is my event.");
 

Parameters

Here are the available parameters for inputs and indexes:

 

Collection parameters

By default, all entities are returned when you retrieve a collection. Using the parameters below, you can specify the number of entities to return, how to sort them, and so on. Set these parameters using setters from the CollectionArgs class for inputs, the IndexCollectionArgs class for indexes, or create a generic Args map.

Parameter

Description

countA number that indicates the maximum number of entities to return.
offsetA number that specifies the index of the first entity to return.
searchA string that specifies a search expression to filter the response with, matching field values against the search expression. For example, "search=foo" matches any object that has "foo" as a substring in a field, and "search=field_name%3Dfield_value" restricts the match to a single field.
sort_dirAn enum value that specifies how to sort entities. Valid values are "asc" (ascending order) and "desc" (descending order).
sort_keyA string that specifies the field to sort by.
sort_modeAn enum value that specifies how to sort entities. Valid values are "auto", "alpha" (alphabetically), "alpha_case" (alphabetically, case sensitive), or "num" (numerically).
 

Input parameters

The properties that are available for each type of data input corresponds to the parameters for the following REST API endpoints:

This table summarizes the available parameters for different types of inputs:

Parameter

Description

Input type

_rcvbufRead only. A number that specifies the size of the socket receive buffer, in bytes. This parameter is valid for UDP inputs, but is deprecated elsewhere.UPD
_TCP_ROUTINGRead only. A string that contains a list of TCP forwarding groups, as specified in the outputs.conf configuration file.monitor
baselineA Boolean that specifies whether to establish a baseline value for the registry keys (1 means yes, 0 means no).Windows Registry
blacklistA string that specifies a regular expression (regex) for a file path. File paths that match this expression are not indexed.monitor
Bytes IndexedRead only. A number that indicates the total number of bytes that were read and sent to the pipeline for indexing from a oneshot input. This total includes the uncompressed byte count from a source file that is compressed on disk.oneshot
check-indexA Boolean that indicates whether to check the "index" value to ensure that it is the name of a valid index.monitor
check-pathA Boolean that indicates whether to check the "name" value to ensure that it exists.monitor
cipherSuiteRead only. A string that contains a list of acceptable ciphers to use in SSL.TCP ssl
classesA string that contains a valid WMI class name.WMI
connection_hostAn enum that sets the host for the remote server that is sending data. Valid values are "ip" (uses the IP address), "dns" (uses the reverse DNS entry for the IP address), and "none" (uses the host as specified in the inputs.conf configuration file, which is typically the Splunk Enterprise system hostname).TCP cooked, TCP raw, UPD
connectionRead only. A string that contains the IP address and port of the source connecting to this input port.TCP cooked, TCP raw
countersA string that specifies a set of counters to monitor. An asterisk ("*") is equivalent to all counters.Windows perfmon
crc-saltA string that is used to force Splunk Enterprise to index files that have a matching cyclic redundancy check (CRC).monitor
disabledA Boolean that indicates whether a given item (monitoring stanza, input, script, monitor input, or collection) has been disabled.AD, monitor, script, TCP cooked, TCP raw, TCP SSL, Windows Registry, WMI
eai:attributesRead only. A string that contains the metadata for this input.monitor, oneshot, script, TCP cooked, TCP raw, UDP
endtimeRead only. A string that contains the time when the script stopped running.script
fieldsA string that specifies properties (fields) to gather from the given WMI class.WMI
filecountRead only. A number that indicates how many files are being monitored.monitor
followTailA Boolean that indicates whether files that are seen for the first time are read from the end.monitor
groupRead only. A string that contains the OS group of commands. A value of "listenerports" is used for listening ports.script, TCP cooked, TCP raw, UDP
hiveA string that specifies the registry hive under which to monitor for changes.Windows Registry
host_regexA string that specifies a regular expression (regex) to use for extracting a "host" field from the path. If the path matches this regular expression, the captured value is used to populate the "host" field for events from this data input. The regular expression must have one capture group.monitor, oneshot
host_segmentA string that contains the specified slash-separate segment of the file path as the value of the "host" field.monitor, oneshot
hostA string that specifies the host from which the indexer gets event data. This parameter corresponds to the "host" field.oneshot, monitor, script, TCP cooked, TCP raw, UPD
hostsA string that contains a comma-separated list of additional hosts to use for monitoring. The first host should be set with the "lookup_host" parameter, and any additional hosts should be set using this parameter.Windows event log
ignore-older-thanA string that specifies a time value indicating the rolling time window. When the modification time of a file being monitored falls outside of this rolling time window, the file is no longer monitored.monitor
indexA string that specifies the index that stores events from this input.AD, monitor. oneshot, script, TCP raw, UPD, Windows event log, Windows perfmon, Windows Registry, WMI
instancesA string that contains a set of counter instances to monitor. An asterisk ("*") is equivalent to all instances.Windows perfmon, WMI
intervalA string that contains the number of seconds or a cron schedule that specifies the interval at which to run a script, poll performance counters, or query WMI providers.script, Windows perfmon, WMI
logsA string that contains a comma-separated list of event log names from which to gather event data.Windows event log
lookup_hostA string that specifies the host from which to gather event data.Windows event log, WMI
monitorSubnodeA Boolean that indicates whether the Windows Registry input monitors all sub-nodes under a given hive.Windows Registry
monitorSubtreeA Boolean that indicates whether to monitor the subtrees of a given directory tree path (1 means yes, 0 means no).AD
nameA string that specifies the name of the input based on the type:
  • Active Directory: The name of the configuration for a specific domain controller.
  • Monitor: The file or directory path to monitor.
  • Oneshot: The path to the file to index.
  • Script: The name of the script.
  • TCP cooked: The port number of the input.
  • TCP raw: The port number of the input.
  • UDP: The port number of the input.
  • Windows event log: The name of the collection.
  • Windows Perfmon: The name of the collection.
  • Windows Registry: The name of the configuration stanza.
  • WMI: The name of the collection.
AD, monitor, oneshot, script, TCP cooked, TCP raw, TCP SSL, UPD, Windows event log, Windows perfmon, Windows Registry, WMI
no_appending_timestampA Boolean that indicates whether to prevent Splunk Enterprise from prepending a timestamp and hostname to incoming events.UPD
no_priority_strippingA Boolean that indicates whether to prevent Splunk Enterprise from removing the "priority" field from incoming syslog events.UPD
objectA string that specifies a valid performance monitor object (for example, "Process", "Server", or "PhysicalDisk").Windows perfmon
passAuthA string that contains a username specifying the user to run the script under. Splunk Enterprise generates an authorization token for the user and passes it to the script.script
passwordA string that contains the certificate password.TCP SSL
procA string that specifies a regular expression (regex). Changes are only collected for process names that match this expression.Windows Registry
queueAn enum that specifies where to deposit the events that are read by the input processor. Valid values are "parsingQueue" (apply values from the props.conf configuration file and other parsing rules to your data) and "indexQueue" (send your data directly into the index).TCP raw, UPD
rawTcpDoneTimeoutA number that specifies, in seconds, the timeout value for adding a Done key. If a connection over the port specified by name remains idle after receiving data for the specified number of seconds, it adds a Done key, implying the last event has been completely received.TCP raw
recursiveA Boolean that indicates whether to monitor any subdirectories within the data input.monitor
rename-sourceA string that specifies a name for the "source" field for events from this data input. The same source should not be used for multiple data inputs.monitor, oneshot, script
requireClientCertA Boolean that indicates whether a client is required to authenticate.TCP SSL
restrictToHostA string that specifies a host to which incoming connections on this port are restricted.TCP cooked, TCP raw, UPD
rootCAA string that specifies the path to the root certificate authority file.TCP SSL
scriptA string that specifies the path to the script to restart. This path must match an existing scripted input that has already been configured.script
serverA string that specifies a comma-separated list of additional servers to gather data from. Use this parameter when you need to gather data from more than one server.WMI
serverCertA string that specifies a full path to the server certificate.TCP SSL
servernameRead only. A string that specifies the server name of the source connecting to this port.TCP cooked, TCP raw
SizeRead only. A number that specifies the size of the source file, in bytes.oneshot
sourceA string that specifies the source for this input, which corresponds to the "source" field. The same source should not be used for multiple data inputs.script, TCP raw, UPD, Windows perfmon
Sources IndexedRead only. Indicates the number of sources read from a file in a compressed format, such as TAR or ZIP. A value of 0 means the source file was not compressed.oneshot
sourcetypeA string that specifies the source type for events from this input, which corresponds to the "sourcetype" field. The source type of an event is the format of the data input from which it originates, such as access_combined or cisco_syslog. The source type also determines how Splunk Enterprise formats your data.monitor, oneshot, script, TCP raw, UPD, Windows perfmon
Spool TimeRead only. A string that specifies the time the request was made to read the source file.oneshot
SSLA Boolean that indicates whether SSL is configured.TCP cooked, TCP raw
startingNodeA string that specifies where in the Active Directory tree to start monitoring. If a value is not specified, Splunk Enterprise attempts to start monitoring at the root of the directory tree.AD
starttimeRead only. A string that specifies the time when the script was run.script
targetDcA string that specifies a fully-qualified domain name of a valid, network-accessible domain controller. If a value is not specified, Splunk Enterprise will obtain the domain controller from the local computer.AD
time-before-closeA number that specifies the minimum number of seconds to keep a file open when Splunk Enterprise reaches the end of a file that is being read. After this period has elapsed, the file is checked again for more data.monitor
typeA string that specifies a regular expression (regex) for the types of registry events to monitor.Windows Registry
whitelistA string that specifies a regular expression (regex) for a file path. Only those file paths that match this expression are indexed.monitor
 

Index parameters

The parameters you can use for working with indexes correspond to the parameters for the data/indexes endpoint in the REST API.

The following parameters are available for indexes:

Parameter

Description

assureUTF8A Boolean that indicates whether all data retrieved from the index is in proper UTF8 encoding. When true, indexing performance is reduced. This setting is global, not per index.
blockSignatureDatabaseA string that specifies the name of the index that stores block signatures of events. This setting is global, not per index.
blockSignSizeA number that indicates how many events make up a block for block signatures. A value of 0 means block signing has been disabled for this index.
bloomfilterTotalSizeKBA number that indicates the total size of all bloom filter files, in KB.
bucketRebuildMemoryHintA string that contains a suggestion for the Splunk Enterprise bucket rebuild process for the size of the time-series (tsidx) file to make.
coldPathA string that contains the file path to the cold databases for the index.
coldPath_expandedA string that contains an absolute path to the cold databases for the index.
coldToFrozenDirA string that contains the destination path for the frozen archive. Use as an alternative to the "coldToFrozenScript" parameter. The "coldToFrozenDir" parameter takes precedence over "coldToFrozenScript" if both are specified.
coldToFrozenScriptA string that contains the destination path to the archiving script. If your script requires a program to run it (for example, python), specify the program followed by the path. The script must be in $SPLUNK_HOME/bin or one of its subdirectories.
compressRawdataThis parameter is ignored.
currentDBSizeMBA number that indicates the total size of data stored in the index, in MB. This total includes data in the home, cold, and thawed paths.
defaultDatabaseA string that contains the index destination, which is used when index destination information is not available in the input data.
disabledA Boolean that indicates whether the index has been disabled.
eai:aclA string that contains the access control list for this input.
eai:attributesA string that contains the metadata for this input.
enableOnlineBucketRepairA Boolean that indicates whether to enable asynchronous online fsck bucket repair, which runs in a concurrent process with Splunk Enterprise. When enabled, you do not have to wait until buckets are repaired to start Splunk Enterprise. However, you might observe a slight performance degradation.
enableRealtimeSearchA Boolean that indicates whether real-time search is enabled. This setting is global, not per index.
frozenTimePeriodInSecsA number that indicates how many seconds after which indexed data rolls to frozen.
homePathA string that contains a file path to the hot and warm buckets for the index.
homePath_expandedA string that contains an absolute file path to the hot and warm buckets for the index.
indexThreadsA number that indicates how many threads are used for indexing. This setting is global, not per index.
isInternalA Boolean that indicates whether the index in internal.
lastInitTimeA string that contains the last time the index processor was successfully initialized. This setting is global, not per index.
maxBloomBackfillBucketAgeA string that indicates the age of the bucket. If a warm or cold bucket is older than this time, Splunk Enterprise does not create (or re-create) its bloom filter. The valid format is number followed by a time unit ("s", "m", "h", or "d"), for example "5d".
maxConcurrentOptimizesA number that indicates how many concurrent optimize processes can run against a hot bucket.
maxDataSizeA string that indicates the maximum size for a hot bucket to reach before a roll to warm is triggered. The valid format is a number in MB, "auto" (Splunk Enterprise auto-tunes this value, setting the size to 750 MB), or "auto_high_volume" (for high-volume indexes such as the main index, setting the size to 10 GB on 64-bit, and 1 GB on 32-bit systems).
maxHotBucketsA number that indicates the maximum number of hot buckets that can exist per index. When this value is exceeded, Splunk Enterprise rolls the least recently used (LRU) hot bucket to warm. Both normal hot buckets and quarantined hot buckets count towards this total. This setting operates independently of "maxHotIdleSecs", which can also cause hot buckets to roll.
maxHotIdleSecsA number that indicates the maximum life, in seconds, of a hot bucket. When this value is exceeded, Splunk Enterprise rolls the hot bucket to warm. This setting operates independently of "maxHotBuckets", which can also cause hot buckets to roll. A value of 0 turns off the idle check.
maxHotSpanSecsA number that indicates the upper bound, in seconds, of the target maximum timespan of hot and warm buckets. If this value is set too small, you can get an explosion of hot and warm buckets in the file system.
maxMemMBA number that indicates the amount of memory, in MB, that is allocated for indexing.
maxMetaEntriesA number that indicates the maximum number of unique lines in .data files in a bucket, which may help to reduce memory consumption. When set to 0, this parameter is ignored. When this value is exceeded, a hot bucket is rolled to prevent further increase.
maxRunningProcessGroupsA number that indicates the maximum number of processes that the indexer creates at a time. This setting is global, not per index.
maxTimeA string that contains the UNIX timestamp of the newest event time in the index.
maxTimeUnreplicatedNoAcksA number that specifies the upper limit, in seconds, on how long an event can remain in a raw slice. This value applies only when replication is enabled for this index.
maxTimeUnreplicatedWithAcksA number that specifies the upper limit, in seconds, on how long events can remain unacknowledged in a raw slice. This value applies only when acks are enabled on forwarders and replication is enabled (with clustering).
maxTotalDataSizeMBA number that indicates the maximum size of an index, in MB. If an index grows larger than the maximum size, the oldest data is frozen.
maxWarmDBCountA number that indicates the maximum number of warm buckets. If this number is exceeded, the warm buckets with the lowest value for their latest times are moved to cold.
memPoolMBA number that indicates how much memory is given to the indexer memory pool. This setting is global, not per index.
minRawFileSyncSecsA string that indicates how frequently splunkd forces a file system sync while compressing journal slices. This value can be either an integer or "disable". If set to 0, splunkd forces a file system sync after every slice has finished compressing. If set to "disable", syncing is disabled and uncompressed slices are removed as soon as compression is complete. Some file systems are very inefficient at performing sync operations, so only enable this if you are sure it is needed. During this interval, uncompressed slices are left on disk even after they are compressed, then splunkd forces a file system sync of the compressed journal and removes the accumulated uncompressed files.
minTimeA string that contains the UNIX timestamp of the oldest event time in the index.
nameA string that contains the name of the index.
numBloomfiltersA number that indicates how many bloom filters are created for this index.
numHotBucketsA number that indicates how many hot buckets are created for this index.
numWarmBucketsA number that indicates how many warm buckets are created for this index.
partialServiceMetaPeriodA number that indicates how often to sync metadata, in seconds, but only for records where the sync can be done efficiently in place, without requiring a full re-write of the metadata file. Records that require a full re-write are synced at the frequency specified by "serviceMetaPeriod". When set to 0 or a value greater than "serviceMetaPeriod", metadata is not partially synced, but is synced at the frequency specified by "serviceMetaPeriod".
quarantineFutureSecsA number that indicates a time, in seconds. Events with a timestamp of this value newer than "now" are dropped into a quarantine bucket. This is a mechanism to prevent main hot buckets from being polluted with fringe events.
quarantinePastSecsA number that indicates a time, in seconds. Events with timestamp of this value older than "now" are dropped into a quarantine bucket. This is a mechanism to prevent the main hot buckets from being polluted with fringe events.
rawChunkSizeBytesA number that indicates the target uncompressed size, in bytes, for individual raw slice in the raw data journal of the index. If set to 0, "rawChunkSizeBytes" is set to the default value. Note that this value specifies a target chunk size. The actual chunk size may be slightly larger by an amount proportional to an individual event size.
repFactorA string that contains the replication factor, which is a non-negative number or "auto". This value only applies to Splunk Enterprise clustering slaves.
rotatePeriodInSecsA number that indicates how frequently, in seconds, to check whether a new hot bucket needs to be created, and how frequently to check if there are any warm or cold buckets that should be rolled or frozen.
serviceMetaPeriodA number that indicates how frequently metadata is synced to disk, in seconds.
summarizeA Boolean that indicates whether to omit certain index details to provide a faster response. This parameter is only used when retrieving the index collection.
suppressBannerListA string that contains a list of indexes to suppress "index missing" warning banner messages for. This setting is global, not per index.
syncA number that indicates how many events can trigger the indexer to sync events. This setting is global, not per index.
syncMetaA Boolean that indicates whether to call a sync operation before the file descriptor is closed on metadata file updates.
thawedPathA string that contains the file path to the thawed (resurrected) databases for the index.
thawedPath_expandedA string that contains the absolute file path to the thawed (resurrected) databases for the index.
throttleCheckPeriodA number that indicates how frequently Splunk Enterprise checks for index throttling condition, in seconds.
totalEventCountA number that indicates the total number of events in the index.