Format events for HTTP Event Collector

HTTP Event Collector (HEC) receives events from clients in a series of HTTP requests. Each request can contain the following:

  • HEC token: A valid HEC token in an authorization header or query string.
  • Channel identifier header: For raw events only, a header field set to a unique channel ID.
  • Event metadata: Comprised of one or more JSON key-value pairs.
  • Event data: Either a raw string, or contained within the value of the "event" key. Event data can come in batches, one event after another within a single HTTP request, or it can be sent with a single event per request. Single events can't be split and sent across multiple requests—that is, each event must be contained within one request. Non-raw event data—or event data that is contained within the value of the "event" key—can be formatted as a string, a number, another JSON object, and so on.
Note: Support for parsing raw event text is available in Splunk Enterprise 6.4.0 and later, Splunk Light 6.4.0 and later, and in the current releases of Splunk Cloud and Splunk Light Cloud.

HEC token

Before HTTP Event Collector will accept your data for indexing, you must authenticate to the Splunk server on which it's running. You do this using the token you generate when you create a new HEC input. When you use the token management endpoint on the Splunk server to generate a token, it generates the token in the form of a GUID. This guarantees that the token is unique.

You have several ways to authenticate to the server:

HTTP Authentication

Place the token in the authorization header of each HTTP request as follows:

"Authorization: Splunk <hec_token>" 

In context:

curl -k -H "Authorization: Splunk 12345678-1234-1234-1234-1234567890AB" https://mysplunkserver.example.com:8088/services/collector/event -d '{"sourcetype": "mysourcetype", "event": "http auth ftw!"}'

Basic authentication

Include a colon-separated user/password pair in the request after -u, inserting the HEC token as the <password>: "<user>:<password>". The <user> can be any string.

For example:

-u "x:<hec_token>"

In context:

curl -k -u "x:12345678-1234-1234-1234-1234567890AB" https://mysplunkserver.example.com:8088/services/collector/event -d '{"sourcetype": "mysourcetype", "event": "basic auth ftw!"}'

Query string (Splunk Cloud only)

Specify the HEC token as a query string in the URL that you specify in your queries to HEC. For example:

?token=<hec_token>

In context:

curl -k https://mysplunkserver.example.com:8088/services/collector/event?token=12345678-1234-1234-1234-1234567890AB -d '{"sourcetype": "mysourcetype", "event": "query string ftw!"}'

You must also enable query string authentication on a per-token basis. On your Splunk server, request Splunk Support to edit the file at $SPLUNK_HOME/etc/apps/splunk_httpinput/local/inputs.conf. Your tokens are listed by name in this file, in the form [http://<token_name>].

Within the stanza for each token you want to enable query string authentication, add the following setting (or change the existing setting, if applicable):

allowQueryStringAuth = true

Save and close the inputs.conf file.

Note: For Splunk Cloud, you must open a Splunk Support ticket to set allowQueryStringAuth to true. Support for a UI toggle for this setting is planned for a future release.

Channel identifier header

If your request includes raw events, you must include an X-Splunk-Request-Channel header field in the event, and it must be set to a unique channel identifier (a GUID). Following is an example of a cURL statement that constitutes a valid request:

curl https://http-inputs-<customer>.splunkcloud.com/services/collector/raw  -H "X-Splunk-Request-Channel: FE0ECFAD-13D5-401B-847D-77833BD77131" -H "Authorization: Splunk BD274822-96AA-4DA6-90EC-18940FB2414C" -d '<raw data string>' -v

Alternatively, the X-Splunk-Request-Channel header field can be sent as a URL query parameter, as shown here:

curl https://http-inputs-<customer>.splunkcloud.com/services/collector/raw?channel=FE0ECFAD-13D5-401B-847D-77833BD77131 -H "Authorization: Splunk BD274822-96AA-4DA6-90EC-18940FB2414C" -d '<raw data string>' -v
Note: If the token with which you are authenticating to HTTP Event Collector has indexer acknowledgement enabled, you must also include the channel identifier with your indexer status query. For more information, see Enable indexer acknowledgement.

Event metadata

This section describes the keys that can be included in event metadata. These keys are all optional. Any key-value pairs that are not included in the event will be set to values defined for the token on the Splunk server.

KeyDescription
"time"The event time. The default time format is epoch time format, in the format <sec>.<ms>. For example, 1433188255.500 indicates 1433188255 seconds and 500 milliseconds after epoch, or Monday, June 1, 2015, at 7:50:55 PM GMT.
"host"The host value to assign to the event data. This is typically the hostname of the client from which you're sending data.
"source"The source value to assign to the event data. For example, if you're sending data from an app you're developing, you could set this key to the name of the app.
"sourcetype"The sourcetype value to assign to the event data.
"index"The name of the index by which the event data is to be indexed. The index you specify here must within the list of allowed indexes if the token has the indexes parameter set.
"fields"(Not applicable to raw data.) Specifies a JSON object that contains explicit custom fields to be defined at index time. Requests containing the "fields" property must be sent to the /collector/event endpoint, or they will not be indexed. For more information, see Indexed field extractions.

With raw events, you can configure metadata at the global level (all tokens), at the token level, and at the request level using the query string. Metadata specified within a request will apply to all events that are extracted from the request.


Event data

Event data can be assigned to the "event" key within the JSON object in the HTTP request, or it can be raw text. The "event" key is at the same level within the JSON event packet as the metadata keys. Within the "event" key value's curly brackets, the data can be in whatever format you want—a string, a number, another JSON object, and so on.

You can batch multiple events in one event packet by combining them within the request. By doing this, you are specifying that any event metadata within the request is to apply to all of the events contained in the request. Batching events can significantly speed performance when you need to index large quantities of data.

Examples

Following is an example of properly-formatted event metadata and event data (a string) contained within a JSON object:

{
    "time": 1426279439, // epoch time
    "host": "localhost",
    "source": "datasource",
    "sourcetype": "txt",
    "index": "main",
    "event": { "Hello world!" }
}

Here's an example of including a JSON object as the event data within a properly-formatted event:

{
    "time": 1437522387,
    "host": "dataserver992.example.com",
    "source": "testapp",
    "event": { 
        "message": "Something happened",
        "severity": "INFO"
    }
}

Here is an example of batched data. The batch protocol for HTTP Event Collector is simply event objects stacked one after the other as shown here, and not in a JSON array. Note that these events, though they only contain the "event" and "time" keys, are still valid:

{
  "event":"event 1", 
  "time": 1447828325
}

{
  "event":"event 2", 
  "time": 1447828326
}

The following example is a simple "Hello, World!" cURL statement that includes the auth header, a destination endpoint, and very simple event data. Note that the request is going to the /services/collector/event endpoint, which is where all JSON-formatted event requests must go:

curl -k -H "Authorization: Splunk 12345678-1234-1234-1234-1234567890AB" https://localhost:8088/services/collector/event -d '{"event":"hello world"}'

The following example cURL statement demonstrates sending raw event data. Note the addition of the channel ID, which is required when sending raw event data. Also, the request is going to the /services/collector/raw endpoint, which is where all raw event requests should go:

curl -k http://localhost:8088/services/collector/raw -H 'Authorization: Splunk B5A79AAD-D822-46CC-80D1-819F80D7BFB0' -H 'x-splunk-request-channel: 18654C68-B28B-4450-9CF0-6E7645CA60CA' -d 'hello world'

Alternately, this example cURL statement passes the channel ID as a URL parameter:

curl -k http://localhost:8088/services/collector/raw?channel=18654C68-B28B-4450-9CF0-6E7645CA60CA -H 'Authorization: Splunk B5A79AAD-D822-46CC-80D1-819F80D7BFB0'  -d 'hello world'

Event parsing

The HTTP Event Collector endpoint extracts the events from the HTTP request and parses them before sending them to indexers. Because the event data formats, as described in this topic, are pre-determined, Splunk Enterprise is able to parse your data quickly, and then sends it to be indexed. This results in improved data throughput and reduced event processing time compared to other methods of getting data in.

You can configure extraction rules in the props.conf file. To learn more, see Configure rule-based source type recognition in the Splunk Enterprise Getting Data In manual.

Raw event parsing

Available in Splunk Enterprise 6.4.0 and later, Splunk Light 6.4.0 and later, and the current releases of Splunk Cloud and Splunk Light Cloud.

HTTP Event Collector can parse raw text and extract one or more events. HEC expects that the HTTP request contains one or more events with line-breaking rules in effect. Once HEC accepts the request, it passes its events into the pipeline, which extracts fields such as timestamps. HEC uses a line-breaking strategy that is based on the timestamp, but you can override it by setting a sourcetype in the props.conf file.

Events must be contained within a single HTTP request. They cannot span multiple requests.

To accomodate raw events, use the services/collector/raw endpoint.

This endpoint requires an additional X-Splunk-Request-Channel header field, which you must set to a unique channel identifier (a GUID). You must include a channel identifier with each HTTP request that contains raw events. The following is an example of a cURL statement that constitutes a valid request:

curl https://http-inputs-<customer>.splunkcloud.com/services/collector/raw  -H "X-Splunk-Request-Channel: FE0ECFAD-13D5-401B-847D-77833BD77131" -H "Authorization: Splunk BD274822-96AA-4DA6-90EC-18940FB2414C" -d '<raw data string>' -v

Alternatively, the X-Splunk-Request-Channel header field can be sent as a URL query parameter, as shown here:

curl https://http-inputs-<customer>.splunkcloud.com/services/collector/raw?channel=FE0ECFAD-13D5-401B-847D-77833BD77131 -H "Authorization: Splunk BD274822-96AA-4DA6-90EC-18940FB2414C" -d '<raw data string>' -v
Note: If the token with which you are authenticating to HTTP Event Collector has indexer acknowledgement enabled, you must also include the channel identifier with your indexer status query. For more information, see the following section, "Indexer acknowledgement," or Enable indexer acknowledgement.

With raw events, you can configure metadata at the global level (all tokens), at the token level, and at the request level using the query string. Metadata specified within a request will apply to all events that are extracted from the request.

Timestamp extraction rules are enabled at the sourcetype level to extract timestamps. Most common timestamp formats are recognized—for example, the "current-time" key—but if no timestamp is able to be extracted, one is assigned based on the current time. For other metadata, you can configure extraction rules in the props.conf file.

For more examples of cURL requests to services/collector/raw, see Input endpoint examples in the Splunk Enterprise REST API Reference Manual.

For more information about channels, see "About channels and sending data" in the Enable indexer acknowledgement topic.