Working with data: Where it comes from and how we manage it

The focus of this chapter is on how the apps access the data they use, and how we can manage the data.

When working with data in the two apps, our developers make use of their knowledge of Splunk® simple XML, Splunk search processing language (SPL™), and JavaScript. These all help in understanding how to make use of the data that Splunk Enterprise retrieves from the various sources.

Getting data from the Auth0 service into Splunk Enterprise using a modular input

One of the first issues to address for the Auth0 app is how best to get the information collected by the Auth0 service into the Splunk app where it can be visualized and analyzed in real time. Both Auth0 and Splunk Enterprise are hosted services, so to enable viewing of real-time data from Auth0 in a Splunk app we must have some mechanism for transferring event data over the network. The initial approach we explored was to push data whenever anything interesting happened in the Auth0 service to an endpoint in Splunk Enterprise. Splunk Enterprise would then be able to index the incoming data and make it available to a dashboard in the Auth0 app. However, we identified the following potential issues with this solution:

  • It is not robust. If the Splunk instance is not listening, or there are connectivity problems, then event data from Auth0 is lost.
  • It is not efficient. Because of the way that the Auth0 service works internally, there is no opportunity for batching event data to send to the Splunk instance. For every interesting event in Auth0, the Auth0 service must make an HTTP call to the Splunk instance.
  • It is not complete. Again, because of the way the Auth0 service works internally, it is only possible to push some event types to the Splunk instance. For example, the Auth0 service can send event data relating to successful logins, but not failures.
  • It has limited reporting capabilities as you cannot get historical information. You can only pick up the new events that are sent to your Splunk instance.
  • It is awkward to configure. You may need to configure firewall rules to allow the Auth0 service to push data to your Splunk instance.
  • There is no easy deployment model. You need to add code to the Auth0 service to push data into your Splunk service.

However, the Auth0 service generates its own complete log files that contain all the detailed event data that the Splunk app requires. Copying log files on a schedule from the Auth0 environment to the Splunk Enterprise environment would not deliver the requirement for real-time information in the Splunk Enterprise dashboard. If the Splunk instance could request data from the Auth0 service every couple of seconds that would allow the dashboard to display sufficiently up-to-date information on all the events generated in the Auth0 service. A robust solution would be if the Splunk instance could keep track of the most recent event it received, and be able to resubmit a request if it failed to receive the data. It would also be more efficient, because the Splunk instance could request data in batches making it easier to configure since the Splunk service is calling a public endpoint in the Auth0 service to request data.

The model we chose also makes it easy for customers to deploy our solution because we can publish it as an app on the Splunkbase site (http://dev.splunk.com/goto/auth0app). To implement this solution, we made changes to the Auth0 service and created a modular input using the Splunk SDK for JavaScript in the Splunk app (for more information, see "How to work with modular inputs in the Splunk SDK for JavaScript"). Splunk Enterprise also supports scripted inputs as an alternative to modular inputs. Scripted inputs are easier to write, however they are more complex to deploy and manage, especially if you need to support multiple operating systems. Scripted inputs also have limitations as compared to modular inputs. For example, scripted inputs:

  • Do not support passing arguments in Splunk Web (which we require to pass in the Auth0 credentials).
  • Do not provide validation feedback when you configure them.
  • Do not support multiple instances (you would need two copies of the script if you had two Auth0 installations).
  • Are less integrated with respect to logging to Splunk Enterprise's own internal logs.
ARCHAlthough Auth0 was new to Splunk Enterprise, it took Auth0 just two weeks to get their basic Splunk app up and running along with making the necessary changes to their API.

DEVModular inputs configuration parameters can be managed through the Splunk REST API, which is really useful.

Changes to the Auth0 service

Our first challenge was how to retrieve data from the Auth0 service. We wanted to enable a modular input to continuously poll for data, but the Auth0 service itself did not have a suitable API. We determined that the best option was to create a new REST API in the Auth0 service that enables a client (such as a Splunk instance) to request the event data. This new API takes two parameters: take specifies the maximum number of log entries to return, and from specifies the log entry from which to start reading. This was a simple change to make in the Auth0 service and did not have an impact on any other features.

ARCHAuth0 uses MongoDB to store log data from the Auth0 service. Because MongoDB lets them assign incrementing IDs to log entries as they are written, it's easy to implement an API that reads a sequential set of log entries starting from a specified log entry.

Creating a modular input

A modular input enables us to add a new type of custom input to our Auth0 Splunk application that behaves like one of the native input types. A user can interactively create and update the custom inputs using Splunk Enterprise, just as they do for native inputs (this would not be possible with a scripted input). The following screenshot shows the new custom Auth0 input type in the list of available input types in the Splunk Enterprise UI. This new Auth0 input type is defined in the server.js script that is discussed later in this section:

The next screenshot shows the Auth0 input type UI requesting details of the Auth0 service to which to connect:

We choose to implement this modular input using the Splunk SDK for JavaScript as the team working on this app are experienced users of node.js. Node.js is also a cross-platform development tool. It is just as easy to build a modular input using one of the other Splunk SDKs in a language of your choice. When you create a modular input using node.js, you define the input in a Node module that exports a standard set of functions. The following code snippets come from the server.js script in the bin\app folder in the app (there are also scripts called auth0.cmd and auth0.sh in the bin folder for launching the server.js script at startup for both Windows and Linux environments).

ARCH

Modular inputs are an alternative to scripted inputs. Where scripted inputs are quick and easy to implement, they may not be easy for an end user to use. Modular inputs require more upfront work by the developers, but are much easier for end users to use.

To implement a modular input, you must define a Scheme instance, which tells Splunk Enterprise about the arguments that a user configuring this input must provide. You then provide any optional validation logic for those arguments, as well as the logic for streaming the events back to Splunk Enterprise. The Auth0 input requires the user to provide credential information, it will then connect to the Auth0 service to validate the credentials, and then connect to the Auth0 service to begin retrieving the data that it streams into Splunk Enterprise. As you can see from the require calls below, the modular input relies on the Splunk JavaScript SDK for the modular input infrastructure as well as the Auth0 SDK for communicating with the Auth0 service:

(function() { 
   var fs              = require('fs'); 
   var path            = require('path'); 
   var splunkjs        = require('splunk-sdk'); 
   var Auth0           = require('auth0');

The first section of the server.js script defines the Scheme instance for the input that displays in the UI of Splunk Enterprise for configuring the modular input. As you can see, we are exporting a getScheme function. The Scheme instance describes the input and provides its arguments. Notice how we set the property useSingleInstance to false, which causes the UI to display an optional Interval parameter to let a user specify how frequently the script should run. In this case, the parameter determines the polling interval for checking with the Auth0 service for new log data to request. For more information about creating modular inputs using JavaScript, see "How to work with modular inputs in the Splunk SDK for JavaScript."

exports.getScheme = function () {
  var scheme = new Scheme('Auth0');

  scheme.description = 'Streams events of logs in the specified Auth0 account.';
  scheme.useExternalValidation = true;
  scheme.useSingleInstance = false; // Set to false so an input can have an optional interval parameter

  scheme.args = [
    new Argument({
      name:             'domain',
      dataType:         Argument.dataTypeString,
      description:      'Auth0 domain (for example contoso.auth0.com)',
      requiredOnCreate: true,
      requiredOnEdit:   false
    }),
    new Argument({
      name:             'clientId',
      dataType:         Argument.dataTypeString,
      description:      'Auth0 Client ID',
      requiredOnCreate: true,
      requiredOnEdit:   false
    }),
    new Argument({
      name:             'clientSecret',
      dataType:         Argument.dataTypeString,
      description:      'Auth0 Client Secret',
      requiredOnCreate: true,
      requiredOnEdit:   false
    })
  ];

  return scheme;
};

The remainder of this script is a JavaScript function that has been broken into chunks with commentary throughout so that you can follow along.

The first section validates that a checkpoint file can be created, and if one does not exist, creates it. This file is used to store the current seek location or last record seen during the polling of the Auth0 API. Splunk Enterprise provides a checkpoint folder for each input to store its checkpoint data.

exports.streamEvents = function (name, singleInput, eventWriter, done) {
  // Get the checkpoint directory out of the modular input's metadata
  var checkpointDir = this._inputDefinition.metadata['checkpoint_dir'];
  var checkpointFilePath  = path.join(checkpointDir, singleInput.domain + '-log-checkpoint.txt');

  var logCheckpoint = '';
  try {
    logCheckpoint = utils.readFile('', checkpointFilePath);
  }
  catch (e) {
    // If there's an exception, assume the file doesn't exist. Create the checkpoint file with an empty string
    fs.appendFileSync(checkpointFilePath, '');
  }

Next, the script initializes the Auth0 object provided by the Auth0 module. This object will be used to retrieve log data from the Auth0 service.

// Call Auth0 API
  var auth0 = new Auth0({
    domain:       singleInput.domain,
    clientID:     singleInput.clientId,
    clientSecret: singleInput.clientSecret
  });

The main body of the script uses an asynchronous loop to poll the Auth0 service for new log data. The Async.whilst method is from the Splunk SDK for JavaScript. The loop continues until there are no more logs or an error is encountered.

var working = true;

  Async.whilst(
    function () {
      return working;
    },

In the body of the loop, we first use the Auth0 API to retrieve up to 200 new log entries, starting from the last checkpoint.

function (callback) {
      try {
        auth0.getLogs({
          take: 200, // The maximum value supported by the Auth0 API
          from: logCheckpoint
        },

We check for errors and if there are any remaining log entries to index. If there are none, working is set to false, which will then exit the whilst loop. We also use the Logger class from the SDK to record what happened.

function (err, logs) {
          if (err) {
            Logger.error(name, 'auth0.getLogs: ' + err.message, eventWriter._err);
            return callback(err);
          }

          if (logs.length === 0) {
            working = false;
            Logger.info(name, 'Indexed was finished');
            return callback();
          }

          var errorFound = false;

Next, we loop over the log entries we retrieved from the Auth0 service. The Event and EventWriter classes from the JavaScript SDK are used to send data to Splunk Enterprise. We then record the most recent ID in the logCheckpoint variable.

for (var i = 0; i < logs.length && !errorFound; i++) {

            try {
              var event = new Event({
                stanza:     singleInput.domain,
                sourcetype: 'auth0_logs',
                data:       JSON.stringify(logs[i]), // Have Splunk index our event data as JSON
              });
              
              eventWriter.writeEvent(event);
              logCheckpoint = logs[i]._id;

              Logger.info(name, 'Indexed an Auth0 log with _id: ' + logCheckpoint);
            }

If there are any errors, we log the error and save the most recent log entry id in the checkpoint file.

catch (e) {
              errorFound = true;
              working = false; // Stop streaming if we get an error
              Logger.error(name, e.message, eventWriter._err);
              fs.writeFileSync(checkpointFilePath, logCheckpoint); // Write to the checkpoint file
              
              // We had an error, die
              return done(e);
            }
          }

Finally, if everything worked, we save the id of the last log entry we indexed into the checkpoint file.

catch (e) {
              errorFound = true;
              working = false; // Stop streaming if we get an error
              Logger.error(name, e.message, eventWriter._err);
              fs.writeFileSync(checkpointFilePath, logCheckpoint); // Write to the checkpoint file
              
              // We had an error, die
              return done(e);
            }
          }

We designed this input based on the assumption that Splunk Enterprise will run a single instance at a given time, and fetch events by continually polling. Each time it fetches, it will pull down all available logs, send events back to the Splunk instance, and then the process is killed. The way intervals work in Splunk Enterprise, if the input is still collecting data when the interval timer expires, the Splunk instance does not launch a new instance of the input. The interval applies only when the input finishes its work. You can also configure the location where Splunk Enterprise stores the checkpoint file, but the default location is in the %SPLUNK_HOME%/var/lib/splunk/modinputs folder. For more information, see "Data checkpoints."

DEVYou can create modular inputs using other languages such as Python and C#.


Refreshing an index with checkpoints

During our testing, we need to be able to delete our indexed data and start over. To do this, we followed this procedure:

  1. Open Splunk Enterprise, click Settings, and then click Data Inputs. Click Auth0 (the modular input we defined) and then select Delete.
  2. Delete the log checkpoint file (that tracks the most recent event that Splunk retrieved from the Auth0 service) from the Splunk folder %SPLUNK_HOME%/var/lib/splunk/modinputs.
  3. Delete the content of the index by running the command bin/splunk eventdata clean INDEX_NAME in a shell or at a command prompt in Windows.

Getting data into Splunk Enterprise for the PAS app using data models and Splunk Common Information Model extensions

The PAS app currently uses log data from three different sources: a database, a document repository, and the file system. Each of these is defined as a separate sourcetype: ri:pas:database, ri:pas:application, and ri:pas:file. These three logs contain different types of event data with different formats from each other. Each of these types has different field names, which on first sight requires separate searches to pull the data. This is not ideal and introduces a potential maintenance issue if we add a new data source in the future (or if the format of one of the log files changes). We would like to make the log data from these three sources (and any data sources we define in the future) available in a normalized format to simplify the design of the searches in the app. We would also like to make it available for other apps to consume in a standardized format.

Fortunately, Splunk Enterprise offers a better solution. We can achieve the first of these goals by using aliases and extracts to translate and map the content of our log files into common field names, and by building data models based on the extracts and aliases. We can achieve the second of these goals by building a special model that maps and translates our log data into the structure defined in a Splunk Common Information Model (CIM).

A data model is a semantic mapping from a set of events that can be used for querying Splunk Enterprise. A data model specifies a set of fields with fixed data types and an agreed interpretation with respect to the events Splunk Enterprise is indexing that Splunk apps can use.

A Splunk CIM defines a core set of fields for a particular type of event that might come from multiple log sources. For example, there is a Change Analysis CIM data model with fields that describe Create, Update, and Delete activities, and there is an Authentication CIM data model with fields that describe login and logout activities. For more information about these and other Splunk CIM data models, see the section "Data Models" on the "Common Information Model Add-on Manual" page. In addition to the documentation, after you install the Splunk Common Information Model Add-on, you can browse the structure of the models from the Pivot page in the Search & Reporting app in Splunk Enterprise.

SHIPSplunk CIM is shipped as an add-on. Get it from dev.splunk.com/goto/splunkcim.

ARCHA CIM defines the lowest common denominator of the data associated with the activity such as change analysis, authentication, or intrusion detection. Browsing the model in Splunk Enterprise will give you more insight into the structure of the model.

A CIM focuses on normalizing data and making it interoperable with other apps. However, we also want to create a data model that is specific to our app, and that will define all of the rich data that we need to build our pivot reports. You can define multiple models for your data as CIM Extensions.

We also plan to accelerate our CIM PAS Extension data model to improve query performance, this will enable us to use commands such as tstats on the fields in our data model in our searches.

ARCHData model acceleration creates summaries for only those specific fields you and your Pivot Editor users are interested in and want to report on. To enable data model acceleration, follow these instructions: dev.splunk.com/goto/enabledatamodelacc. While there, we highly recommend you review the restrictions on the kinds of data model objects that can be accelerated.

DEVYou can manually generate accelerated namespaces and leverage the power of indexed fields to perform statistical queries without having to index fields. You do so by using the tscollect command.

Mapping to a Splunk Common Information Model

For the PAS app, we determined that the Change Analysis CIM data model was the most appropriate. After identifying the model to use, the next step is to map the existing fields in our data sources to the set of standard field names defined in the CIM to create a normalized view of the data. We begin by using a spreadsheet to document the mappings from our three data sources to the CIM and then implement the mappings using a combination of aliases, extracts, and static evals in the props.conf file for each data source. For example, we map the SQLTEXT field in the database log, use the static value "updated" for the document repository log, and map an extract field in the file log to the CIM field named action. Now a search can refer to the action field, regardless of the particular log file we are searching, and any other app that uses our data sources can expect to find the standard field names from the CIM. If our app needs to support another data source, we can perform a similar mapping operation and use search definitions that are very similar to our existing ones. Furthermore, if the format of a log file changes, we can accommodate those changes in our mappings without the need to modify any searches that depend on specific field names. The following table shows our initial set of mappings for our three data inputs:

Database log original field
Document log original field
File log original field
CIM field
SQLTEXT "updated" Extract action

NAME

Enumerated values in log:

  • Connect
  • Insert
  • Update
  • Select
  • Delete
  • Quit
  • Grant
  • Revoke

event_name

Enumerated values in log:

  • login
  • download
  • edit
  • read
  • create
  • upload
  • share
  • permissions_changed
  • lock
  • unlock
  • delete

Extract

Enumerated values in log:

  • getattr
  • read
  • open
  • write
command
DOCUMENT Extract object
CONNECTION_ID pid object_id
IP src_ip src
USER user_id Extract user
USER_ID empid user_id
event_details object_attrs
event_id event_id
Extract event_target
"success" status
"application" change_type

ARCHFor an event to show up in the Change Analysis CIM it must be tagged with the value change as defined in the constraint in the Change Analysis data model. You must define a search that assigns this tag value to events from your data.

Tagging our events

Tagging events lets us associate those events with a data model. This works with both Splunk Common Information Model and with our custom data model. To tag an event, we first define event types and then associate those event types with tags. The following screenshot shows the event types for our database provider add-on: you can view this page in the Settings section of Splunk Enterprise:

ADMINYou should make sure that event type names are unique to each app, otherwise the definition in one app will overwrite the definition in another one.

Each event type has a search string that identifies the events and a set of associated tags. Notice how we reference the ri-pas-database event type in the subsequent definitions, and how some event types have more than one associated tags. You can also view the tags in the Settings section of Splunk Enterprise:

Some of these tags (change_permissions, delete, read, and update) are used to associate the events with the Change Analysis CIM, and some of these tags (pas, change, and audit) are used to associate events with our custom data model. The pas tag is intended to be unique to the PAS apps, while the other tags may be used by many other apps to identify events generically. In the Google Drive add-on app, we also define the tag cloudstorage that could be used in other similar apps such as add-ons for OneDrive or DropBox to indicate a category of data.

The files eventtypes.conf and tags.conf in each of the provider add-ons store these definitions.

Using tags

Searches in the PAS app can now use the tags instead of specifying an index to search. For example:

<search id="dendrogram_search">
    <query>
          tag=pas tag=change tag=audit customer_name=$customer$
        | stats count by department department_group user
    </query>
</search>
DEV Not all our searches use tags, in some cases we search for events using more detailed criteria such as looking at the values in specific fields.

The only place where we mention the pas index is the authorize.conf file in the main app (Indexes and Access Controls in Settings). In the Access Controls settings we specify pas as the default index for the users of the app. For more information about authorizations and permissions in the PAS app, see the "Packaging and deployment: reaching our destination" chapter in this guide. If you decide to create another add-on app for the main PAS app, if the add-on app has an inputs.conf file, that file will also refer to the pas index.

The following diagram summarizes the role of the Splunk knowledge objects related to tagging in the PAS app:

When we ship the PAS app it includes sample add-on provider apps, that together with the Eventgen app generate sample event data that is indexed in the pas index. When a customer deploys the app, they can use their own event data and indexes provided that:

  • The events are tagged with the tags recognized by our data model.
  • The pasuser and pasadmin roles are authorized to use the customer's index.

For more information about the pasuser and pasadmin roles, see the chapter "Packaging and deployment: reaching our destination" in this guide.

Defining a custom data model

In addition to mapping our log data to the Change Analysis CIM, we also defined our own custom data model within the app to support pivot-based searches on the app dashboards. A custom data model defines a set of fields (possibly organized hierarchically) and a constraint that identifies the events that the data model handles. This definition is expressed in JSON, and in our app the file is named ri_pas_datamodel.json. The app also contains a datamodels.conf file that contains metadata about the model such as whether it is accelerated.

ARCHAs a reminder, CIM is the least common denominator and not very rich. It makes sense to use other models or techniques as well. The key is to make sure that CIM is also covered when extracting data, so that the least common denominator can be relied on.

PERFAn accelerated model is equivalent to an indexed view in an Relational Database Management System (RDMS). Searches will be faster, at the expense of persisting and maintaining indexes.

The following screen shot from Splunk Enterprise shows the data model we defined for the PAS app:

Notice how the constraint uses our tags to specify which events are included in the model.

ARCHMapping our data to a CIM or to a custom data model are both examples of normalizing multiple data sources to a single model.

Defining our mappings in separate add-on apps

To make it easy to maintain these mappings and keep them all in a fixed location, we package them as separate add-on apps. In the PAS app, we use these separate add-on apps specifically because we want to let customers extend the PAS app by adding their own data sources, which will require their own custom mappings. For information about how the main PAS app recognizes these add-on apps, see the section "Using the Splunk JavaScript SDK to interrogate other apps" in the chapter "Adding code: using JavaScript and Search Processing Language." The following code snippet shows the props.conf file from the RI document TA app:

[ri:pas:application]
MAX_TIMESTAMP_LOOKAHEAD = 150
NO_BINARY_CHECK = 1
pulldown_type = 1

FIELDALIAS-command = event_name AS command
FIELDALIAS-object_attrs = event_details AS object_attrs
FIELDALIAS-event_id = event_id AS event_id
FIELDALIAS-src = src_ip AS src
FIELDALIAS-user = user_id AS user

EXTRACT-file=event_target=(?<object_path>.+)\\(?<object>.*?)\s

EVAL-action="updated"
EVAL-status="success"
EVAL-change-type="application"

The stanza name, ri:pas:application, identifies the sourcetype with which the mappings are associated.

SHIPYou can find out more about technology add-ons in the "Data Source Integration Manual."


Other apps use the custom knowledge objects such as the field aliases and extracts defined in our add-on apps; therefore, we give these objects Global rather than App scope.

SHIPYou can define the scope (individual, app, or global) of knowledge objects in either a local.meta or default.meta file. You should not ship an application that contains a local.meta file, so you should move any scoping definitions to the default.meta file.

A note about the props.conf file

For the PAS app, we are generating our own simulated events using the Eventgen app. Therefore, we are confident that the format of the event data is optimized for consumption by Splunk Enterprise. In practice, with real event data, you may be able to further improve the performance of Splunk Enterprise when it parses the event data by providing additional information in the props.conf file. Typically, you should include the following attributes: TIME_PREFIX, MAX_TIMESTAMP_LOOKAHEAD, TIME_FORMAT, LINE_BREAKER, SHOULD_LINEMERGE, TRUNCATE, KV_MODE. The following snippet shows an example of these attributes in use:

[sourcetypeA]
TIME_PREFIX = ^
MAX_TIMESTAMP_LOOKAHEAD = 25
TIME_FORMAT = %Y-%m-%d %H:%M:%S.%3N %z
LINE_BREAKER = ([\r\n]+)\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}.\d{3}
SHOULD_LINEMERGE = False
TRUNCATE = 5000
KV_MODE = None
ANNOTATE_PUNCT = false

For more information about these attributes, see "props.conf" in the Admin manual.

Rebuilding our index after refactoring

As part of the effort to refactor our data inputs, create the data models, and package them in add-on apps, we renamed our sourcetypes part way through our journey: for example, we renamed the conducive:app sourcetype to ri:pas:application. We also renamed the app that contains our sample data. An unintended consequence of this was that Splunk Enterprise could no longer find the sample data and was no longer indexing our data. To fix this, we had to delete the content of the old index named pas completely by using the following procedure:

  1. Add the admin user to the can_delete role in Access controls in Splunk Enterprise.
  2. Stop Splunk Enterprise.
  3. At an a operating system command prompt, run the following command:
    bin/splunk clean eventdata -index pas
  4. Restart Splunk Enterprise.

For more information, see "Remove indexes and indexed data."

Using the data models

Earlier in this chapter, we describe our custom data model and how we map our log data to the Change Analysis CIM. After building our custom data model, we can refactor our existing dashboards to make use of the data model and use pivots in the search criteria. For example, the Summary dashboard includes several pivot searches that use the data model such as this one that is based on the Root_Event in our data model:

<search id="base_search">
    <query>
        | pivot ri_pas_datamodel Root_Event count(Root_Event) AS Count SPLITROW _time AS _time PERIOD auto SPLITROW user AS user 
          SPLITROW command AS command SPLITROW object AS object 
          FILTER command isNotNull $filter$ $exclude$ ROWSUMMARY 0 COLSUMMARY 0 NUMCOLS 0 SHOWOTHER 1
    </query>
</search>

The following example from the Off-hours Document Access dashboard is based on the Invalid_Time_Access (Off-Hours Document Access) child event:

<chart>
    <title>Documents Accessed Outside Working Hours</title>
        <searchString>| pivot ri_pas_datamodel Invalid_Time_Access count(Invalid_Time_Access) AS count 
                        SPLITROW _time AS _time PERIOD auto SORT 0 _time ROWSUMMARY 0 COLSUMMARY 0 
                        NUMCOLS 0 SHOWOTHER 1
        </searchString>
    <option name="charting.chart">line</option>
</chart>

These search definitions now use the fields defined in our custom data model such as _time, user, command, and object.

For more information about the pivot command, see "pivot" in the Search Reference.

The following screenshot shows a search that uses the Change_Analysis CIM data model to show some of the sample data from the PAS add-ons (in this example the Document and File sample providers):

Modifying the data model to support additional queries

The following screenshot shows an example of a pivot based on the Root Event in our original data model that shows counts of the different commands executed by individual users. The existing dashboards in our app all use pivots similar to this one to retrieve the data they display:

We plan to add visualizations to the summary screen that show the overall health of the system we are monitoring. These visualizations will need Key Performance Indicator (KPI) values to determine the overall health, and we need to modify our data model to enable us to query for these KPIs. Our initial set of KPIs are: a count of out of hours accesses to the system (Invalid Time Access), a count of accesses by terminated employees (Terminated Access), and a count of policy violations (Policy Violation).

DEVThe Policy Violation object gets removed later in our journey.


The following screenshot shows a pivot based on the Terminated Access event and you can see the count on each day. We can use the count of Terminated Access events for the last day as part of the calculation of the overall system health status:

We define additional events in the PAS Data Model, such as Terminated Access events, as children of the root event. The following screenshot shows the attributes and constraint the Terminated Access event inherits from the root event along with the additional constraint that identifies the specific event type.

We can now use these additional event definitions in the search managers on our dashboards. For example to search for Policy Violation events, we use the following search definition:

| pivot ri_pas_datamodel Policy_Violation count(Policy_Violation)
AS count FILTER command isNotNull $filter$ $exclude$ ROWSUMMARY 0 COLSUMMARY 0 NUMCOLS 0 SHOWOTHER 1

Case study: Using data models to handle large volumes of data

One reason to use data models is to optimize the performance of Splunk Enterprise when you have a large number of users who use a dashboard that runs searches across high volumes of data. For example, you have a requirement for a dashboard used by several hundred users that displays information from the last thirty days and you have multiple terabytes of new data to index every day. This is considerably more data than the PAS scenario expects, but an app such as PAS will still see performance benefits from using accelerated data models.

Using simple, inline searches on the dashboard that search for the last thirty days of data will be unusably slow in this scenario. Therefore, the first step might be to replace the in-line searches with saved reports that you have accelerated for the last thirty days. While this will speed up the searches, they will typically stall at the end because Splunk Enterprise only updates the accelerated data every ten minutes. A search over the last thirty days retrieves mostly accelerated data but still has to search the raw data for the last few minutes' worth of nonaccelerated data. To work around this problem, you can modify your dashboards to report on the last thirty days of data using a time range that excludes the last ten minutes to ensure that the searches only retrieve accelerated data.

You can further improve on this approach by using scheduled reports. This lets the searches on the dashboard access cached data on the search head instead of accessing the indexers for the accelerated data. You can manually schedule the searches you need to run as reports every ten minutes, and then the searches on the dashboards can load the results of the scheduled reports from the search head using the loadjob command.

You can also accelerate a data model to improve the performance of pivot searches based on the data model. This provides similar performance improvements to accelerated reports, but in addition to enabling pivot searches, accelerated data models:

  • Update every five minutes instead of every ten minutes.
  • Let you manage the amount of disk space required to store the accelerated data because you can choose which columns to add to your data model.
PERFIt's possible to further optimize a high data volume scenario by using a custom solution instead of a data model. For example, you could run a search, with a timespan of one minute, every minute that appends data to an output lookup file, and then on the dashboard use input lookups to read this summary data.

For more information about data model acceleration, see the "Accelerate data models" section of the Knowledge Manager Manual.

For more information about using reports, see "About reports" in the Reporting Manual.

For more information about accelerated data, see "Manage report acceleration" in the Knowledge Manager Manual and "Accelerate reports" in the Reporting Manual.

For more information about scheduling reports, see "Schedule reports" in the Reporting Manual.

For more information about the loadjob command, see "loadjob" in Search Reference.

Integrating with a third-party system

On the User Activity dashboard we display information about a user that we pull from a third-party system. In the sample PAS app this third-party system is a REST endpoint we implemented using Python that simulates a directory service such as LDAP or Active Directory.

ARCHUsing a mock implementation like this let us develop the functionality in the absence of the real directory service with real user data.

The following screenshot shows how we display this information on the dashboard:

To pull the data from our simulated directory service, we use a custom search command. Splunk Enterprise lets you implement custom search commands for extending the SPL. Custom search commands are authored in Python, and are easy to do with the Splunk SDK for Python. Our custom search command is named pasgetuserinfo as shown in the following code snippet from the user_activity.xml file:

<search id="user_info_search">
    <query>
        | pasgetuserinfo user=$user|s$
    </query>
</search>

We implement this custom command in the PAS Get User Information app (you can find this sample in the test repository). The commands.conf file specifies the name of the custom command as shown in the following configuration snippet:

# [commands.conf]($SPLUNK_HOME/etc/system/README/commands.conf.spec)
[defaults]

[pasgetuserinfo]
filename = pasgetuserinfo.py
supports_getinfo = true
supports_rawargs = true
outputheader = true

This configuration file identifies the Python source file, pasgetuserinfo.py, that implements the custom event generating command. The following code sample shows the complete implementation of the pasgetuserinfo command:

# [commands.conf]($SPLUNK_HOME/etc/system/README/commands.conf.spec)
[defaults]

[pasgetuserinfo]
filename = pasgetuserinfo.py
supports_getinfo = true
supports_rawargs = true
outputheader = trueimport requests
import json
import sys, time
from splunklib.searchcommands import \
    dispatch, GeneratingCommand, Configuration, Option, validators

@Configuration()
class PasGetUserInfoCommand(GeneratingCommand):
 user = Option(require=True)

 def generate(self):
  url = 'http://localhost:5000/user_list/api/v1.0/users/' + self.user
  data = requests.get(url).json()
  if 'user' in data:
   # Known user.
   row = {}
   for k, v in data['user'].iteritems():
    row[str(k)] = str(v)
   yield row
  else:
   # Unknown user. Return no data.
   pass
dispatch(PasGetUserInfoCommand, sys.argv, sys.stdin, sys.stdout, __name__)

Notice how this code imports the GeneratingCommand, Configuration, and Option classes from the splunklib.searchcommands module in the Splunk SDK for Python. We chose a GeneratingCommand because we are manufacturing events. The generate method calls our mock REST API endpoint passing the value of the user option of the custom command. If the REST API recognizes the user, it returns a JSON string containing the user data. The generate method then returns this data as a dictionary instance. To use a real directory service, we can replace the code in the generate method with code to query the real service and return the data in a Python dictionary instance.

The JavaScript code behind the User Activity dashboard formats the data from the custom search command to display in the panel.

For more information about how to implement custom search commands in Python, see "How to create custom search commands."

Using stateful configuration data in the PAS app

In the PAS app, the Suspicious Activity panel and the donut charts in the Policy Violations panel on the Summary dashboard make use of configuration data that the user creates the first time they use the app. The section "Sharing code between dashboards" in the chapter "Adding code: using JavaScript and Search Processing Language" describes how we direct the user to the Setup dashboard the first time they access the PAS app for providing this data. The following screenshot shows the Setup dashboard and the data the user must create:

On this dashboard, the user can select the departments for which they want to see a donut chart, and provide definitions of the policy violations that should appear in the list of suspicious activities. Each policy violation type has a name, a color, and a weight that the calculations behind the visualizations use. The following screenshot from the Summary dashboard shows the donut charts for the Development and Management departments selected on the Setup dashboard and in the Violation Types from the Setup dashboard in the Suspicious Activity panel:

We need a mechanism to persist the configuration data the user enters on the Setup dashboard so it can be read by the code that renders the donuts on the Summary dashboard. Historically, custom REST endpoints have been the mechanism used to persist data in a scenario such as this one, and a Splunk Enterprise API is available for accessing your custom endpoint. The App KV Store is a new feature in the version of Splunk Enterprise we are using that provides a more robust and easy to use data storage management than custom REST endpoints. App KV Store can even interface to a database using convenient REST operations, although, this is not one of our requirements for the PAS app. Additionally, App KV Store has built-in support for a distributed architecture of search head clusters: a considerable amount of coding is needed to add this level of functionality to a custom REST endpoint solution. All the functionality we need comes with Splunk Enterprise, therefore we decided to use App KV Store to persist our configuration data. No additional coding is needed beyond defining your data collection and invoking the App KV Store REST operations.

DEVWe use the KV Store to persist global configuration data shared by all users of the PAS app. It is possible to use the KV Store to persist per-user data.

The Setup dashboard uses the KV Store feature in Splunk Enterprise to persist the setup data in a collection named ri_setup_coll that we define in the collections.conf file as shown in the following configuration snippet.

[ri_setup_coll]
enforceTypes = true
field.departments=array

[violation_types]
enforceTypes = true
field.id=string
field.title=string
field.color=string
field.weight=number

We use two different collections, one for departments and one for violation types to make it easier to access this data in a search. Notice how we use arrays to store the list of departments and the policy violation types to accommodate a variable number of entries in each case. We then use a transforms.conf file to make the setup data in the KV store available to our searches:

[ri_setup]
external_type = kvstore
collection = ri_setup_coll
fields_list = departments

[violation_types]
external_type = kvstore
collection = violation_types
fields_list = id,title,color,weight

Now we can use the setup data in the searches behind the visualizations on the Summary dashboard. For example, the search policy_violations_search in the summary.xml file which extracts the data for both the donut visualizations and the Suspicious Activity panel includes the following lookup clause to use the setup data:

| lookup violation_types id AS ViolationType
OUTPUTNEW title AS ViolationTypeTitle,
color AS ViolationColor,
weight AS ViolationWeight,

The policy_violations_color_summary search that retrieves the data for the donut visualizations uses the following join clause to filter the data based on the departments the user selected on the Setup dashboard:

| join type=inner department [ | inputlookup ri_setup
| fields departments
| mvexpand departments
| rename departments as department ]

The code in the setup.js file shows how we persist the configuration data. First we load the kvstore module (we have placed the library file kvstore.js in our components folder):

require([
    'splunkjs/ready!',
    'splunkjs/mvc/simplexml/ready!',
    'underscore',
    '../app/pas_ref_app/components/kvstore_backbone/kvstore',
    'splunkjs/mvc/multidropdownview'

Then we extend the standard KVStore.Model class (which is a backbone model) to include the configuration data we define in the collections.conf file:

var SetupModel = KVStore.Model.extend({
    collectionName: 'ri_setup_coll'
});

var ViolationTypeModel = KVStore.Model.extend({
    collectionName: 'violation_types'
});

var ViolationTypeCollection = KVStore.Collection.extend({
    collectionName: 'violation_types',
    model: ViolationTypeModel
});

Finally, we can populate a model instance and persist it using the save function. For example:

var newSetupData = {
    departments: departmentsDropdown.val()
};
...
new SetupModel()
...

newSetupModel.save(newSetupData)
DEVWe also added some utility code that replaces a complete collection of data in the KV store. See the function setCollectionData in the setup.js file for more details.

At a later stage in the project we add a new field to the KV store to let a user toggle the display of the learning tip icons. To make this change we add a new field named learningTipsEnabled in the collections.conf file, add a new checkbox in the setup.xml file, and some additional code in setup.js to initialize and save the field value. To use the learningTipsEnabled configuration setting in the app, we read the value using code in the dashboard.js file and control the visibility of the learning tip icons using CSS. For more information about the dashboard.js and dashboard.css files, see the chapter "UI and visualizations: what the apps look like."

For more information about how we restrict access to the KV Store, see the chapter "Packaging and deployment: reaching our destination" in this guide.

Search managers that don't return results

It's possible in some circumstances that the user_info_search on the User Activity dashboard does not return any results. We noticed that in this case that the on("data", ... callback function is not being invoked. We have modified the code in the user_info.js file to work round this problem as shown in the following code sample:

userInfoSearch.data("results", {
    // HACK: By default, no "data" event is fired when no results are
    //       found. Override so that it does fire in this case.
    condition: function(manager, job) {
        return (job.properties() || {}).isDone;
    }
}).on("data", function(resultsModel) {
    var rows = resultsModel.data().rows;
    if (rows.length === 0) {
        view.html("No user information found.");
    } else { ...

What did we learn?

This section summarizes some of the key lessons learned while we were working with the data our apps use.

  • You can use a modular input to pull in events from external systems.
  • You can author modular inputs in several languages including JavaScript / node.js.
  • You can store state in a modular input by writing to a file.
  • If you are using a modular input written in JavaScript, you can instrument your code using methods such as error and info of the ModularInputs.Logger class. You search for these log messages in the _internal index in Splunk itself.
  • You can store state in a modular input by writing to a file.
  • You can use the CIM to provide a standard way to search against disparate sources of data. You can map existing and future sources to the CIM using aliases, extractions, event types, and tags.
  • You can create your own data models to provide a richer mapping for querying your data.
  • You can easily extend a data model to support additional search requirements.
  • You can use the KV store to persist data that can then be referenced in searches.
  • You learned how to delete an index completely and how to delete all the entries in an index. Both are useful when testing an app.
  • You need to know your data to design effective apps. Different users of your app have different data and must be able to configure the app to make it work for them.

More information

To see the Auth0 app on Splunkbase dev.splunk.com/goto/auth0app.

For more information about creating modular inputs using JavaScript, see "How to work with modular inputs in the Splunk SDK for JavaScript" at: dev.splunk.com/goto/modularinputs.

For information about the Logger class see: dev.splunk.com/goto/loggerclass.

For information about the way intervals work in Splunk Enterprise, see "Data checkpoints" at: dev.splunk.com/goto/datacheckpoints.

For information about the Change Analysis CIM data model see: dev.splunk.com/goto/changeanalysiscim.

For more information about other Splunk CIM data models, see "Data Models" at: dev.splunk.com/goto/cimmanual.

Get the Splunk CIM add-on from: dev.splunk.com/goto/splunkcim.

To learn to accelerate data models to improve query performance see: dev.splunk.com/goto/accelerateddatamodels.

To enable data model acceleration, see: dev.splunk.com/goto/enabledatamodelacc.

To learn how to use tscollect see: dev.splunk.com/goto/tscollect.

For more information on improving the performance of Splunk Enterprise using the props.conf file see: dev.splunk.com/goto/propsconf.

For more information on rebuilding an index, see "Remove indexes and indexed data" at: dev.splunk.com/goto/removeindex.

For more information about data model acceleration, see "Accelerate data models" at: dev.splunk.com/goto/accelerateddatamodels.

For more information about using reports, see "About reports" at: dev.splunk.com/goto/aboutreports.

For more information about accelerated data, see "Manage report acceleration" and "Accelerate reports" at: dev.splunk.com/goto/managereportacceleration and dev.splunk.com/goto/acceleratereports.

For more information about scheduling reports, see "Schedule reports" at: dev.splunk.com/goto/schedulereports.

For more information about the loadjob command, see: dev.splunk.com/goto/loadjobref.

To learn how we use a custom search command to pull the data from our simulated directory service see: dev.splunk.com/goto/customsearchcommand.

For more information about how to implement custom search commands in Python, see "How to create custom search commands" at: dev.splunk.com/goto/createcustomsearch.

For more information on using custom REST endpoints to persist data see: dev.splunk.com/goto/customrest.

The App KV Store provides a more robust and easy to use data storage management than custom REST endpoints. To learn more see: dev.splunk.com/goto/appkvstore.