New adventures require new tools: alerting

This chapter shows how using Splunk alerting can help you automatically trigger various actions based on real-time metrics and/or schedules. It will also walk you through building a custom alert action.

While deciding which use cases to focus on it became clear that a prime candidate was alerting: the ability to trigger notifications or custom actions based on search results. And with fine-tuned control over what happens when alerts are triggered, we discovered that our possibilities for where to go next were almost endless.

The final key element from Chapter 1, "Planning a journey," was:

A workflow solution with capabilities such as triggering incidents for review, categorizing them by type, assigning tasks to the relevant personnel, capturing details relevant to the investigation in an unalterable manner, and escalating incidents. 

Unfortunately, as we also mentioned in Chapter 1, we were not able to implement all of the goals we set forth originally. By implementing the concept of alerts in Splunk Enterprise it would seem that we have the start to this solution.

Considering alerts

An alert is an action that a search triggers based on the results of the search. When creating an alert, you specify a condition, such as a threshold or trend setting, that triggers the alert and the actions to take when the alert triggers. You can also create scheduled alerts, where you specify a scheduled search that will trigger an alert when it generates results. 

So far so good, but what kind of alert actions do we have to choose from? Historically, Splunk Enterprise alert actions have included listing triggered alerts in a Triggered Alerts list (accessible from the Activity menu), sending e-mails, and running a script. As you know, the PAS app's end users are typically non-technical business users who prefer a friendly UI over entering complex search strings or scripts. Therefore, while an e-mail or an RSS feed would certainly be acceptable to our users, we can't expect them to write a script.

ADMScripted alerts require manual setup and offer limited configuration. Previously, the end user would have to directly edit script or .conf files to include connection-level credentials or custom configurations.
 

What's more, we want to build some flexibility into our alerts. Back at the start of the journey, we said:

In addition to the end users, there is a technical role for configuring the app with rules for triggering alerts specific to the organization and its requirements. Some organizations may configure these rules once when they deploy the app, others may have the requirement to be able to update these rules in response to changes in the business environment or to emerging threats. This may require some basic technical knowledge of Splunk Enterprise.

Fair enough, but what if we could create alert actions that admins or non-technical users could configure and change on their own, without needing much technical Splunk Enterprise knowledge?

Enter the custom alert action framework in Splunk Enterprise 6.3 that enables packaged integration with both popular third-party systems and your internal enterprise systems to automate workflows and improve efficiency.

PERFYou can troubleshoot alert action performance or diagnose errors with ease: alert actions are logged and indexed by Splunk Enterprise. Specifically, messages printed to stderr are logged to splunkd.log. You can access alert action logs directly from the Alert Actions management page (at Settings > Alert actions): in the log column for the action you want to diagnose, click View log events. Or just enter the following search query, where <action_name> indicates the name of the alert action: index=_internal sourcetype=splunkd component=sendmodalert action="<action_name>".

Our alerting scenarios from a business perspective

During meetings with SMEs and our business partners, we discussed a number of options for alerts, but for our purposes, we decided to create solutions for the following two user scenarios:

  • Manager/compliance officer: When an excessive usage policy is violated by a user, send an e-mail to the manager or compliance officer, and lock the user's account. (In the case of our reference app, we accomplished this by setting the user's status to locked in the pas_simulated_users_addon.)
  • Network administrator: When a terminated employee modifies a file on the network, automatically respond by creating an Atlassian JIRA issue to investigate the matter, and by locking the terminated user's account. 
BUSJIRA is an issue tracking and project planning system that many companies use to track work items. We wanted to choose something that was relatively easy to hook into, and that would be familiar to developers.

BUSIt is also possible to use an alert as a means to perform scheduled exports of data from Splunk Enterprise.
 

Implementation-wise, we realized that the first scenario would require two alert actions that are already built into Splunk Enterprise. We'll tackle them first, and then concentrate on the JIRA scenario.

Excessive Usage Policy Violation alert

The Excessive Usage Policy Violation alert is triggered when a user violates the excessive usage policy. Excessive usage of network assets can indicate that the user's account has been compromised or that the user is engaged in illicit behavior. When the Excessive Usage Policy Violation alert is triggered, we want Splunk Enterprise to send an e-mail to the user's manager or to a compliance officer, and we want the user's account to be locked.

Built-in e-mail alert action workflow

When the alert is triggered, the first thing we want Splunk Enterprise to do is to send an e-mail to the user's manager or to a compliance officer. 

BUSBefore you can send e-mail notifications of alerts, you need to configure the e-mail notification settings in Splunk Enterprise. We don't cover that here, so see "Configure email notification settings" for all the details.

E-mail notification alert actions are easy and quick to set up. First, create the search that you plan to use to identify the trigger scenario. There are several possible ways to do this, but let's simplify things and say that we want to be notified when a user logs on more than five times in an hour. Therefore, we'd start with a search that shows successful login attempts, sorted by user. We'll set the trigger conditions a bit later, but here's an example search:

index=_audit action="login attempt" info=succeeded | stats count by user | where count > 5

From the search app, click Save As in the upper-right corner, and then click Alert. Give the alert a title and a description, and then, next to Permissions, choose whether it should be private or shared with other users. Next, choose Real-time as the alert's type, because you want Splunk Enterprise to send the alert as soon as the trigger condition occurs, not on a periodic basis.

Under Trigger Conditions, you choose when you want the alert to trigger. Since we're looking for the e-mail to be sent as soon as five successful login attempts occur, we choose the second option, Number of Results, and then fill out the rest of the window as shown here:

Save as alert window

A few settings to pay special attention to are the Trigger toggle and the Throttle checkbox. Next to Trigger, choose Once to have a single e-mail sent with all of the users who have met the alert conditions, or choose For each result to receive an e-mail for each user that meets the alert conditions. When you select the Throttle checkbox, you indicate that alert notifications should be suppressed for the indicated period of time and (if you've chosen to trigger for each result) any specified field values. The default time period is 60 seconds, but you can change that to be any length of time you want. Both of these settings will help keep your e-mail inbox from being overwhelmed with alert notification e-mails.

Now it's time to specify the e-mail to send. Under Trigger Actions, we click Add Actions, and then Send email.

Add Actions menu

The fields in this section are self-explanatory, but note that we can use tokens in the e-mail subject and body to add specificity to the alert. For example, the subject and body fields are prefilled with text that uses the $name$ token, which will be replaced by the name of the search when the alert is sent.

You can even include information from the trigger search results themselves by using the $result.fieldname$ token and replacing fieldname with the name of the field to include--for instance, the field name that contains the user name of the user who violated the excessive use policy. This way, the information is available at a glance to the e-mail recipient, even if that person is not a Splunk Enterprise user.  Keep in mind, though, that the token retrieves information from only the first search result. This can lead to problems if multiple users violate the policy, only the first user would be reported. Set the Trigger option discussed previously to For each result to ensure you get notifications for all violating users.

Here is our complete alert:

Finished e-mail alert

Note that, based on the settings we've chosen next to Include, not only will a link to the alert and to the search results be included in the e-mail, it will also include the search results themselves inline and in an attached .CSV file.

Built-in webhook alert action workflow

After the e-mail is sent, notifying the user's manager or a compliance officer of the violation, the second thing we want to do is automatically lock the user's account. Using built-in support for webhooks in Splunk Enterprise 6.3, we can send an HTTP POST request to a REST endpoint to cause it to lock a given user's account.

The webhook alert action mechanism is simple: When an alert is triggered, the webhook will make an HTTP POST request on the URL. The webhook passes JSON-formatted information about the alert in the body of the POST request.

When should you consider using the Splunk Enterprise built-in webhook alert action? There are a few situations where it's ideal:

  • When the target application is flexible enough to take a defined JSON payload and transform it in a way that's useful to you. We talk more about the format of the JSON payload later in this section. 
  • When you have full control over the target application and can modify it to accept this predefined payload. In our case, since we developed the PAS app and all its supplemental apps, we could modify them as needed to accept the JSON payload and take the appropriate action.
  • When the action does not require any user configurable parameters. At this time, the JSON payload format can't be changed.
SHIP

You can perform a different action upon triggering the alert, such as sending an SMS text message, making an alert message appear in a chat room, or posting a notification to a webpage. For maximum customizability, we recommend you build or reuse a custom alert action packaged as an add-on that is specific to the API of the service on which you want the alert action triggered.

Of course, whether you use a webhook or create a custom alert action is up to you. If you already subscribe to a webhook-enabled service such as Zapier or Twilio, or you're already running an internal webhook-compatible app, it's probably simplest to take advantage of the built-in webhook functionality in Splunk Enterprise rather than creating a custom alert action. Unlike an e-mail alert, before you can use the webhook alert action, you must configure the app or service that will be the recipient of the alert to accept a JSON-formatted data packet. If you're using a service like Zapier or Twilio, you're guided through the process of setting up a webhook input, and then given a URL to enter when you set up your webhook alert action. If you're configuring a service manually, you'll need to know the contents and structure of the JSON sent with each alert.

Webhooks are great as a quick solution for a service you already use, but building your own custom alert action gives you the most customization options and is the more flexible alternative.

For the PAS app, we first added two new endpoints to pas_simulated_users_addon to simulate the account lock and unlock states. To pas_simulated_users_addon/user-api/app.py, we added the following:

@app.route('/user_list/api/v1.0/users/lock/<string:user_name>', methods=['POST'])
def lock_user_account(user_name):
    try:
        user = filter(lambda t: t['UserName'] == user_name, users)
        user[0]["AccountStatus"] = 'Account Locked'
        if len(user) == 0:
            abort(404)
        return jsonify({'user': user[0]["AccountStatus"]})
    except Exception, e:
        print >> sys.stderr, "ERROR Error sending message: %s" % e
        return jsonify({'Error': "Account lock attempt failed!"})
@app.route('/user_list/api/v1.0/users/unlock/<string:user_name>', methods=['POST'])
def unlock_user_account(user_name):
    try:
        user = filter(lambda t: t['UserName'] == user_name, users)
        user[0]["AccountStatus"] = 'Account Unlocked'
        if len(user) == 0:
            abort(404)
        return jsonify({'user': user[0]["AccountStatus"]})
    except Exception, e:
        print >> sys.stderr, "ERROR Error sending message: %s" % e
        return jsonify({'Error': "Account lock attempt failed!"})

This Python code defines the following endpoints, where <username> indicates a username in the PAS simulated user database:

  • Locks the user's account: http://localhost:5000/user_list/api/v1.0/users/lock/<username>
  • Unlocks the user's account: http://localhost:5000/user_list/api/v1.0/users/unlock/<username>

If the action is successful, the endpoint returns the following JSON: {"user": "Account Locked"} or {"user": "Account Unlocked"}, respectively. If the specified user does not exist, the following JSON is returned: {"error": "Not found"}.

For a webhook, the POST request's JSON data payload includes the search ID (SID) for the search that triggered the alert, the search owner and app, and the first results row from the search that triggered the alert. Here's an example JSON data packet:

{
    "result": {
        "user": "nick",
        "client_ip": "10.4.0.28",
        "status": "failure",
        "reason": "user-initiated"
    },
    "sid": "scheduler__admin__search__W2_at_1427942640_178",
    "results_link": "http:// splunk.local:8000/app/search/@go?sid=scheduler__admin__search__W2_at_1427942640_178",
    "search_name": Failed_Login_Attempts,
    "owner": "admin",
    "app": "search"
}

Be aware that the contents of the "result" key will always vary, depending on the search that is triggering the alert action. However, the "result" key will always be followed by the "sid", "results_link", "search_name", "owner", and "app" keys, in that order. In the case of the PAS simulated users add-on, the only thing crucial for our purposes is the username included in the endpoint. The content of the JSON packet is ignored.

Once we've set up the receiver of the alert data, the process for creating a webhook alert action is similar to that for an e-mail alert. In fact, we can simply add the webhook alert action to the e-mail alert action we just created. Go to the search app within Splunk Enterprise, click the Alerts button in the top navigation bar, find the alert we created, and click Edit > Edit Actions. You'll see the existing e-mail alert action. Now, from the Add Actions menu, choose Webhook.

Webhook alert action window

Enter the URL of the web resource that will be receiving the alert data. Recall from the e-mail alert action setup that we used tokens to indicate where to insert incident-specific data. They work here, too, so for the PAS app, we entered the following endpoint URL into this field: http://localhost:5000/user_list/api/v1.0/users/lock/$result.user_id$. When the alert is triggered, the $result.user_id$ token will be replaced with the appropriate username, sending an HTTP POST request to the lock endpoint for that user.

Each time a webhook alert is triggered, Splunk Enterprise makes an HTTP POST request to the URL you entered. The POST request carries the data payload to deliver to the URL. For more information about using webhook alert actions, see "Use a webhook alert action" in the Splunk documentation, "Alerting Manual."

DEVThe webhook functionality is built into Splunk Enterprise as an app, and is located here: $SPLUNK_HOME/etc/apps/alert_webhook. If you are so inclined, you can clone it, and then modify it however you want. For example, you might choose to do this if your application accepts a specific payload that does not match to the Splunk Enterprise default.

SECWhen setting up webhook alert actions, keep in mind that the built-in webhook functionality only supports plain, no-auth HTTP communications.
 

Custom alert action authoring workflow

The next logical step in alerting was also introduced in Splunk Enterprise 6.3: custom alert actions, made possible by a new custom alert action framework. Custom alert actions are seamlessly integrated into the alert workflow. When creating an alert, users simply choose a custom alert action from the Add actions menu. As a developer, you specify what input parameters users can configure. You may choose to do this if your application accepts a specific payload that does not match the Splunk default.

Custom alert action

Custom alert actions, like alerts, can be access control list (ACL)-controlled, packaged, and distributed within apps, but they are fully modular. That is, they can be reused by other apps, or even invoked on demand when performing searches. To demonstrate this, the custom alert action we develop in this chapter is completely separate from the PAS app. Once installed, it is available to the PAS app on a user's Splunk Enterprise instance—and, in fact, to all apps to which a user has permissions—but they require a separate, additional install.

DEVFor an example of the kinds of alert actions that you can create, check out the ones built into Splunk Enterprise. In Splunk Enterprise 6.3 or later, go to Settings > Alert actions.
 

The Terminated Employee Access JIRA alert action is triggered when a terminated employee accesses a file on the network. At that instant, we want to create a new JIRA issue and assign it to a manager or compliance officer, and also lock the terminated user's account.

Note: Since we've already covered how we went about locking the user's account in the previous section, we won't cover it again here.

SHIPIf you want to use the JIRA alert action in your Splunk app, we've made it available independently from the main PAS reference app. Find it on Splunkbase.
 

The Splunk documentation contains detailed instructions for creating a new custom alert action in the "Developing Views and Apps for Splunk Web Manual," starting with the topic "Custom alert actions overview." We won't go into quite as much detail here, but we'll talk about our experience with the process, and cover a few gotchas we discovered along the way.

DEVWe've made it easy to create a new custom alert action by including an alert action template with the PAS app download package. In the spikes folder, open the alertaction_app_template folder and you'll see all of the files we talk about in this section, with the correct file structure already in place. All you have to do is fill in the blanks.

The custom alert actions documentation lists the following basic steps for creating a new custom alert action. As mentioned in the docs, you can follow them in any order, but we proceeded pretty much as listed:

  • Create configuration files. These are the .conf (and .conf.spec) files that define the configuration of the custom alert action and the app that implements it. The .conf files are where attributes and settings are stored. The .conf.spec files are where the settings are documented, and they also serve as a template against which the .conf files are validated upon startup. In addition, the metadata file default.meta defines permissions and scope.
  • Create a script. The .py script executes the custom alert action, which in our case is to connect to JIRA and create a new issue using the settings defined in the .conf files. The script follows a workflow that gets information about the triggered alert and then runs the alert action.
  • Define a user interface. This is an HTML fragment that defines the appearance of the alert action's input controls. The controls are contained in Bootstrap control groups.
  • Add optional components. These can include .spec files that describe custom parameters in your .conf files, an app setup file (setup.xml) that populates global configuration settings,  a .conf file with validation rules (restmap.conf), an endpoint for confidential information storage (storage/passwords), and an icon file. Of these, our JIRA custom alert action only has a setup.xml file. We explain why later, in "Define a user interface" and "Storing user passwords securely."

Create configuration files

Recall that the JIRA alert action's .conf (and .conf.spec) files define the attributes and settings of the alert action. We need to figure out what those fields are, but first let's take a look at the file structure of the "jira_alerts" add-on. As we start talking about different files and their locations, you'll want to refer back to the following to help orient yourself:

[jira_alerts]
├── appserver
| └── static
| | ├── appIcon.png
| | └── jira_alert_action.png
├── bin
| └── generate_jira_dialog.py 
| └── jira_alerts_install_endpoint.py 
| └── jira_helpers.py
| └── jira.py
├── default
| ├── alert_actions.conf
| ├── app.conf
| ├── data
| | └── ui
| | | └── alerts
| | | | └── jira.html
├── metadata
| └── default.meta
├── README
| ├── alert_actions.conf.spec
| └── savedsearches.conf.spec

In addition to the directories listed here, the local directory will be created at the same level as the default directory when users add their first JIRA alert action to an alert.

alert_actions.conf

By inspecting the Atlassian API that lets you remotely trigger issue creation in JIRA, we identified several attributes that we'll need from the user (and some that are optional) before we can automatically create a new JIRA issue. First, we identified the following three JIRA database-specific parameters:

  • The address of the JIRA server.
  • The JIRA username under which to file issues.
  • The JIRA user's password.

These three custom parameters, because they represent global values but can still be changed by the user, are specified in a stanza corresponding to the alert action within the alert_actions.conf.spec file (in the README directory) and created and stored in the alert_actions.conf file (in the local directory) as alert actions are added by users . (The only exception, in our case, was passwords, which our developers stored hashed in passwords.conf using a custom Python script and the method discussed in "Storing user passwords securely.") Custom parameters are named using the form param.[param_name]. Our JIRA alert action's alert_actions.conf.spec file appears as follows:

[jira]
param.jira_url = <string>
param.jira_username = <string>
param.jira_password = <string>

The parameters are assigned to a data type in the alert_actions.conf.spec file. When Splunk Enterprise starts, it validates local/alert_actions.conf against alert_actions.conf.spec to ensure the correct value types have been specified.

If we'd wanted to, we could have entered values for any of these parameters inside a corresponding stanza in alert_actions.conf to preset the fields for users since these values are typically assigned on a per-database basis. Admins will want to do this if users will always be using the same JIRA URL, username, and password to create new issues using the alert action.

All custom alert actions are required to have several parameters defined. These values are specified and stored in the alert_actions.conf file within the default directory. Our JIRA alert action's alert_actions.conf file looks like this:

[jira]
is_custom = 1
disabled = 0
label = JIRA
description = Opens an Issue in JIRA
icon_path = jira_alert_action.png
payload_format = json

These parameters are all optional, except for the is_custom parameter. However, we recommend you include all of them in the default directory's alert_actions.conf file. Notice that there are no parameters that start with param. because none of them are custom parameters.

savedsearches.conf

In addition to the three custom parameters that are specific to a JIRA database (stored in local/alert_actions.conf) and the parameters common to all alert actions (stored in default/alert_actions.conf), we'll also need the following per-alert action information:

  • The JIRA project key (short name of the JIRA project) under which to create issues.
  • A summary (title) for each new issue created.
  • A description of each new issue created.
  • The JIRA issue type (Task, Bug, Documentation, and so on) to assign to each new issue.
  • An assignee for each new issue.

Because these values are going to be different every time a user creates a new JIRA alert action, they're stored as custom attributes in the savedsearches.conf file in the local directory of the app where you'll be using the alert actions—the search app, more than likely. The savedsearches.conf file is compared to the savedsearches.conf.spec file in the README directory upon startup to check its syntax. Our JIRA alert action's savedsearches.conf.spec file appears as follows. The first setting (action.jira) is a Boolean value that indicates whether the alert action is enabled:

  # JIRA alert settings
  action.jira = [0|1] 
  action.jira.param.project_key = <string>
  action.jira.param.summary = <string>
  action.jira.param.description = <string>
  action.jira.param.issue_type = <string>
  action.jira.param.assignee = <string> 

Note that the custom attributes are named using the form action.[stanza_name].param.[param_name], where [stanza_name] represents the name of the stanza in alert_actions.conf, and [param_name] represents the name of the parameter in alert_actions.conf. This is the standard format for these parameters, as specified in the Splunk documentation.

If you, or an admin, want to set any custom parameters to default values that will be preset every time a user creates a new alert, assign them in the alert_actions.conf file rather than savedsearches.conf, and place the file in the local directory.

When a user uses an alert action for the first time, a local directory is created at the same level as default, and a new copy of savedsearches.conf is generated if they don't already exist. Within the savedsearches.conf file, a new stanza is created using the name the user gave the alert action. All of the fields described above, plus several more that are necessary for an alert action, are stored within the stanza. Every time a user creates a new alert using the alert action, its settings are stored in a new stanza.

The following stanza was created within savedsearches.conf when we attached the JIRA alert action to an alert:

[Terminated Employee Access]
action.email.pdf.footer_enabled = 1
action.email.pdf.header_enabled = 1
action.email.pdf.html_image_rendering = 1
action.email.reportServerEnabled = 0
action.email.useNSSubject = 1
action.jira = 1
action.jira.param.description = A terminated employee with username $result.user_id$ accessed internal files $result.count$ times in the last hour.
action.jira.param.issue_type = Incident
action.jira.param.password = dfdskl
action.jira.param.priority = Critical
action.jira.param.project_key = SUP
action.jira.param.summary = Terminated Employee Access: $result.user_id$
action.jira.param.username = user123
alert.suppress = 0
alert.track = 0
counttype = number of events
cron_schedule = 0 6 * * 1
dispatch.earliest_time = -1w
dispatch.latest_time = now
enableSched = 1
quantity = 1
relation = greater than
request.ui_dispatch_app = search
request.ui_dispatch_view = search
search = | data model ri_pas_datamodel Terminated_Access search | stats count by user_id

Notice that the search trigger is listed at the very end of the stanza. Be aware that attributes in savedsearches.conf take precedence over global settings in alert_actions.conf on a per instance basis.

app.conf

The app.conf file maintains the state of an app in Splunk Enterprise, plus it enables customization of aspects of an app, such as custom alert actions. When creating a modular alert action like ours—an alert action that is effectively stand-alone, and that exists in its own add-on bucket—the app.conf contents are minimal. Here is the entire contents of the JIRA alert action's app.conf file:

#
# Splunk app configuration file
#
[install]
is_configured = 0
[ui]
is_visible = false
label = JIRA Ticket Creation
[launcher]
author = 
description = 
version = 1.0
[credential::jira_password:]

[package]
id = jira_alerts
SECThe credential line in the app.conf file was added later in the development process, and has to do with encrypting user credentials (in this case, JIRA passwords). For more information, see "Storing user passwords securely."

The alert_actions.conf and app.conf files are the only ones required when you create a new alert action. The .spec files are not required, but we highly recommend that developers include them. To find out about the other, optional configuration files, see the Create custom alert configuration files topic in Splunk documentation.

default.meta

The default.meta file contains ownership information, read and write controls, and export settings for alert actions. Each app or add-on has its own default.meta file, which is stored in the metadata directory.

The contents of the JIRA alert action's default.meta file are:

[]
access = read : [ * ], write : [ admin ]
 
[alert_actions/jira]
export = system           
 
[alerts]
export = system
 
[restmap]
export = system

The first pair of lines sets the access controls. Setting read to * allows all users to read the alert action's contents. Setting write to admin allows only Splunk Enterprise administrators to share objects into the alert action.

The other pairs of lines define settings for exporting the alert action to other apps and add-ons. Setting export to system for each of the contexts inside the brackets makes them each available in all apps.

For all the details about assembling the default.meta file, see the fifth step in the building apps documentation, "Set permissions."

Create a script

The next step was to create a Python script to execute the custom alert action. We want the script to connect to JIRA and create a new issue using the settings stored in the savedsearches.conf file. Custom alert action scripts follow a workflow that gets information about the triggered alert and then runs the alert action.

Our JIRA alert action script (jira.py in the bin directory, the other scripts in that directory are discussed in the next section) follows the typical script workflow, as described in the "Create a custom alert action script" topic in the Splunk documentation:

  • Check the execution mode, based on command line arguments. Specifically, when the alert action is triggered, it runs the script with the --execute argument, which indicates to our script that it should do its thing.
  • Read configuration payload from stdin. In our case, the payload is a data packet in JSON format with properly formatted attribute-value pairs. Within the payload, the "configuration" attribute is set to a value that contains the appropriate JIRA values. That is, it contains the values that we specified in savedsearches.conf and any the user entered when setting up the alert action, with tokens replaced by actual values. To illustrate what we mean, here's a sample JSON payload. Compare the contents of the "configuration" attribute to the stanza from savedsearches.conf from the previous section.
{
  "app": "pas_ref_app",
  "owner": "admin",
  "results_file": "C:\\Program Files\\SplunkBeta\\var\\run\\splunk\\dispatch\\scheduler__admin_cGFzX3JlZl9hcHA__RMD5de437274897e69c9_at_1436395080_45\\per_result_alert\\tmp_2.csv.gz",
  "results_link": "http:\/\/localhost:8000\/app\/pas_ref_app\/search?q=%7Cloadjob%20scheduler__admin_cGFzX3JlZl9hcHA__RMD5de437274897e69c9_at_1436395080_45%20%7C%20head%203%20%7C%20tail%201&earliest=0&latest=now",
  "server_host": "SPLUNKPC",
  "server_uri": "https:\/\/127.0.0.1:8089",
  "session_key": "Ls2dhEfbOVo3j52MPF4v82bglhpT7QUnFERZhcfB6NHYj6m^4Rzpr6VXln2ZTlnFSXpMxburc_n42TVWxZ5NHvAi3D_q12a_iZbhZNfJmlcK^0x^4qSzfM1nFGcIt07j2y1z4KRRKo",
  "sid": "scheduler__admin_cGFzX3JlZl9hcHA__RMD5de437274897e69c9_at_1436395080_45",
  "search_name": "Terminated Employee Access",
  "configuration": {
    "description": "The user rblack has accessed files 7 times in the past hour.",
    "issue_type": "3",
    "jira_url": "http:\/\/myjiraserver:8080",
    "jira_username": "theuser",
    "project_key": "SIM",
    "summary": "Terminated User Access: rblack"
  },
  "result": {
    "count": "7",
    "user_id": "rblack"
  }
}
  • Run the alert action. Our script then gets the value of that "configuration" attribute, and parses it. First, it gets the URL of the JIRA server ('jira_url'), and then tacks on the appropriate RESTful endpoint (to create a new issue in JIRA, it's "rest/api/latest/issue"). Now it knows where to send the data. Then, the script assembles a JSON data packet in the appropriate format for JIRA. Finally, the script creates an outbound request object and sends it to the JIRA endpoint.
  • Terminate. It's a good idea to account for any anomalies that might occur before terminating. For example, our script accounts for an incorrect command line argument, receiving an HTTP status code of 200, and generating an exception by printing an error message. The script then terminates using exit().

You place the script in your app's bin directory.

DEVOne of our developers ran into some issues with the Python script during development and wasn't immediately sure how to troubleshoot them. He discovered a solution made possible by the fact that custom alert action executions are logged (at both a splunkd process level and a script level) to the internal index. He first added some exception handling in the form of print statements to stderr, such as the following, inside a try-catch statement:
# create outbound request object
try:
    headers = {"Content-Type": "application/json"}
    result = requests.post(url=jira_url, data=body, headers=headers, auth=(username, password))
    print >>sys.stderr, "INFO Jira server HTTP status= %s" % result.text
    print >>sys.stderr, "INFO Jira server response: %s" % result.text
except Exception, e:
    print >> sys.stderr, "ERROR Error sending message: %s" % e
    return False

Here's the query he used to check the status of his previous alert action executions, where <alert_action_name> indicates the name of the custom alert action:

index=_internal sourcetype=splunkd component=sendmodalert action="<alert_action_name>" 

Define a user interface

The final main step is to create a user interface. This is the interface that users will see when they choose the JIRA custom alert action from the available alert actions menu. In our case, it also included a setup interface that users must complete before using the alert action.

This step went through a major iteration before the final version due to the requirement that the interface be written in HTML. At first, our developers had hoped to use the classic web development combination of HTML and JavaScript to create an interface that could dynamically change according to the options users choose. For example, though there are default JIRA issue types, it's most common for JIRA administrators and project managers to create their own issue types. Plus, issue types can vary depending on the project in which a new issue is created.

With JavaScript out of the picture for this phase of the alert action development, our developers came up with a configuration pane that looked like this:

Old configuration pane

Note that, to specify values for project and issue type, users would have to enter the exact values or be faced with error messages and unpredictable behavior. We could have also used pop-up menus for the values, but, again, they would have had to be hard-coded.

That's all fine for values that will be different with each alert, such as the issue summary and description, but we were still not satisfied with the limitation of hard-coded HTML for the UI. That's when our developers realized that, though JavaScript was out of the question, Python scripting was clearly something that is supported in this context.

So, we refactored this portion of the JIRA alert action setup and configuration workflow. We moved the JIRA server, username, and password entry to a new setup page for enabling the JIRA alert action, and then updated the UI portion of the custom alert action workflow to invoke a new Python script that imports values from the JIRA server using the user-supplied credentials.

To be able to choose existing project, issue type, and priority values when creating a new JIRA alert, users must first invoke the setup page. To do this, go to Settings > Alert Actions. Then click Setup JIRA Ticket Creation.

Alert Actions pane

The new setup page is shown here. It's defined in the setup.xml file in the default directory.

JIRA alert action setup page

Notice that the page has three sections:

  • The Server section is where users enter the JIRA server's URL, their username, and their password. The password is stored securely using the method described in the section "Storing user passwords securely," later in this chapter.
  • The Import Projects and Issue Types section is where users can choose whether to have Splunk Enterprise contact the JIRA server and fill in projects and issue types dynamically when setting the alert action. This option is off by default.
  • The Default Settings section is where users can set the default project, issue type, and priority for new JIRA alerts. When creating new alerts, users can always change the values from the defaults.

Once they've filled in their JIRA credentials on the setup page, users have the option to import project names and issue types from the JIRA server. If they choose this option, the Python script generate_jira_dialog.py (along with the script jira_helpers.py) generates a static HTML page using a template and the values it retrieves from the JIRA server using the user-provided credentials.

The following is the new configuration pane for adding trigger actions. In this case, the alert action has already been set up with a valid JIRA server, username, and password, so the Project, Type, and Priority settings are now pop-up menus that are prefilled with values from the server.

Custom alert action

Be aware that the markup you use must be consistent with Bootstrap version 2.3.2 (just like the rest of the Splunk Enterprise UI). You add controls within Bootstrap control groups. Match the name attribute for each <input> tag with the parameters defined in the savedsearches.conf.spec file. The value that the user enters into the text input (or chooses from a pop-up menu) ends up in savedsearches.conf when the user saves the alert action. For example, this control group contains the control where users enter their JIRA username:

<div class="control-group">    
<label class="control-label" for="username">Username</label>
   
    <div class="controls">
        <input type="text" name="action.jira.param.username" id="username" />
        <span class="help-block">Enter your JIRA username.</span>
    </div>
<div>

DEVDon't forget to validate user input! Use the validation stanzas in restmap.conf. For example, the following stanza verifies whether a URL is valid, and displays a message if it's not:
[validation:savedsearch]
action.webhook.param.url = validate( match('action.webhook.param.url', "^https?://[^\s]+$"), "Webhook URL is invalid")
For more information, search for "Validation stanzas" on the restmap.conf reference page.

The script builds the HTML file, gives it the same name as the main script file (jira.py), appends it with ".html" (jira.html), and then places it within the app's directory as follows: /local/data/ui/alert_actions/

In case users opt out of retrieving project, type, and priority values from the JIRA server, we also created an alternate, static HTML page that allows them to enter values manually. The file, just like the one our script creates, is named jira.html, and is located at /default/data/ui/alert_actions/.

Testing

Throughout the alert action development process, we engaged in rigorous unit testing, which we wholeheartedly recommend before releasing even something as seemingly simple as an e-mail, webhook, or custom alert action into the wild. This included:

  • Observe real-time notification: In our case, it's relatively easy to test out these alerts. Simply perform the actions that we know should trigger the alerts. For example, do what you've intended to invoke the JIRA alert, and then check the database you specified to be sure that a new issue has been created. We were able to do this relatively easily since Atlassian provides JIRA on a trial basis. We installed it in our sandbox environment and tested it out with relatively little added effort. The drawback here is that this is appropriate for exploratory testing, not automated testing. 
  • On-demand invocation: Invoke alerts using the search language. You can do this using the sendalert command. Its syntax is as follows, where <alert_action> indicates the alert action to test, <action_specific_params> indicates any alert action-specific parameters that must be set, and <value> indicates the parameter's value:
    sendalert <alert_action> param.<action_specific_params>="<value>" 
    For example, here's an example search query that invokes the JIRA alert action we created previously in this chapter.
    | sendalert jira param.description="TEST RUN" param.issue_type="3" param.project_key="SIM" param.summary="TESTING ONLY"
  • Subcutaneous testing: This type of testing is ideal for automated testing. It indicates a type of testing that doesn't rely on visually verifying in the user interface that things are working correctly. For example, consider the alert actions that involve locking user accounts. Regardless of the type of system that is administering your accounts, there are likely many more ways to verify whether an account has been locked than by logging in using the UI to visually confirm it. For example, can you automate the process of querying an LDAP or Active Directory server? On the Splunk Enterprise side, if the alert action results in data generation or changes the state of a Splunk index or some other internal file, you can execute a Splunk query or call to the REST API to confirm whether something significant comes back. 
  • Manual invocation: Shift test focus to the alert script by taking the alert trigger out of the equation. For example, try creating a fake payload, and then piping it directly to the alert script with a Python command such as the following:
    $ cat fakepayload.json | splunk cmd python myalertscript.py 

In addition to unit testing, we have enriched our set of acceptance tests to test the alerting scenarios end-to-end using Selenium test automation.

TESTFor more, be sure to check out our JIRA alert action testing README file: jira_alerts/bin/TEST_README.txt.
 

Problems encountered

Our developer encountered two significant problems, both stemming from being limited to static HTML when crafting the user interface. He also had some advice for setting up his testing environment.

No JavaScript support when creating custom alert action UI

When creating JIRA issues, input elements such as "issue type" are meant to be loaded in dynamically because JIRA users are able to create custom issue types at will.

At first, the lack of JavaScript support in the custom alert action UI made us think that we would have to be content with default JIRA issue types, which are built into all JIRA instances. However, doing so would have meant the app became more brittle and could easily have broken if Atlassian decided to modify the structure of the default issue types. Developers would have to use the "swivel chair" approach and manually reproduce any custom issue types in the action alert static HTML.

Our developer then realized that, even though JavaScript wasn't a possibility, Python scripting is clearly something that is supported in this context. So he refactored the UI portion of the custom alert action workflow to include the Python scripts mentioned above.

DEVYou may have to get creative to work around the lack of JavaScript support in the custom alert action UI.

 

Storing user passwords securely

Being limited to static HTML also means that we are limited to using HTTP basic access authentication. Something like OAuth2 would be preferable, but that would require the use of custom JavaScript, which is currently not supported. A consequence of using HTTP basic auth is that we must ask the user for a username and password. The password is masked in the UI, but Splunk Enterprise saves the input value as cleartext in alert_actions.conf. This is not good, for fairly obvious reasons. Thankfully, a workaround was crafted using Python scripting, the setup.xml file, and the Splunk Enterprise password endpoint to handle encryption of the user's password. In the custom alert action Python script, a GET request can be made to the following endpoint to retrieve the unencrypted password:

<SPLUNK_BASE_URL>/servicesNS/nobody/jira_alerts/storage/passwords/%3Ajira_password%3A?output_mode=json 

The "jira_password" parameter name is defined in the app.conf file. Most of the heavy lifting of obscuring user passwords is accomplished in the jira_alerts_install_endpoint.py script. It handles running the user-entered password through the Splunk Enterprise internal hash mechanism, and then writing the result to the passwords.conf file. When the time comes to decode the hashed password, the get_jira_password() method within the main jira.py script takes care of it.

DEVBe aware that because the hashed password is stored in app.conf, you can only use this method for global-level settings. You can't securely store credentials on a per-alert basis.
 

The GET call also needs an authorization header with a session key value that can be obtained directly from the JSON payload.

SECUse setup.xml and the storage/passwords endpoint to encrypt any passwords that you'll be requesting from users.
 

Setup advice

Our developer set up his testing environment by installing JIRA on his Windows machine. He had a few caveats to share:

  • Be sure to manually turn on Accept remote API calls in the system config. 
  • Create a project of the type that you want to add issues to. 
  • Create a non-admin test user in User Management. 
  • Assign that user to the project you created. 

To test that the JIRA REST endpoint (which, as we mentioned previously, is http://your_jira_server:8080/rest/api/latest/issue) worked as expected using basic authentication, our developer used Chrome's Postman application. It did!

What did we learn?

Here are the key lessons we have learned while creating alerts and building custom alert actions:

  • Alerts are actions that a search triggers based on the results of the search.
  • You can easily create a new alert just by saving your search as one.
  • You have a number of alert actions available to you-both built-in and custom-made.
  • You can attach e-mail and script actions to your alerts, or cause an alert notification to be added to the Triggered Alerts list in Splunk Enterprise.
  • You can use a webhook alert action to define a custom callback on a particular web resource. 
  • A webhook alert action simply sends an HTTP POST request containing a JSON data packet to an endpoint URL that you specify.
  • You can attach multiple actions to an alert.
  • It is possible to set the throttling controls to suppress an alert to a longer time window to avoid alert flooding.
  • You can create your own alert actions using the Custom Alert Action Framework.
  • You create custom alert actions by creating and assembling the appropriate files, defining the right parameters, writing alert logic in Python, and defining a user interface.
  • Unit testing custom alert actions is essential, and is made easier by using the sendalert search command.
  • You can also test custom alert actions by observing a real-time notification you initiate through automated subcutaneous testing, or by manually sending a fake payload through your Python script.
  • You can package your custom alert actions as Splunk add-ons and distribute them through Splunkbase to a wider community.