Overview of SimData

SimData is a tool that generates event data from a simulation of a user-defined scenario. Instead of using a sample set of data that is repetitive and unrealistic, SimData allows you to generate a rich and robust set of events from real-world situations by mimicking how multiple systems work together and affect the performance of your system.

Let's say your system includes a web server and a database server. SimData simulations let you emulate possible outcomes when the database server is impacted by situations such as web traffic, then you can see how overall performance is affected. SimData can correlate the behavior of the web server and the database server so that when the database slows, the web server is also impacted.

Use SimData for:

    App development. Use SimData to generate data with a specific format to test the different features of your app. SimData allows you to encode scenarios in the data to ensure that those patterns appear in your extractions.

    Live demonstrations. SimData sends events in real time so that you can demonstrate plausible behavior. You can also change variables at runtime to demonstrate how an app responds to different behaviors. The result is a realistic demo, in real time.

SimData is implemented in Java with the Akka library. SimData is designed for stand-alone instances of Splunk Enterprise, loosely-coupled clusters, and remote nodes.

The design goals of SimData include:

  • Using a simple description language with powerful primitives.
  • Allowing for simple and complex interactions between simulated entities.
  • Enabling flexible modeling capabilities.
  • Modeling each entity in a typical simulation as a specific object.
  • Emulating existing formats for log data.
  • Using messages to between entities for communication, layered on the Akka message paradigm.

How Simdata works

SimData models your environment as follows:

  • Systems are referred to as entities.
  • A bot is a running instance of an entity.
  • Bots use messages to communicate with each other.
  • Actions are triggered for conditions that you specify.
  • Events are emitted from the simulation. An event usually corresponds to a log entry.

For example, for a set of users interacting with a web server, which communicates to a backend database server, the interaction flow might look like this:

Flow of data between the user, web server, and database

The entities are the user, the database server, and the web server.

The simulation can model the interaction between multiple users, multiple databases, and multiple web servers. In this example, the model would have one entity for users ("User"), one entity for web servers ("Webserver"), and one entity for database servers ("Database"). The simulation would have one bot for each User that is modeled, and for each WebServer and each Database in the modeled network.

The messages are the requests and responses between the bots in the simulation.

As the simulation runs, you could simulate what happens when you flood a web server with traffic by increasing the round-trip time for each request, which triggers an action such as a forced restart of the database. You would model this by updating the state on the bot that represents the overloaded database.

The messages in the diagram are all internal to the simulation, and are not visible outside of the simulation. You can then add events from the WebServer and Database entity definitions so that you get logs that reflect the internal states of the bots. These events can be weblog entries from the web server, database logs from the database, and CPU load metrics that reflect the modeled load on the underlying servers.