Tempo is the first CyberSecurity solution based on a LogLM, or Log Language Model invented by DeepTempo. These models are similar to their more familiar cousins, LLMs such as Anthropic's Claude and LLama. Like LLMs, LogLMs are Foundation Models that apply their understanding across very different environments and in response to differing inputs. However, Tempo was pre-trained using enormous quantities of logs. Tempo is focused on the pattern of events, including relative and absolute time. Tempo has been shown to be extremely accurate, with a low false positive and false negative rate.

This guide will walk you through the process of setting up and using the TEMPO Native App in your Snowflake environment with provided sample data (CIC Dataset).

The data that is provided comes from the Canadian Institute for Cybersecurity. You can see the data set - and an explanation of the attacks discerned by Tempo here

What You'll Learn

What You'll Need

What You'll Build

  1. Obtain the TEMPO Native App from the Snowflake Marketplace.
  2. It is recommended that during installation you shorten the name to just TEMPO.
  1. After Tempo is installed, you will be prompted to select Configure
  2. When you select Configure, you will be asked to grant the following permissions; please do so

GRANT CREATE COMPUTE POOL ON ACCOUNT TO APPLICATION TEMPO; GRANT CREATE WAREHOUSE ON ACCOUNT TO APPLICATION TEMPO;

  1. Continue to click through and Launch the app

At this point, you will be a Worksheet showing SHOW TABLES; you are now ready to use Tempo as explained below

The application comes with its own warehouse (TEMPO_WH) and compute pool (TEMPO_COMPUTE_POOL) with the following specs, which will be used for container services runs.

TEMPO_WH

TEMPO_COMPUTE_POOL

Starting on the same worksheet, you can now initialize Tempo:

CALL TEMPO.MANAGER.STARTUP();

After a few minutes, Snowflake will be ready to perform inference. You are creating a Snowflake Job service, which are containers that run a specific image and terminate as soon as the run is completed.

Once completed, we will use the TEMPO.DETECTION schema's stored procedure to perform inference on sample log data. These stored procedures take a job service name as the only parameter. The demo data looks at logs for all Workstations and logs for all Webservers for a midsized company over several days. This demo data was obtained from the Canadian Institute of Cybersecurity. In a live run each created procedure represents a call to the respective model type IE. workstation representing the model specialized for workstations, webservers for webservers and so on.

When used for inference in your company, you would likely choose to execute each of these models as relevant logs are ingested. Tempo is modular in construction in order to minimize costs and compute time.

Example:

CALL TEMPO.DETECTION.WORKSTATIONS('<job_service_name>');

: the name of the run you want to perform (e.g., ‘tempo_run_one', ‘here_we_go')

or

CALL TEMPO.DETECTION.WEBSERVER('<job_service_name>');

After you run inference to find anomalies - or incidents - by looking at the Workstations or the Webserver, you will see a table with all the sequences the model has created. Unlike many neural network based solutions, one strength of Tempo is that it preserves and shares relevant sequences for further analysis.

If you order the rows by the Anomaly column, you will see that for Workstations you should see 11 anomalies and for Webserver you should see 3918 anomalies.

Were this a production use case, you might want to augment these results with information from IP Info or threat intelligence, to look into the external IPs that are indicated to be part of likely security incidents.

Some users have asked to see the entities that Tempo can discern. Note that for larger environments it would be typical to have Tempo to discern many more types of entities. You can ask Tempo to specifically learn the types of entities that are present in the log data provided using the following command:

CALL TEMPO.DETECTION.DEVICE_IDENTIFICATION('<job_service_name>');

At this point you have already seen the ability of DeepTempo to discern incidents in complex log data that traditional approaches are challenged to identify. As you can see, the output from DeepTempo could be used in conjunction with other data sources that you possess about your organization.

Monitor Job Services

The TEMPO.DETECTION.WORKSTATION and ...WEBSERVER commands should execute in 3-4 minutes.

If you decide to test the model on a larger dataset or otherwise would like to keep track of the execution of the inference on this sample data, you can check the status of Job services. As a reminder, job_service_name is the same job service name you assigned when you ran TEMPO.DETECTION.

CALL SYSTEM$GET_SERVICE_STATUS('DETECTION."job_service_name"');

"job_service_name": The name of the job service to check.

Example:

CALL SYSTEM$GET_SERVICE_STATUS('DETECTION.WORKSTATION_RUN_ONE');

This optional section guides you through setting up Splunk Enterprise to analyze the output from the Snowflake TEMPO project. This step is optional and intended for Splunk users who want a visualization of the output. For this demo we used a trial account on Splunk and we import the results of Tempo as CSV. In a production use case, you will likely use the Snowflake Splunk connector, DBConnect, as explained in the Snowflake documentation [here]: (https://community.snowflake.com/s/article/Integrating-Snowflake-and-Splunk-with-DBConnect)

Prerequisites

Install Splunk Enterprise

  1. Clone the installation repository:
    git clone https://github.com/your-username/splunk-tempo-dashboard-installer.git
    cd splunk-tempo-dashboard-installer
    
  2. Place the Splunk Enterprise tarball in the same directory as the script.
  3. Edit the script to set your desired credentials:
    vi splunk_tempo_install.sh
    
  4. Make the script executable and run it:
    chmod +x splunk_tempo_install.sh
    sudo ./splunk_tempo_install.sh
    

Configure Splunk and Load Data

  1. Access Splunk at http://your_ip:8000 and log in with the credentials you set.
  2. Download the CSV file from the TEMPO Snowflake app output.
  3. In Splunk, go to Settings > Add Data > Upload > select your CSV file.
  4. Follow the prompts to load the CSV, using default options.

Create the Dashboard

  1. After loading the CSV, click "Build Dashboards" > "Create New Dashboard".
  2. Select "Classic Dashboard Builder" and create the dashboard.
  3. In the dashboard editor, switch to "Source" view.
  4. Copy the XML from anomaly_hub.xml and paste it into the Source view.
  5. Save the Dashboard
    • After the dashboard is saved, you will now have to create a Tempo Splunk macro
  6. Create a Tempo macro in Splunk
    • In splunk create a new Splunk macro by going to Settings > Advanced Search > + Add New
    • Keep Destination app as search
    • Name the macro TempoDataLocation
    • Define the macro as your Splunk path to Tempo's csv output. Will look something like this
    source="your-filename.csv" host="Your Name" sourcetype="csv"
    
    • You can leave the rest of the macro creation blank.
    • Save the macro

You should now be able to see the incidents - or anomalies - in your new dashboard. This enables Security Operations teams to click through on the context provided by Tempo. For example, you can see all transactions to and from a specific IP address, or across given ports, as a part of investigating the incidents that have been identified.

Note that as a default, only the incidents are uploaded. Not also transferring and loading the entire dataset of logs simplifies the work of the Security Operator and also can translate into significant cost savings, as Splunk and most security operations solutions tend to charge by data ingested.

Important Notes

Troubleshooting

Conclusion

Congratulations, you just ran the world's first purpose-built LogLM available as a Snowflake NativeApp. In the weeks to come DeepTempo will launch a range of additional easy-to-use options and extensions as NativeApps, including tooling to simplify the process of using your own data with Tempo and upgrades to the power of Tempo including scale out multi-GPU usage.

What You Learned

Resources

Snowflake Native Apps