Getting Started with TEMPO and Snowflake

Tempo is the first CyberSecurity solution based on a LogLM, or Log Language Model invented by DeepTempo. These models are similar to their more familiar cousins, LLMs such as Anthropic's Claude and LLama. Like LLMs, LogLMs are Foundation Models that apply their understanding across very different environments and in response to differing inputs. However, Tempo was pre-trained using enormous quantities of logs. Tempo is focused on the pattern of events, including relative and absolute time. Tempo has been shown to be extremely accurate, with a low false positive and false negative rate.

This guide will walk you through the process of setting up and using the TEMPO Native App in your Snowflake environment with provided sample data (CIC Dataset).

The data that is provided comes from the Canadian Institute for Cybersecurity. You can see the data set - and an explanation of the attacks discerned by Tempo here

What You'll Learn

How to run Tempo on sample data (CIC Dataset)
How to check to see if Tempo is accurate in flagging attacks
Optional - How to view the output in Splunk

What You'll Need

A Snowflake Account
Optional - A Splunk Account or instance

What You'll Build

A working LogLM alerting dashboard

Find The App

In the Snowflake app Marketplace you can find the Tempo app or simply click Here.

If you are running on your own data you will have the select the storage before clicking the launch app button in the deployment phase. To select your table please click add next to the on Incident Inference Logs section. In the popup after clicking the add button click the +Select Data button and find the table you want to use on the dropdown. Select it and click Save.

Note: If you are running with the demo data simply skip this step and continue.

Snowflake will require you to grant permissions to run this app. For a smooth experience make sure you do this in the initial setup though the Snowflake UI.
Go to the Projects>Worksheets console in Snowflake. Here you should see a + sign in the top right corner of the screen. We will use this to create our own worksheets. Go ahead and click it now.
From the top of the worksheet there should be a dropdown called Select Databases. This is what you will use to attach our database to this worksheet. If you are using demo data select the option with TEMPO at the beginning of it's name.

The default resources created by the tempo app are as follows.

TEMPO_WH

Type: Snowpark Optimized
Size: Medium
Auto Suspend: 120 seconds
Auto Resume: Enabled
Initial State: Active

TEMPO_COMPUTE_POOL

Node Configuration:
- Minimum Nodes: 1
- Maximum Nodes: 1
Auto Suspend: 360 seconds
Instance Family: GPU_NV_S
Auto Resume: Enabled
Initial State: Active

In the new worksheet we now need to setup our procedures. We will start with initializing the container resources. Throughout this guide we will provide you with statements to run. Please add them to the sheet. You can do these one by one or add them all to a single worksheet.

Initialize Application Resources

CALL management.create_resources();

Purpose: Initializes the application by loading required model weights and configurations Required Permissions: Warehouse, compute pool, and task management access

It is recommended that you run this command prior to running the sheet as a whole. It can take some time for the resources to spin up. If you are the account admin you can monitor resources using SHOW COMPUTE POOLS IN ACCOUNT;. Once the compute pools are idle you may continue with the rest of the worksheet.

CALL static_detection.inference('your_service_name');

Parameters:

your_service_name: Name of the service to analyze (string). This is set by you and should be unique to each run. Purpose: Executes inference on specified service data

If you want to use the demo feel free to name it something like demorun for the your_service_name.

CALL inspect.deepdive(sequence_id);

Parameters:

sequence_id: Identifier of the sequence to analyze (integer). This ID can be used down the road if any anomalies are detected to run deeper investigation on suspicious interactions. Purpose: Investigates specific sequences flagged as anomalies

Note: If running on demo data lets use 2 as the id (valid IDs 1-1200)

The results will be collections of related events making up Suspicious and Anomalous activities. These are the events your security team would want verify as actuall intrusion events.

This optional section guides you through setting up Splunk Enterprise to analyze the output from the Snowflake TEMPO project. This step is optional and intended for Splunk users who want a visualization of the output. For this demo we used a trial account on Splunk and we import the results of Tempo as CSV. In a production use case, you will likely use the Snowflake Splunk connector, DBConnect, as explained in the Snowflake documentation [here]: (https://community.snowflake.com/s/article/Integrating-Snowflake-and-Splunk-with-DBConnect)

Prerequisites

An Amazon EC2 instance running Amazon Linux or another compatible Linux distribution
Root or sudo access
Splunk Enterprise installer tarball (.tgz file)
anomaly_hub.xml dashboard file

Install Splunk Enterprise

Clone the installation repository:

git clone https://github.com/your-username/splunk-tempo-dashboard-installer.git
cd splunk-tempo-dashboard-installer

Place the Splunk Enterprise tarball in the same directory as the script.
Edit the script to set your desired credentials:
```
vi splunk_tempo_install.sh
```

Make the script executable and run it:

chmod +x splunk_tempo_install.sh
sudo ./splunk_tempo_install.sh

Configure Splunk and Load Data

Access Splunk at http://your_ip:8000 and log in with the credentials you set.
Download the CSV file from the TEMPO Snowflake app output.
In Splunk, go to Settings > Add Data > Upload > select your CSV file.
Follow the prompts to load the CSV, using default options.

Create the Dashboard

After loading the CSV, click "Build Dashboards" > "Create New Dashboard".
Select "Classic Dashboard Builder" and create the dashboard.
In the dashboard editor, switch to "Source" view.
Copy the XML from anomaly_hub.xml and paste it into the Source view.
Save the Dashboard
- After the dashboard is saved, you will now have to create a Tempo Splunk macro
Create a Tempo macro in Splunk
- In splunk create a new Splunk macro by going to Settings > Advanced Search > + Add New
- Keep Destination app as search
- Name the macro TempoDataLocation
- Define the macro as your Splunk path to Tempo's csv output. Will look something like this
```
source="your-filename.csv" host="Your Name" sourcetype="csv"
```
- You can leave the rest of the macro creation blank.
- Save the macro

You should now be able to see the incidents - or anomalies - in your new dashboard. This enables Security Operations teams to click through on the context provided by Tempo. For example, you can see all transactions to and from a specific IP address, or across given ports, as a part of investigating the incidents that have been identified.

Note that as a default, only the incidents are uploaded. Not also transferring and loading the entire dataset of logs simplifies the work of the Security Operator and also can translate into significant cost savings, as Splunk and most security operations solutions tend to charge by data ingested.

Important Notes

Set strong passwords for all user accounts.
Ensure sufficient disk space (minimum 10GB recommended).
Keep your EC2 instance and Splunk Enterprise updated with the latest security patches.

Troubleshooting

If the script fails to find the Splunk tarball, confirm that it's in the correct directory.
For user creation issues, check Splunk server logs and verify authentication credentials.

Conclusion

Congratulations, you just ran the world's first purpose-built LogLM available as a Snowflake NativeApp. In the weeks to come DeepTempo will launch a range of additional easy-to-use options and extensions as NativeApps, including tooling to simplify the process of using your own data with Tempo and upgrades to the power of Tempo including scale out multi-GPU usage.

What You Learned

How to run Tempo on Snowflake as a NativeApp
How to monitor Tempo
How to see incidents and relevant context identified by Tempo in Splunk

Resources

To try the app please follow This Link Snowflake Native Apps