Tempo is the first CyberSecurity solution based on a LogLM, or Log Language Model invented by DeepTempo. These models are similar to their more familiar cousins, LLMs such as Anthropic's Claude and LLama. Like LLMs, LogLMs are Foundation Models that apply their understanding across very different environments and in response to differing inputs. However, Tempo was pre-trained using enormous quantities of logs. Tempo is focused on the pattern of events, including relative and absolute time. Tempo has been shown to be extremely accurate, with a low false positive and false negative rate.
This guide will walk you through the process of setting up and using the TEMPO Native App in your Snowflake environment with provided sample data (CIC Dataset).
The data that is provided comes from the Canadian Institute for Cybersecurity. You can see the data set - and an explanation of the attacks discerned by Tempo here
GRANT CREATE COMPUTE POOL ON ACCOUNT TO APPLICATION TEMPO; GRANT CREATE WAREHOUSE ON ACCOUNT TO APPLICATION TEMPO;
At this point, you will be a Worksheet showing SHOW TABLES; you are now ready to use Tempo as explained below
The application comes with its own warehouse (TEMPO_WH) and compute pool (TEMPO_COMPUTE_POOL) with the following specs, which will be used for container services runs.
Starting on the same worksheet, you can now initialize Tempo:
CALL TEMPO.MANAGER.STARTUP();
After a few minutes, Snowflake will be ready to perform inference. You are creating a Snowflake Job service, which are containers that run a specific image and terminate as soon as the run is completed.
Once completed, we will use the TEMPO.DETECTION
schema's stored procedure to perform inference on sample log data. These stored procedures take a job service name as the only parameter. The demo data looks at logs for all Workstations and logs for all Webservers for a midsized company over several days. This demo data was obtained from the Canadian Institute of Cybersecurity. In a live run each created procedure represents a call to the respective model type IE. workstation representing the model specialized for workstations, webservers for webservers and so on.
When used for inference in your company, you would likely choose to execute each of these models as relevant logs are ingested. Tempo is modular in construction in order to minimize costs and compute time.
Example:
CALL TEMPO.DETECTION.WORKSTATIONS('<job_service_name>');
: the name of the run you want to perform (e.g., ‘tempo_run_one', ‘here_we_go')
or
CALL TEMPO.DETECTION.WEBSERVER('<job_service_name>');
After you run inference to find anomalies - or incidents - by looking at the Workstations or the Webserver, you will see a table with all the sequences the model has created. Unlike many neural network based solutions, one strength of Tempo is that it preserves and shares relevant sequences for further analysis.
If you order the rows by the Anomaly column, you will see that for Workstations you should see 11 anomalies and for Webserver you should see 3918 anomalies.
Were this a production use case, you might want to augment these results with information from IP Info or threat intelligence, to look into the external IPs that are indicated to be part of likely security incidents.
Some users have asked to see the entities that Tempo can discern. Note that for larger environments it would be typical to have Tempo to discern many more types of entities. You can ask Tempo to specifically learn the types of entities that are present in the log data provided using the following command:
CALL TEMPO.DETECTION.DEVICE_IDENTIFICATION('<job_service_name>');
At this point you have already seen the ability of DeepTempo to discern incidents in complex log data that traditional approaches are challenged to identify. As you can see, the output from DeepTempo could be used in conjunction with other data sources that you possess about your organization.
The TEMPO.DETECTION.WORKSTATION and ...WEBSERVER commands should execute in 3-4 minutes.
If you decide to test the model on a larger dataset or otherwise would like to keep track of the execution of the inference on this sample data, you can check the status of Job services. As a reminder, job_service_name is the same job service name you assigned when you ran TEMPO.DETECTION.
CALL SYSTEM$GET_SERVICE_STATUS('DETECTION."job_service_name"');
"job_service_name": The name of the job service to check.
Example:
CALL SYSTEM$GET_SERVICE_STATUS('DETECTION.WORKSTATION_RUN_ONE');
This optional section guides you through setting up Splunk Enterprise to analyze the output from the Snowflake TEMPO project. This step is optional and intended for Splunk users who want a visualization of the output. For this demo we used a trial account on Splunk and we import the results of Tempo as CSV. In a production use case, you will likely use the Snowflake Splunk connector, DBConnect, as explained in the Snowflake documentation [here]: (https://community.snowflake.com/s/article/Integrating-Snowflake-and-Splunk-with-DBConnect)
.tgz
file)anomaly_hub.xml
dashboard filegit clone https://github.com/your-username/splunk-tempo-dashboard-installer.git
cd splunk-tempo-dashboard-installer
vi splunk_tempo_install.sh
chmod +x splunk_tempo_install.sh
sudo ./splunk_tempo_install.sh
http://your_ip:8000
and log in with the credentials you set.anomaly_hub.xml
and paste it into the Source view.Settings
> Advanced Search
> + Add New
search
TempoDataLocation
source="your-filename.csv" host="Your Name" sourcetype="csv"
You should now be able to see the incidents - or anomalies - in your new dashboard. This enables Security Operations teams to click through on the context provided by Tempo. For example, you can see all transactions to and from a specific IP address, or across given ports, as a part of investigating the incidents that have been identified.
Note that as a default, only the incidents are uploaded. Not also transferring and loading the entire dataset of logs simplifies the work of the Security Operator and also can translate into significant cost savings, as Splunk and most security operations solutions tend to charge by data ingested.
Congratulations, you just ran the world's first purpose-built LogLM available as a Snowflake NativeApp. In the weeks to come DeepTempo will launch a range of additional easy-to-use options and extensions as NativeApps, including tooling to simplify the process of using your own data with Tempo and upgrades to the power of Tempo including scale out multi-GPU usage.