This quickstart guide will review the configuration of the Snowflake Data Source and Snowflake Streaming connectors within Tealium. Tealium's Customer Data Hub acts as a high quality data source and a real-time activation engine for Snowflake. Use Tealium to capture and send low latency customer interaction data to Snowflake for further analysis or to inform AI initiatives, and automate real-time data activation using Snowflake data across Tealium's 1300+ integration marketplace.
The Snowflake Streaming connector leverages the powerful Snowflake Snowpipe Streaming feature to enable near real-time importing of event and visitor data directly into Snowflake staging tables, enabling immediate availability for processing and analytics with data latency under 10 seconds. Send either the entire dataset or specific attributes to the staging table, ensuring customizable data integration tailored to business needs.
Using Snowpipe Streaming creates a streamlined pathway to enhance data accessibility and analytical capabilities within Snowflake while decreasing latency and costs. By removing third party integrations and using the Snowpipe Streaming serverless compute model, the Snowflake Streaming connector provides a cost effective, secure connection.
The Snowflake Streaming connector enables you to unlock a number of use cases, including:
The staging table in Snowflake that is intended to receive data from Tealium must contain specific columns depending on the selected connector action.
When using either Send Entire Event Data or Send Entire Visitor Data actions, the staging table must have a VARIANT column to receive the dataset and a column to receive the timestamp. The timestamp column must support the the timestamp data format that is sent.
When using either Send Custom Event Data or Send Custom Visitor Data, the staging table must have a column for each data type. Each data attribute must be assigned to a column. Ensure that each attribute is an accepted Snowflake data type and that the column is formatted properly to receive this data type. To map a timestamp, a separate timestamp column must be added.
For more information, see Snowflake: Supported Java data types.
The following table summarizes the data types supported by the Snowflake Streaming connector.
Snowflake Data Type | Supported |
Numeric data types | Yes |
String and binary data types | Yes |
Logical data types | Yes |
Date and time data types | Yes |
Arrays | Yes |
Object | Yes |
Variant | Yes |
Vector | No |
Geography | No |
Geometry | No |
The connector does not support the following table or column configurations
Snowflake has strict rules about which systems it accepts requests from. You will need to add the Tealium IP addresses to your Snowflake allow list.
NOTE: You must add the us-west-1 along with the server-side profile region to your allowlist. If you do not add these addresses to your allowlist, you will see errors when you try to fetch column data.
We recommend the following Snowflake table configurations for the Snowflake Streaming connector.
NOTE: Concurrent writing of tables by more than one feed or audience may result in performance errors.
Within the Tealium Customer Data Hub, navigate to either the EventStream or AudienceStream Connector Marketplace and add a new connector. For more information and instructions on how to add a connector, see About Connectors.
After adding the connector, configure the following settings:
To complete the connector authentication, use the following steps to alter the username with the public key details in Snowflake.
Within your preferred Command Line Interface (CLI), generate the RSA 2048 key through openssl.
## Generate your private key
openssl genrsa 2048 | openssl pkcs8 -topk8 -inform PEM -out rsa_key.p8 -nocrypt
## Generate your public key
openssl rsa -in rsa_key.p8 -pubout -out rsa_key.pub
Store your private and public key (public is located in rsa_key.pub, private is located in rsa_key.p8). The private key will be used in the Snowflake Streaming connector configuration and the public key will be assigned to the username in Snowflake. To make changes to the USER, you must have ACCOUNTADMIN privileges. Make sure that all line breaks have been removed from the public key. In Snowflake, run the following command to alter the user, making sure to put the appropriate values in for username and public_key.
alter user <username> set rsa_public.key=<public_key>;
The Snowflake Streaming connector within Tealium has two different actions for the event and visitor data sets: Send Entire Dataset or Send Custom Data.
All connector actions require the following parameters:
Parameter | Description |
Database Name | The Snowflake database that contains the required table |
Schema Name | The name of the schema used in the table |
Table Name | The name of the staging table you want to land the data into |
After inputing the above values, a drop-down menu with the available columns will become available to map your data. When sending the entire dataset, you will need to assign which column the data will be recorded under. This column must be a VARIANT type column in Snowflake. A new row will be created in the staging table for each event with the dataset available under the selected column as an object.
In addition to selecting where to record the datasets, the timestamp column needs to be identified. Because staging tables cannot increment, Tealium generates a timestamp for when the data is sent to Snowflake and inputs that value into this timestamp column. If the default Tealium timestamp is not desirable, the optional Timestamp Attribute format selection can be configured to the appropriate format. For more information on supported timestamp types: see Snowflake: Supported Java data types.
If the Snowflake Streaming connector is being setup within Tealium's AudienceStream, an additional configuration feature called Include Current Visit Data is included. When enabled, the dataset will include both the visitor data and current visit data.
Tealium provides the ability to select custom data attributes to send into Snowflake, providing greater control over your transferred data. When these actions are chosen, the configuration changes slightly to include the mapping of data attributes to columns in the Snowflake staging table. As previously mentioned, the staging table will require separate columns to be created for each data attribute, including a timestamp column. Once a data attribute is mapped, another attribute cannot be mapped to that column.
The following code snippet will create a new role that will be used to create and gain access to write to the staging table. It is recommended that a separate User is created with limited access to only the staging table for the Tealium connector. The staging table created has two columns, one to map the timestamp and the other to land the data object.
USE ROLE SECURITYADMIN;
CREATE USER SVC_TEALIUM DEFAULT_ROLE = TEALIUM_RL;
CREATE ROLE TEALIUM_RL;
GRANT ROLE TEALIUM_RL TO USER SVC_TEALIUM;
GRANT ROLE TEALIUM_RL TO ROLE SYSADMIN;
-- Public Key Assignment
-- *Note: Create RSA key-pair first
ALTER USER SVC_TEALIUM SET RSA_PUBLIC_KEY='<publickey>';
-- Database, Schema, and Table Creation
USE ROLE SYSADMIN;
CREATE DATABASE TEALIUM;
GRANT OWNERSHIP ON DATABASE TEALIUM TO ROLE TEALIUM_RL;
USE ROLE TEALIUM_RL;
CREATE SCHEMA EVENT_STREAM;
CREATE SCHEMA AUDIENCE_STREAM;
-- Create raw event tables
CREATE TABLE EVENT_STREAM.RAW_EVENT
(
EVENT_JSON VARIANT,
LOAD_TS TIMESTAMP_TZ
);
CREATE TABLE AUDIENCE_STREAM.AUDIENCE_EVENT
(
VISITOR_JSON VARIANT,
LOAD_TS TIMESTAMP_TZ
);
-- Repeat for any structured table for specific events as needed
After the connector is configured and before data can be sent, the changes must be saved to your Tealium instance. In the upper right-hand corner of your Tealium page, click on the orange Save/Publish button.
Use the Snowflake data source to import bulk data from a Snowflake table or view at near real-time speeds. After you import your data, transform and map it to Tealium attributes and build better audiences for targeted activation.
To get started, create one Snowflake data source per Snowflake table or view and customize the query mode and SQL WHERE clause for your data set. Then map your database columns to Tealium attributes. Each imported row from a Snowflake source is processed as an event in Tealium.
Imports from Snowflake are typically limited to 50 events per second per account, but may vary. Standard attribute size limits still apply. For more information, see About attributes > Size limits
The Snowflake data source fetches data at a maximum of 1,000 rows at a time.
Each Snowflake data source supports importing data from one Snowflake table or view. To import data from multiple Snowflake tables, create a view in Snowflake and select the view in the data source configuration. For more information about views in Snowflake, see Snowflake: Overview of Views.
The Snowflake data source supports all Snowflake data types. To ensure data is imported correctly, map the Snowflake data types according to the following guidelines:
Snowflake | Tealium |
Numeric data types | Number attributes |
String and binary data types | String attributes |
Logical data types | Boolean attributes |
Date and time data types | Date attributes |
Arrays | Array of strings, array of numbers, array of Booelans |
Object, variant, geography, geometry, and vector data types | String attributes |
For more information about Snowflake data types, see Snowflake: Summary of Data Types.
In the default Tealium data collection order of operations, events from a Snowflake data source are processed before the Event received step and do not change the order of operations.
Snowflake data source events are sent to EventStream and AudienceStream in the same way as events from other data sources with the following important exceptions:
The Snowflake data source supports three query modes to let you control how data is imported from your Snowflake table or view:
The Snowflake data source does not support bulk query modes.
The following example shows how batch processing of rows and query modes work together. In the following table, modification_time is the timestamp column and customer_id is the incrementing column.
customer_id | modification_time | customer_segment |
1 | 01Apr 13:00 | A |
2 | 01Apr 13:00 | B |
... | ... | ... |
1000 | 01Apr 13:00 | D |
1001 | 01Apr 13:00 | E |
1002 | 02Apr 14:00 | A |
The Snowflake data source fetches data 1,000 rows at a time and marks the maximum value of the timestamp and/or incrementing column(s) from the batch of data.
modification_time is 01Apr 13:00 and the customer_id is greater than 1000
or
modification_time is greater than 01Apr 13:00
In the Snowflake data source Query Configuration, select the columns you want to import into Tealium. To add additional conditions for processing, use the SQL WHERE clause. This option adds a WHERE statement to your query. WHERE statements support basic SQL syntax.
Note: The WHERE clause does not support subqueries from multiple tables. To import data from multiple Snowflake tables, create a view in Snowflake and select the view in the data source configuration. For more information, see Snowflake: Overview of Views.
The column mapping configuration determines the event attributes that correspond to each column in the Snowflake table.
Column names are often different from the attribute names in the Customer Data Hub, so this mapping ensures that the data is imported properly. For example, a table might have a column named postalCode, but the matching event attribute is named customer_zip, so you need a mapping to associate them.
For information about mapping Snowflake data types to Tealium data types, see the Data Types section.
To ensure your imported data is stitched with other sources, such as web, mobile, or HTTP API, ensure that every row in the Snowflake table has a column with a unique visitor ID. You can then map the visitor ID column and corresponding event attribute to a visitor ID attribute (a unique attribute type for visitor identification in AudienceStream). The value in the mapped event attribute is assigned to the tealium_visitor_id attribute and matched directly to any existing visitor profiles.
For more information about Visitor ID Mapping in AudienceStream, see Visitor Identification using Tealium Data Sources.
If your Snowflake account has strict rules about which systems it accepts requests from, add the Tealium IP addresses to your Snowflake allow list.
Take the following steps to pull up the Snowflake data source and begin the configuration process:
The Snowflake data source lets you create connections to Snowflake that you can reuse in other Snowflake data sources.
Ensure you have the following Snowflake account information before you get started:
In the Query Settings screen, select the appropriate query mode for your Snowflake table or view.
When you select Timestamp or Incrementing, you must list the name of one column to use to detect either new and modified rows or new rows only. If you select Timestamp + Incrementing you must list two columns, a timestamp column and a strictly incrementing column, to detect new and modified rows.
For more information, see About Snowflake data source > Query settings.
When you are done, click Next.
SELECT
ID,
TIMESTAMP,
CUSTOMER_ID,
CAST(DATE_OF_BIRTH AS VARCHAR) AS DATE_OF_BIRTH_STRING,
ADDRESS_1,
ADDRESS_2,
CITY,
STATE,
COUNTRY,
ZIP_CODE
FROM
customers
WHERE
ID >= 4;
Use the column mapping table to map pre-configured column labels to event attributes or manually enter the column labels for mapping. Each row from Snowflake is processed as an event. Columns not mapped to an event attribute are ignored.
For each column label, select the corresponding event attribute from the drop-down list.
When you are done, click Continue.
To use your Snowflake data with AudienceStream, map your data to visitor ID attributes. Select the mapped event attribute that represents a visitor ID and map it to the corresponding visitor ID attribute.
Visitor ID Mapping in AudienceStream is enabled by default. Disabling visitor ID mapping may cause errors in visitor stitching. For more information, see Visitor Identification using Tealium Data Sources.
When you are done, click Next.
In this final step, view the summary, make any needed corrections, and then save and publish your profile. To edit your configuration, click Back to return to the step where you want to make changes.
To check the status of import activity, navigate to the Data Sources dashboard and expand the data source.
To change a Snowflake data source, navigate to the Data Sources dashboard and click the edit icon next to the configuration you want to change. From the Edit Connection screen, you can edit Snowflake connection, query, and mapping configurations.
To view the error logs, complete the following steps in Tealium:
You can export logs for the following time ranges:
To reduce file sizes, exports are limited to a time range of 4 hours or shorter. Complete the following steps to export logs:
There is a short delay while the logs are processed. You can view your exported logs from the past six days by navigating to Error Logs>View Exported Logs. Logs are stored for 30 days.
In this guide, you learned how to successfully configure a Snowflake Data Source and Snowflake Streaming connectors within Tealium, enhancing your ability to leverage real-time customer data. These integrations enables you to efficiently capture, process, and activate low-latency data, optimizing your analysis and AI initiatives while leveraging Tealium's extensive integration marketplace.
Reach out to your account representatives to learn more about unlocking the full potential of your data with Tealium and Snowflake!