Dynamic tables are a new table type offered by Snowflake that allow data teams to use SQL statements to declaratively define the results of data pipelines. Dynamic tables simplify the process of creating and managing data pipelines by streamlining data transformations without having to manage Streams and Tasks.
Dynamic tables materialize the results of a query that you specify. Instead of creating a separate target table and writing code to transform and update the data in that table, you can define the target table as a dynamic table and specify the SQL statement that performs the transformation. An automated process updates the materialized results automatically through regular refreshes. These automatic refreshes occur based on changes to your data, meaning that they only operate on new data changes since the last refresh.
As the only transformation solution uniquely built for Snowflake, Coalesce offers a way to visually build, adjust and deploy dynamic tables in Snowflake orders of magnitude faster without having to code by hand. Coalesce provides many "out-of-the-box" node types that are ready to use immediately upon logging into the platform, in addition to empowering users to create their own nodes known as user-defined nodes, or UDNs. The dynamic table node that you will use in the following exercise is a form of a UDN found in Coalesce.
In this guide, you will build a small directed acyclic graph (DAG) in Coalesce using dynamic table nodes that understands dependency ordering. Snowflake handles the refreshing of the pipeline, while Coalesce guarantees that dynamic tables are created only after all upstream dependencies are satisfied. When building pipelines without dynamic tables, Coalesce uses the dependency graph for deployment and refreshing of tables and views.
ACCOUNT ADMIN
privilegesComplete the steps in this section to prepare your Coalesce environment. Please note that these steps assume you are using trial Snowflake and Coalesce accounts. If you are using pre-existing accounts, you will need to adjust your Storage Locations and Mappings to use the sample Snowflake dataset as shown in this guide.
After registering, you will receive an email from Snowflake with an activation link and URL for accessing your trial account. Finish setting up your account following the instructions in the email.
Once you are logged into your Snowflake account, sign up for a free Coalesce trial account using Snowflake Partner Connect. Check your Snowflake account profile to make sure that it contains your fist and last name.
Open the Build Settings of your workspace, and navigate to Packages. Click the Browse button in the upper right hand corner. This will take you to the Coalesce Marketplace. Find the Dynamic Tables package and click Find out more. Copy the Package ID from the package details page and navigate back to Coalesce.
Click the Install button on the packages page and paste in the Package ID. The most recent version of the package will be automatically selected. Give the package an Alias - this is the name of the package as it will appear in the build interface of Coalesce.
Copy and run this code in your Worksheet. This will create a standalone Development, QA (Testing) and Production environment for you.
SRC
mapping, select your newly created DEV
database (starting with your first initial and last name) and SOURCE DATA
schema. For your WORK
mapping, select your DEV
database and REPORTING
schema as shown below.{
"targetDynamicTableWarehouse": "DEV ENVIRONMENT"
}
SRC
- SNOWFLAKE_SAMPLE_DATA.TPCH_SF1
. Then click Add 8 sources in the bottom right corner.CUSTOMER
, NATION
and REGION
source nodes. Then right click and hover over Join Nodes and select Dynamic Table Stage.Under Dynamic Table Options, you have the option of changing the selected warehouse that you would like to use to run your node. Enter the warehouse dev_wh_xs
that was created when you ran your setup code in Snowflake. In Coalesce, you have the ability to change and set specific warehouses that run each dynamic table node in a given environment (e.g. DEV
, QA
, or PROD
) by using Parameters.
The Downstream toggle controls whether refreshes are determined by subsequent nodes in your pipeline.
Your Lag Specification determines how often your node is refreshed. Set this schedule to 1 minute as shown.
FROM {{ ref('SRC', 'NATION') }} "NATION"
INNER JOIN {{ ref('SRC', 'REGION') }} "REGION"
ON "NATION"."N_REGIONKEY" = "REGION"."R_REGIONKEY"
INNER JOIN {{ ref('SRC', 'CUSTOMER') }} "CUSTOMER"
ON "NATION"."N_NATIONKEY" = "CUSTOMER"."C_NATIONKEY"
N_NATIONKEY
, N_REGIONKEY
, N_COMMENT
, R_REGIONKEY
and R_COMMENT
in your node.ORDERS
and LINEITEM
source nodes. Right click and hover over Join Nodes and then click on your click on your dynamic table node.FROM {{ ref('SRC', 'ORDERS') }} "ORDERS"
INNER JOIN {{ ref('SRC', 'LINEITEM') }} "LINEITEM"
ON "ORDERS"."O_ORDERKEY" = "LINEITEM"."L_ORDERKEY"
O_ORDERSTATUS
, O_TOTALPRICE
, O_ORDERDATE
, O_CLERK
, O_SHIPRIORITY
, L_PARTKEY
and L_SUPPKEY
:FROM {{ ref('SRC', 'DT_ORDERS_LINEITEM') }} "DT_ORDERS_LINEITEM"
INNER JOIN {{ ref('SRC', 'DT_CUSTOMER_NATION_REGION') }} "DT_CUSTOMER_NATION_REGION"
ON "DT_ORDERS_LINEITEM"."O_CUSTKEY" = "DT_CUSTOMER_NATION_REGION"."C_CUSTKEY"
C_ADDRESS
, C_NATIONKEY
, C_PHONE
, C_ACCTBAL
, C_MKTSEGMENT
, and C_COMMENT
columns. Right click and select Delete Columns from the dropdown menu.C_CUSTKEY
column. Press and hold the Shift and Option buttons, then scroll upwards to multi-select every column from C_CUSTKEY
to L_EXTENDEDPRICE
. Right click on the columns and select Delete Columns to bulk delete them. Finally, delete all of the remaining columns but C_NAME
and L_QUANTITY
.C_NAME
column and drag it to the top of your Mapping grid so that it appears in front of the L_QUANTITY
column. Select L_QUANTITY
and add the aggregation below to the Transform field:sum({{SRC}})
GROUP BY ALL
capability.GROUP BY ALL
function that you added via the Config.Congratulations! You have built out a small DAG of dynamic table nodes.
DT_ORDERS_LINEITEM
to open it. In the Config section, select the Downstream option and press the Create button.DT_CUSTOMER_NATION_REGION
node and repeat this action by selecting the Downstream option and creating your node.DT_ORDERS_LINEITEM_CUSTOMER_NATION_REGION
node.DEV
database that you created earlier in this guide (YOURFIRSTINITIALLASTNAME_DEV
). Under Source Data, click on Dynamic Tables and select your DT_ORDERS_LINEITEM_CUSTOMER_NATION_REGION
node.To deploy your DAG to a non-development environment, you will first need to set up version control in Coalesce by connecting to a git repository.
QA
and PROD
databases that we created as part of our setup in Snowflake.Click on Environments and then click on the New Environment button. Name your new Environment QA
to use as a testing environment. You will need to enter your Snowflake account which can be found in the lower left hand corner of your Snowflake account. Click the Save button to save your QA
settings.
SRC
mapping to your QA
database (FIRSTINITIALLASTNAME_DB
) and SOURCE DATA
schema. Set your WORK
mapping to your QA
database and REPORTING
schema. Then click the Save button and close the window.QA
Environment will use the qa_wh_xs.
Then click the Save button to save your QA
settings.{
"targetDynamicTableWarehouse": "qa_wh_xs"
}
QA
Environment and click the New Environment button. Name this Environment PROD.
You will need to enter your Snowflake account name which can be found in the lower left hand corner of your Snowflake account. Click the Save button to save your PROD
settings.QA
Environment. Be sure to list prod_wh_xs
as your warehouse. Then click on Storage Mappings and set your database to your PROD
DB for SRC
and WORK.
Map your SRC
schema to SOURCE_DATA
and your WORK
schema to PUBLIC.
Then click the Save button.PROD
Environment will use the prod_wh_xs.
Then click the Save button.{
"targetDynamicTableWarehouse": "prod_wh_xs"
}
Congratulations! You've set up your QA
and PROD
Environments and are now ready to deploy.
QA
Environment, switch over to the Deploy interface at the top of your screen:QA
Environment to start the deployment process. Then select your most recent commit (Env_Update) and click Next.QA
warehouse should be used. Click the Next button to continue:QA
Environment. You can continue working on your DAG, or repeat the deployment process and deploy to your PROD
environment.Congratulations, you've completed this guide on creating dynamic table nodes in Snowflake using Coalesce. Continue with your free trial by loading your own sample or production data and exploring more of Coalesce's capabilities with our documentation and resources.
Happy transforming!