Through this guide, you will learn how to move beyond slow, nightly batch jobs and stream data in real-time from an operational database like SQL Server directly into Snowflake. Using Openflow, a cloud-native data movement platform, you will build a continuous Change Tracking pipeline that unlocks immediate access to your business data for faster, more accurate analytics.

What You Will Learn

By the end of this guide, you will learn to work with:

NOTE

Please note that Openflow on SPCS is not available on Snowflake's free trial account. Please input credit card details to work through this quickstart or use your own Snowflake accounts.

What is Snowflake Openflow?

Openflow is a cloud-native data movement platform built on Apache NiFi, designed specifically for scalable, real-time streaming and Change Data Capture (CDC) pipelines. It provides a unified experience for building and monitoring data integration workflows, complete with built-in observability and governance.

Openflow

Openflow is engineered for high-speed, continuous ingestion of all data types—from structured database records to unstructured text, images, and sensor data—making it ideal for feeding near real-time data into modern cloud platforms for AI and analytics.

What is Change Tracking in SQL Server (CT)?

Change tracking captures the fact that rows in a table were changed, but doesn't capture the data that was changed. This enables applications to determine the rows that have changed with the latest row data being obtained directly from the user tables. Therefore, change tracking is more limited in the historical questions it can answer compared to change data capture. However, for those applications that don't require the historical information, there's far less storage overhead because of the changed data not being captured. A synchronous tracking mechanism is used to track the changes. This has been designed to have minimal overhead to the DML operations.

What You'll Build

By the end of this quickstart guide, you will learn how to build:

Prerequisites

A Snowflake account with Snowflake Openflow and Snowpark Container Services access.

NOTE

Please note that Openflow on SPCS is not available on Snowflake's free trial account. Please input credit card details to work through this quickstart or use your own Snowflake accounts.

To set the stage for our real-time data streaming demonstration, we first need a source database. We will use SQL Server as a transaction db for this use-case. This will serve as the live OLTP environment from which we will stream data changes into Snowflake.

1. Creating the AWS RDS SQL Server Instance

This is the primary, step-by-step guide from AWS for creating and connecting to a SQL Server instance on RDS. It covers everything from the initial setup in the AWS console to configuring the security groups.

This is a more general guide that provides context on all available settings when creating an RDS instance.

2. Connecting with SQL Server Management Studio (SSMS)

Once your RDS instance is running, this AWS document shows you exactly how to find your database endpoint and connect to it using the most common tool, SSMS.

3. Getting and Loading the Northwind Database Script

The installation script instawnd.sql for the Northwind database is provided by Microsoft. The link below is to the official Microsoft SQL Server samples repository on GitHub, which is the most reliable place to get the script.

Once you download the instawnd.sql file from that repository, you can simply open it in SSMS (while connected to your RDS instance) and execute it to create and populate all the Northwind tables.

Configure Change-tracking on Database

To configure change-tracking, execute the console.sql script from this repository against the Northwind database.

In the next section, we will configure Snowflake Openflow connector and analyze real-time data from SQL Server to generate business insights

Available Connectors

Openflow supports 19+ connectors including:

Openflow Configuration

Before creating a deployment, you need to configure core Snowflake components including the OPENFLOW_ADMIN role and network rule.

NOTE

For a detailed, step-by-step guide on these prerequisite configurations, please complete Steps 2 of the following Snowflake Quickstart guide: Snowflake Configuration for Openflow.

Create Deployment

With the core Snowflake components configured, the next step is to create the Openflow deployment. This deployment provisions the necessary containerized environment within your Snowflake account where the Openflow service will execute.

IMPORTANT: Verify User Role

Before proceeding, ensure your current active role in the Snowsight UI is set to OPENFLOW_ADMIN. You can verify and switch your role using the user context menu located in the top-left corner of the Snowsight interface. Failure to assume the correct role will result in permissions errors during the deployment creation process.First, login to Snowflake UI.

  1. On the left pane, navigate to Data → Ingestion → Openflow
  2. Openflow Interface: You'll see three tabs:
    • Overview - List of available connectors and documentation
    • Runtimes - Manage your runtime environments
    • Deployments - Create and manage Openflow deployment
  3. Click on the Deployments tab. Click Create Deployment button
  4. Enter Deployment Location as Snowflake and Name as CDC_QS_DEPLOYMENT
  5. Complete the wizard
  6. Look for your deployment with status ACTIVE to verify deployment status

Create Runtime Role

Create a runtime role that will be used by your Openflow runtime. This role needs access to databases, schemas, and warehouses for data ingestion.

-- Create runtime role
USE ROLE ACCOUNTADMIN;
CREATE ROLE IF NOT EXISTS NORTHWIND_ROLE;

-- Create database for Openflow resources
CREATE DATABASE IF NOT EXISTS NORTHWIND_QS;

-- Create warehouse for data processing
CREATE WAREHOUSE IF NOT EXISTS NORTHWIND_WH
  WAREHOUSE_SIZE = MEDIUM
  AUTO_SUSPEND = 300
  AUTO_RESUME = TRUE;

-- Grant privileges to runtime role
GRANT USAGE ON DATABASE NORTHWIND_QS TO ROLE NORTHWIND_ROLE;
GRANT USAGE ON WAREHOUSE NORTHWIND_WH TO ROLE NORTHWIND_ROLE;

-- Grant runtime role to Openflow admin
GRANT ROLE NORTHWIND_ROLE TO ROLE OPENFLOW_ADMIN;

Create External Access Integration

External Access Integrations allow your runtime to connect to external data sources. This quickstart creates one integration with network rules for SQL Server.

USE COMPANION NOTEBOOKS:

For detailed External Access Integration setup for specific connectors, use the notebooks from the companion repository and look for SQL Server

Create Runtime

Next step is to create a runtime associated with the previously created runtime role. A runtime is the execution environment for your Openflow connectors. Follow these steps to create your runtime:* Navigate to Data → Ingestion → Openflow → Runtimes tab

NOTE

Runtime creation typically takes 3-5 minutes. The status will progress from CREATING → ACTIVEOnce your runtime is active, you can access the Openflow canvas to add and configure connectors. We will add and configure connectors in the next section.

Configure and Launch the SQL Server CDC Connector

This section details the final step of launching the Openflow connector.

Openflow Connectors

![SQL Server Process Group](assets/sql server connector.png)

Configure the connector

You can configure the connector to replicate a set of tables in real-time.

NOTE

The connector does not replicate any data until any tables to be replicated are explicitly added to its ingestion configuration.

Run the flow

Exploratory Analysis of Ingested Data

Once the initial data replication from SQL Server to the NORTHWIND_QS database in Snowflake is complete, you can perform an exploratory analysis to validate the ingested data.

  1. Download Analysis Notebook: Download the northwind.ipynb Jupyter Notebook from this git repository.
  2. Execute in Snowflake UI: Upload and open the northwind.ipynb notebook within the Snowflake UI's notebook environment.
  3. Run Analysis Cells: Execute all cells in the notebook sequentially to perform the data analysis. Do not execute the final cell, as it is reserved for the live data validation step.Completing this analysis confirms that the initial data load was successful and prepares the environment for the next phase, where we will generate live transactions in SQL Server to verify that the changes are tracked and replicated into Snowflake in real time.

Live Data Simulation and Real-Time Verification

This final procedure validates the end-to-end CDC pipeline by simulating new transactional data in the source SQL Server and observing its immediate replication and availability in Snowflake.

Simulate Live Transactions in SQL Server

To generate changes for the CDC process to capture, you will execute pre-defined SQL scripts from this repository against the source Northwind database.

Verify Real-Time Replication in Snowflake

After the simulation scripts have been executed in SQL Server, the changes will be captured and streamed by Openflow. You can verify their arrival in Snowflake nearly instantly.

  1. Navigate to Snowflake UI**:** Return to the northwind.ipynb notebook that you previously ran within the Snowflake UI.
  2. Execute Final Cell: Locate and run the last cell of the notebook.
  3. Confirm Results: The output of this cell will query the target tables and display the new transaction records (the new orders and the "waffles" product) that were just generated in SQL Server, confirming the successful real-time replication of data.

You've successfully built a real-time Change Tracking pipeline to stream data from SQL Server to Snowflake using Openflow. This modern approach to data integration eliminates the delays of traditional batch jobs, enabling immediate data analysis and faster business decisions.

What You Learned

Related Resources