Modern Data Stack requires setup and maintenance and Snowflake goes a long way to lowering the bar.
Our Powered By Snowflake partner Keboola offers Data Platform as a Service, helping users further accelerate data projects by delivering a ready-to-use managed platform. It includes all the technologies a data team needs. Therefore enabling everyone to capitalize on Snowflake's capabilities and acting as an avenue to the incredible modern no-maintenance and no-code solution.
This Quickstart will guide you to build a complete data pipeline from ingestion, transformation, orchestration to presentation in a Streamlit application.
Here we'll prepare everything for Keboola to write into Snowflake account:
Login to Snowflake trial account
At the left bottom of the screen, hover the account and click on "Copy account URL".
Paste the content in safe place to reuse later.
Click the blue button to create the new database
PLEASE USE THE NAME "SHOP_DB" FOR LATER INSTRUCTIONS
Select the "SHOP_DB" and create a new schema:
PLEASE USE THE NAME "SHOP_SC" FOR LATER INSTRUCTIONS
Navigate to the "Warehouses" section in Admin, click "+ Warehouse"
Select X-Small size
Enter the name (SHOP_WH) and click "Create Warehouse"
Navigate to the "Users & Roles" section in Admin, select the ... on your user and click "Edit"
Select the warehouse previously created (SHOP_WH)
Click "Save User"
Let's move to Keboola platform, after you have created your Keboola trial account, login to the platform and navigate to the "Use Cases" tabs.
In the searchbar, type "virtual"
Be sure you select the correct template "Snowflake Virtual Hands on Lab".
Click the green button "+ USE TEMPLATE"
You'll see more details on the template, click the top right green button "+ USE TEMPLATE"
Enter a name and click " -> NEXT STEP"
For this lab, we'll use dummy Shopify data so we don't need to configure the Shopify Component.
The template configuration constists in setting-up connection information to each component needed to run the flow.
For our lab, we'll need to setup only Snowflake.
We need to enter Snowflake information in the last step of the flow:
Click on "Edit Configuration"
Add the information we created during step 1
Hostname is the URL you copied in Snowflake WITHOUT "https://"
Port is "443"
Username is your Snowflake user name
Password is your Snowflake password
Database Name is "SHOP_DB"
Scheam Name is "SHOP_SC"
Warehouse is "SHOP_WH"
Save the template:
The template generated a Keboola Flow. A flow is a sequence of actions.
Navigate to the generated flow:
We are now ready to run it!
The run will last about 10mn for the first time, about 5mn for subsequent, you can track the progress:
Success!
Open your Snowflake web tab and check the SHOP_DB content.
You should see the tables created:
From the Streamlit application, we can write back data thanks to Keboola API.
We need first to get an API token. Follow the steps to generate this token.
Navigate to token page in Keboola:
Generate the token, do not forget to set "Full Access" and to copy the value !
At the left bottom of the screen, hover the account and click on "Copy account URL".
The URL is:
https://XXXXXX.eu-central-1.snowflakecomputing.com
The pattern is [protocol][locator][region].snowflakecomputing.com
We need only [locator][region]
Our account ID to connect Streamlit application will be:
XXXXXX.eu-central-1
Change the region accordingly !
We have seen Keboola Shopify template in action. We have now a complete set of tables in Snowflake with Shopify sales data including an RFM segmentation, ready to use!
RFM stands for Recency, Frequency, and Monetary value, each corresponding to some key customer charasteritics. These RFM metrics are important indicators of a customer's behavior because frequency and monetary value affects a customer's lifetime value, and recency affects retention, a measure of engagement
In this section, we'll leverage this segmentation in a Streamlit application.
This application will :
From the previous step, you have now your Snowflake credentials, account information and Keboola token.
Navigate to application: Link
Click "Connect" to validate.
Select the Keboola region (you can check with your browser in the keboola tab) end enter the Keboola token
You are now connected !
If your Snowflake information are correct, after clicking "Connect" you should see:
You see here the generated segmentation and the number of customers assigned to.
You can select the segment(s) you want to assign discount:
You can then adjust discount level and expected revenue increase.
This will calculate the impact on the total revenue
Scrolling down the page, you'll find an always adjusted list of customers (based on selected segments) and the level of discount.
This list can be used to trigger a marketing campaign:
Finally press the "UPLOAD" button:
Wait until the upload is finished:
Check the table in Keboola:
YOU'RE DONE WITH THE LAB !! 🍾
Delete the associated Flow
Delete the Components
Delete the Storage
Same player, start again :-)
Navigate to the Components tab, click on the Snowflake Data Destination:
Select "Database Credentials" on the right:
Udpate and test your credentials:
Navigate to the Components tab, click on the Snowflake Data Destination:
Select "Database Credentials" on the right:
Check the DB, Schema, Warehouse:
Congratulations! You've successfully built a complete data pipeline from ingestion, transformation, orchestration, to presentation in a Streamlit application!