This entry-level lab introduces you to the user interface and basic capabilities of both Snowflake and Sigma, and how to use both solutions to build analytics. It is designed specifically for use with the Snowflake, free 30-day trial at https://trial.snowflake.com and the Sigma, free 14-day trial available within the Snowflake free trial on Snowflake Partner Connect. When done with the lab you should be ready to load your own data into Snowflake, analyze and visualize it with Sigma.

What You'll Learn

The exercises in this lab will walk you through the steps to:

What You'll Need

What You'll Build

By walking through this lab you will:

Footer

Steps to Prepare Your Lab Environment

  1. If not yet done, register for a Snowflake free 30-day trial at https://trial.snowflake.com
    • You will have different Snowflake editions (Standard, Enterprise, e.g.), cloud providers (GCP, AWS, or Azure), and Regions (US Central, Europe West, e.g.) Available to you. We suggest you select the cloud provider and region which is physically closest to you and your data, and select the Enterprise edition so you can leverage advanced capabilities that are not available in the Standard Edition.
    • After registering, you will receive an email with an activation link and your Snowflake account URL. Bookmark this URL for easy, future access. After activation, you will create a username and password. Write down these credentials.
  2. Click here and download the "sigma_vhol.sql" file to your local machine. This file contains pre-written SQL commands and we will use this file later in the lab.

Footer

Logging Into the Snowflake User Interface (UI)

Open a browser window and enter the URL of your Snowflake 30-day trial environment. You should see the login screen below. Enter your unique credentials to log in.

Snowflake Login Screen

Navigating the Snowflake Snowsight UI

First let's get you acquainted with Snowflake! This section covers the basic components of the user interface to help you orient yourself. We will move top to bottom on the left panel of the UI.

Snowsight Navigation

Worksheets Tab

Worksheet Canvas

Import SQL Menu

Footer

Provisioning Sigma via Partner Connect

  1. In the top ribbon of the Snowflake UI, click on the Partner Connect icon. From here you can automatically connect your Snowflake account with our partner applications available for a free trial. Click on the Sigma tile on the first row to start our new Sigma free trial.

Partner Connect Button

Partner Connect Menu

  1. Click "Connect".

Connect To Sigma

  1. Now click "Activate" which will open a new tab.

Partner Account Dialog

  1. A new tab for connecting to Sigma will pop-up in your web browser. Please leave the tab in place as we will come back to it later when we begin working with Sigma.

Sigma Sign-up

tab close-up

Footer

The Lab Story

This Snowflake and Sigma lab will be done as part of a theoretical real-world "story" to help you better understand why we are performing the steps in this lab and in the order they appear.

Throughout this lab we will be working with some common sales data from our fictitious physical retail company ‘Plugs Electronics'. This data includes order and SKU numbers, product name, price, store name and region, and customer data. Some of this data is structured and some is JSON semi-structured.

We will load this data into Snowflake and highlight some Snowflake capabilities. Via Snowflake PartnerConnect, we will then connect Sigma, a business intelligence platform, to Snowflake to then run some analytics on the data and turn it into several charts to form a dashboard.

Create a Database & Table

  1. Go back to the Snowflake UI and click on the Worksheets tab. Per prior step, ensure the SQL text from the "sigma_vhol.sql" file has been loaded into the worksheet.

Worksheet with Code

  1. As you can see from the SQL we loaded into our worksheet, a worksheet can have more than one command in it. SQL commands are delimited by semicolons. In order to run a single command, all you have to do is click anywhere on the line/command you would like to execute, then click on the "Play" button.
  2. At this point, we are ready to click and "Play" on each line in our worksheet in order to get our Snowflake environment setup for our Sigma users. Some of the steps will repeat things we have already executed in this lab, via the UI. For the next few steps click on each line in the worksheet and run the command. Each step in the lab guide has a description of what the command is doing.

Warning In this lab, never check the "All Queries" box at the top of the worksheet. We want to run SQL queries one at a time in a specific order; not all at once.

  1. use role sysadmin; This will set the context of the WORKSHEET, to use the role of SYSADMIN, when we "Play" the command. We do not want to be in the role ACCOUNTADMIN any more.
  2. use warehouse pc_sigma_wh; Sets the PC_SIGMA_WH to be used for commands run in the WORKSHEET. As you can see by the (XS) to the right of the warehouse name, that an extra small warehouse is being used for this lab. An XS translates to a single node cluster for our virtual warehouse. Here is a link to Snowflake's doc covering warehouses in detail.
  3. create or replace database plugs_db; Creates a database named PLUGS_DB in our account. It also automatically sets the context for the worksheet to use .
  4. use database plugs_db; Specifies to use the plugs_db database for the session
  5. use schema public; Sets the context of the WORKSHEET to use the schema.
  6. CREATE or REPLACE STAGE plugs_db.public.sigma_stage URL = ‘s3://sigma-snowflake-vhol/data/'; Creates an external stage in Snowflake that points to an external S3 bucket that has the data files we would like to use for the lab.
  7. ls @sigma_stage; Lists all the contents of the stage that was just created.

Loading Data Into Snowflake

FYI the data we will be using is demo data for a fictitious physical retailer called "Plugs Electronics. This data has been exported and pre-staged for you in an AWS s3 bucket in the us-west-1 (N California) region. There are two files.

The first data file is "Plugs Transactions.csv" includes order and SKU numbers, product name, price, store name and region, and customer data. It is in comma-delimited format with double quote enclosing and a single header line. The data represents 4.7m rows and 1k MB total size. Below is a partial screenshot of the Plugs structured data CSV file:

Result Set View

  1. create or replace table transactions
    (order_number integer,
    date timestamp,
    sku_number string,
    quantity integer,
    cost integer,
    price integer,
    product_type string, product_family string,
    product_name string,
    store_name string,
    store_key integer,
    store_region string,
    store_state string,
    store_city string,
    store_latitude integer,
    store_longitude integer,
    customer_name string,
    cust_key integer);
  2. create file format "PLUGS_DB"."PUBLIC".COMMA_DELIMITED
    TYPE = ‘CSV'
    COMPRESSION = ‘AUTO'
    FIELD_DELIMITER = ‘,'
    RECORD_DELIMITER = ‘\n'
    SKIP_HEADER = 1
    FIELD_OPTIONALLY_ENCLOSED_BY = ‘NONE'
    TRIM_SPACE = FALSE
    ERROR_ON_COLUMN_COUNT_MISMATCH = TRUE
    ESCAPE = ‘NONE'
    ESCAPE_UNENCLOSED_FIELD = ‘\134'
    DATE_FORMAT = ‘AUTO'
    TIMESTAMP_FORMAT = ‘AUTO'
    NULL_IF = ('\N');
    )
    We have data files in our stage as shown in the previous list (ls). These files have certain formats that need to be defined in Snowflake in order for the data to be loaded properly. In this case we are creating a FILE FORMAT named COMMA_DELIMITED that is specifying that the data in the file is delimited by commas, has been compressed, has a record delimiter of newline character ‘\n', has a first row record that has column names that needs to be skipped, etc. More information regarding file formats can be found here.
  3. copy into transactions from @sigma_stage/Plugs_Transactions.csv FILE_FORMAT = ( FORMAT_NAME = ‘COMMA_DELIMITED' );
    This copies the data from our Plugs_Transactions.csv file and loads into our transactions table. A SELECT COUNT(*) from the table will show we loaded 4,709,568 rows into the table.
  4. create or replace table Customer (cust_key integer, cust_json variant);
    This creates our customer table for the second data file, or "Plugs_Customers.csv" which is composed of structured data and semi-structured data. It consists of Plugs Electronics customer information including age group and age, civil status, address, gender,if they are in the loyalty program, and more. It is also staged on AWS where the data represents 5k rows and 1.9 MB total size.

The cust_json column is defined as VARIANT. We use the variant data type to store json, parquet, orc, avro, and xml as they are semi-structured data, you can find more information on this here. We are able to store the data without applying structure, then use SQL with path notation to immediately start querying our semi-structured data asi-is. Sigma's integration with Snowflake generates SQL with path notation so that you do not have to write the SQL yourself. This will be shown later on in the lab.

A partial screenshot of the file is:

JSON Result Set

SEMI-STRUCTURED DATA Snowflake can easily load and query semi-structured data, such as JSON, Parquet, or Avro, without transformation. This is important because an increasing amount of business-relevant data being generated today is semi-structured, and many traditional data warehouses cannot easily load and query this sort of data. With Snowflake it is easy!

  1. copy into customer from @sigma_stage/Plugs_Customers.csv FILE_FORMAT = ( FORMAT_NAME = ‘COMMA_DELIMITED' );
    Loads data from our stage into our customer table. The copy will load 4,972 rows into the table. We recommend doing a select * from customer; to see how the semi-structured data differs from the structured cust_key column. In the diagram below, we clicked in the general area on the second row of the cust_json column in the answer set, highlighted by the red box. The UI then popped open the Details dialog box showing the contents (key value pairs) of the JSON.

JSON Record

  1. grant usage on database PLUGS_DB to role PC_SIGMA_ROLE;
    Snowflake access rights are based upon role based access control (RBAC). We now need to allow the PC_SIGMA_ROLE to use the plugs_db database.
  2. grant usage on schema PLUGS_DB.PUBLIC to role PC_SIGMA_ROLE;
    We now allow the PC_SIGMA_ROLE to use the plugs_db.public schema.
  3. grant SELECT on TABLE PLUGS_DB.PUBLIC.TRANSACTIONS to role PC_SIGMA_ROLE;
    Grant SELECT access on the transactions table to the pc_sigma_role.
  4. grant select on TABLE PLUGS_DB.PUBLIC.CUSTOMER to role PC_SIGMA_ROLE;
    Grant SELECT access on the CUSTOMER table to the pc_sigma_role.
  5. use role PC_SIGMA_ROLE;
    We completed granting access to the tables and data to the PC_SIGMA_ROLE, that will allow a user to now start reporting on the data from Sigma using this role. This command will now set the context of the UI so that we are now using this role so that we can confirm Sigma will be able to access the data.
  6. select * from customer;
    A select count(*) from the customer table should complete successfully. If not, please go back and re-run the prior steps of this module. Note, you will have to switch the context of the UI back to SYSADMIN before re-running the steps.
  7. select count(*) from transactions;
    This will confirm that the PC_SIGMA_ROLE has access to the customer table as well.

Footer

Start Working With Sigma

Your Snowflake account has now been set up with the data that will be used by Sigma to create an workbook. In order to get started with Sigma, please click on the tab that was opened in your browser when we clicked the launch button from Partner Connect.

Company Name

  1. Navigating to the Sigma tab in your browser will bring you to the Sigma dialog box to complete your Sigma Partner Connect setup. Use a company name of your choosing, or something unique like your first initial plus last name for your ‘made up' company url. Click on "Continue".

Company Name

  1. Next enter your name and password and click "Create".

Create Your Profile

  1. If you are utilizing IP white listing in Snowflake (most likely not applicable if you just signed up for a free Snowflake trial) you can read more about that here, these are the URLs you would need to add to the white listing . Click "Get Started Using Sigma".

IP Address To Whitelist

Sigma UI Orientation

  1. You now should see the home page for Sigma.

Sigma Home Screen

  1. At the bottom left of the page are connections to Snowflake databases. There should be two showing. Click on the lower connection called "Snowflake PC_Sigma_WH" which is the virtual warehouse we created in prior steps. Click on the connection.
  2. On the resulting page, click one of the two blue "Add Permission" buttons.

Connection Details

  1. Click into the bar that says "Search for members or teams" and a drop down appears. Select "All Members (Team)" so all users of Sigma have write access to this database. Then click the blue "Save" button.

Connection Permissions

  1. If you expand the Plugs_DB database you will be able to see our two tables that were loaded into Snowflake.

Completed Connection Details

Footer

Create a Sigma Dataset

Sigma's Datasets are a flexible way to build centralized data definitions and guide data exploration. Sigma balances administrative control with the freedom to find, add, and trust new data. Datasets function as sources of data for Workbooks.

Sigma has the ability to join tables, other datasets, csv uploads, or your own SQL inside of datasets. In this section we will go over how to join tables to build a dataset as well as how to work with semi structured json data within the Sigma UI.

  1. At the top left of the Sigma UI click on the small crane icon to go to the home page.

Navigate Home

  1. Let's create a new Dataset and connect it to the "Sigma Sample Dataset". Click on the "Create New" at the top left of the page and then select "Dataset".

Create Dataset

  1. On the next page we need to select a data source for the new Dataset. Under the "Table" option, click the blue "Select" button.

Datasource Options

  1. On the left of the page ensure that "Snowflake PC_Sigma_WH" is selected in the Connection drop-down. Then expand the "Plugs_DB" and select the "Transactions" table. You will get a preview of the table. Next click ‘Get Started' in the upper right hand corner to begin our data modeling.

Connection Navigation

  1. On the top bar first select "Worksheet", then navigate to the database symbol in the top right and select it, finally select the "+" symbol below the database symbol.

Worksheet Tab

  1. On the next page we need to select a data source for us to join the "Transactions" table. Under the "Table" option, click the blue "Select" button.

Datasource Options

  1. Navigate and select the "Customer" table under the "Plugs_DB database". Here you can select which columns you want to bring into the worksheet. We will leave both columns selected. Click Next.

Connection Navigation

  1. Here you now have many options to decide how you would like to bring in the data. Sigma will do its best to infer the correct join columns based on data type as well as the names of the columns. You can select the type of join you wish to perform, the columns you wish to join to and optionally you can use formulas or multiple join keys if necessary. In this case we will select a Left Join and the join key will remain "CUST_KEY" for both tables. Click Done.

Join Window

  1. Sigma has now joined the tables as defined, and pulls in the fields that we selected. You will notice that Sigma can natively identify variant data types based on the icon of the column for Cust JSON. What this allows us to do is quickly pull out the key values pairs in that json very quickly. Select the dropdown from the Cust Json header and select ‘Extract Columns'.

Extract JSON Drop-down

  1. Next we can select the key value pairs we wish to pull out. Select the fields ‘AGE_GROUP', ‘CUST_GENDER', and ‘LOYALTY_PROGRAM' and Click Confirm.

JSON Schema Modal

  1. Now that we have joined the tables and extracted the JSON data we can publish our dataset by clicking the blue "Publish" button in the top right and start our analysis.

Expanded JSON Result

Footer

Sigma Workbook

  1. At the top left of the Sigma UI click on the small crane icon to go to the home page.

Go Home

  1. We are now back on the Sigma home page. Let's now create a new Workbook and connect it to the Dataset that we just created. Click on the "Create Workbook" at the top left of the page.

Create Workbook

  1. On the next page we need to select a data source for the new workbook. On the left hand side first select the "Table" option, then select "Tables and Datasets".

Add Table Element

Select Source

  1. On the resulting page select "My Documents" this is where our Dataset that we just created was saved.

Navigate to Dataset

  1. On the left of the page expand the select the "Transactions" Dataset. You will get a preview of the data. Next click ‘Done' in the upper right hand corner to begin our analysis.

Preview Dataset

  1. Notice how Sigma has a spreadsheet like interface that makes it easy to quickly get a glimpse of what data we have and allows for rapid development of workbooks and datasets. The average business user can use our workbooks to explore data and self-serve in a governed and secure way.

Review Loaded Table

Spreadsheet-like Interface Sigma is unique in empowering users to do analysis in an interface they already know and love: the spreadsheet. It also uses familiar functions found in spreadsheets. No need for SQL knowledge as the Sigma interface automatically, and behind the scenes, converts all user actions into optimized SQL. This interface helps speed up user adoption and success, especially with non-technical business users.

Footer

Sigma Workbook Analysis

Today we will be looking at our fictitious company Plugs Electronics data. It contains the retail transactions from our stores across the country as well as some attributes around our customers making the purchase. We will create some calculations, join tables, parse json, and build some visualizations with the ultimate goal of creating and embedding a dashboard for our Regional Sales Managers to get insights into the performance of their stores.

  1. First, if you select the column header above the "Cost" column and select "Add New Column".

Add Column

  1. This will take us to the formula bar where we can write formulas. Lets type the function "[Quantity] * [Cost]" and click enter to get our Cost of Goods Sold. Double click the column header to rename the column to "COGS". We can also use the quick select $ left of the formula bar to format the column as a currency.

Define Column

  1. This was a simple function, but if you click on the "ƒx" icon to the left of the formula bar, you can see a full list of all the functions Sigma supports. Sigma also provides helpful tooltips and autocomplete within the formula bar to help guide users how to use the functions.

Formula Bar

  1. Every action we take in Sigma produces machine generated ANSI compliant SQL that is pushed down to Snowflake ensuring the data is secure and up to date at all times. Sigma never extracts from, or modifies data in, the cloud data source. You can see the queries we are generating by clicking the dropdown next to the refresh button on the top right and selecting "Query History".

Query History

  1. Next, if you select the column header above the "Price" column and select "Add New Column".

Add Column

  1. This will take us to the function bar where we can write formulas. Lets type the function "[Quantity] * [Price]" and click enter to get our Revenue. Double click the column header to rename the column to "Revenue". We can also use the quick select $ icon at the top to quickly format as a currency.

Define Column

  1. Let's add one more column next to Revenue by selecting the column header and click "Add New Column".

Add Column

  1. This will take us to the function bar where we can write formulas. Lets type the function "[Revenue] - [COGS]" and click enter to get our Profit. Double click the column header to rename the column to "Profit".

Define Column

Note: All workbooks are considered purely exploratory until you, as their creator, actively save their first version. This means you have one central location to start both your ad hoc analysis and reporting. Once you begin exploring your data, you can choose to leave the unsaved workbook behind, or you can save it and continue to build it out as a report.

Create Visualizations

It is often easier to spot trends, outliers, or insights which lead to further questions when viewing data in a visualization. Sigma makes it easy to create visualizations of your data while also enabling you to dig into the data that makes up that visualization.

  1. Start the creation of a visualization by selecting the table that we just built, then click the "Create child element" icon on its top right corner and select "Visualization" to start creating a new chart.

Child Element

  1. This will add a Visualization below our table. In the left-hand bar you will see a dropdown that lists all of the supported visualizations.

Viz Type

  1. Select the bar chart. On the X-Axis click the plus button and add our "Store State" column. Notice you can search here for the column you wish to add.

Select Column

  1. We can also drag values onto the Axes instead of using the add button. If we look at our columns on the bottom, find the "Profit" column and drag it under the Y-Axis. The value will automatically aggregate and become "Sum of Profit".

Y-axis

  1. Under the X-Axis header, click the dropdown next to "Store State" and go to Sort then "Sum of Profit". This will put the x-axis in ascending order. Click it one more time to swap to descending order.

Sort Bar Chart

  1. Now let's look at our sales over time to get an understanding of how we are trending. Another way to create a new chart is by selecting the ‘+' icon on the top left panel next to the ‘Page Elements' title. Click on the ‘+' icon to get a list of elements that we can add to our canvas and choose ‘Viz'.

Add Element

Viz

  1. After selecting the ‘Viz' icon, you will be prompted to select a source to use for that ‘viz'. You can see tabs for selecting:

Select Data Source

  1. Click the visualization dropdown and select "Line". Next, drag the "Date" column into the X-Axis. (Optionally, add it using the ‘+' icon next to the x-axis.)

X-Axis

  1. Notice that Sigma has defaulted to change the aggregation to "Day" as the title now reads "Day of Date". We can change this aggregation at any time using the dropdown next to the field name, and selecting a new aggregation level under the "Truncate date" submenu. Let's change the aggregation level to be "Month".

Set Aggregate

  1. Next we can place our "Revenue" column on the Y-Axis to see our revenue over time. Again, Sigma has automatically summed the revenue to the monthly level.

Y-Axis

  1. We now have a line graph with revenue by month. Let's add some more detail by breaking the series out by store region. To do this add "STORE_REGION" to the color grouping section in the left sidebar.

Line Graph

  1. Let's create one more visualization around our customer, and regions. Again, select the "+" icon on the top left of the screen and then select "Viz".

Add Element

Viz

  1. For the data source, go to the In Use tab and select the Workbook Element "TRANSACTIONS".

Select Data Source

  1. Next drag "Product Family" to the X-axis, and "Cust Key" to the Y axis. Click the dropdown on the "Cust Key" value under the Y-axis, go to "Set aggregate", and change the aggregation to "CountDistinct".

Change Aggregate

  1. Let's break out our customers by region. To do that drag "Store Region" to the color section. Then, unstack the bars by changing the bar chart clustering option right below the chart type drop-down.

Add Color

  1. Finally let's create a pivot table around our customer demographics. Again, select the "+" icon on the top left of the screen and then select "Viz".

Add Element

Pivot Table

  1. For the data source, go to the In Use tab and select the Workbook Element "TRANSACTIONS".

Select Data Source

  1. Next drag "Product Family" to the Pivot Rows, "CUST_Gender" to the Pivot Columns, and "Profit" to the Values section.

Pivot Chart Config

  1. We now have a breakdown of our profit by product family and customer gender. Now that we have built out our workbook let's go ahead and save it. Click the blue "Save As" button in the top right and name your report.

Save Workbook

Footer

Create Filters

  1. Next, let's add a filter to this data. We will do this by adding a control element to our canvas. Controls enable interactions with the chart such as filter the charts when in use. Clicking the "+" icon on the upper left hand pane next to "Page Elements", select "Date Range". This will add a Date Range control to the canvas.

Control Elements

We will also see options for other Control Elements:

  1. After adding the "Date Range" control, let's drag it to the top of the page and update the control_id to say "Date-Range" and update the control label to say "Select a Date Range"

Set Date Range

  1. Next we need to tell the control which elements we want the control to apply to. Clicking on the filter control, we have some options in the left hand pane. Select "Targets". Then select "Add Target". Next click on "Add Target" and select the "Transactions" data source.

Set Target

  1. One additional way to create a page control is by starting with a filter. Sigma can easily start with a filter and convert the filter to a page control. To try this out, right click on the drop down next to the column "Store State" and select the "Filter" option from the menu

Add Filter

  1. A new filter will be added to the top of the page.

Filter Drop-down

  1. Click on the Kebab menu to the right of the "Store State" Filter and select "Convert to Page Control"

Convert Filter

  1. The filter will be added as a Page Control to the canvas. Move it to the same level as the Date Range control element by clicking and holding on the double kebab menu at the top right, and dragging it to the right of the Date Range filter
  2. Since this filter started out with a target, there is no need to add one. However, if there were additional data elements on the page that you want this filter to control, now would be the time to add those elements as targets.

Cleaning Up The Canvas

  1. To start, let's move our base data table to another page. This is done by selecting the options button (the three-dots icon) in the top right of the table and clicking on "Move to page" -> "New page".

  1. Lets also rename page 1 to "Visuals" and page 2 to "Data" by double clicking on their existing names.

  1. Next, let's move our other charts and filters around on the "Visuals" page.
    Hover over the line chart until you see a hand icon appear. We can grab this chart and move it to the right hand side of our bar chart. You can also resize any of the charts to make them fit as you need.

  1. Next, let's give this dashboard some context for our users. Using text elements, we can give the page a title, as well as descriptions to let users know what this dashboard is used for.

Click the "+" icon on the upper left-hand pane below the Sigma logo, then select "Text". This will add a text box to the canvas.

You will also notice other UI elements available to you here:

  1. Drag the text box to the top of the canvas and enter the text "Sales Performance". Set the text style to be "Large Heading." You can also adjust the font, color, and sizing as you see fit. If you'd like to add a description as well, you can enter that under the title

  1. Finally, let's go ahead and click "Publish" to save our work.

Footer

In this lab we went through the process of setting up up a Snowflake free trial, and a Sigma free trial through Snowflake partner connect. We leveraged an external stage in Snowflake to copy data into tables. Finally we did some light modeling, and an analysis resulting in a "Sales Performance" dashboard.

The seamless transition from Snowflake to Sigma enables data professionals to iterate quickly and with ease. This sets Sigma up uniquely to not only provide a robust client facing tool but also a great sand box for data engineers to profile and experiment with freshly landed data in Snowflake.

What we've covered

Helpful Resources

Footer