This lab introduces you to the user interface and embedding capabilities of Sigma Workbooks. This lab does not get into complex data analysis use cases for Sigma, but is designed to showcase the type of capabilities available in an application development use case with Sigma.
- This lab is designed for semi-technical users who will be aiding in the planning or implementation of Sigma. No SQL or technical data skills are required for this lab.
What You Will Learn
- How to ingest data into Snowflake & connect to Sigma
- How to leverage Sigma functions for data prep
- How to build a workbook and visualizations
- How to embed a workbook into your application
- How end users of the application can explore data and generate new insights in a self-serve fashion with Sigma
What You'll Need
- Access to a Snowflake trial account on AWS or a Snowflake instance on AWS in which you have Account Admin permissions.
- Download Portal Template: App_embed.zip
- Download and Install Node.js
- Download SQL Script: Sigma_VHOL.sql
What You'll Build
- In this lab you will build a sales performance portal that live queries against data in Snowflake and provides unique drill down and exploration capabilities to the end users through embedding Sigma.
Prepare Your Snowflake Lab Environment
- If not done yet, register for a Snowflake free 30-day trial at https://trial.snowflake.com
- You will have different Snowflake editions (Standard, Enterprise, e.g.), cloud providers (GCP, AWS, or Azure), and regions (Us Central, Europe West, e.g.) available to you. For this lab, please select AWS as your cloud provider and at minimum enterprise edition.
- After registering, you will receive an email with an activation link and your Snowflake account URL. Bookmark this URL for easy future access.
- Click here and download the "sigma_vhol.sql" file to your local machine. This file contains pre-written SQL commands which will be used later in the lab.
The Snowflake User Interface
Logging into the Snowflake User Interface (UI)
- Open a browser window and enter the URL of your Snowflake 30-day trial environment. You should see the login screen below. Enter your unique credentials to log in.
Navigating the Snowflake Snowsight UI
- From the Worksheets tab, click the "+ Worksheet" button in the top right to open a new worksheet.
- In the left pane you will see the database objects browser, which enables users to explore all databases, schemas, tables, and views accessible by the role selected for a worksheet.
- The bottom pane shows results of queries and operations.
- At the top left of the page click on the downward facing arrow next to the worksheet name, and select "Import SQL from File". Browse to the "sigma_vhol.sql" file you downloaded in the prior module. Click "Open".
- All of the SQL commands you need to run for the remainder of this lab will now appear on the new worksheet. Do not run any of the SQL commands yet.
Provisioning Sigma via Snowflake Partner Connect
- On the left hand side of the Snowflake UI, navigate to Admin, then select partner connect. Click the icon for Sigma.
- You will see a dialog box that will show the objects that will be created in your Snowflake account by Sigma. We will be using the PC_SIGMA_WH warehouse, PC_SIGMA_DB database, and the PC_SIGMA_ROLE for this lab, which are automatically created for you as part of the launch.
- Click "Connect" then "Activate", which will open a new tab. You will be prompted to choose a name for the new Sigma Organization that will be created for you. Once you have chosen a name, click continue. You do not need to worry about the IP whitelisting screen, and can click "Get Started". This will open Sigma in a new tab. Please keep this tab open, as we will return to it later in the lab.
The Lab Story
This Snowflake and Sigma lab will be done as part of a theoretical real world story to help you better understand why we are performing the steps in this lab, and the order they appear.
Throughout the lab, we will be working with some common sales data from our fictitious physical retail company "Plugs Electronics". This data includes order and SKU numbers, product names, prices, store names and regions, as well as customer data. Some of this data is structured and some is JSON (semi-structured).
We will use this data to create a retailer portal where brands who sell their products at Plugs retail locations can log in to see their sales performance across different metrics.
Create a Database and Table
- Go back to the Snowflake UI and click on the worksheets tab. Per the prior steps, ensure the SQL text from the "Sigma_vhol.sql" file has been loaded into the worksheet.
- As you can see from the SQL we loaded into our worksheet, a worksheet can have more than one command in it. SQL commands are delimited by semicolons. In order to run a single command, click anywhere on the line / command you would like to execute, then click the "Run" or "Play" button.
- In this lab, never check the "All Queries" box at the top of the worksheet. We want to run SQL queries one at a time, in a specific order.
- USE ROLE ACCOUNTADMIN; This will set the context of the worksheet to use the role of ACCOUNTADMIN when we run the commands. This role holds the highest level of permission in a given Snowflake account, and can create new objects, grant permissions to other roles, and more.
- USE WAREHOUSE PC_SIGMA_WH; Sets the PC_SIGMA_WH to be used for commands run in the worksheet. As you can see by the (XS) to the right of the warehouse name, an extra small warehouse is being used for this lab. An XS translates to a single node cluster for our virtual warehouse. Here is a link to Snowflake's docs covering warehouses in detail.
- USE DATABASE PC_SIGMA_DB; This command tells Snowflake to operate off the PC_SIGMA_DB database, which was created when your Sigma trial was spun up.
- CREATE SCHEMA IF NOT EXISTS EMBEDDED_LAB; This creates a new schema in our PC_SIGMA_DB database.
- USE SCHEMA EMBEDDED_LAB; This sets the context of the worksheet to use our newly created schema.
- CREATE STAGE IF NOT EXISTS SIGMA_LAB_STAGE URL = ‘s3://sigma-embedded-lab-demo/LabData/'; This creates an external stage in Snowflake that points to an S3 bucket that has the data files we would like to use for the lab.
- LS @SIGMA_LAB_STAGE; This command lists all of the files in the stage we just created.
Loading Data into Snowflake
The data we will be using is demo data for a fictitious retailer called Plugs Electronics. This data has been exported and pre-staged for you in an AWS S3 bucket in the US-East (Northern Virginia) region. The data is in a CSV format, and includes transaction data like order numbers, product names and prices, as well as customer information. This data set is just under 4 million rows.
- CREATE FILE FORMAT IF NOT EXISTS SIGMA_CSV
TYPE = CSV
COMPRESSION = GZIP
FIELD_OPTIONALLY_ENCLOSED_BY = ‘0x27'
NULL_IF= ‘null' ;
- CREATE TABLE IF NOT EXISTS TRANSACTIONS
We have data files in our stage as shown in the previous list (ls) command. These files have certain formats that need to be defined in Snowflake in order for the data to be properly loaded. In this case, we are creating a file format named SIGMA_CSV that is specifying that the data in the files is delimited by commas, has been compressed, and how null values can be determined. We additionally created a table to hold the data we are about to load. More information regarding file formats can be found here.
- COPY INTO TRANSACTIONS FROM @SIGMA_LAB_STAGE FILE_FORMAT = SIGMA_CSV;
- This copies the data from our S3 bucket and loads it into our Transactions table. A SELECT COUNT(*) from the table will show we loaded 3.9 million rows into the table.
- GRANT USAGE ON DATABASE PC_SIGMA_DB TO ROLE PC_SIGMA_ROLE;
- GRANT USAGE ON SCHEMA PC_SIGMA_DB.EMBEDDED_LAB TO ROLE PC_SIGMA_ROLE;
- This allows the PC_SIGMA_ROLE to use the schema we created earlier in the lab.
- GRANT SELECT ON ALL TABLES IN SCHEMA PC_SIGMA_DB.EMBEDDED_LAB TO ROLE PC_SIGMA_ROLE;
- This allows our PC_SIGMA_ROLE to query against the transactions table we created.
- USE ROLE PC_SIGMA_ROLE;
- We completed granting access to the data we ingested to the PC_SIGMA ROLE. This command will now allow the user to start reporting on the data from Sigma using this role.
- SELECT * FROM TRANSACTIONS LIMIT 1000;
- A SELECT * against the transactions table should complete successfully and show the data we have loaded. If not, please go back and re-run the prior steps of this module using the SYSADMIN role to ensure permissions were granted to the new role appropriately.
Connecting Your Workbook to the Dataset
- Navigate to the Sigma tab that was previously opened through partner connect. Select the top left Paper Crane logo to navigate back to the Sigma homepage if you are not there already.
- Let's create a new Workbook and connect it to the data we just ingested into Snowflake. Click on the "+ Create New" button at the top left of the page and select "Create New Workbook".
- We are now inside a Sigma Workbook. Sigma workbooks are a collaborative canvas for data driven decision makers. Each workbook can have one or more pages, and each page has its own canvas. Each canvas supports one or more visual elements (e.g. charts, tables, controls, images, etc).
- We'll first add a new data source to our workbook. Click the "+" icon on the left hand side in the Elements sidebar, then select the "Table" option. For source, select "Tables and Datasets".
- On the resulting page, navigate to "Connections", expand the drop down, and click into "Snowflake PC_SIGMA_WH". Select "PC_SIGMA_DB", then navigate to the EMBEDDED_LAB schema and select the "TRANSACTIONS" table. You will notice Sigma automatically populates a preview of the table. Click "Select" to begin our data modeling.
In this segment we will begin to clean up the data set for our customer portal. We will create calculations, parse JSON, and build visualizations with the ultimate goal of creating and embedding a dashboard for our brand managers to get a sense of how their products are performing in Plugs Electronic's stores.
- First, let's save our workbook as ‘Customer Portal Workbook' by clicking "Save As" at the top of the page. We will then rename our workbook page to "Data" by clicking the down arrow next to "Page 1" in the bottom left of the UI.
- All workbooks are considered purely exploratory until you as their creator actively save the first version. This means you have one central location to start both your ad-hoc analysis and reporting. Once you begin exploring your data, you can choose to leave the unsaved workbook behind, or you can save it and continue to build it out as a report.
- Let's start by formatting our currency columns. Select the columns for Sale Amount, Item Price, and Wholesale Cost, then click the "$" icon by the formula bar. You will see that these columns are now formatted as currency.
- Double click on the Sale Amount column name and rename this column to Revenue.
- Click the drop down to the right of Purchase Date and select "Truncate Date - Day". This will now display the purchase timestamps as the date only. Rename the column to Purchase Date, by double clicking the column name and typing "Purchase Date". Repeat this process for the Cust Since column.
- On the far right side of the table you will see the CUST_JSON column. This column holds JSON data around the customers who made these purchases. Click on the arrow to the right of the column name, and choose "extract columns". This will bring up a window where Sigma has already identified the fields within the JSON object. Select Age Group and Cust_Gender, then click confirm. You will notice that Sigma intuitively parses this data out of the JSON object and adds it to the table as columns.
- To prevent end users from extracting more sensitive customer information, click the arrow next to the Cust_JSON column and select "Hide Column" from the drop down.
- Finally, click the Revenue column once more. For reporting purposes, we are going to remove the decimals so that our total Revenue metrics in the report are more aligned with the industry standard. With the column selected, click the decrease decimal places button in the top tool bar twice to turn this into a whole number.
- Every action we take in Sigma produces machine-generated SQL, optimized for Snowflake, that runs live against the warehouse. This ensures that the data is secure and up to date at all times. You can see the queries we are generating by clicking the dropdown next to the refresh button on the top right and selecting "Query History".
- If we navigate back to our Snowflake environment, we can see the queries being pushed down in our Snowflake query history view as well.
It is often easier to spot trends, outliers, or insights which lead to further questions when viewing data in a visualization. Sigma makes it easy to create visualizations of your data while also enabling you to dig into the data that makes up the visualization.
- Start the creation of a visualization by clicking on the Transaction table that we just built, then clicking the "Create Child Element" icon on the top right corner. Select "Visualization" to start creating a new chart.
- You will see a new visualization element has been created under our table. In the left-hand bar you will see a dropdown that lists all of the visualizations that Sigma currently supports. Select the bar chart.
- On the X-Axis click the plus icon and add our "Store Region" column. Notice that Sigma allows you to search for the column you would like to add. We can also drag values onto the axis instead of using the add button. Find "Revenue" in the list of columns and drag it to the Y-axis. The value will automatically aggregate and become "Sum of Revenue". Double click the header on the bar chart and name it Revenue by Store Region.
- Click the ‘kebab' (3 dots) on the top right hand side of the element. From the drop down, select ‘move to new page'. This will create a new page in our workbook to hold our visualizations. Rename this new page "Customer Portal".
- Now let's look at our sales over time to get an understanding of how we are trending. Another way to create a new chart is by selecting the plus icon on the top of the left hand panel next to our ‘Page Overview' title. Click on this icon to get a list of elements that we can add to our canvas, and choose ‘Viz'.
- After selecting the ‘Viz' icon, you will be prompted to select a source to use for the new visualization. You can see tabs for selecting:
- In Use : sources that are currently being used by other elements in the workbook.
- New : a new source that could be a table, dataset, SQL, or uploaded CSV.
- Page Elements : any data elements already in the workbook, such as the bar chart or table we created.
From the "In Use" tab, select the Workbook Element "Transactions".
- Click on the visualization drop down and select "Line". Next, drag the "Purchase Date" column to the X-Axis. (Optionally add it using the + icon next to the x-axis)
- We previously truncated our purchase date timestamp to the day of date, but can change this aggregation level for the visualization. Using the dropdown next to the field name, select a new aggregation level under the "Truncate Date" submenu. Let's change the aggregation level to be "Month".
- Next we can place our "Revenue" column on the Y-Axis to see our revenue over time. Again, Sigma has automatically summed the revenue to the monthly level.
- We now have a line graph with revenue by month. Let's add some more detail by breaking the series out by customer age group. To do this add "AGE_GROUP" to the color grouping section in the left sidebar.
- You will notice a sharp drop off on the right side of the chart where our data end. Right click on the right most data point, and select "exclude 2022-04". Once this is done, rename the visualization to "Revenue by Month & Age Group".
- Let's create one more visualization around our revenue generated. Again, select the "+" icon on the top left of the screen and select "Viz".
- For the data source, go to the In Use tab and select the Workbook Element "Transactions". For the visualization type, select "Single Value" from the drop down list.
- Next drag Revenue to the value. This will automatically sum the revenue across all transactions. Rename this visualization to Total Revenue by double clicking Sum of Revenue on the left hand side and typing Total Revenue.
- Finally, we want to share some transaction level data with our end users. From our transactions table on the data page, click "Create child element - table". This creates a new table from our Transactions table. Let's sort this table by purchase date descending, so that our most recent transactions are shown first. Finally, move this element to our Customer Portal page.
- Drag and drop the visualizations on the Customer Portal page so that the Total Revenue element is at the top, the line chart and bar graph are side by side below it, and the transactions table is at the bottom.
Let's add a filter to this data. We will do this by adding a control element to our canvas. Controls enable interactions with the charts such as filtering them for specific values. When clicking the ‘+' icon in the upper left hand pane, we will see options for control elements:
- Number Range: Creates a range of values you wish to look at
- List Values: Creates a list of values for users to choose from
- Text Box: Allows users to input free form text
- Switch: Allows users to filter on Boolean (true/false) values
- Drill Down: Allows you to specify specific drill paths
- Date : Allows users to filter for a specific date or date range
- Click the "+" icon on the upper left hand pane, then select "Date". This will add a Date control element to the canvas.
- After adding the "Date" control to our Customer Portal page, let's drag it to the top of the page and update the control_id to say "Date-Range" and update the control label to say "Select a Date Range".
- Next we need to tell the control which elements we want it applied to. Clicking on the filter control, we have some options in the left hand pane. Select "Targets", then choose "Add Target". Select the 3 visualizations we previously created- Revenue by Store Region, Revenue by Month & Customer Age, and Total Revenue.
- On the data page, right click on the drop down next to the column "Product Brand" and select the "Filter" option from the menu. A new filter will be added to the table.
- Click on the kebab menu to the right of the "Product Brand" filter and select "Convert to Page Control".
The filter will be added as a page control to the canvas. This product brand filter is additionally what we will pass into our embed URL to only serve up data related to the brand we are exploring. Since this filter started out with a target, there is no need to add one.
Finishing up the Canvas
In Sigma, you can add a variety of UI elements to your workbook to customize the look and feel. In this section, we will work with text elements to create a dynamic text element as the header for our workbook. When you click the ‘+' in the top left, you will see a variety of elements available to you such as:
- Image: Upload images or links to URLs to show an image on the canvas
- Button: Use buttons to navigate to other workbooks, websites, or download the workbook as a PDF
- Embed: Embed other websites or applications into your workbook
- Spacer: Use to add space between elements on the canvas
- Divider: Use to create hard divisions between sections of the canvas
- To start, navigate to your Customer Portal page, and click "add element". Under UI elements, select "Text".
- We are going to create a dynamic text element as the header for our page. In the text bar type ‘='. This will start the input of a formula. In the formula bar type If(CountDistinct([TRANSACTIONS/Product Brand]) >1, "All Brands", [TRANSACTIONS/Product Brand]). Hit Enter. You should now see a circle stating "All Brands". To the right of this, type "Sales Performance".
- Click your new dynamic text element to open the formatting bar at the top. Select "Large Heading" for the element size, and drag your text element to the top of the page. Finally, click the formatting option to center the element. This dynamic title will adjust based on the user we log into our portal as, and the brand we are exploring.
- On the bottom left, click the down arrow next to your ‘Data' page and select "Hide". This will hide the page with the underlying data set from your end users.
- Click Publish to save these changes.
Building the Application / Portal
We are now going to begin building our portal where we can embed our workbook. This will be a Sales Performance dashboard where the Plugs Electronics family of brands can log in to see how their products are performing in our store.
- First we will need to install Node.js. Node is going to allow us to set up a local server, as well as the front end portal, and securely embed our dashboards with row level security so that brands are not seeing each other's data. Download and install Node.js by going here: https://nodejs.org/
- Note, there are many programming languages and libraries you can use to code a client and server side application, this just happens to be the one we will be using today.
- Once downloaded, double click on the download and go through the installation steps. This should only take a minute.
- Now open the app_embed folder that was provided to you. This holds the shell for the portal we will be building today.
- First, open the two files index.html and server.js in your favorite text editor.
- If we look first at the index. html file, we can see it is just a basic HTML page that calls an API to the backend server. js file. When the server is running, you will be able to access the page by going to the URL http://localhost:3000 in your browser. This is also where you would define the client-facing website if you wanted to customize the look and feel of the portal.
- If we move over to the server.js file, we can start to see what is expected. Sigma requires a variety of parameters to be set when requesting a dashboard. These parameters not only ensure security, but also allow for flexible interaction with filters and parameters within the dashboard.
- <dashboard_embed_path> is the Embed Path that is generated on the dashboard you wish to embed.
- <random_nonce_value> is a random unique string (less than 255 characters). Sigma uses this to prevent application embed URLs from being shared and reused.
- <allow_export_boolean> (optional) is a boolean (true/false) parameter that controls whether the viewer of the dashboard will be able to download data from the dashboard visualizations. If this parameter is not specified, the URL will default to false and viewers will not be able to download data.
- <session_length> is the number of seconds after <unix_timestamp> that the URL will remain valid. After the specified number of seconds, the Application Embed will no longer display new values.
- <unix_timestamp> is the current time as a UNIX timestamp. Sigma uses this in combination with the <session_length> to determine if your link has expired. The URL is valid after the <unix_timestamp> and before the <session_length> expiration.
- <control_id> and <control_value> are the ID and value of a dashboard control you'd wish to pass through to the dashboard. You may pass multiple control IDs and values. This will allow you to customize what your viewers see. Note: All controls must exist on the dashboard. This is to ensure changes to a dashboard do not cause data to be visible unintentionally.
- < mode > Determines the end users permissions. For this lab, it will be set to mode = explore to allow end users to take advantage of Sigma's exploratory capabilities.
We need to install a couple of libraries in order to get this working: express and uuid. These libraries are used to construct a unique signature for your embed URLs when combined with a secret key provided by Sigma. This makes it so that no one is able to ever modify and request the dashboard other than the server.
- Go back to the Finder, right-click on the app_embed folder, and select "New Terminal at Folder".
- Now we can install the needed libraries by issuing the following command: npm install express uuid.
There are two key edits we need to make in order for the server to use our workbook, the Secret Key and the Embed URL of our workbook. We will obtain both of these pieces of information next.
Generating an Application Embedding Secret in Sigma
- If we go back to Sigma, we can generate our secret key that is needed for application embedding. You can do this by clicking on your user icon on the top right of the screen and selecting "Administration".
- On the account screen there is an Embedding section with a subheading labeled "Application Embedding". Click on the "Add" button to generate the key.
- You will now get your secret key that can be used for all embedded workbooks. Please make sure to save this key in a secure place. If you lose it, you will have to re-generate a new key.
- Copy this key and place it in the server.js file where it says "YOUR_SECRET_KEY_HERE".
Generating an Embed URL for your Workbook
- Now, if we go back to your Sigma workbook by clicking on the back button in the top left, we can retrieve the embed URL.
- Find the drop-down icon next to the dashboard name in the top header bar and select "Embedding".
- Next, click on the tab labeled "Application(0)" and use the dropdown to generate an embed path for the entire workbook.
- Now it has generated the embed path for your dashboard. Select COPY to copy this URL, then paste it into the server.js file where it says "YOUR WORKBOOK PATH HERE".
- Save your server.js file.
- Once complete, we are ready to start our server. Back in your terminal, you can run the following command to start the server: node server.js
- Now that the server is running, we can visit our portal by going to http://localhost:3000 in our browser.
- You will notice that your workbook shows All Brands Sales Performance as we have not added any row level security yet. We will see changes in this title following the next section.
- Now we might want to put some row level security on this dashboard, so that brands can only see data related to the sale of their own products. Navigate back to your data page in your Sigma workbook.
- On the data page, find the page control we created previously for Product-Brand. When we select it the left panel will show its properties. Find the Control ID and copy the value. It should be a value similar to "Product-Brand".
- Click Publish to save the changes.
- Navigate to your server.js file and un-comment the field that contains the control_id by deleting the "//" before searchParams at the beginning of the line. Here is where we can place the control_id from our workbook and pass a value to set that control. Today, we will hardcode a value, but these can always be set in a more dynamic fashion based on user properties. For more information on more dynamic security for embedding, please see the Sigma docs here. Update control-id to ‘Product-Brand' (or whatever the control ID was labeled in your workbook) and the controlValue to ‘Samsung' as shown in the photos below.
Before the changes in step 1 your file looks like this:
After changes it should look like this:
- Save your server.js file and navigate back to your terminal. Here we need to stop the server by pressing Control + C. This will exit the running server process. We can then start it again with our new configuration by running the command Node server.js.
- Now if you go back to your browser and reload the web page, you should notice that we only see data for Samsung now. You will additionally notice that the dynamic text we created for the header now reads "Samsung Sales Performance" rather than "All Brands Sales Performance".
For more details on how to set up dynamic row-level security in Sigma, refer here.
- Looking at the customer portal, click maximize element in the top right of the Revenue by Store Region bar chart.
- Lets get some additional insights from this dashboard. Drag Product Type to the color category.
- Collapse the bar chart by clicking minimize element in the same place you clicked to expland it.
- Next, scroll down to the Transactions table and click Maximize element.
- Scroll right to the store region column. Click the drop down arrow and select "Group Column". Then, scroll to the Order Channel column, and again select "Group Column". We can now see that our data has been aggregated at two levels, first the store region, then the order channel.
- Click the drop down next to Order Channel and select "Add new column".
- In the formula bar, type CountDistinct([Order Number]). Rename this column Number of Orders.
- Again, click the drop down next to Order Channel and select "Add Column". This time, type Sum([Revenue]) in the formula bar. What we can now see is Revenue generated for a specific region by the Order Channel the purchase was made from.
- With the Sum of Revenue column selected, click the paintbrush to the right of the table icon in the left hand tool pane.
- Select "Conditional Formatting", then click "Data Bars".
- Click the minus to the left of Order Channel to collapse the view at the aggregate level. You should now see revenue generated by different order channels across regions. Based on this view, we can tell that the South region has the largest amount of revenue tied to in store purchases (Brick & Mortar order channel) and that we might want to focus our in store marketing efforts here.
- Minimize the element using the arrows in the top right to collapse this new visualization back into the larger page.
Thank you for your participation in this hands-on lab. To learn more about how real businesses are leveraging Snowflake & Sigma for embedded use cases, check out our webinar here : How iPipeline Leverages Snowflake and Sigma
What We Have Covered
- Creating objects in Snowflake and ingesting data
- Connecting Sigma to Snowflake
- Creating workbooks, visualizations, and filters
- Embedding Sigma workbooks into an application
- Adding row level security to an embed
- Exploring an embedded Sigma workbook