Sharing information between departments or business units ("domains") of a company is critical for success. Sharing and consuming data assets is more successful if data is shared as a product. A data product is a collection of related data objects plus metadata, such as a business description, ownership and contact information, service level objectives, data dictionary, and more. In a Data Mesh, data products are typically also subject to various data management and organizational principles.
Snowflake Internal Marketplace enables companies to publish documented and governed data products, so they are discoverable and understandable for data consumers. Optionally, data quality metrics and SLOs can be included to make the product more trustworthy. The marketplace also offers rich capabilities to manage access to data products and wrap detailed governance around them to control which consumers can use which data products or which parts of a data product.
The setup instructions for this lab describe all the steps for you to create the 3 accounts, domain profiles, and roles shown in the diagram below.
The internal marketplace exists by default. It does not need to be created. But, you will configure it with provider profiles for the different business units via the organization account. The organization account is a recent Snowflake capability to optionally monitor and manage a set of regular accounts.
The setup follows these steps:
For Steps 2 through 7 you can download scripts here and execute them in different accounts as per the instructions below. In the Snowflake UI you can easily import these scripts like this:
Sign up for a trial account here
admin
sales_admin
that will be used throughout this lab.admin
to your Primary Account from Step 1 and execute the following commands in a worksheet.STEP2_setup_primary_account.sql
-- Run this code in your PRIMARY Account
-- Make sure you update the four variables below (email_var, firstname_var, lastname_var, and pwd_var)
USE ROLE accountadmin;
-- Use the same name and email for all accounts
set email_var = 'FILL_IN_YOUR_EMAIL';
set firstname_var = 'FILL_IN_YOUR_FIRST_NAME';
set lastname_var = 'FILL_IN_YOUR_LAST_NAME';
-- Use the same password for users in all accounts
set pwd_var = 'FILL_IN_YOUR_PASSWORD';
CREATE OR REPLACE WAREHOUSE compute_wh WAREHOUSE_SIZE=xsmall INITIALLY_SUSPENDED=TRUE;
GRANT ALL ON WAREHOUSE compute_wh TO ROLE public;
-- Create a user and role for the sales domain:
USE ROLE accountadmin;
CREATE OR REPLACE ROLE sales_data_scientist_role;
SET my_user_var = CURRENT_USER();
ALTER USER identifier($my_user_var) SET DEFAULT_ROLE = sales_data_scientist_role;
CREATE OR REPLACE USER sales_admin
PASSWORD = $pwd_var
LOGIN_NAME = sales_admin
DISPLAY_NAME = sales_admin
FIRST_NAME = $firstname_var
LAST_NAME = $lastname_var
EMAIL = $email_var
MUST_CHANGE_PASSWORD = FALSE
DEFAULT_WAREHOUSE = compute_wh
DEFAULT_ROLE = sales_data_scientist_role
COMMENT = 'Sales domain admin';
GRANT ROLE sales_data_scientist_role TO USER sales_admin;
GRANT ROLE accountadmin TO USER sales_admin; -- for simplicity in this lab
GRANT CREATE SHARE ON ACCOUNT TO ROLE sales_data_scientist_role;
GRANT CREATE ORGANIZATION LISTING ON ACCOUNT TO ROLE sales_data_scientist_role;
-- Next, create a user and role for the marketing domain:
USE ROLE accountadmin;
CREATE OR REPLACE ROLE marketing_analyst_role;
CREATE OR REPLACE USER marketing_admin
PASSWORD = $pwd_var
LOGIN_NAME = marketing_admin
DISPLAY_NAME = marketing_admin
FIRST_NAME = $firstname_var
LAST_NAME = $lastname_var
EMAIL = $email_var
MUST_CHANGE_PASSWORD = FALSE
DEFAULT_WAREHOUSE = compute_wh
DEFAULT_ROLE = marketing_analyst_role
COMMENT = 'Marketing domain admin';
GRANT ROLE marketing_analyst_role TO USER marketing_admin;
GRANT CREATE SHARE ON ACCOUNT TO ROLE marketing_analyst_role;
GRANT CREATE ORGANIZATION LISTING ON ACCOUNT TO ROLE marketing_analyst_role;
USE ROLE orgadmin;
GRANT MANAGE LISTING AUTO FULFILLMENT ON ACCOUNT TO ROLE sales_data_scientist_role;
GRANT MANAGE LISTING AUTO FULFILLMENT ON ACCOUNT TO ROLE marketing_analyst_role;
While waiting for the email, you can go ahead and run the following parts.
Now, run the following commands to create the next two accounts that you need.
-- Run this code in your PRIMARY account
-- Create a secondary account in the same region (default!):
USE ROLE orgadmin;
CREATE ACCOUNT hol_account2
admin_name = supply_chain_admin
admin_password = $pwd_var
first_name = $firstname_var
last_name = $lastname_var
email = $email_var
must_change_password = false
edition = enterprise;
-- Create an organization account for admin purposes:
CREATE ORGANIZATION ACCOUNT hol_org_account
admin_name = org_admin
admin_password = $pwd_var
first_name = $firstname_var
last_name = $lastname_var
email = $email_var
must_change_password = false
edition = enterprise;
-- Get an overview of all the accounts in the organization.
-- This SHOW command should return 3 rows:
SHOW ACCOUNTS;
HOL_ACCOUNT2
In a separate browser tab, log in to the account you created in step 1 (HOL_ACCOUNT2
) and set up this account.
supply_chain_admin
user to your account HOL_ACCOUNT2
from Step 2 and execute the following commands in a worksheet (Use the code below or download it from the file STEP3(HOL_ACCOUNT2)_setup_hol_account2.sql
)-- Run this in hol_account2, logged in as supply_chain_admin user
-- Make sure you run this as ACCOUNTADMIN
USE ROLE accountadmin;
CREATE OR REPLACE WAREHOUSE compute_wh WAREHOUSE_SIZE=xsmall INITIALLY_SUSPENDED=TRUE;
GRANT ALL ON WAREHOUSE compute_wh TO ROLE public;
CREATE ROLE supply_chain_admin_role;
GRANT ROLE accountadmin TO ROLE supply_chain_admin_role; -- for simplicity in this lab
GRANT ROLE supply_chain_admin_role TO USER supply_chain_admin;
ALTER USER supply_chain_admin
SET DEFAULT_ROLE = supply_chain_admin_role;
USE ROLE supply_chain_admin_role;
CREATE DATABASE supply_chain_db;
Login to the Organization Account HOL_ORG_ACCOUNT
as the org_admin
user and execute the following commands in a worksheet.
You can also download code below from the file STEP4(HOL_ORG_ACCOUNT)_configure_org_account.sql
from the repository:
-- Login to the Organization Account HOL_ORG_ACCOUNT and execute the following commands in a worksheet.
USE ROLE accountadmin;
CREATE OR REPLACE WAREHOUSE compute_wh WAREHOUSE_SIZE=xsmall INITIALLY_SUSPENDED=TRUE;
GRANT ALL ON WAREHOUSE compute_wh TO ROLE public;
-- Rename the Primary Account:
USE ROLE globalorgadmin;
-- execute the following two commands together,
-- no other commands in between:
show accounts;
SET my_curr_account = (SELECT "account_name" FROM TABLE(RESULT_SCAN(LAST_QUERY_ID())) order by "created_on" ASC LIMIT 1);
-- View and rename the account:
SELECT $my_curr_account;
ALTER ACCOUNT identifier($my_curr_account)
RENAME TO hol_account1 SAVE_OLD_URL = true;
-- Enable users with the ACCOUNTADMIN role to set up Cross-Cloud Auto-Fulfillment
SELECT SYSTEM$ENABLE_GLOBAL_DATA_SHARING_FOR_ACCOUNT('hol_account1');
SELECT SYSTEM$ENABLE_GLOBAL_DATA_SHARING_FOR_ACCOUNT('hol_account2');
SHOW ACCOUNTS;
-- You should see 3 rows similar to the image below.
-- Make a note of your account names, URLs, and passwords!
Continue working as the org_admin
user in your Organization Account HOL_ORG_ACCOUNT
to create data provider profiles. You will set up profiles for 3 business domains: Sales, Marketing, and Supply chain.
STEP5(HOL_ORG_ACCOUNT)_create_org_profiles.sql
STEP5(HOL_ORG_ACCOUNT)_create_org_profiles.sql
in a worksheet.STEP6(HOL_ACCOUNT1)_create_lab_database.sql
HOL_ACCOUNT1
as the sales_admin
user and run the downloaded script STEP6(HOL_ACCOUNT1)_create_lab_database.sql
script in a worksheetSTEP7a(HOL_ACCOUNT1)_create_sample_listing.sql
hol_account1
as marketing_admin
and run the script STEP7a.STEP7b(HOL_ACCOUNT2)_create_sample_listing.sql
hol_account2
as supply_chain_admin
and run the script STEP7b.Setup is now complete!
In this section you will work in HOL_ACCOUNT1
to create and publish an organizational listing.
The publishing flow consists of 5 steps:
Login in to HOL_ACCOUNT1
as user sales_admin
.
Now let's select the data objects that we want to share in this data product.
Next you set the access control for the data product. Click on the gray +Access Control button.
For this first data product we keep it simple and stick with the defaults:
As a result, every data consumer will need to request access to obtain approval to use the data product. Click on Set up request approval flow to proceed.
You could configure an external workflow engine for the request approval process. But for this lab we choose to Manage requests in Snowflake. The email address for notifications defaults to the one from the Sales profile but could be changed.
After you confirm the approval flow settings, Snowflake prompts you for one more configuration. Here is why: this listing is configured to be discoverable by the entire organization. What if you add another account to the organization but in a different cloud region? Then Snowflake would transparently perform incremental replication to that region to minimize egress cost. As the data provider you can choose the frequency of this replication.
So lets (1) Review the settings, (2) Change the replication interval to daily (1 Days), and then (3) Save the settings for this listing:
Data products should be understandable and trustworthy for data consumers so let's add additional metadata to describe the product (see screenshot below).
ORGDATACLOUD$SALES$ORDER_INSIGHTS
. -- Title: Explore the Order Summary View:
SELECT *
FROM ORGDATACLOUD$SALES$ORDER_INSIGHTS.SF1.ORDER_SUMMARY
LIMIT 100;
-- Title: Use the UDF to obtain order details for one customer:
SELECT customer_name, country, orderkey, orderdate, AMOUNT
FROM TABLE(ORGDATACLOUD$SALES$ORDER_INSIGHTS.sf1.orders_per_customer(60001));
customer
, orders
, and order_summary
to be featured.Click the blue Publish button in the top right corner.
Your data product is now live! You can see it when you navigate to the Internal Marketplace.
In this section you will request access to the new data product for the Marketing domain and the Supply chain domain.
HOL_ACCOUNT1
a s the marketing_admin
user.marketing_admin
user, Snowflake will now prompt you to do so, and you can follow the dialog to resend the verification email.Now let's also request access for the Supply chain team.
HOL_ACCOUNT2
as the supply_chain_admin
user.Let's switch back to the perspective of the data product owner to review and grant the access requests.
HOL_ACCOUNT1
as the sales_admin
user.Switch from Needs Review to Resolved Requests to see the history of requests.
Now that access has been granted, let's go back to the consumer roles:
HOL_ACCOUNT2
as the supply_chain_admin
user. (Keep this tab alive for the rest of the lab.)HOL_ACCOUNT1
as the marketing_admin
user and perform the same steps.What happens when the data owner decides to update the data product?
sales_admin
user in HOL_ACCOUNT1
use schema tpch.sf1;
use role sales_data_scientist_role;
SELECT customer_name, country, orderkey, orderdate, AMOUNT
FROM TABLE(orders_per_customer(60001));
-- Customer 60001 moves from Kenya to Mozambique !
UPDATE customer SET c_nationkey = 16 WHERE c_custkey = 60001;
HOL_ACCOUNT2
as supply_chain_admin
. In the worksheet "Order Insights - Examples" run the second sample query again:-- Use the UDF to obtain the order details for one customer
SELECT customer_name, country, orderkey, orderdate, AMOUNT
FROM TABLE(ORGDATACLOUD$SALES$ORDER_INSIGHTS.sf1.orders_per_customer(60001));
Let's examine some simple techniques for row- and column-level access control across domains.
sales_admin
user in HOL_ACCOUNT1
use schema tpch.sf1;
use role sales_data_scientist_role;
SELECT *
FROM order_summary
LIMIT 100;
The data steward of the Sales domain has requested the following access restrictions:
Implement the following policy to make your data product compliant:
use schema tpch.sf1;
use role sales_data_scientist_role;
CREATE OR REPLACE ROW ACCESS POLICY country_filter AS (country INTEGER)
RETURNS boolean ->
CASE
WHEN current_account_name() = 'HOL_ACCOUNT1'
AND current_role() = 'SALES_DATA_SCIENTIST_ROLE'
THEN true
WHEN current_account_name() = 'HOL_ACCOUNT1'
AND current_role() = 'MARKETING_ANALYST_ROLE'
AND country = 3 /* Canada */
THEN true
WHEN current_account_name() = 'HOL_ACCOUNT2'
AND country = 24 /* USA */
THEN true
ELSE false
END;
ALTER TABLE nation ADD ROW ACCESS POLICY country_filter ON (n_nationkey);
Before we review the impact of this policy on the data consumers, let's look at a different governance requirement that requires column masking.
The data steward of the Sales domain has requested the following data masking to be enforced:
Implement the following policy to make your data product compliant:
CREATE OR REPLACE MASKING POLICY order_mask AS (value INT, cutoff_date DATE) RETURNS INT ->
CASE
WHEN current_account_name() = 'HOL_ACCOUNT1'
AND current_role() = 'SALES_DATA_SCIENTIST_ROLE'
THEN value
WHEN current_account_name() = 'HOL_ACCOUNT1'
AND current_role() = 'MARKETING_ANALYST_ROLE'
AND cutoff_date >= '1996-01-01'
THEN value
WHEN current_account_name() = 'HOL_ACCOUNT2'
AND cutoff_date >= '1996-01-01'
THEN value
ELSE null
END;
ALTER VIEW order_summary ALTER COLUMN order_amount
SET MASKING POLICY order_mask USING (order_amount, o_orderdate);
ALTER TABLE orders ALTER COLUMN o_totalprice
SET MASKING POLICY order_mask USING (o_totalprice, o_orderdate);
ALTER TABLE lineitem ALTER COLUMN l_extendedprice
SET MASKING POLICY order_mask USING (l_extendedprice, l_commitdate);
Let's see how the Supply Chain and Marketing teams are affected by the new policies.
HOL_ACCOUNT2
as supply_chain_admin
.SELECT *
FROM ORGDATACLOUD$SALES$ORDER_INSIGHTS.SF1.ORDER_SUMMARY
LIMIT 100;
HOL_ACCOUNT1
as the marketing_admin
user and execute the same query.As soon as you remove roles or accounts, the listing is no longer accessible for this role and/or account.
In this section we will review further capabilities for managing and monitoring listings as a data product owner or as an organization data steward:
HOL_ACCOUNT1
as the sales_admin
userAs an organization admin you can query the organization_usage.access_history view to audit the access to all data products on the internal marketplace.
HOL_ORG_ACCOUNT
and run the following query to obtain a list of all queries against the ORDER_INSIGHTS data product including the user, role, timestamp, and SQL text of the access as well as the set of provider policies that have governed the access at that point in time.use role globalorgadmin;
use warehouse compute_wh;
select q.account_name,
q.user_name,
q.role_name,
q.query_text,
q.start_time,
q.end_time,
a.direct_objects_accessed,
a.provider_base_objects_accessed,
a.provider_policies_referenced
from SNOWFLAKE.ORGANIZATION_USAGE.ACCESS_HISTORY a,
SNOWFLAKE.ORGANIZATION_USAGE.QUERY_HISTORY q
where a.query_id = q.query_id
and q.query_text ilike '%ORDER_INSIGHTS%'
order by query_start_time desc;
So far this lab has managed listings mainly through the Snowflake UI. But, data owners and data consumers can also work with listings programmatically through the Listing API.
In this section we point you to some of the most commonly used commands. These are not end-to-end exercises to create or alter listings, but we encourage you to experiment with some of these commands.
SHOW LISTINGS;
to list the listings that you own or have permission to manage.DESCRIBE LISTING listing-name;
to obtain additional details for one specific listing.SHOW LISTINGS;
;DESCRIBE LISTING
scroll to the right to the column MANIFEST_YAML and copy its column value to a worksheets.CREATE ORGANIZATION LISTING
with a YAML file and SHARE name to create a new listing programmatically.ALTER LISTING
to make changes to a listing such as:ALTER LISTING
commandSHOW AVAILABLE LISTINGS IS_ORGANIZATION = TRUE;
to list all the internal marketplace listings that your current role is allowed to discover.DESCRIBE AVAILABLE LISTING listing_global_name
to get more details on one particular listing.SHOW AVAILABLE LISTINGS
command.listing global name
is a different kind of listing identifier than the listing name
.CREATE DATABASE name FROM LISTING listing_global_name;
if you operate as a data consumer in a different account and you want to mount the listing as a local database.Congratulations, you completed this Snowflake Internal Marketplace journey! You have seen how data products can be authored, published, requested, consumed, and governed. These are key capabilities for sharing documented and understandable data products across business units with governance and compliance control controls.