In this guide, we walk through how to use LiveRamp to create an Identity Key on top of Snowflake.
Being able to join, consolidate and measure different data sources that are using different keys (Name, Address, Phone, Email, Cookies, Maids, Connected TVs) by using LiveRamp Identity resolution keys.
Note: To complete this quickstart, you must have access to the full version of LiveRamp Identity Resolution and Transcoding. Please reach out to the LiveRamp team to get access.
Snowflake Account
LiveRamp has a video describing a basic use case as well as a demo of the execution.
For the PII execution mode we will be using the standard process. The Metadata table describes the PII components for use in the process as well as specify the type of identity resolution will be executed. The following is described in more detail for other types here
Keep track of what will be used for the following variables and will be used in the below SQL:
The Completed Metatable will be built using the following SQL. If you do not have a PII component, please remove it from the SQL.
create or replace table identifier($customer_meta_table_name) as
select
    TO_VARCHAR(DECRYPT(ENCRYPT('<client_id>', 'HideFromLogs'), 'HideFromLogs'), 'utf-8') as client_id,
    TO_VARCHAR(DECRYPT(ENCRYPT('<client_secret>', 'HideFromLogs'), 'HideFromLogs'), 'utf-8') as client_secret,
    'resolution' as execution_mode,
    'pii' as execution_type,
    parse_json($$
    {
      "name": ["<up to 4 name column names>"],
      "streetAddress": ["<up to 7 address column names>"],
      "city": "<city column>",
      "state": "<state column>",
      "zipCode": "<zipcode column>",
      "phone": "<phone column>",
      "email": "<email column>"
    }
    $$) as target_columns,
    1 as limit;
To set up the permissions for the tables used for resolution, run the SQL in the Execution Steps worksheet shown below
grant usage on database identifier ($customer_db_name) to application identifier($application_name);
grant usage on schema identifier ($customer_schema_name) to application identifier($application_name);
grant select on table identifier ($customer_input_table_name) to application identifier($application_name);
grant select on table identifier ($customer_meta_table_name) to application identifier($application_name);
use database identifier ($application_name);
use schema lr_app_schema;
More details for this step can be found here
The following SQL will execute the identity resolution process. The stored procedure will. be located in your Snowflake account. It will be listed under the You need to replace the following variable below
call lr_resolution_and_transcoding(
    <customer_input_table_name>,
    <customer_meta_table_name>,
    <output_table_name>
);
When the above process completes you will need to run one more process to get the output table visible in your warehouse. You will use the output_table variable from the initial job invocation
call check_for_output(
	$output_table_name
);
A table should now be visible in your Snowflake warehouse under the LiveRamp native application
Since you have executed a PII Resolution job there are two additional columns that LiveRamp includes in the output table These columns provide the level at which the included PII was resolved against the LiveRamp application. The following columns will be included in an additional columns called __lr_filter_name
The following SQL will use a variable that defines your output from the previous step.
SET final_output = '$output_table_name';
Once you have that table you can check the frequency of each match on your output table.
select __LR_RANK, count(__LR_RANK) as Count1 from identifier($final_output)
group by (__LR_RANK)
order by Count1 desc;
select __LR_FILTER_NAME, count(__LR_FILTER_NAME) as Count1 from identifier($final_output)
group by (__LR_FILTER_NAME)
order by Count1 desc;
--select FILTER_NAME, count(FILTER_NAME) as count1 from identifier($final_output);
--group by FILTER_NAME
--ORDER BY COUNT1 DESC;
select FILTER_NAME, count(FILTER_NAME) as Count1 from  LR_APP_SHARE_RESOLUTION_DEMO.LR_JOB_SCHEMA.DS_TEST_1A_POST_V5
group by (FILTER_NAME)
order by Count1 desc;
It is also recommended to look at the number of maintained and derived identifiers which are based on the first two bytes.
SELECT 
    SUBSTR(RAMPID, 1, 2) AS first_two_chars,
    COUNT(*) AS frequency
FROM 
    identifier($final_output)
GROUP BY 
    SUBSTR(RAMPID, 1, 2)
ORDER BY 
    frequency DESC;
It is always good to validate the output count from your final execution
SELECT COUNT(RAMPID) FROM   identifier($final_output);
More Documentation can be found here
The following SQL will create the correct Translation input table.
CREATE OR REPLACE TABLE <Translation Input Table>
AS SELECT RAMPID , 'ZZZZ' AS TARGET_DOMAIN , 'RampID' AS TARGET_TYPE FROM <Table with a valid RampID Column>;
For the Translation execution mode we will be using the standard process. The Metadata table describes the components for use in the process as well as specify the type of translation that will be executed. The following is described in more detail for other types here
Keep track of what will be used for the following variables and will be used in the below SQL:
The Completed Metatable will be built using the following SQL.
create or replace table identifier($customer_meta_table_name) as
select
    TO_VARCHAR(DECRYPT(ENCRYPT('<client_id>', 'HideFromLogs'), 'HideFromLogs'), 'utf-8') as client_id,
    TO_VARCHAR(DECRYPT(ENCRYPT('<client_secret>', 'HideFromLogs'), 'HideFromLogs'), 'utf-8') as client_secret,
    'transcoding' as execution_mode,
    'transcoding' as execution_type,
    '<column to be translated>' as target_column,
    '<column containing target domain>' as target_domain_column,
    '<column containing target type>' as target_type_column;
To set up the permissions for the tables used for resolution, run the SQL in the Execution Steps worksheet shown below
grant usage on database identifier ($customer_db_name) to application identifier($application_name);
grant usage on schema identifier ($customer_schema_name) to application identifier($application_name);
grant select on table identifier ($customer_input_table_name) to application identifier($application_name);
grant select on table identifier ($customer_meta_table_name) to application identifier($application_name);
use database identifier ($application_name);
use schema lr_app_schema;
More details for this step can be found here
The following SQL will execute the identity resolution process. The stored procedure will. be located in your Snowflake account. It will be listed under the You need to replace the following variable below
call lr_resolution_and_transcoding(
    <customer_input_table_name>,
    <customer_meta_table_name>,
    <output_table_name>
);
When the above process completes you will need to run one more process to get the output table visible in your warehouse. You will use the output_table variable from the initial job invocation
call check_for_output(
	$output_table_name
);
A table should now be visible in your Snowflake warehouse under the LiveRamp native application
This section will describe some sample audits for the Translation process. Each client is encouraged to reach out to their LiveRamp representative for more examples and guidance on audits.
Set Up some variables for use in the SQL below. You will need the following:
-- setting the input table for the rest of the script
set audit_table = 'your Transcoding Output';
set audit_table_output = 'Extr Audit Table';
Once you have set up the variables you can execute the different audits
This first audit is a count where the translation did not work as expected
-- Counts of where the input and output are different.
SELECT COUNT(*) AS count_of_different_lengths
FROM  identifier($audit_table)
WHERE LENGTH(rampid) <> LENGTH(TRANSCODED_IDENTIFIER);
This will create a new table that is the results of several queries
-- create a table of the rows that are different for audit review   
CREATE or REPLACE TABLE identifier($audit_table_output) AS
SELECT *,
       CASE 
           WHEN  TRANSCODED_IDENTIFIER LIKE '%Error, target key missing for transcoding%' THEN 'Target Access Mismatch'
           WHEN  TRANSCODED_IDENTIFIER LIKE '%Error, Invalid RampID Format. Overall.%' THEN 'Data Input Mismatch'
           WHEN LENGTH(rampid) <> LENGTH(TRANSCODED_IDENTIFIER) 
                AND LEFT(rampid, 2) <> LEFT(TRANSCODED_IDENTIFIER, 2) THEN 'Size and Maintained Mismatch'
           WHEN LENGTH(rampid) <> LENGTH(TRANSCODED_IDENTIFIER) THEN 'Size Mismatch'
           WHEN LEFT(rampid, 2) <> LEFT(TRANSCODED_IDENTIFIER, 2) THEN 'Maintained Mismatch'
       END AS mismatch_type
FROM identifier($audit_table)
WHERE LENGTH(rampid) <> LENGTH(TRANSCODED_IDENTIFIER)
   OR LEFT(rampid, 2) <> LEFT(TRANSCODED_IDENTIFIER, 2)
   OR TRANSCODED_IDENTIFIER LIKE '%Error, target key missing for transcoding%'
   OR TRANSCODED_IDENTIFIER LIKE '%Error, Invalid RampID Format. Overall%';
Run a select off the audit table you created
-- quick check of output table
select * from identifier($audit_table_output)
limit 10;
You have now completed the audit of the translation table
You are now ready to create a file for collaboration with your partner. With the ability to generate RampIDs that match your partner's, you now have a shared exchange key. The SQL query below will generate an attribute file by replacing your RampID with the client's RampID. The resulting output table can be shared, exported, or leveraged for further collaborative efforts with your partner.
Making sure you share your identified attributes but using the newly created Translation table you have a key that can be exchanged with your partner
CREATE TABLE NewTable AS
SELECT 
    t1.* EXCLUDE (RampID),
    t2.Transcoded_Identifier
FROM 
    <Your RampID and Attribute Table> t1
JOIN 
    <Your Output from Translation> t2
ON 
    t1.RampID = t2.RampID;
After creating make sure to run a quick select on the table to make sure it has what you expect.
select * from NewTable limit 100;
Congratulations! You have successfully learned how to transform your Data Cloud into an Identity Resolution Engine.
This lab was designed as a hands-on introduction to Snowflake and LiveRamp Identity Resolution and Transcoding to simultaneously teach best practices on how to use them together in an effective manner.
We encourage you to continue with your Identity and Transcoding journey by loading in more of your own sample or production data and by using some of the more advanced capabilities not covered in this lab.
Want to learn more about LiveRamp and Identity? Check out the following resources: