In this Quickstart guide, you will be help the fictitious food truck company, Tasty Bytes, to identify where their customer experience may be falling short at the truck and business level by leveraging Snowflake Cortex. The company gathers customer reviews across multiple sources and languages to assess their food truck operations. This comprehensive feedback helps them identify areas for improvement, ultimately boosting customer satisfaction and loyalty. Leveraging Snowflake Cortex's advanced language AI capabilities, they can automatically process reviews through real-time translation, generate actionable insights through intelligent summarization, and analyze customer sentiment at scale – transforming diverse, unstructured feedback into strategic business decisions that drive their food truck operations forward.
You will need the following things before beginning:
In this quickstart, you will learn:
You will use Snowsight, the Snowflake web interface, to:
USE ROLE sysadmin;
/*--
• database, schema and warehouse creation
--*/
-- create tb_voc database
CREATE OR REPLACE DATABASE tb_voc;
-- create raw_pos schema
CREATE OR REPLACE SCHEMA tb_voc.raw_pos;
-- create raw_customer schema
CREATE OR REPLACE SCHEMA tb_voc.raw_support;
-- create harmonized schema
CREATE OR REPLACE SCHEMA tb_voc.harmonized;
-- create analytics schema
CREATE OR REPLACE SCHEMA tb_voc.analytics;
-- create tasty_ds_wh warehouse
CREATE OR REPLACE WAREHOUSE tasty_ds_wh
WAREHOUSE_SIZE = 'large'
WAREHOUSE_TYPE = 'standard'
AUTO_SUSPEND = 60
AUTO_RESUME = TRUE
INITIALLY_SUSPENDED = TRUE
COMMENT = 'data science warehouse for tasty bytes';
USE WAREHOUSE tasty_ds_wh;
/*--
• file format and stage creation
--*/
CREATE OR REPLACE FILE FORMAT tb_voc.public.csv_ff
type = 'csv';
CREATE OR REPLACE STAGE tb_voc.public.s3load
COMMENT = 'Quickstarts S3 Stage Connection'
url = 's3://sfquickstarts/tastybytes-voc/'
file_format = tb_voc.public.csv_ff;
/*--
raw zone table build
--*/
-- menu table build
CREATE OR REPLACE TABLE tb_voc.raw_pos.menu
(
menu_id NUMBER(19,0),
menu_type_id NUMBER(38,0),
menu_type VARCHAR(16777216),
truck_brand_name VARCHAR(16777216),
menu_item_id NUMBER(38,0),
menu_item_name VARCHAR(16777216),
item_category VARCHAR(16777216),
item_subcategory VARCHAR(16777216),
cost_of_goods_usd NUMBER(38,4),
sale_price_usd NUMBER(38,4),
menu_item_health_metrics_obj VARIANT
);
-- truck table build
CREATE OR REPLACE TABLE tb_voc.raw_pos.truck
(
truck_id NUMBER(38,0),
menu_type_id NUMBER(38,0),
primary_city VARCHAR(16777216),
region VARCHAR(16777216),
iso_region VARCHAR(16777216),
country VARCHAR(16777216),
iso_country_code VARCHAR(16777216),
franchise_flag NUMBER(38,0),
year NUMBER(38,0),
make VARCHAR(16777216),
model VARCHAR(16777216),
ev_flag NUMBER(38,0),
franchise_id NUMBER(38,0),
truck_opening_date DATE
);
-- order_header table build
CREATE OR REPLACE TABLE tb_voc.raw_pos.order_header
(
order_id NUMBER(38,0),
truck_id NUMBER(38,0),
location_id FLOAT,
customer_id NUMBER(38,0),
discount_id VARCHAR(16777216),
shift_id NUMBER(38,0),
shift_start_time TIME(9),
shift_end_time TIME(9),
order_channel VARCHAR(16777216),
order_ts TIMESTAMP_NTZ(9),
served_ts VARCHAR(16777216),
order_currency VARCHAR(3),
order_amount NUMBER(38,4),
order_tax_amount VARCHAR(16777216),
order_discount_amount VARCHAR(16777216),
order_total NUMBER(38,4)
);
-- truck_reviews table build
CREATE OR REPLACE TABLE tb_voc.raw_support.truck_reviews
(
order_id NUMBER(38,0),
language VARCHAR(16777216),
source VARCHAR(16777216),
review VARCHAR(16777216),
review_id NUMBER(18,0)
);
/*--
• harmonized view creation
--*/
-- truck_reviews_v view
CREATE OR REPLACE VIEW tb_voc.harmonized.truck_reviews_v
AS
SELECT DISTINCT
r.review_id,
r.order_id,
oh.truck_id,
r.language,
source,
r.review,
t.primary_city,
oh.customer_id,
TO_DATE(oh.order_ts) AS date,
m.truck_brand_name
FROM tb_voc.raw_support.truck_reviews r
JOIN tb_voc.raw_pos.order_header oh
ON oh.order_id = r.order_id
JOIN tb_voc.raw_pos.truck t
ON t.truck_id = oh.truck_id
JOIN tb_voc.raw_pos.menu m
ON m.menu_type_id = t.menu_type_id;
/*--
• analytics view creation
--*/
-- truck_reviews_v view
CREATE OR REPLACE VIEW tb_voc.analytics.truck_reviews_v
AS
SELECT * FROM harmonized.truck_reviews_v;
/*--
raw zone table load
--*/
-- menu table load
COPY INTO tb_voc.raw_pos.menu
FROM @tb_voc.public.s3load/raw_pos/menu/;
-- truck table load
COPY INTO tb_voc.raw_pos.truck
FROM @tb_voc.public.s3load/raw_pos/truck/;
-- order_header table load
COPY INTO tb_voc.raw_pos.order_header
FROM @tb_voc.public.s3load/raw_pos/order_header/;
-- truck_reviews table load
COPY INTO tb_voc.raw_support.truck_reviews
FROM @tb_voc.public.s3load/raw_support/truck_reviews/;
-- scale wh to medium
ALTER WAREHOUSE tasty_ds_wh SET WAREHOUSE_SIZE = 'Medium';
CREATE OR REPLACE TABLE CONCATENATED_REVIEWS AS
WITH RANKED_REVIEWS AS (
SELECT
TRUCK_BRAND_NAME,
REVIEW,
ROW_NUMBER() OVER (PARTITION BY TRUCK_BRAND_NAME ORDER BY REVIEW) AS ROW_NUM
FROM TRUCK_REVIEWS_V
),
FILTERED_REVIEWS AS (
SELECT *
FROM RANKED_REVIEWS
WHERE ROW_NUM <= 20
),
AGGREGATED_REVIEWS AS (
SELECT
TRUCK_BRAND_NAME,
ARRAY_AGG(REVIEW) AS ALL_REVIEWS
FROM FILTERED_REVIEWS
GROUP BY TRUCK_BRAND_NAME
),
CONCATENATED_REVIEWS AS (
SELECT
TRUCK_BRAND_NAME,
ARRAY_TO_STRING(ALL_REVIEWS, ' ') AS ALL_REVIEWS_TEXT
FROM AGGREGATED_REVIEWS
)
SELECT * FROM CONCATENATED_REVIEWS;
-- setup completion note
SELECT 'Setup is complete' AS note;
You will use Snowsight, the Snowflake web interface, to create Snowflake notebook by importing notebook.
tb_voc
, schema analytics
and warehouse tasty_ds_wh
You will leverage Translate - one of the Snowflake Cortex specialized LLM functions are available in Snowpark ML:
This is done within the notebook using following code snippet in cell CORTEX_TRANSLATE
.
# Conditionally translate reviews that are not english using Cortex Translate
reviews_df = reviews_df.withColumn('TRANSLATED_REVIEW',when(F.col('LANGUAGE') != F.lit("en"), \
cortex.Translate(F.col('REVIEW'), \
F.col('LANGUAGE'), \
"en")) \
.otherwise(F.col('REVIEW')))
reviews_df.filter(F.col('LANGUAGE') != F.lit("en")).select(["REVIEW","LANGUAGE","TRANSLATED_REVIEW"]).show(3)
-- Add the TRANSLATED_REVIEW column with conditional translation
WITH TRANSLATED_REVIEWS AS (
SELECT
REVIEW,
LANGUAGE,
CASE
WHEN LANGUAGE != 'en' THEN SNOWFLAKE.CORTEX.TRANSLATE(REVIEW, LANGUAGE, 'en')
ELSE REVIEW
END AS TRANSLATED_REVIEW
FROM TRUCK_REVIEWS_V
)
-- Filter rows where the LANGUAGE is not English and select the desired columns
SELECT
REVIEW,
LANGUAGE,
TRANSLATED_REVIEW
FROM TRANSLATED_REVIEWS
WHERE LANGUAGE != 'en'
LIMIT 3;
In this section, you will leverage Snowflake Cortex LLM - Summarize to quickly understand what the customers are saying:
summarized_reviews_df = session.table("CONCATENATED_REVIEWS").select(
F.col("TRUCK_BRAND_NAME"),
cortex.Summarize(F.col("ALL_REVIEWS_TEXT")).alias("SUMMARY")
)
summarized_reviews_df.select(["TRUCK_BRAND_NAME", "SUMMARY"]).show(3)
-- Generate summaries for each truck brand
WITH SUMMARIZED_REVIEWS AS (
SELECT
TRUCK_BRAND_NAME,
SNOWFLAKE.CORTEX.SUMMARIZE(ALL_REVIEWS_TEXT) AS SUMMARY
FROM CONCATENATED_REVIEWS
)
SELECT * FROM SUMMARIZED_REVIEWS;
In this section, you will make use of Snowflake Cortex LLM - ClassifyText to categories reviews to understand:
# To understand whether a customer would recommend food truck based on their review
text_description = """
Tell me based on the following food truck customer review, will they recommend the food truck to \
their friends and family? Answer should be only one of the following words - \
"Likely" or "Unlikely" or "Unsure".
"""
reviews_df = reviews_df.withColumn('RECOMMEND', cortex.ClassifyText(F.col('REVIEW'),["Likely","Unlikely","Unsure"], test_description))\
.withColumn('CLEAN_RECOMMEND', when(F.contains(F.col('RECOMMEND'), F.lit('Likely')), \
F.lit('Likely')) \
.when(F.contains(F.col('RECOMMEND'), F.lit('Unlikely' )), \
F.lit('Unlikely')) \
.when(F.contains(F.col('RECOMMEND'), F.lit('Unsure' )), \
F.lit('Unsure')))
reviews_df.select(["REVIEW","CLEAN_RECOMMEND"]).show(3)
WITH CLASSIFIED_REVIEWS AS (
SELECT
REVIEW,
PARSE_JSON(SNOWFLAKE.CORTEX.CLASSIFY_TEXT(
REVIEW,
['Likely', 'Unlikely', 'Unsure'],
OBJECT_CONSTRUCT('task_description',
'Tell me based on the following food truck customer review, will they recommend the food truck to their friends and family?'
)
)):label::TEXT AS RECOMMEND
FROM TRUCK_REVIEWS_V
)
SELECT * From CLASSIFIED_REVIEWS limit 3;
In this section, you will leverage Snowflake Cortex LLM - Complete to get answers to your specific questions:
question = "What is the number one dish positively mentioned in the feedback?"
summarized_reviews_df = session.table("CONCATENATED_REVIEWS").select(
F.col("TRUCK_BRAND_NAME"),
cortex.Complete(
"mistral-large2",
F.concat(
F.lit("Context: "),
F.col("ALL_REVIEWS_TEXT"),
F.lit(f" Question: {question} Answer briefly and concisely and only name the dish:")
)
).alias("NUMBER_ONE_DISH")
)
summarized_reviews_df.show(3)
-- Gain Learnings from a specific question
WITH GAIN_LEARNINGS AS (
SELECT
TRUCK_BRAND_NAME,
SNOWFLAKE.CORTEX.COMPLETE(
'mistral-large2',
'Context:' || ALL_REVIEWS_TEXT || ' Question: What is the number one dish positively mentioned in the feedback? Answer briefly and concisely and only name the dish:'
) AS NUMBER_ONE_DISH
FROM CONCATENATED_REVIEWS
)
SELECT TRUCK_BRAND_NAME, NUMBER_ONE_DISH FROM GAIN_LEARNINGS LIMIT 3;
Next, you will look at another task specific LLM function in Cortex - Sentiment.
CORTEX_SENTIMENT
.# Understand the sentiment of customer review using Cortex Sentiment
reviews_df = reviews_df.withColumn('SENTIMENT', cortex.Sentiment(F.col('REVIEW')))
reviews_df.select(["REVIEW","SENTIMENT"]).show(3)
SELECT
REVIEW,
SNOWFLAKE.CORTEX.SENTIMENT(REVIEW) AS SENTIMENT
FROM TRUCK_REVIEWS_V
LIMIT 3;
Congratulations! You've mastered powerful customer analytics using Snowflake Cortex, processing multilingual reviews and extracting valuable insights – all while maintaining data security within Snowflake's ecosystem. By leveraging these built-in AI capabilities, you've eliminated the complexity of managing external infrastructure while keeping sensitive customer feedback protected within Snowflake's secure environment.
With the completion of this quickstart, you have now:
Want to learn more about the tools and technologies used in this quickstart? Check out the following resources: