Cortex Agents orchestrate across both structured and unstructured data sources to deliver insights. They plan tasks, use tools to execute these tasks, and generate responses. Agents use Cortex Analyst (structured) and Cortex Search (unstructured) as tools, along with LLMs, to analyze data. Cortex Search extracts insights from unstructured sources, while Cortex Analyst generates SQL to process structured data. A comprehensive support for tool identification and tool execution enables delivery of sophisticated applications grounded in enterprise data.
Amazon Q Business is a generative AI assistant that transforms how work gets done in your organization. With specialized capabilities for software developers, business intelligence analysts, contact center employees, supply chain analysts, and anyone building with AWS, Amazon Q helps every employee get insights on their data and accelerate their tasks. Leveraging Amazon Q's advanced agentic capabilities, companies can streamline processes, get to decisions faster, and help employees be more productive.
Cortex Search: A Snowflake service that combines advanced search capabilities to find relevant information within any text data stored in your organization's Snowflake environment. It takes care of all the complex technical processes automatically, allowing organizations to implement powerful search features without worrying about the underlying technical infrastructure. Amazon Q Business Plugin: An integration tool that connects Amazon Q Business with other business services and data sources through secure, customizable connections. These plugins enhance Amazon Q's functionality by allowing it to safely access and work with various company systems and services while maintaining security standards. Amazon Q Business Chat Interface: The main user interface where you can interact with Amazon Q Business through a conversational format to access company information and services. Users can ask questions and make requests using natural language, and the interface provides responses by gathering and synthesizing information from connected business systems.
Users will create an intelligent search system for movie script PDFs and structured movie data using Snowflake Cortex AI and Amazon Q Business. Snowflake Cortex AI will process and index the unstructured PDF movie scripts, making them searchable through advanced text analysis. Amazon Q Business will then provide a natural language interface, allowing users to ask questions about the scripts in conversational language and receive relevant answers. For example, users could ask about specific dialogues, scene descriptions, or character interactions across different movie scripts, and the system will retrieve and present the relevant information. This implementation demonstrates how to transform raw PDF scripts into an interactive, queryable knowledge base that understands and responds to complex questions about movie content.
The end-to-end workflow will look like this:
Ingest data into structured and unstructured data stores then:
You will build an end-to-end copilot workflow on unstructured data in Snowflake
In this section, we'll set up Amazon Q Business , a generative AI-powered assistant that enables natural language interaction with business data. We'll create and configure a Q Business application that will connect with our Snowflake database, enabling intelligent querying of our movie script data. While we're using movie scripts for this example, these same steps can be applied to analyze any type of business documents in your organization.
Congratulations you have now created your first Q Business App!
After clicking Create, you should have been automatically routed to your QBusinessApp home page. Here you can see all of the information regarding your application. Be sure to copy the Deployed URL and store this in a notes page or text file to the side, we will need this later to set up our authorization between Snowflake and Q Business.
Before we continue on, we need to make sure we have configured a user to access our Q Business Application.
Well done! From this section of the lab you have successfully configured your Q Business Application and created a user that has permissions to access the application.
In this section, we'll create the foundation for an AI-powered movie data analysis system. By configuring Snowflake Cortex Search, Cortex Analyst and a Cortex Agent, you'll build a system that can process, store, and intelligently search through movie scripts. Once completed, this setup will allow users to ask natural language questions about movie content and receive relevant answers through Amazon Q Business.
CREATE OR REPLACE DATABASE movielens;
CREATE OR REPLACE SCHEMA movielens.movies;
CREATE OR REPLACE SCHEMA movielens.data;
CREATE OR REPLACE WAREHOUSE workshopwh;
USE DATABASE movielens;
USE SCHEMA data;
CREATE TABLE movies_dashboard (
movie_id NUMBER,
movie_title VARCHAR,
movie_release_year INTEGER,
genre VARCHAR,
user_rating FLOAT,
rating_timestamp TIMESTAMP_NTZ,
user_id NUMBER,
user_firstname VARCHAR,
user_lastname VARCHAR,
user_city VARCHAR,
user_state VARCHAR,
user_country VARCHAR,
user_email VARCHAR,
user_phonenumber VARCHAR,
interaction_timestamp NUMBER ,
interaction_type VARCHAR
);
CREATE OR REPLACE STAGE MOVIEDASHBOARD
URL='s3://hol-qs-bucket/'
FILE_FORMAT = (TYPE = 'csv');
COPY INTO movies_dashboard FROM @MOVIEDASHBOARD/movies_dashboard.csv
FILE_FORMAT=(TYPE = 'csv' FIELD_DELIMITER = ',' SKIP_HEADER = 1);
USE WAREHOUSE workshopwh;
USE DATABASE movielens;
USE SCHEMA data;
CREATE STAGE DOCS
DIRECTORY = ( ENABLE = true )
ENCRYPTION = ( TYPE = 'SNOWFLAKE_SSE' );
Once uploaded you should be able to see your PDF file in your webpage to validate that you have successfully uploaded the movie script.
--Create Table for text data
CREATE OR REPLACE TABLE SCRIPT_TABLE AS
SELECT
'toy-story-script' as doc,
SNOWFLAKE.CORTEX.PARSE_DOCUMENT(@MOVIELENS.DATA.DOCS, 'toy-story-1995.pdf', {'mode': 'LAYOUT'}) as script_text;
-- Create table with chunked text
CREATE OR REPLACE TABLE SCRIPT_TABLE_CHUNK AS
SELECT
TO_VARCHAR(c.value) as CHUNK_TEXT, DOC
FROM
SCRIPT_TABLE,
LATERAL FLATTEN( input => SNOWFLAKE.CORTEX.SPLIT_TEXT_RECURSIVE_CHARACTER (
TO_VARCHAR(script_text:content),
'none',
700,
100
)) c;
SELECT * FROM SCRIPT_TABLE_CHUNK;
Note: The code splits the text into 700-token chunks with 100-token overlaps. These numbers can be adjusted later to optimize your search results. For more details about text processing options, see the Snowflake documentation.
-- Create Search Service
CREATE OR REPLACE CORTEX SEARCH SERVICE SCRIPT_SEARCH_SRV
ON CHUNK_TEXT
ATTRIBUTES DOC
WAREHOUSE = HOL_WH
TARGET_LAG = '30 day'
AS (
SELECT CHUNK_TEXT as CHUNK_TEXT, DOC FROM SCRIPT_TABLE_CHUNK);
CREATE OR REPLACE STAGE models DIRECTORY = (ENABLE = TRUE);
The service automatically updates every 30 days and allows filtering by document name using the DOC attribute..
Well Done! with this upload you have now created a Cortex Analyst service.
In this section we create a stored procedure that passes a Cortex Agent spec to the Cortex API that utilizes the Search and Analyst Services we just created.
CREATE OR REPLACE PROCEDURE CALL_CORTEX_AGENT_PROC(query STRING, limit INT)
RETURNS VARIANT
LANGUAGE PYTHON
RUNTIME_VERSION = '3.9'
PACKAGES = ('snowflake-snowpark-python')
HANDLER = 'call_cortex_agent_proc'
AS
$$
import json
import _snowflake
import re
from snowflake.snowpark.context import get_active_session
def call_cortex_agent_proc(query: str, limit: int = 10):
session = get_active_session()
API_ENDPOINT = "/api/v2/cortex/agent:run"
API_TIMEOUT = 50000
CORTEX_SEARCH_SERVICES = "MOVIELENS.DATA.SCRIPT_SEARCH_SRV"
SEMANTIC_MODELS = "@MOVIELENS.DATA.MODELS/movie_dashboard.yaml"
query = (
"You are an assistant tasked with answering questions about the movie Toy Story. "
"Please summarize and answer this question concisely: " + query
)
payload = {
"model": "claude-3-5-sonnet",
"messages": [{"role": "user", "content": [{"type": "text", "text": query}]}],
"tools": [
{"tool_spec": {"type": "cortex_analyst_text_to_sql", "name": "analyst1"}},
{"tool_spec": {"type": "cortex_search", "name": "search1"}}
],
"tool_resources": {
"analyst1": {"semantic_model_file": SEMANTIC_MODELS},
"search1": {"name": CORTEX_SEARCH_SERVICES, "max_results": limit}
}
}
try:
resp = _snowflake.send_snow_api_request(
"POST", API_ENDPOINT, {}, {}, payload, None, API_TIMEOUT
)
if resp["status"] != 200:
return {"error": resp["status"]}
response_content = json.loads(resp["content"])
return process_cortex_response(response_content, session)
except Exception as e:
return {"error": str(e)}
def clean_text(text):
""" Cleans up unwanted characters and symbols from search results. """
text = re.sub(r'[\u3010\u3011\u2020\u2021]', '', text)
text = re.sub(r'^\s*ns\s+\d+\.*', '', text)
return text.strip()
def process_cortex_response(response, session):
""" Parses Cortex response and executes SQL if provided. """
result = {"type": "unknown", "text": None, "sql": None, "query_results": None}
full_text_response = []
for event in response:
if event.get("event") == "message.delta":
data = event.get("data", {})
delta = data.get("delta", {})
for content_item in delta.get("content", []):
content_type = content_item.get("type")
if content_type == "tool_results":
tool_results = content_item.get("tool_results", {})
for result_item in tool_results.get("content", []):
if result_item.get("type") == "json":
json_data = result_item.get("json", {})
if "sql" in json_data:
result["type"] = "cortex_analyst"
result["sql"] = json_data["sql"]
result["text"] = json_data.get("text", "")
try:
query_results = session.sql(result["sql"]).collect()
result["query_results"] = [row.as_dict() for row in query_results]
except Exception as e:
result["query_results"] = {"error": str(e)}
elif "searchResults" in json_data:
result["type"] = "cortex_search"
formatted_results = []
for sr in json_data.get("searchResults", []):
search_text = clean_text(sr.get("text", "").strip())
citation = sr.get("citation", "").strip()
if search_text:
if citation:
formatted_results.append(f"- {search_text} (Source: {citation})")
else:
formatted_results.append(f"- {search_text}")
if formatted_results:
full_text_response.extend(formatted_results)
elif content_type == "text":
text_piece = clean_text(content_item.get("text", "").strip())
if text_piece:
full_text_response.append(text_piece)
result["text"] = "\n".join(full_text_response) if full_text_response else "No relevant search results found."
return result
$$;
CALL call_cortex_agent_proc('what is the dinosaurs name in toy story?', 5);
Now you can access that stored procedure from external sources like Q for Business!
The final step is setting up OAuth authentication, this creates a secure connection between Snowflake and Amazon Q Business nd ensures that only authorized requests can access your movie script data.
--create custom oauth
CREATE OR REPLACE SECURITY INTEGRATION Q_AUTH_HOL
TYPE = OAUTH
ENABLED = TRUE
OAUTH_ISSUE_REFRESH_TOKENS = TRUE
OAUTH_REFRESH_TOKEN_VALIDITY = 3600
OAUTH_CLIENT = CUSTOM
OAUTH_CLIENT_TYPE = CONFIDENTIAL
OAUTH_REDIRECT_URI = '<Deployed URL>/oauth/callback';
GRANT USAGE on database MOVIELENS to role PUBLIC;
GRANT USAGE on SCHEMA DATA to role PUBLIC;
GRANT USAGE on CORTEX SEARCH SERVICE SCRIPT_SEARCH_SRV to role PUBLIC;
GRANT READ ON STAGE MODELS TO ROLE PUBLIC;
GRANT USAGE ON PROCEDURE CALL_CORTEX_AGENT_PROC(VARCHAR, NUMBER) TO ROLE PUBLIC;
GRANT USAGE ON WAREHOUSE WORKSHOPWH TO ROLE PUBLIC;
DESC INTEGRATION Q_AUTH_HOL;
SELECT SYSTEM$SHOW_OAUTH_CLIENT_SECRETS('Q_AUTH_HOL');
Next, we'll create a Q Business Custom Plugin to connect our Q Business Application with the Cortex Search in Snowflake. This integration enables the Q Business chatbot interface to access and analyze the movie script data stored in Snowflake.
plugin to connect to movie script data from snowflake
.openapi: 3.0.0
info:
title: Cortex Agent via Stored Procedure
version: 1.0.0
servers:
- url: https://SFSENORTHAMERICA-HOL_MATTMARZILLO.snowflakecomputing.com
paths:
/api/v2/statements:
post:
summary: Call Cortex Agent stored procedure
description: Calls the stored procedure MOVIELENS.DATA.call_cortex_agent_proc(query, 5) using the SQL API.
parameters:
- in: header
name: X-Snowflake-Authorization-Token-Type
required: true
description: Customer Snowflake OAuth header
schema:
type: string
enum: ["OAUTH"]
requestBody:
required: true
content:
application/json:
schema:
$ref: '#/components/schemas/QueryRequest'
responses:
'200':
description: Successful stored procedure execution
content:
application/json:
schema:
$ref: '#/components/schemas/QueryResponse'
security:
- oauth2: []
components:
schemas:
QueryRequest:
type: object
required:
- statement
- warehouse
- role
properties:
statement:
type: string
description: The SQL statement to execute
warehouse:
type: string
default: WORKSHOPWH
example: WORKSHOPWH
role:
type: string
default: PUBLIC
example: PUBLIC
example:
statement: CALL MOVIELENS.DATA.call_cortex_agent_proc("What is Toy Story about?", 5)
warehouse: WORKSHOPWH
role: PUBLIC
QueryResponse:
type: object
description: The response returned from the Snowflake SQL API.
properties:
data:
type: array
description: The result rows returned.
items:
type: object
additionalProperties: true
request_id:
type: string
description: ID of the SQL request.
required:
- data
- request_id
securitySchemes:
oauth2:
type: oauth2
flows:
authorizationCode:
authorizationUrl: https://SFSENORTHAMERICA-HOL_MATTMARZILLO.snowflakecomputing.com/oauth/authorize
tokenUrl: https://SFSENORTHAMERICA-HOL_MATTMARZILLO.snowflakecomputing.com/oauth/token-request
scopes:
session:role:PUBLIC: Use PUBLIC role in Snowflake
Some things to note*
Great Job! Your plugin is now ready to use within Q Business to query movie script data from Snowflake
OPTIONAL Duration: 20
This lab introduces participants to Amazon Q in QuickSight, dashboard-authoring capabilities empower business analysts to swiftly build, uncover, and share valuable insights using natural language prompts. Simplify data understanding for business users through a context-aware Q&A experience, executive summaries, and customizable data stories.
Participants will connect the Snowflake table movies_dashboard to Amazon QuickSight to generate an interactive dashboard. This lab covers both personas – Authors (analysts) and Readers (business users/consumers), covering the Amazon Q in QuickSight features:
This section is essential for integrating Snowflake data with Amazon QuickSight, enabling users to leverage QuickSight's visualization and analysis capabilities. By configuring a Snowflake data source and using custom SQL to query the movies_dashboard table, users ensure that the relevant data is accessible for creating interactive dashboards and reports.
Create and refine Dashboard as BI Author
Go to Amazon QuickSight on the console .
SELECT * FROM movies.movies_dashboard;
In Amazon QuickSight, SPICE and Direct Query represent different approaches to data access and analysis. SPICE involves importing data into QuickSight's in-memory engine for faster performance, while Direct Query retrieves data directly from the source in real-time. The choice between them depends on factors like data size, freshness requirements, and performance needs. Refer to the blog: Best practices for Amazon QuickSight SPICE and direct query mode for further information.
We shall proceed as BI Author
Next, let's create a Data Story for the Dashboard. Creating a data story in the dashboard provides stakeholders with insights into how various factors, such as movie ratings, genres, user demographics, and interactions, affect movie performance and user engagement. By typing a descriptive prompt, selecting visuals from the published dashboard, and building the report, users generate a comprehensive narrative that aids in understanding the data and making informed decisions. This step is crucial for creating a meaningful and actionable report that can guide content production, marketing strategies, and user experience improvements, and allows for sharing these insights with others. Please note that Data story drafts are not meant to replace your own ideas or to perform analysis but as a starting point to customize and expand on as needed
This report helps stakeholders understand how different factors such as movie ratings, genres, user demographics, and interactions impact overall movie performance and user engagement. It can guide decisions on content production, marketing strategies, and user experience improvements.
Congratulations, you have successfully extracted relevant insights from your movie dataset in Snowflake, enabling you to make informed business decisions based on the generated report with Amazon Q in QuickSight!
This quickstart is just that, a quick way to get you started with using Amazon Q with Snowflake Cortex, though with this start you are now enabled to extend the quickstart in the below ways: - Scale the workflow to a use case with many documents and use a more robust Cortex Search Service. - Scale Agents to include more robust Analyst services and multiple Analyst and Search Services. - Use a Cortex Q plugin alongside Quicksight to get next level answers on your data that's represented in your dashboards. - Use multiple plugins to Cortex from Q along with AWS service to create a robust web app for getting answers from your data with plain text.
There are some great blogs on Medium regarding Snowflake Cortex and Amazon Services work together: