In this Quickstart, you'll build a chatbot that allows you to chat with the data shared in a Snowflake Cortex Knowledge Extension.
Snowflake Cortex Knowledge Extensions allow teams to integrate up-to-date, licensed third-party information, such as news, research, and specialized publications, directly into their AI systems. This integration respects publishers' intellectual property while enabling access to relevant and current knowledge.
We'll build the chatbot using a few key Snowflake features:
Here's how we'll use each feature:
Snowflake Marketplace
We'll obtain the Cortex Knowledge Extension from Snowflake Marketplace and load it into a Snowflake account.
Snowflake Cortex Knowledge Extension
We'll load the Snowflake Cortex Knowledge Extension into our account after locating it in the Snowflake Marketplace. This extension will have the relevant data that can be used with our chatbot.
Snowflake Cortex AI
We'll use Snowflake Cortex AI's COMPLETE to generate and return responses in the chatbot. We'll use it in combination with a prompt we provide, and an LLM hosted natively in Snowflake.
Streamlit in Snowflake
We'll build the application's front-end using Streamlit in Snowflake.
Cortex Knowledge Extensions (CKEs) allow publishers to bring their documents (e.g. news articles, market research reports, books, articles, etc.) to customers in their generative AI applications (e.g. chat interfaces, agentic platforms, etc.).
CKEs are shared Cortex Search Services on the Snowflake Marketplace that integrate with Cortex generative AI applications following the retrieval augmented generation (RAG) pattern.
Here is how it works:
To complete this lab, you'll need a Snowflake account. A free Snowflake trial account will work just fine. To open one:
If you choose a different cloud provider, or select a different region within AWS, you will not be able to complete the Quickstart.
In a Cortex Knowledge Extension, a provider securely shares chunked data, along with other important data attributes – like the source of the data, as an example – which can be used in RAG chatbots. Although a consumer can use the data in the extension, the data is fully secure and remains proprietary to the publisher.
Here's how the chatbot will function:
We're going to build a chatbot that allows you to ask questions of the official Snowflake documentation in natural language. This is possible because Snowflake has published an official Cortex Knowledge Extension that includes chunked data (technical documentation) which is usable in a RAG chatbot. The extension is available on Snowflake Marketplace, so let's start by loading it into our account.
Let's dive in!
Let's start by "loading" the relevant Cortex Knowledge Extension into our account. It turns out that "loading" is really the wrong word here.
We're using Snowflake's unique agentic product sharing capability in Snowflake Marketplace. Because of this, we don't actually need to copy any data to our Snowflake account with any logic. Instead, we can directly access the extension shared by a trusted provider in Snowflake Marketplace. Let's start.
Let's explore the extension to understand how it works.
USE ROLE accountadmin;
USE DATABASE cortex_knowledge_extension_snowflake_documentation;
USE SCHEMA shared;
USE WAREHOUSE compute_wh;
DESCRIBE CORTEX SEARCH SERVICE cke_snowflake_docs_service;
You should see some metadata on the CKE_SNOWFLAKE_DOCS_SERVICE service in the account. Important to note is the search column, named CHUNK
, and the additional attribute columns DOCUMENT_TITLE
, SOURCE_URL
.
The CHUNK
column represents the chunked knowledge that can be used in our RAG chatbot, and the attribute columns allow us to build references to the source knowledge. For example, we can use them to cite the docs page that the end-user can use to confirm the correctness of the answer.
select
snowflake.cortex.search_preview(
'cortex_knowledge_extension_snowflake_documentation.shared.cke_snowflake_docs_service',
'{ "query": "What is a table in Snowflake?", "columns": ["chunk","document_title", "source_url"] }');
In the console, you'll see a JSON object returned with relevant data corresponding to the attribute columns, and the relevant chunk of data containing the knowledge that can answer the question posed to the extension. Great job! Now let's build the app.
Let's now build the chatbot using Streamlit in Snowflake.
USE ROLE accountadmin;
USE WAREHOUSE compute_wh;
CREATE OR REPLACE DATABASE chatbot_db;
CREATE OR REPLACE SCHEMA chatbot_schema;
The application should run and be ready for you to enter a question. Try it out by asking the following questions:
On the left, you can configure more options for the app, like:
Congratulations! You've built a RAG chatbot that uses a Cortex Knowledge Extension.
You built a RAG chatbot that allows you to chat with the Snowflake Documentation in natural language. You were able to do this by using a Cortex Knowledge Extension, which contained the relevant knowledge (data) to be able to answer questions about the Snowflake docs. You also used Snowflake Cortex AI's COMPLETE function and Streamlit in Snowflake.
For more resources, check out the following:
Additional Cortex Knowledge Extensions