This demo showcases how to transform Google Drive business documents into actionable strategic intelligence using Snowflake's unstructured data processing capabilities.

Festival Operations Demo Dataset

You'll work with a realistic Festival Operations business document collection that includes:

By completing this guide, you will be able to build an end-to-end unstructured data pipeline that ingests documents from Google Drive, processes them through Openflow, and enables intelligent search and analysis using Snowflake Intelligence.

Here is a summary of what you will be able to learn in each step by following this quickstart:

What is Openflow

Openflow is Snowflake's managed service for building and running data pipelines in Snowpark Container Services (SPCS). It provides pre-built connectors and processing capabilities that make it easy to ingest, transform, and analyze data from various sources including unstructured documents.

Openflow SPCS Overview

Key Benefits:

Learn more about Openflow.

What is Snowflake Intelligence

Snowflake Intelligence is an integrated AI capability that enables natural language interactions with your data. It combines large language models with your business context to provide intelligent search, analysis, and insights.

Core Components:

This quickstart will focus on:

What You Will Learn

What You Will Build

Prerequisites

Before starting, ensure you have:

Clone the QuickStart Repository

First, clone the repository to get access to sample documents and SQL scripts:

git clone https://github.com/Snowflake-Labs/sfguide-getting-started-openflow-unstructured-data-pipeline.git
cd sfguide-getting-started-openflow-unstructured-data-pipeline

Repository Contents:

Alternative: Use Snowflake Workspaces with Git Integration

For executing SQL scripts directly in Snowsight, you can import this repository into Snowflake Workspaces:

  1. Sign in to Snowsight and navigate to ProjectsWorkspaces
  2. Select From Git repository
  3. Paste the repository URL: https://github.com/Snowflake-Labs/sfguide-getting-started-openflow-unstructured-data-pipeline
  4. Configure your API integration and credentials
  5. Execute SQL scripts directly in Snowsight without leaving your browser

Learn more about integrating Workspaces with Git.

Create Database and Schema

Log into Snowsight using your credentials to create the necessary database objects.

Open Snowflake Workspaces and run the following SQL commands to create the warehouse, database, schema, and role.

-- Create role and warehouse
USE ROLE ACCOUNTADMIN;
CREATE ROLE IF NOT EXISTS FESTIVAL_DEMO_ROLE;
CREATE WAREHOUSE IF NOT EXISTS FESTIVAL_DEMO_S
  WAREHOUSE_SIZE = SMALL
  AUTO_SUSPEND = 300
  AUTO_RESUME = TRUE;
GRANT USAGE ON WAREHOUSE FESTIVAL_DEMO_S TO ROLE FESTIVAL_DEMO_ROLE;

-- Create database and grant ownership
CREATE DATABASE IF NOT EXISTS OPENFLOW_FESTIVAL_DEMO;
GRANT OWNERSHIP ON DATABASE OPENFLOW_FESTIVAL_DEMO TO ROLE FESTIVAL_DEMO_ROLE;

-- Grant role to current user
SET CURR_USER=(SELECT CURRENT_USER());
GRANT ROLE FESTIVAL_DEMO_ROLE TO ROLE IDENTIFIER($CURR_USER);

-- Switch to demo role and create schema
USE ROLE FESTIVAL_DEMO_ROLE;
USE DATABASE OPENFLOW_FESTIVAL_DEMO;
CREATE SCHEMA IF NOT EXISTS FESTIVAL_OPS;

Enable Required Services (Optional)

Cortex Search and Snowflake Intelligence are available by default in most regions.

For accounts in other regions, you may need to enable cross-region Cortex access:

-- Check current Cortex cross-region setting (requires ORGADMIN role)
SHOW PARAMETERS LIKE 'CORTEX_ENABLED_CROSS_REGION' IN ACCOUNT;

-- Enable cross-region Cortex access if needed (requires ORGADMIN role)
-- This allows your account to use Cortex services from us-west-2
ALTER ACCOUNT SET CORTEX_ENABLED_CROSS_REGION = 'AWS_US';

-- Verify the setting was applied
SHOW PARAMETERS LIKE 'CORTEX_ENABLED_CROSS_REGION' IN ACCOUNT;

Setup External Access Integration

Configure external access for Google APIs to allow Openflow to connect to Google Drive.

First, create a schema for network configuration (or use an existing one):

-- Create schema for network rules
USE ROLE ACCOUNTADMIN;
USE DATABASE OPENFLOW_FESTIVAL_DEMO;
CREATE SCHEMA IF NOT EXISTS NETWORKS;

Now create the network rules and external access integration:

-- Create network rule for Google APIs
CREATE OR REPLACE NETWORK RULE google_network_rule
  MODE = EGRESS
  TYPE = HOST_PORT
  VALUE_LIST = (
    'admin.googleapis.com',
    'oauth2.googleapis.com',
    'www.googleapis.com',
    'google.com'
  );

-- Verify the network rule was created successfully
DESC NETWORK RULE google_network_rule;

Optional: Add network rule for your Google Workspace domain

If you need to access resources from your specific Google Workspace domain, create an additional network rule:

-- Create network rule for your Google Workspace domain
-- Replace 'your-domain.com' with your actual Google Workspace domain
CREATE OR REPLACE NETWORK RULE your_workspace_domain_network_rule
  MODE = EGRESS
  TYPE = HOST_PORT
  VALUE_LIST = ('your-domain.com');

-- Example: For a domain like kamesh.dev
-- CREATE OR REPLACE NETWORK RULE kameshs_dev_network_rule
--   MODE = EGRESS
--   TYPE = HOST_PORT
--   VALUE_LIST = ('kameshs.dev');

-- Verify the domain network rule was created successfully
DESC NETWORK RULE your_workspace_domain_network_rule;

Create the External Access Integration

Now combine the network rules into an external access integration:

-- Create external access integration with Google API access
-- If you created a workspace domain rule, add it to ALLOWED_NETWORK_RULES
CREATE OR REPLACE EXTERNAL ACCESS INTEGRATION festival_ops_access_integration
  ALLOWED_NETWORK_RULES = (
    OPENFLOW_FESTIVAL_DEMO.NETWORKS.google_network_rule
    -- Add your workspace domain rule if created:
    -- , OPENFLOW_FESTIVAL_DEMO.NETWORKS.your_workspace_domain_network_rule
  )
  ENABLED = TRUE
  COMMENT = 'Used for Openflow SPCS runtime to access Google Drive';

-- Verify the external access integration was created with correct settings
DESC EXTERNAL ACCESS INTEGRATION festival_ops_access_integration;

Grant Permissions

Grant necessary permissions to the Openflow admin role:

-- Grant access to the external access integration
GRANT USAGE ON DATABASE OPENFLOW_FESTIVAL_DEMO TO ROLE OPENFLOW_ADMIN;
GRANT USAGE ON SCHEMA OPENFLOW_FESTIVAL_DEMO.NETWORKS TO ROLE OPENFLOW_ADMIN;
GRANT USAGE ON INTEGRATION festival_ops_access_integration TO ROLE OPENFLOW_ADMIN;

-- Verify grants
SHOW GRANTS TO ROLE OPENFLOW_ADMIN;

This section guides you through setting up Openflow SPCS infrastructure and creating a runtime for the Festival Operations document pipeline.

Complete Openflow SPCS Setup

Follow the comprehensive Getting Started with Openflow SPCS quickstart to set up your infrastructure:

Quickstart Guide: Getting Started with Openflow SPCS

This 25-minute setup includes:

Step

Task

Duration

What You'll Create

1

Setup Core Snowflake

10 min

OPENFLOW_ADMIN role, network rules, BCR bundles

2

Create Deployment

5 min

Openflow SPCS deployment with optional event logging

3

Create Runtime Role

5 min

Runtime role with external access integrations

4

Create Runtime

5 min

Active runtime environment ready for connectors

Key Components You'll Set Up:

After Setup: Once you complete the quickstart, you'll have a production-ready Openflow environment. You can then proceed with adding the Google Drive connector for this Festival Operations demo.

Alternative: Use Existing Openflow Infrastructure

If you already have Openflow SPCS set up in your account, you can reuse your existing infrastructure. However, for this quickstart, we recommend using FESTIVAL_DEMO_ROLE to keep naming consistent:

  1. Use existing deployment and runtime - No need to create new ones
  2. Grant FESTIVAL_DEMO_ROLE access - Ensure it has access to festival_ops_access_integration (created in the previous "Setup Environment" section)
  3. Grant integration access - Add the external access integration to FESTIVAL_DEMO_ROLE:
    USE ROLE ACCOUNTADMIN;
    GRANT USAGE ON INTEGRATION festival_ops_access_integration TO ROLE FESTIVAL_DEMO_ROLE;
    

Access Openflow Interface

After completing the Openflow SPCS setup, access the Openflow interface to configure your runtime:

  1. In Snowsight, navigate to Work with data in the left sidebar
  2. Select Ingestion
  3. Click on Openflow (this will open Openflow in a new browser tab)Launch Openflow from Snowsight
  4. Accept the authentication prompt when Openflow opens in the new tab

Configure Runtime for Festival Operations

Now that you have Openflow open, configure your runtime for the Festival Operations pipeline:

Option A: Create New Runtime (Recommended for Demo)

Create a dedicated runtime for this demo:

  1. In the Openflow interface, click on the Runtimes tab at the top
  2. Click Create Runtime button
  3. Configure with these settings:
    • Runtime Name: FESTIVAL_DOC_INTELLIGENCE
    • Runtime Role: FESTIVAL_DEMO_ROLE (created in Setup Environment)
    • External Access Integration: festival_ops_access_integration (created in previous step)
    • Database: OPENFLOW_FESTIVAL_DEMO
    • Schema: FESTIVAL_OPS
    • Warehouse: FESTIVAL_DEMO_S

After creating the runtime, your Openflow interface will look like this:

Openflow SPCS Overview

Option B: Use Existing Runtime

If you already have a runtime from the Openflow SPCS quickstart (e.g., QUICKSTART_RUNTIME):

  1. Grant access to the Festival Operations integration:
    USE ROLE ACCOUNTADMIN;
    GRANT USAGE ON INTEGRATION festival_ops_access_integration TO ROLE YOUR_RUNTIME_ROLE;
    
  2. The runtime will automatically have access to the integration for Google Drive connectivity

Once the runtime is active, add the Google Drive connector (Overview tab in Openflow Home page):

  1. Click Add Connector in your runtime
  2. Search for "Google Drive" in the connector marketplace
  3. Select the Google Drive connectorOpenflow Connectors List
  4. Click Add to RuntimeAdd Connector to Runtime

The connector will be automatically added to your canvas:

Google Drive Connector in Canvas

Prepare Google Drive Location

Before configuring the connector, set up your Google Drive location:

  1. Create Shared Drive: In Google Drive web interface, create a new shared drive. This will be used as your Google Drive ID in the connector configuration.
  2. Create Festival Operations Folder: Inside the shared drive, create a folder named "Festival Operations". This will be used as your Google Folder Name in the connector configuration.

Now configure the Google Drive connector with the following parameters:

Configure Source Parameters

Google Drive Source Parameters

Configure Destination Parameters

Google Drive Destination Parameters

Configure Ingestion Parameters

Navigate to Parameter Contexts from Runtime Canvas:

Navigate to Parameter Contexts

Turn Off Parameter Inheritance (for clarity):

Click the checkbox to disable inherited parameters and show only ingestion-specific settings:

Turn Off Inherited Parameters

Configure the Ingestion Parameters:

Now configure only the ingestion-specific parameters:

Google Drive Ingestion Parameters

Test and Run the Pipeline

After configuring all parameters, you need to enable and start the pipeline by right-clicking on the canvas:

  1. Enable Controller Services: Right-click on the canvas and select Enable all controller servicesEnable All Controller Services
  2. Start the Connector: Right-click on the canvas and select Start to begin the connector and all processorsStart Pipeline

Once started, you should see the connector running with active processors:

Connector Started

What Happens Next

The pipeline will automatically:

Before running the pipeline, you need to prepare the Festival Operations sample documents in your Google Drive.

Verify Document Collection

The repository includes 15 business documents across multiple formats in the sample-data/google-drive-docs/ directory:

sample-data/google-drive-docs/
├── Analysis/
│   └── Post-Event-Analysis-Summer-2024.pptx
├── Compliance/
│   └── Health-Safety-Policy.pdf
├── Executive Meetings/
│   └── Board-Meeting-Minutes-Q4-2024.docx
├── Financial Reports/
│   └── Q3-2024-Financial-Analysis.pdf
├── Operations/
│   ├── Venue-Setup-Operations-Manual-0.jpg
│   ├── Venue-Setup-Operations-Manual-1.jpg
│   ├── Venue-Setup-Operations-Manual-2.jpg
│   └── Venue-Setup-Operations-Manual-3.jpg
├── Projects/
│   └── Sound-System-Modernization-Project-Charter.docx
├── Strategic Planning/
│   ├── 2025-Festival-Expansion-Strategy-0.jpg
│   ├── 2025-Festival-Expansion-Strategy-1.jpg
│   ├── 2025-Festival-Expansion-Strategy-2.jpg
│   ├── 2025-Festival-Expansion-Strategy-3.jpg
│   └── 2025-Festival-Expansion-Strategy-4.jpg
├── Training/
│   └── Customer-Service-Training-Guide.pptx
└── Vendors/
    └── Audio-Equipment-Service-Agreement.pdf

Document Formats: PDF, DOCX, PPTX, JPG - demonstrating true multi-format document intelligence

Google Drive Setup

Complete the document preparation in your Google Drive:

  1. Create Folder Structure: Inside the "Festival Operations" folder you created earlier, create these subfolders matching the sample-data/google-drive-docs/ structure:
    • Strategic Planning/
    • Operations/
    • Compliance/
    • Training/
    • Analysis/
    • Executive Meetings/
    • Financial Reports/
    • Projects/
    • Vendors/
  2. Upload Documents: Drag and drop files from your local sample-data/google-drive-docs/ directory into the corresponding folders

Verify Document Upload

After uploading, verify your Google Drive "Festival Operations" folder contains all 15 documents across multiple formats:

Folder

Document

Format

Strategic Planning

2025-Festival-Expansion-Strategy (5 images)

JPG

Operations

Venue-Setup-Operations-Manual (4 images)

JPG

Projects

Sound-System-Modernization-Project-Charter

DOCX

Financial Reports

Q3-2024-Financial-Analysis

PDF

Compliance

Health-Safety-Policy

PDF

Vendors

Audio-Equipment-Service-Agreement

PDF

Analysis

Post-Event-Analysis-Summer-2024

PPTX

Training

Customer-Service-Training-Guide

PPTX

Format Summary:

Monitor Pipeline Execution

Once the Google Drive Connector starts, you can monitor the pipeline execution directly from the canvas. The processor group displays real-time statistics:

Monitor Pipeline

The animation demonstrates:

  1. Hover over the processor group to see a detailed statistics tooltip with real-time metrics
  2. Click on the "Google Drive (Cortex Connect)" processor group to drill down into the pipeline components
  3. View individual processors including:
    • Google Drive source connector pulling documents
    • Document transformation processors
    • Snowflake destination processors writing to tables
    • Controller services managing the flow
  4. Navigate back to the canvas to see the overall pipeline health
  5. Monitor byte throughput for each stage (In, Read/Write, Out) shown in the statistics panel

Verify Document Ingestion

Check that documents have been successfully ingested using the verification queries:

-- Switch to the correct role and database
USE ROLE FESTIVAL_DEMO_ROLE;
USE WAREHOUSE FESTIVAL_DEMO_S;
USE DATABASE OPENFLOW_FESTIVAL_DEMO;
USE SCHEMA FESTIVAL_OPS;

-- Show all tables created by Openflow connector
SHOW TABLES;

-- Show all stages created by Openflow connector
SHOW STAGES;

Check Auto-Created Tables

The Openflow connector automatically creates several tables for document management:

-- Describe the auto-created tables
DESC TABLE docs_chunks;      -- Document content chunks
DESC TABLE docs_groups;      -- Document groupings
DESC TABLE docs_perms;       -- Document permissions
DESC TABLE doc_group_perms;  -- Group permissions
DESC TABLE file_hashes;      -- File tracking and metadata
DESC TABLE perms_groups;     -- Permission groups

-- View file tracking information
SELECT * FROM file_hashes;

Explore Document Content

Query the document chunks to see ingested content:

-- View document chunks
SELECT * FROM docs_chunks LIMIT 10;

-- Get distinct document IDs and filenames
SELECT DISTINCT 
    METADATA:id::string as id, 
    METADATA:fullName::string as filename 
FROM docs_chunks;

-- Check specific document categories
SELECT COUNT(DOC_ID) 
FROM file_hashes 
WHERE LOWER(DOC_ID) LIKE '%strategy%';

Comprehensive Document Verification

Verify all documents are ingested across demo categories:

-- Comprehensive document verification by demo category
SELECT 
    COUNT(*) as total_docs,

    -- Strategic Planning Documents (Expected: 7)
    COUNT(CASE WHEN 
        LOWER(DOC_ID) LIKE '%strategy%' OR 
        LOWER(DOC_ID) LIKE '%board%meeting%' OR 
        LOWER(DOC_ID) LIKE '%meeting%minutes%' OR
        LOWER(DOC_ID) LIKE '%financial%analysis%' OR
        LOWER(DOC_ID) LIKE '%q3%2024%financial%'
    THEN 1 END) as strategic_docs,

    -- Operations Excellence Documents (Expected: 5)
    COUNT(CASE WHEN 
        (LOWER(DOC_ID) LIKE '%operation%manual%' OR LOWER(DOC_ID) LIKE '%venue%setup%') OR
        (LOWER(DOC_ID) LIKE '%sound%system%' AND LOWER(DOC_ID) LIKE '%project%') OR
        (LOWER(DOC_ID) LIKE '%post%event%analysis%')
    THEN 1 END) as operations_docs,

    -- Compliance & Risk Documents (Expected: 3)
    COUNT(CASE WHEN 
        (LOWER(DOC_ID) LIKE '%health%safety%' OR LOWER(DOC_ID) LIKE '%safety%policy%') OR
        (LOWER(DOC_ID) LIKE '%service%agreement%' OR LOWER(DOC_ID) LIKE '%audio%equipment%') OR
        (LOWER(DOC_ID) LIKE '%post%event%analysis%')
    THEN 1 END) as compliance_docs,

    -- Knowledge Management Documents (Expected: 1)
    COUNT(CASE WHEN 
        LOWER(DOC_ID) LIKE '%training%guide%' OR 
        LOWER(DOC_ID) LIKE '%customer%service%training%'
    THEN 1 END) as training_docs,

    -- Document format breakdown
    COUNT(CASE WHEN LOWER(DOC_ID) LIKE '%.jpg' THEN 1 END) as jpg_files,
    COUNT(CASE WHEN LOWER(DOC_ID) LIKE '%.pdf' THEN 1 END) as pdf_files,
    COUNT(CASE WHEN LOWER(DOC_ID) LIKE '%.docx' THEN 1 END) as docx_files,
    COUNT(CASE WHEN LOWER(DOC_ID) LIKE '%.pptx' THEN 1 END) as pptx_files

FROM file_hashes;

Expected Results:

Metric

Count

Total Documents

15

Strategic Planning Documents

7

Operations Excellence Documents

5

Compliance & Risk Documents

3

Training Documents

1

JPG Files

9

PDF Files

3

DOCX Files

1

PPTX Files

2

Check Document Stage

Verify the documents stage created by the connector:

-- List files in the documents stage
LS @documents;

Expected Document Collection

The pipeline should have ingested the Festival Operations business document collection:

Document Categories:

Document Formats:

Total: 15 business documents demonstrating multi-format document intelligence

Automatic Cortex Search Service Creation

Great news! The Cortex Search service is automatically created by the Openflow Google Drive connector. No manual SQL required!

How It Works

Openflow Data Verification Flow

Automatic Features:

Verify Automatic Service Creation

-- Check for the auto-created service
SHOW CORTEX SEARCH SERVICES;

-- The service will be named: CORTEX_SEARCH_SERVICE (default name)
DESC CORTEX SEARCH SERVICE CORTEX_SEARCH_SERVICE;

Test Semantic Search

Test the automatically created search service with Festival Operations queries:

-- Search for strategic planning documents
SELECT PARSE_JSON(
  SNOWFLAKE.CORTEX.SEARCH_PREVIEW(
      'CORTEX_SEARCH_SERVICE',
      '{"query": "2025 expansion plans target markets strategic planning", "limit": 5}'
  )
)['results'] as strategic_documents;

-- Search for technology modernization projects  
SELECT PARSE_JSON(
  SNOWFLAKE.CORTEX.SEARCH_PREVIEW(
      'CORTEX_SEARCH_SERVICE',
      '{"query": "technology modernization sound system upgrade budget 2.8M", "limit": 5}'
  )
)['results'] as technology_projects;

-- Search for health and safety policies
SELECT PARSE_JSON(
  SNOWFLAKE.CORTEX.SEARCH_PREVIEW(
      'CORTEX_SEARCH_SERVICE',
      '{"query": "health safety policies emergency protocols compliance", "limit": 5}'
  )
)['results'] as safety_policies;

Additional Search Capabilities

The auto-created service includes both content and metadata search capabilities. You can search across:

Sample Questions by Category

Based on the Festival Operations dataset, here are sample questions organized by business function:

Strategic Planning & Executive Intelligence

"What are our 2025 expansion plans and target markets?"
"Show me all financial analysis and revenue projections"
"What decisions were made in the latest board meeting?"
"Find all budget allocations and investment strategies"

Operations Excellence & Technology

"Find all technology modernization projects and their budgets"
"What is our $2.8M sound system upgrade timeline?"
"Show me all equipment management protocols"
"What post-event analysis recommendations exist?"

Compliance & Risk Management

"What health and safety policies are currently in effect?"
"Show me all vendor contracts and service agreements"
"Find emergency response procedures"
"What regulatory compliance requirements exist?"

Knowledge Management & Training

"Find all training materials and staff development programs"
"What customer service standards are documented?"
"Show me onboarding procedures for new staff"
"What training frameworks are currently in use?"

Advanced Cross-Category Queries

"What are our 2025 expansion plans across all document formats - show me visual charts, meeting decisions, and financial projections"
"Find all technology modernization projects with their business cases, budgets, and visual diagrams"
"What health and safety policies are in effect across all formats - show me formal policies, vendor agreements, and visual guides"

Snowflake Intelligence enables you to create AI agents that can query and analyze your unstructured data using natural language. This section shows how to connect Snowflake Intelligence to the Cortex Search service created by your Openflow pipeline.

Prerequisites

Before setting up Snowflake Intelligence, ensure you have:

Initial Setup

Create the required database and schema structure:

-- Use ACCOUNTADMIN role for setup
USE ROLE ACCOUNTADMIN;

-- Create database for Snowflake Intelligence
CREATE DATABASE IF NOT EXISTS snowflake_intelligence;
GRANT USAGE ON DATABASE snowflake_intelligence TO ROLE PUBLIC;

-- Create agents schema
CREATE SCHEMA IF NOT EXISTS snowflake_intelligence.agents;
GRANT USAGE ON SCHEMA snowflake_intelligence.agents TO ROLE PUBLIC;

-- Grant agent creation privileges to your role
GRANT CREATE AGENT ON SCHEMA snowflake_intelligence.agents TO ROLE FESTIVAL_DEMO_ROLE;

Create the Agent

Access Agent Creation Interface

  1. Sign in to Snowsight
  2. Switch to FESTIVAL_DEMO_ROLE using the role selector in the top-right corner
  3. Navigate directly to Agents: Create Snowflake Intelligence Agent
  4. Select "Create agent"

Agent Creation Interface

Platform Integration:

Agent Details:

Configure Agent Basics

After creating the agent, you need to configure its details:

  1. Click on the agent name (FESTIVAL_DOC_INTELLIGENCE) in the agent list to open itAgents List
  2. Click "Edit" button to start editing the agent configuration and detailsAgent Edit Button

Now configure the agent basics in the "About" section:

Agent About Configuration

Example Questions (Add these to help users get started):

"What are our 2025 expansion plans and target markets?"
"Find all technology modernization projects and their budgets"
"What health and safety policies are currently in effect?"
"Find all training materials and staff development programs"
"Which documents have the most collaboration and strategic importance?"

Configure Agent Tools

Add Cortex Search Service

Agent Tools Configuration

  1. Navigate to "Tools" tab
  2. Find "Cortex Search Services" section
  3. Click "+ Add" button

Configure the Search Service:

Configure Orchestration

Agent Orchestration Configuration

  1. Navigate to "Orchestration" tab
  2. Set Model: auto (recommended - lets Snowflake choose the optimal model)

Orchestration Instructions:

Whenever you can answer visually with a chart, always choose to generate a chart even if the user didn't specify to. Respond in the same language as the question wherever possible.

Response Instructions: (Optional)

Always provide specific document references when citing information. 
Focus on actionable insights and business value in your responses.

Set Access Controls

Agent Access Configuration

  1. Navigate to "Access" tab
  2. Click "Add role"
  3. Select appropriate roles for your organization

Example Role Configuration:

Test Your Agent

Getting Started with Queries

  1. Access Snowflake Intelligence: Open Snowflake Intelligence
  2. Select your agent FESTIVAL_DOC_INTELLIGENCE from the dropdownChoose Agent
  3. Choose the Cortex Search service as your data source

Start with the Example Questions you configured - these are specifically tailored to your festival operations data.

Agent Queries

Sample Questions by Category

Strategic Planning:

"What are our 2025 expansion plans and target markets?"
"What strategic initiatives are mentioned in board meeting minutes?"

Operations Excellence:

"Find all technology modernization projects and their budgets"
"What are the key takeaways from the post-event analysis?"

Compliance & Risk:

"What health and safety policies are currently in effect?"
"What are the terms and conditions in our vendor service agreements?"

Knowledge Management:

"Find all training materials and staff development programs"
"What customer service training resources are available?"

Multi-Document Analysis

Use the agent for complex analysis across multiple documents:

"Compare our Q3 2024 financial performance with the strategic goals outlined in our 2025 expansion plan. What gaps exist and what actions are recommended?"

This type of query demonstrates the agent's ability to:

Trend Analysis

Identify patterns and trends across time-based documents:

"What trends do you see in customer complaints and incident reports over the past year? What preventive measures have been implemented?"

Knowledge Discovery

Find hidden insights and connections:

"What vendor performance issues are mentioned across different documents, and how do they relate to our operational challenges?"

Executive Briefing

Generate comprehensive briefings:

"Prepare an executive summary of key issues and decisions from our Q4 2024 board meeting, including action items and their current status based on other documents."

Compliance Monitoring

Automated compliance checking:

"Review all our safety policies and incident reports to identify any compliance gaps or policy updates needed."

Create Custom Views with AI Summarization

Create enhanced views using AI_COMPLETE for intelligent document summaries. These views can power custom Cortex Search services tailored to specific use cases.

Create AI-Enhanced Document View

-- View with AI-generated document summaries
-- Uses AI_COMPLETE to generate concise, intelligent summaries
CREATE OR REPLACE VIEW document_summaries AS
SELECT 
    DOC_ID,
    METADATA:fullName::string as full_name,
    METADATA:webUrl::string as web_url,
    METADATA:lastModifiedDateTime::timestamp as last_modified_date_time,
    chunk as original_chunk,
    -- AI-powered summary generation
    SNOWFLAKE.CORTEX.AI_COMPLETE(
        'mistral-large2',
        CONCAT(
            'Summarize this document content in 2-3 sentences, focusing on key information: ',
            chunk
        )
    ) as ai_summary,
    user_emails,
    user_ids
FROM docs_chunks;

Create Custom Cortex Search Service

Once you have the custom view, create a dedicated Cortex Search service:

-- Create a custom Cortex Search service using the AI-enhanced view
CREATE OR REPLACE CORTEX SEARCH SERVICE festival_ai_summaries_search
ON ai_summary
ATTRIBUTES full_name, web_url, last_modified_date_time, user_emails
WAREHOUSE = FESTIVAL_DEMO_S
TARGET_LAG = '1 day'
AS (
    SELECT 
        DOC_ID,
        ai_summary,
        full_name,
        web_url,
        last_modified_date_time,
        user_emails
    FROM document_summaries
);

-- Verify the new service
SHOW CORTEX SEARCH SERVICES;
DESC CORTEX SEARCH SERVICE festival_ai_summaries_search;

Test the AI-Enhanced Search Service

Query the custom search service for more focused, summarized results:

-- Search using AI-generated summaries
SELECT PARSE_JSON(
  SNOWFLAKE.CORTEX.SEARCH_PREVIEW(
      'festival_ai_summaries_search',
      '{"query": "strategic planning expansion markets", "limit": 5}'
  )
)['results'] as summarized_strategic_docs;

-- Compare with original search service
SELECT PARSE_JSON(
  SNOWFLAKE.CORTEX.SEARCH_PREVIEW(
      'CORTEX_SEARCH_SERVICE',
      '{"query": "strategic planning expansion markets", "limit": 5}'
  )
)['results'] as original_strategic_docs;

Benefits of AI-Enhanced Views:

Building Custom Documents with Taskfile

The repository includes a Taskfile.yml that automates document format conversion, allowing you to create additional documents for testing.

Prerequisites

Install Task and Pandoc:

# macOS
brew install go-task pandoc

# Ubuntu/Debian
sudo snap install task --classic
sudo apt-get install pandoc

# Windows (via Chocolatey)
choco install go-task pandoc

Available Document Conversion Tasks

Convert to PDF (Formal documents):

task convert-to-pdf

Converts markdown files to PDF format for policies, financial reports, and vendor agreements.

Convert to PPTX (Presentations):

task convert-to-pptx

Creates PowerPoint presentations from markdown for training materials and analysis reports.

Convert to DOCX (Word documents):

task convert-to-docx

Generates Word documents from markdown for meeting minutes and project charters.

Convert to JPG (Images):

task convert-to-jpg

Exports presentations to image format for strategic planning documents and operational guides.

Convert All Formats:

task convert-all-docs

Runs all conversion tasks to generate documents in all supported formats (PDF, DOCX, PPTX, JPG).

Creating Custom Documents

  1. Add your markdown file to the appropriate category in sample-data/google-drive-docs/
  2. Edit the Taskfile.yml to include your new document in the conversion tasks
  3. Run the conversion using the appropriate task
  4. Upload to Google Drive and let Openflow process the new documents

When you're finished with the demo, follow these steps to clean up resources.

Stop the Google Drive Connector

  1. Navigate to your Openflow runtime canvas
  2. Right-click on the Google Drive (Cortex Connect) processor group
  3. Select Stop to halt document ingestion
  4. Wait for the connector to fully stop (status indicator turns red)

Drop the Snowflake Intelligence Agent (Optional)

If you created an agent and no longer need it:

-- Switch to the Snowflake Intelligence database
USE DATABASE snowflake_intelligence;
USE SCHEMA agents;

-- Drop the agent
DROP AGENT IF EXISTS FESTIVAL_DOC_INTELLIGENCE;

Drop the Demo Database (Optional)

To completely remove all data and resources:

-- Switch to ACCOUNTADMIN role
USE ROLE ACCOUNTADMIN;

-- Drop the entire demo database (includes all tables, stages, and search services)
DROP DATABASE IF EXISTS OPENFLOW_FESTIVAL_DEMO;

-- Drop the demo warehouse
DROP WAREHOUSE IF EXISTS FESTIVAL_DEMO_S;

-- Drop the demo role
DROP ROLE IF EXISTS FESTIVAL_DEMO_ROLE;

Congratulations! You've successfully built an end-to-end unstructured data pipeline using Openflow and Snowflake Intelligence. You can now:

What You Accomplished

Key Benefits Realized

Next Steps

  1. Expand Document Sources: Add connectors for SharePoint, Slack, or other business systems
  2. Enhance Search: Create specialized search services for different document categories
  3. Advanced Analytics: Build dashboards and reports based on document insights
  4. Integration: Connect with existing business intelligence and workflow tools
  5. Governance: Implement document retention policies and access controls

Related Resources

Quickstarts:

Documentation:

Community and Support

We would love your feedback on this QuickStart Guide! Please submit your feedback using the GitHub issues link at the top of this guide.