For many large enterprises, extracting tangible value from Machine Learning (ML) initiatives remains a significant challenge. Despite substantial investments, the journey from developing models to realizing their benefits often encounters numerous roadblocks. Models can take excessive time and effort to reach production, or worse, they may never get there at all. Even when they do make it to production, the real-life outcomes can sometimes fall short of expectations. MLOps is a core function of ML engineering and focuses on streamlining the process of taking machine learning models to production, and then maintaining and monitoring them effectively. Unlike traditional software, machine learning models can change their behavior over time due to various factors, including input drift, outdated assumptions from model training, issues in data pipelines, and standard challenges like hardware/software environments and traffic. These factors can lead to a decline in model performance and unexpected behavior which needs to be monitored very closely.
Snowflake ML provides organizations with an integrated set of capabilities for end-to-end machine learning in a single platform on top of governed data. Snowpark ML provides capability to create and work with machine learning models in Python that includes Snowflake ML Modeling API which uses familiar Python frameworks for building and training your own models,Python APIs for building and training your own models, Snowflake Feature Store that lets data scientists and ML engineers create, maintain, and use ML features in data science and ML workloads and Model Explainability and Snowflake Model Registry lets you securely manage models and their metadata in Snowflake, Model explainability and Model observability.
In this Quickstart guide we will be exploring Model Observability* in Snowflake that enables one to detect Model behavior changes over time due to input drift, stale training assumptions, and data pipeline issues, as well as the usual factors, including changes to the underlying hardware and software and the fluid nature of traffic. ML Observability allows you to track the quality of production models thay has been deployed via the Snowflake Model Registry across multiple dimensions, such as performance, drift, and volume. For demonstration purposes, let's consider a multinational financial services firm aiming to understand and mitigate customer churn. Losing customers not only impacts revenue but also incurs additional costs to acquire new ones. Therefore, having an accurate model to predict churn and monitoring it regularly is crucial. The financial firm leverages Snowflake ML for its end to end Machine Learning pipeline and have achieved continuous monitoring for data quality, model quality, bias drift and feature attribution drift data quality, model quality, bias drift and feature attribution drift with ML Observability.
This section will walk you through creating various objects
We will leverage Snowflake Notebooks to carry the Data loading, Feature Engineering, Model Training, Model Inference and the Model Monitoring setup.
Step 1. Navigate to Worksheets in Snowsight. Create a database and warehouse that is needed for the Notebook execution. The commands are given below :
USE ROLE SYSADMIN;
create database customer_db;
CREATE OR REPLACE WAREHOUSE ml_wh WITH
WAREHOUSE_TYPE = standard WAREHOUSE_SIZE = Medium
AUTO_SUSPEND = 5 AUTO_RESUME = True;
Step 2. Clone GitHub repository.
Open and download the following notebook from the cloned repository. Import the notebook into the Snowflake Snowsight under Projects -> Notebooks section. Note that the snowflake-ml-python is a required package and remember to add it in the package selector.
Step 3. Notebook Walkthrough. You can choose to run cell by cell or Run All.
CREATE OR REPLACE MODEL MONITOR QS_CHURN_MODEL_MONITOR
WITH
MODEL=QS_CustomerChurn_classifier
VERSION=demo
FUNCTION=predict
SOURCE=CUSTOMER_CHURN
BASELINE=CUSTOMERS
TIMESTAMP_COLUMN=TRANSACTIONTIMESTAMP
PREDICTION_CLASS_COLUMNS=(PREDICTED_CHURN)
ACTUAL_CLASS_COLUMNS=(EXITED)
ID_COLUMNS=(CUSTOMERID)
WAREHOUSE=ML_WH
REFRESH_INTERVAL='1 min'
AGGREGATION_WINDOW='1 day';
In the above case we are creating a NAME: The name of the monitor to be created. Must be unique within the schema.
Lets view the monitoring dashboards in Snowsight.
In the Monitors section of the details page, a list of model monitors can be found, including the model versions they are linked to, their status, and creation dates.
Prediction Count - Count of non-null values for prediction column.
Actual Count - Count of non-null values for label column.
Precision - ratio of true positive predictions to the total predicted positives, indicating the accuracy of positive predictions.
Classification accuracy- ratio of correctly predicted instances to the total instances in the dataset.
Recall - ratio of true positive predictions to the total actual positives, measuring the ability to capture all relevant instances.
The F1 Score - the harmonic mean of precision and recall, providing a balance between the two metrics.
Difference of mean - compares the average values between two datasets.
There are other metrics that can be tracked as per the model type.
In this guide, we explored how financial firms can build end-to-end, production-ready customer churn prediction models using Snowflake ML. With Snowflake's new features supporting MLOps, organizations can now monitor, optimize, and manage models at scale in production. This approach ensures models are deployed in a controlled, reliable manner and continue to evolve, delivering sustained value over time. Ultimately, Snowflake empowers organizations to move confidently from development to production, unlocking the full potential of their machine learning models and driving impactful, large-scale results.