By Justin Delisi
In this blog, we’ll explain Cortex, how its features can be used with simple SQL, and how it can help you make better business decisions.
Cortex offers pre-built ML functions for tasks like forecasting and anomaly detection and access to industry-leading large language models (LLMs) for working with unstructured text data.
COMPLETE
simply takes in a prompt (and the model for the prompt to be used) from the user and outputs a response from the model, similarly to other LLMs. However, it is done all within SQL commands. Here is a simple example using the snowflake-arctic model:EXTRACT_ANSWER
will answer a question based on a text document in plain English or as a string representation of JSON.SELECT
SNOWFLAKE.CORTEX.EXTRACT_ANSWER(
blog_post,
'Why does the author think Mace Windu is the best Star Wars character?')
FROM blogs LIMIT 5;
Copy
EMBED_TEXT_768
takes any unstructured data and creates an embedded vector from it. These vectors can then be compared with other applications for similarities. Again, a simple SQL command is used to create the vector:SUMMARIZE
, which returns a summary of the given content.SELECT
SNOWFLAKE.CORTEX.SUMMARIZE(blog_post)
FROM blogs LIMIT 10;
Copy
SENTIMENT
returns a floating-point number between -1 and 1 based on the text, giving -1 for the most negative text, around 0 for neutral, and 1 for the most positive.SELECT
SNOWFLAKE.CORTEX.SENTIMENT(blog_post)
FROM blogs LIMIT 10;
Copy
TRANSLATE
will translate text from one supported language to another.Prepare the data to have:
At least one target column that you want the model to make a prediction on
A timestamp column with a fixed frequency (daily, weekly, hourly, etc.)
Optionally, you can include other columns that may or may not have influenced the changes in the target column
This can be a table or a view that is passed as a reference for the model to use
Train the model
CREATE SNOWFLAKE.ML.FORECAST phdata_model(
INPUT_DATA =
> SYSTEM$REFERENCE('VIEW', 'phdata_view'),
TIMESTAMP_COLNAME =
> 'daily_timestamp',
TARGET_COLNAME =
> 'sales'
);
Copy
NOTE: As soon as that forecast is created, the model will be trained on the data provided, incurring compute costs.
Retraining the data can be achieved by recreating the forecast object and should be done at regular intervals to improve accuracy
Run a forecast
Then, it’s as easy as making a call to a function or procedure to run a forecast to receive a prediction
This will receive a prediction for the next two timestamp intervals in our previously trained forecast object:
CALL phdata_model!FORECAST(FORECASTING_PERIODS =
> 2);
Copy
Another time series function, like Forecast Anomaly Detection, allows you to train a model to find outliers in your data. Detecting and removing outliers from your data can significantly improve the accuracy of any other machine learning models you train on your data.
ANOMALY_DETECTION
object with the same parameters as a Forecast seen above. You are then able to use methods with the trained model:!DETECT_ANOMALIES
!EXPLAIN_FEATURE_IMPORTANCE
!SHOW_EVALUATION_METRICS
!SHOW_TRAINING_LOGS
Don’t understand why your data is trending a certain way? Contribution Explorer will analyze your data and determine which data segments are driving trends within your target. This way, you can quickly determine what is driving an unwanted result and take immediate action to fix the problem. This can be an excellent feature, particularly when the dataset has a large number of dimensions.
TOP_INSIGHTS
, which takes dimension mappings, the target metric, and a flag to determine whether the data is tested or controlled. It then outputs a table with each contributor and a relative change indicator, which shows how much the contributor positively or negatively affected the target metric.Using AI, Snowflake has created Universal Search, which takes natural language input from the user and can interpret it to give results not only on the objects in your account but also from Snowflake Marketplace, Snowflake documentation, or Knowledge Base articles.
Within Cortex, Copilot is an LLM-powered assistant that works alongside your data analysts to help analyze data and build SQL queries. Input into Copilot is always in a natural language and can be asked to analyze structured and unstructured data, build SQL queries, or refine and optimize queries created by humans.
As you can see, Cortex makes AI accessible to more businesses and users within your business. By simplifying data analysis, automating tasks, and fostering deeper insights, Cortex equips you to confidently make data-driven decisions and propel your business forward in the age of AI.
Business, governance, and adoption-focused material. Real-world implementations, case studies, and industry impact.