D1: Architecture & Features Week 3

DAY 15

Snowpark, Snowflake Notebooks & Streamlit in Snowflake

Week 3 opens with the AI/ML developer stack: Snowpark, Notebooks, and Streamlit-in-Snowflake as one integrated developer surface. Snowpark hosts non-SQL code that runs inside Snowflake, which is the answer to official Sample Q5 and the most-tested sentence in sub-objective 1.6.

🗣️ Plain-English First

Term	Plain meaning
Snowpark	A set of client libraries (Python, Java, Scala) plus a server-side runtime that lets you write code that runs inside Snowflake. No data movement, no external compute.
DataFrame	A table-like object you can manipulate in code. Snowpark’s `DataFrame` looks like pandas/Spark but compiles to SQL behind the scenes.
UDF / UDTF	User-Defined Function / User-Defined Table Function. Custom logic (often Python) that Snowflake stores and runs on a warehouse like a built-in function.
Stored procedure	A named, callable block of code (SQL, Python, Java, Scala, JavaScript). Called with `CALL`. Procedural: runs multiple statements, loops, conditionals.
Snowflake Notebook	An interactive Snowsight surface with cells. Each cell is SQL, Python, or Markdown. Like Jupyter, but lives inside Snowflake and uses a warehouse.
Streamlit in Snowflake	A native Snowflake object that hosts a Python web app. No external hosting, no separate URL infrastructure. Users open it through Snowsight.
Snowpark-Optimized warehouse	A warehouse type with ~16× the memory per node of a standard warehouse, built for memory-intensive Python (ML training, large UDFs).
Kernel / Runtime	The Python execution engine. In Notebooks, the kernel runs on either a warehouse (Warehouse Runtime) or a container (Container Runtime).

📘

Today’s Concept

Micro-Concept 1: Snowpark: The One Sentence

One sentence from today is non-negotiable:

Snowpark is how you host non-SQL code (Python, Java, Scala) so it runs inside Snowflake, on a warehouse, with no data movement.

That is the verbatim answer to official Sample Q5. The stem asks which feature lets a customer host non-SQL code (Java, Python) that operates on Snowflake data with high performance. The answer is Snowpark. Not External Functions, which call out to a third-party service. Not stored procedures on their own. Those are the typical callable container for Snowpark code, not the engine.

“Inside Snowflake” has three concrete components:

Code lives in Snowflake, as a Snowpark stored procedure, UDF, or UDTF object.
Compute is a Snowflake warehouse (standard or Snowpark-Optimized). Not your laptop, not a separate cluster.
Data never leaves. The DataFrame operations translate to SQL, run on the warehouse, and return results in place.

The Snowpark client libraries can run anywhere: your laptop, a notebook, an application server. When that client calls session.sql(...) or operates on a DataFrame, the actual work happens on the Snowflake warehouse. Candidates in training sessions I have run often assume that because they typed Python on their laptop, the Python ran on their laptop. It did not.

Micro-Concept 2: Snowpark vs Stored Procedure vs External Function

This is the distinction the exam returns to repeatedly. Three patterns, three answers:

You want to…	Use	Why
Run Python/Java/Scala code inside Snowflake, with no data leaving	Snowpark	Hosts the code on the warehouse. Can deliver as a stored procedure or UDF, but the engine is Snowpark.
Run a block of SQL (or other) statements as a single named, callable unit	Stored procedure	Procedural orchestration. Called with `CALL`. Often the wrapper for Snowpark code, but does not require Snowpark.
Call an external service (REST API, SaaS endpoint) from inside SQL	External Function	Sends data out to an external HTTPS endpoint via an API integration. Snowflake does not host the code; the third-party service does.
Apply custom logic inline inside a SELECT, on many rows	UDF (often a Snowpark UDF)	Returns a value; can be called many times per statement. Snowpark Python UDFs are the most common case.

The key rule is direction. Snowpark = code comes into Snowflake. External Function = data goes out to an external endpoint. Remember the direction and you will answer the scenario question correctly.

Micro-Concept 3: Snowpark-Optimized Warehouses

Standard warehouses are sized for SQL workloads. Snowpark code can hit memory limits on a standard warehouse and spill to disk. ML training and large pandas-style operations are the common cases. Snowpark-Optimized warehouses fix that with roughly 16× the memory per node of a same-size standard warehouse.

Property	Standard warehouse	Snowpark-Optimized warehouse
Memory per node	1× (baseline)	~16× baseline
Local SSD cache	1×	~10×
Minimum size	X-Small	Medium (XS/Small not supported in the classic mode)
Credit cost	1× (baseline)	~1.5× the same size’s standard credits/hour
Best for	SQL workloads, light Python	ML training, memory-heavy Python UDFs/SPROCs
Multi-cluster?	Yes (Enterprise+)	Yes (Enterprise+)
Query Acceleration Service?	Yes	No. Not supported.

Four facts to remember for the exam: minimum size Medium, 16× memory, ~1.5× credit cost, and recommended for memory-intensive Snowpark workloads. ML training is the headline scenario. The newer RESOURCE_CONSTRAINT clause (released March 2025) does let you specify MEMORY_1X on an XSMALL warehouse. That is a configuration extension, not what the exam tests today. Stick to “Medium minimum” on test day. One common distractor says Snowpark-Optimized is “gated to Enterprise Edition or higher.” That is multi-cluster behaviour, not Snowpark-Optimized itself.

Micro-Concept 4: Snowflake Notebooks

A Snowflake Notebook is a Snowsight surface with cells. SQL cells, Python cells, and Markdown cells. They are not a separate edition. They are a Snowsight feature available across all editions.

Two facts make Notebooks exam-testable:

(a) Two runtime choices.

Runtime	What it runs on	Best for
Warehouse Runtime	A virtual warehouse (the Notebook warehouse). Default Python 3.9 (3.10 also available).	General SQL + Python work, lightweight analytics, fast startup.
Container Runtime	A Snowpark Container Services compute pool (CPU or GPU). Many ML packages pre-installed.	ML training, GPU workloads, deep learning, custom Python packages.

(b) Two warehouses, not one. A notebook on the Warehouse Runtime can use two warehouses at the same time:

Notebook warehouse. Runs the kernel (the Python process driving the cells). Snowflake recommends X-Small here. The managed default is called SYSTEM$STREAMLIT_NOTEBOOK_WH, provisioned per account.
Query warehouse. Runs the SQL/Snowpark queries the notebook submits. Can be larger to handle heavy queries on demand.

The split exists because the kernel warehouse stays active for the entire notebook session. You will see an EXECUTE NOTEBOOK query parked in Query History the whole time. The query warehouse only spins up when a cell actually submits work. Keeping them separate lets you pay X-Small rates for sit-and-think time, and run the bigger warehouse only when queries actually fire. On the exam, watch for stems describing this two-warehouse model. The kernel goes on the Notebook warehouse. The queries go on the Query warehouse.

Micro-Concept 5: Streamlit in Snowflake

Streamlit is an open-source Python library for building data apps with a handful of lines of code. Streamlit-in-Snowflake hosts those apps as first-class Snowflake objects. Not on an external server, not on a separate URL, not on your laptop.

Five facts to remember:

Source lives in a named internal stage. The STREAMLIT object points to it. External stages are not supported.
Access is RBAC. Viewers need USAGE on the database, schema, warehouse (or compute pool), and the STREAMLIT object itself.
Two runtimes, mirroring Notebooks: Warehouse Runtime (the default) and Container Runtime (uses a compute pool).
Warehouse runtime gives each viewer a personal app instance. Container runtime shares one long-running instance across all viewers. Cheaper at scale.
Works with Snowpark, UDFs, and stored procedures. Anything you build with Snowpark can be wrapped in a Streamlit UI without leaving Snowflake.

Streamlit-in-Snowflake is the standard exam answer for any scenario that says “host a Python data app inside Snowflake without managing external infrastructure.” If the stem mentions Streamlit Community Cloud, a self-managed VM, or an external hosting platform, that is not Streamlit-in-Snowflake. Those are separate deployment options for the underlying open-source library.

⚡

Cheat Sheet

Concept	What to remember
Snowpark one-liner	Host non-SQL code (Python, Java, Scala) inside Snowflake. No data movement. Sample Q5 answer.
Snowpark vs External Function	Snowpark = code comes IN. External Function = data goes OUT.
Snowpark-Optimized warehouse	16× memory, ~1.5× credits, min size = Medium, NO Query Acceleration Service.
Best fit for Snowpark-Optimized	ML training, memory-intensive Python UDFs/stored procedures.
Notebook = SQL + Python + Markdown cells	Native to Snowsight. Default Python 3.9 (3.10 optional).
Notebook runtimes	Warehouse Runtime (on a warehouse) or Container Runtime (on a compute pool, CPU/GPU, for ML).
Notebook default warehouse	`SYSTEM$STREAMLIT_NOTEBOOK_WH` (Snowflake-managed, provisioned per account).
Two warehouses in a Notebook	Notebook warehouse (kernel, X-Small recommended) + Query warehouse (SQL pushdown, size on demand).
Streamlit-in-Snowflake = native object	Source on internal stage, RBAC for access, no external hosting.
Streamlit warehouse runtime	Each viewer gets a personal app instance.
Streamlit container runtime	All viewers share one long-running instance. More cost-effective at scale.
UDF vs stored procedure call rule	A single SQL statement can call many UDFs but only one stored procedure.
ML Models / Applications	Two of the 12 database objects (from Day 5). Both belong here in 1.6.

🎯 Exam Tip

Three near-identical-sounding distractors to keep straight:

(1) Snowpark hosts non-SQL code inside Snowflake on a warehouse. TRUE. This is the Sample Q5 answer and the most-tested sentence in sub-objective 1.6. (2) External Functions host non-SQL code inside Snowflake. FALSE. They call out to an external HTTPS endpoint, and the data is sent out. (3) Streamlit-in-Snowflake requires an external hosting service. FALSE. It is a native Snowflake object, opened through Snowsight, with no external infrastructure. One common trap: stems that mention “Java UDF for sentiment scoring” or “Python code with high performance on Snowflake data”. Those are Sample Q5 in disguise. The answer is Snowpark, even when the word Snowpark never appears in the question.

🛠️

Hands-On Lab

Type: LAB (guided, mostly GUI-driven) | Time: ~15 minutes | Credits: <0.1 | Prerequisite: Notebooks & Streamlit enabled (default on most accounts); ACCOUNTADMIN role; day10_orders table from Day 10 should still exist (Section 14 of memory: it persists through Day 39).

Confirm the Snowpark Python runtime is reachable from a regular worksheet. This proves Snowpark is server-side accessible. No local install required.

SQL

USE ROLE ACCOUNTADMIN;
USE WAREHOUSE lab_xs;

-- Inspect the Snowpark Python stored-procedure runtime
SHOW FUNCTIONS LIKE 'SYSTEM$ALLOW%' IN SNOWFLAKE.CORE;

-- Confirm day10_orders still exists from Week 2 (Day 10 lab)
SELECT COUNT(*) AS row_count FROM day10_orders;
-- Expect ~1.5M rows

👀 Observe: The Snowpark Python runtime is built into the account. No install step required server-side. day10_orders is your fixture for the next steps.

Create a Snowpark Python stored procedure. This is the “host non-SQL code in Snowflake” pattern, the exact mechanism Sample Q5 asks about.

SQL

CREATE OR REPLACE PROCEDURE day15_top_status()
  RETURNS TABLE (status STRING, order_count NUMBER)
  LANGUAGE PYTHON
  RUNTIME_VERSION = '3.10'
  PACKAGES = ('snowflake-snowpark-python')
  HANDLER = 'run'
AS
$$
def run(session):
    df = (session.table("day10_orders")
                 .group_by("O_ORDERSTATUS")
                 .count()
                 .sort("COUNT", ascending=False))
    return df
$$;

-- Call it. The Python code runs on the warehouse, no data leaves Snowflake.
CALL day15_top_status();

👀 Observe: The Python run() function executes inside Snowflake on lab_xs. The DataFrame operations translate to SQL internally. This is Snowpark’s defining property: non-SQL code, hosted in Snowflake, no data movement.

Create a Snowflake Notebook in Snowsight (GUI step).

In Snowsight: Projects » Notebooks » + Notebook. Fill in:

Name: day15_demo
Database / Schema: any database you have CREATE privileges on (e.g., your default database, public schema)
Run on: Warehouse
Python environment: Warehouse Runtime 1.0 (Python 3.9) is fine
Query warehouse: lab_xs
Notebook warehouse: SYSTEM$STREAMLIT_NOTEBOOK_WH (Snowflake-managed default, recommended)

Click Create. The notebook opens with an empty cell.

👀 Observe: Two warehouses are now bound to this one notebook. The Notebook warehouse runs the Python kernel. You can confirm it by opening Query History and looking for the EXECUTE NOTEBOOK query that stays running as long as the notebook is open. The Query warehouse only spins up when a SQL or Snowpark query is actually submitted from a cell.

Add and run a SQL cell, then a Python cell. Inside the notebook:

In the first cell, set the cell type to SQL and paste:

SQL

SELECT O_ORDERSTATUS, COUNT(*) AS c
FROM day10_orders
GROUP BY 1
ORDER BY c DESC;

Click Run. The output renders as an interactive Streamlit DataFrame.

Add a second cell, set the type to Python, and paste:

Python

from snowflake.snowpark.context import get_active_session
session = get_active_session()

df = session.table("day10_orders") \
            .group_by("O_ORDERSTATUS") \
            .count() \
            .sort("COUNT", ascending=False)

df  # In a Snowflake Notebook, just naming the DataFrame renders it.

Run the cell.

👀 Observe: SQL and Python in the same document, talking to the same data. The Snowpark Python code needed no local install. It ran on the Notebook warehouse. Inside a Snowflake Notebook, naming a DataFrame triggers eager evaluation and renders it. Outside this environment, you would need df.show() to get the same result.

Create a Streamlit-in-Snowflake app (GUI step). This is the third surface in today’s stack.

In Snowsight: Projects » Streamlit » + Streamlit App. Fill in:

Name: day15_streamlit
Database / Schema: same as your notebook
Run on: Warehouse
App warehouse: lab_xs

Click Create. Snowflake generates a starter streamlit_app.py. Replace its body with:

Python

import streamlit as st
from snowflake.snowpark.context import get_active_session

st.title("Day 15 — Order Status Explorer")
session = get_active_session()

status_options = [r[0] for r in
    session.sql("SELECT DISTINCT O_ORDERSTATUS FROM day10_orders").collect()]

picked = st.selectbox("Pick an order status:", status_options)

count = session.sql(
    f"SELECT COUNT(*) AS c FROM day10_orders WHERE O_ORDERSTATUS = '{picked}'"
).collect()[0]["C"]

st.metric(label=f"Orders with status '{picked}'", value=f"{count:,}")

Click Run. The app renders inline.

👀 Observe: A working Python data app, hosted as a Snowflake object, accessed through Snowsight, querying live data on a Snowflake warehouse. No external hosting, no separate URL infrastructure, no deployment script. Access is controlled by RBAC. Anyone you GRANT USAGE on the STREAMLIT object (plus the database, schema, warehouse) can open the app.

Concept check (no need to run). A Snowpark-Optimized warehouse for a memory-intensive ML training stored procedure would be created like this. Most trial accounts will not run this cheaply, so this step is illustrative only.

SQL

-- Concept check only — minimum size is MEDIUM in the classic mode.
-- Costs roughly 1.5x a standard MEDIUM credit rate.
-- DO NOT RUN unless you intend to spend the credits.
/*
CREATE WAREHOUSE day15_ml_wh
  WAREHOUSE_TYPE = 'SNOWPARK-OPTIMIZED'
  WAREHOUSE_SIZE = 'MEDIUM'
  AUTO_SUSPEND = 60
  AUTO_RESUME = TRUE
  INITIALLY_SUSPENDED = TRUE;
*/

-- The newer RESOURCE_CONSTRAINT clause (GA March 2025) lets you do this:
/*
CREATE WAREHOUSE day15_ml_wh
  WAREHOUSE_TYPE = 'SNOWPARK-OPTIMIZED'
  WAREHOUSE_SIZE = 'XSMALL'
  RESOURCE_CONSTRAINT = 'MEMORY_1X'
  INITIALLY_SUSPENDED = TRUE;
*/
-- For the exam, stick to the canonical answer: Snowpark-Optimized minimum = MEDIUM.

💡 Why concept-only: Snowpark-Optimized warehouses cost roughly 1.5× their standard equivalent. The minimum size in classic mode is Medium. Running one for even a minute on a trial account is wasteful when no real ML workload is queued. Learn the facts. Spin one up only when you actually need it.

Cleanup. Remove the stored procedure and the GUI-created notebook + Streamlit app. day10_orders stays. It is the persistent fixture through Day 39.

SQL

DROP PROCEDURE IF EXISTS day15_top_status();
-- In Snowsight:
--   Projects » Notebooks » day15_demo » (ellipsis) » Delete
--   Projects » Streamlit » day15_streamlit » (ellipsis) » Delete
-- DO NOT drop day10_orders — it persists through Day 39.

💡 Pro tip: An idle notebook keeps an EXECUTE NOTEBOOK query alive on its Notebook warehouse. The warehouse stays running until the session ends. I have seen idle notebooks run up meaningful overnight credit costs on a client project, because users closed the browser tab without ending the session. Always click the Active button at the top of the notebook and choose End session before walking away.

❄️

Snowflake Documentation

Official docs for today’s topics.

❄️

Snowpark Developer Guide

docs.snowflake.com/en/developer-guide/snowpark/index

→

❄️

Snowflake Notebooks

docs.snowflake.com/en/user-guide/ui-snowsight/notebooks

→

❄️

Streamlit in Snowflake

docs.snowflake.com/en/developer-guide/streamlit/about-streamlit

→

🔗

External References

Snowpark-Optimized and runtime details.

Docs Snowflake

Snowpark-Optimized Warehouses Reference

Docs Snowflake

Notebooks Setup: Two-Warehouse Model

Docs Snowflake

Streamlit Runtime Environments (Warehouse vs Container)

❓

Practice Questions

Q1: A team needs to host custom Java code that scores customer sentiment and Python code that performs feature engineering, both running with high performance directly on data stored in Snowflake. Which Snowflake feature should they use? ▸

Options:

A. External Functions
B. Snowpark
C. Snowpipe
D. Snowflake Cortex SQL functions

✅ Answer: B

Why B: Snowpark hosts non-SQL code (Java, Python, Scala) that executes inside Snowflake on a warehouse, with no data movement. This is the answer to official Sample Q5 and the most-tested fact in sub-objective 1.6. Whenever a stem mentions custom Java or Python with high performance on Snowflake data, the answer is Snowpark. The word itself need not appear in the question.

Why not A: External Functions call out to an HTTPS endpoint hosted by a third party. The data leaves Snowflake, the opposite direction of what the scenario describes.

Why not C: Snowpipe is for continuous data loading from a stage, not for hosting custom code. Re-read Day 4 if missed.

Why not D: Cortex SQL functions (SENTIMENT, SUMMARIZE, and so on) are pre-built and called from SQL. They cannot host the team’s own custom Java or Python code. Day 16 covers Cortex in detail.

Q2 (verbatim from SnowPro_Core_Practice_Review.docx Q29): How does a Snowflake stored procedure compare to a User-Defined Function (UDF)? ▸

Options:

A. A single executable statement can call only two stored procedures. In contrast, a single SQL statement can call multiple UDFs.
B. A single executable statement can call only one stored procedure. In contrast, a single SQL statement can call multiple UDFs.
C. A single executable statement can call multiple stored procedures. In contrast, multiple SQL statements can call the same UDFs.
D. Multiple executable statements can call more than one stored procedure. In contrast, a single SQL statement can call multiple UDFs.

✅ Answer: B

Why B: The Snowflake documentation states the rule plainly. A single SQL statement can call multiple UDFs, but only one stored procedure. A stored procedure is invoked via CALL as its own independent statement. It cannot be embedded in an expression. UDFs return a value and can sit inline in SELECT lists, WHERE clauses, and so on. That is why many can appear in a single statement.

Why not A, C, D: All three distort the actual call rule. Option D’s “multiple statements can call multiple procedures” is technically true on its own but misses the comparison the documentation actually makes.

Why this matters for Day 15: Snowpark code ships either as a stored procedure (one call per statement, procedural) or as a UDF (many calls per statement, returns a value). Knowing which packaging fits which use case is exam-relevant.

Q3: A Snowflake Notebook running on the Warehouse Runtime is configured with two warehouses: a Notebook warehouse and a Query warehouse. What runs on each? ▸

Options:

A. The Notebook warehouse runs SQL queries; the Query warehouse runs the Python kernel.
B. The Notebook warehouse runs the Python kernel (and cell execution); the Query warehouse runs SQL and Snowpark pushdown queries.
C. Both warehouses split SQL traffic round-robin for load balancing.
D. The Query warehouse is only used when the notebook is closed.

✅ Answer: B

Why B: The Notebook warehouse runs the kernel, the Python process driving the cells. It stays active for the entire session via an EXECUTE NOTEBOOK query. The Query warehouse spins up on demand to run the SQL and Snowpark queries submitted by the cells. Snowflake recommends a small Notebook warehouse (X-Small) paired with a separately-sized Query warehouse for heavy work.

Why not A: The roles are reversed. This is the classic flip-the-labels distractor. If you skim, you pick it.

Why not C: Snowflake does not load-balance round-robin between these two warehouses. They have distinct, non-overlapping roles.

Why not D: The Query warehouse is used while the notebook is actively running queries, not when it is closed.

Q4: Which statement about Streamlit in Snowflake is TRUE? ▸

Options:

A. Streamlit in Snowflake requires hosting on Streamlit Community Cloud and authenticates back to Snowflake via OAuth.
B. Streamlit in Snowflake is a native Snowflake object whose source files live on an internal stage, with access controlled through Snowflake RBAC.
C. Streamlit in Snowflake apps are only visible to the ACCOUNTADMIN role.
D. Streamlit in Snowflake apps can only read data via the Snowflake REST API.

✅ Answer: B

Why B: Streamlit-in-Snowflake stores the app’s source files on a named internal stage. External stages are not supported. The STREAMLIT object is a first-class Snowflake securable. Viewers need USAGE on the database, schema, warehouse (or compute pool), and the STREAMLIT object. That is standard RBAC.

Why not A: External hosting is exactly what Streamlit-in-Snowflake removes. Streamlit Community Cloud is a separate deployment option for the open-source library and is not how Streamlit-in-Snowflake works.

Why not C: Any role granted the right privileges can view the app. ACCOUNTADMIN is not required, and granting it for viewing would be a security anti-pattern.

Why not D: Inside the app, Snowpark and SQL run against the app’s warehouse directly. There is no requirement to route through the REST API.

Q5: Which TWO statements about Snowpark-Optimized warehouses are TRUE? (Select TWO.) ▸

Options:

A. They provide roughly 16× the memory per node compared to a standard warehouse of the same size.
B. They are available starting from the X-Small size.
C. The minimum supported size (in the classic mode without the RESOURCE_CONSTRAINT clause) is Medium.
D. They are billed at the same credit rate as a standard warehouse of the same size.
E. They are only available on Business Critical Edition or higher.

✅ Answer: A, C

Why A: Snowpark-Optimized warehouses provide approximately 16× the memory per node of a standard warehouse. This supports memory-intensive workloads like ML training without spilling to disk.

Why C: In the classic mode, the supported sizes start at Medium. X-Small and Small are not available. The newer RESOURCE_CONSTRAINT = 'MEMORY_1X' clause does let you run at XSMALL, but the standard exam answer remains “Medium minimum.”

Why not B: X-Small is not supported in the classic Snowpark-Optimized mode. This is the distractor designed to catch people who skip past the “in the classic mode” qualifier.

Why not D: Snowpark-Optimized warehouses cost roughly 1.5× the credits/hour of a standard warehouse of the same size, not the same rate.

Why not E: Snowpark-Optimized is available across editions. Multi-cluster capability is what is gated to Enterprise+, not the warehouse type itself. Re-read Day 9 if the multi-cluster edition gate is fuzzy.

📝 Recap

Today you learned: Snowpark is how non-SQL code (Python, Java, Scala) runs inside Snowflake on a warehouse. That is the answer to Sample Q5 and the single most-tested fact in sub-objective 1.6. External Functions go in the opposite direction (data out to an HTTPS endpoint). Snowpark-Optimized warehouses give roughly 16× memory and 1.5× credits, with a Medium minimum size, for memory-hungry ML workloads. Snowflake Notebooks combine SQL and Python in cells. They use two warehouses (kernel on the Notebook warehouse, queries on the Query warehouse) and default to the managed SYSTEM$STREAMLIT_NOTEBOOK_WH. Streamlit-in-Snowflake hosts Python data apps as native Snowflake objects on an internal stage, accessed through Snowsight, governed by RBAC. No external hosting.

Key takeaway: Snowpark, Notebooks, and Streamlit-in-Snowflake share one design principle. Bring the code to the data, not the other way around. Once you internalise that, sub-objective 1.6 stops being a “memorise three products” exercise and becomes a “recognise which surface fits which scenario” exercise. The exam tests exactly that.

Tomorrow (Day 16). Cortex AI: SQL Functions + Cortex Search. Still in sub-objective 1.6, but the angle shifts from “host your own code” to “use Snowflake’s pre-built AI.” Cortex SQL functions are pure SQL (no Python required) for sentiment, summarisation, translation, completion, classification, and answer extraction. Cortex Search adds semantic and lexical retrieval on text columns. Expect one to two exam questions on the SQL-only nature of these functions.

← Take Practice Test All 50 Days Day 16 →

About the Author

Abhay Krishnan

Senior Data & AI Consultant

Connect on LinkedIn

With over five years of data engineering experience at EY and Infosys, Abhay Krishnan specializes in building scalable data pipelines and cloud warehousing solutions. He is a certified SnowPro Core professional, alongside credentials in AWS and Azure. Abhay created this 50-day track to solve a problem he faced firsthand: the lack of a structured, free resource for Snowflake certification prep. Follow him on LinkedIn for more data engineering insights.

Snowpark, Snowflake Notebooks & Streamlit in Snowflake

Today’s Concept

Micro-Concept 1: Snowpark: The One Sentence

Micro-Concept 2: Snowpark vs Stored Procedure vs External Function

Micro-Concept 3: Snowpark-Optimized Warehouses

Micro-Concept 4: Snowflake Notebooks

Micro-Concept 5: Streamlit in Snowflake

Cheat Sheet

Hands-On Lab

Snowflake Documentation

External References

Practice Questions

Topics