Reading the Snowflake Query Profile & Query Insights
Week 6 opens Domain 4, the largest performance block on the COF-C03 (about 21% of the exam). Today is the diagnosis day. You learn to read the Query Profile and find the operator that costs the most. You also name the four problems it exposes: spilling, weak pruning, exploding joins, and queuing. Then you meet Query Insights, the automatic layer that flags those conditions for you.
Domain 4 leans on four words the exam uses as exact triggers. Each one has a plain meaning that points the wrong way.
| Word in the stem | What it sounds like | What it means in Query Profile |
|---|---|---|
| Spilling | A liquid overflowing a cup | Memory ran out, so Snowflake wrote intermediate results to disk. Local SSD first, then remote storage. Remote spilling means the warehouse was too small. |
| Pruning | Cutting branches off a plant | Skipping micro-partitions a query does not need. Pruning is good. Weak pruning is the problem, not pruning itself. |
| Exploding join | Something blowing up | A join that emits far more rows than it reads. Usually a missing join key (Cartesian product) or a many-to-many match. |
| Queue time | Standing in line at a shop | Time the query waited for a free slot on the warehouse before any work began. It is not execution time. |
Reading the Query Profile
Where it lives
Run any query, then open Snowsight. Go to Monitoring, then Query History (Activity in older Snowsight). Click the query, then open the Query Profile tab. The profile is the operator tree Snowflake actually executed.
Each box is an operator node. The arrows show data flowing between nodes. The number on each arrow is the row count passed forward. The tree reads top-to-bottom as last-to-first: the top node is the final step, the bottom node is the first.
The most-expensive node
On the right, the Most Expensive Nodes pane lists the operators by share of execution time, highest first. The number beside each node is the percentage of total execution time that node consumed. The node with the highest percentage is the most expensive operator by definition.
That node is the one to tune first. Fixing a node that takes 6% of the time cannot move the total much. Fixing the node at 53% can. The pane lists only nodes at 1% or more of total execution time, so a short list means a clear leverage point.
A TableScan or a Join sitting at the top of that list is normal. Those are the heaviest steps in most queries. The question the exam asks is not whether a node is large. It is which node is largest.
Where the time went
Select the whole query, with no node clicked. The Profile Overview breaks total time into categories. Processing is CPU work on joins, filters, and aggregations. Local Disk I/O and Remote Disk I/O are time blocked on disk access. The mix tells you the bottleneck before you read a single operator.
Four Problems the Profile Names
The COF-C03 sub-objective lists the exact issue types you must recognise. Each one has a specific metric in the profile and a specific fix. Learn the metric and the fix as a pair.
1. Bytes spilled (warehouse too small)
A query needs memory for sorts, large aggregations, and big joins. When the warehouse runs out of memory, Snowflake spills intermediate results to disk. It writes to local SSD first. If local fills up, it writes to remote cloud storage.
The profile reports two fields: Bytes spilled to local storage and Bytes spilled to remote storage. Local spilling slows a query. Remote spilling slows it far more, because remote storage is the slowest tier. A non-zero remote value is the headline signal.
The fix is to size up the warehouse, the lever from Day 8. A bigger warehouse holds more in memory. Processing smaller batches also helps. In production projects I have worked on, a non-zero Bytes spilled to remote storage is the clearest single sign that a warehouse is undersized.
2. Inefficient pruning (consider clustering)
Pruning is Snowflake skipping micro-partitions a query does not need, the Day 10 mechanism. You read its efficiency on the TableScan operator, by comparing two numbers. Partitions scanned versus Partitions total.
If scanned is a small fraction of total, pruning worked. If scanned is close to total, the filter pruned almost nothing. That is inefficient pruning. On a large table it reads as a near-full scan. The fix is clustering, so the data storage order matches the filter column. Day 39 covers clustering keys.
One trap matters here. Excessive pruning is not a thing. Pruning more is always better, so an option that calls pruning a problem is wrong by construction.
3. Exploding joins (Cartesian or many-to-many)
An exploding join produces far more rows than its inputs hold. The classic cause is a join with no join condition. Snowflake runs that as a Cartesian product. The other cause is a key that matches one row on the left to many on the right, repeated across the table.
You spot it on the row counts. The arrow leaving a Join node carries a row count many times larger than the arrows entering it. The join node also tends to dominate the time. That blowup is the tell.
4. Queue time (warehouse busy, not query slow)
A query’s total duration splits into time spent executing and time spent waiting. Queue time is the wait before any execution begins, while the warehouse finishes other work. High queue time means the warehouse was busy, not that the query was heavy.
This is one of the distinctions Domain 4 leans on. The fix for queuing is to scale out, by adding clusters, not to scale up. Among candidates I have trained, high queue time gets misread as a slow query far more often than as a busy warehouse. Day 37 takes the queuing columns apart in detail.
| Symptom in the profile | What it means | The fix |
|---|---|---|
| Bytes spilled to remote storage > 0 | Warehouse ran out of memory | Size up the warehouse, or batch smaller |
| Partitions scanned ≈ Partitions total | Filter pruned almost nothing | Add a clustering key on the filter column |
| Join output rows ≫ input rows | Cartesian or many-to-many join | Add or fix the join condition |
| High queue time, low execution time | Warehouse was busy when the query arrived | Scale out with multi-cluster, not up |
Query Insights: The Automatic Layer
The Query Profile is a manual tool. You read the tree and draw the conclusion yourself. Query Insights, released in 2025, does part of that reading for you. Snowflake detects known conditions and writes them down in plain language.
Each insight carries three things. A message describing the condition. The part of the query that produced it. A suggested next step, when the condition hurts performance. The detected conditions include a filter that matches no rows, a filter that barely reduces rows, and a LIKE pattern that starts with a wildcard.
You read insights in two places. In Snowsight, the Query Profile tab highlights nodes that have insights and lists them in a side pane. Programmatically, you query the SNOWFLAKE.ACCOUNT_USAGE.QUERY_INSIGHTS view. Its key columns are the insight type, the message, and the suggestion.
For the exam, hold the distinction clearly. The Query Profile shows you the operator tree to interpret. Query Insights surfaces named conditions with a recommendation attached. One is the X-ray. The other is the second opinion written next to it.
Cheat Sheet
| Concept | What to remember | Exam keyword |
|---|---|---|
| Most-expensive node | Highest percentage of execution time. Tune that one first | “Highest %” |
| Profile location | Query History, click query, Query Profile tab | “Query Profile” |
| Bytes spilled local | Memory full, wrote to local SSD. Slower | “Spilled to local” |
| Bytes spilled remote | Local full too. Warehouse undersized. Size up | “Spilled to remote” |
| Pruning check | Partitions scanned vs total on the TableScan | “Partitions scanned” |
| Weak pruning fix | Cluster the table on the filter column | “Clustering” |
| Exploding join | Output rows far exceed input rows. Missing or wide join key | “Cartesian” |
| Queue time | Waiting for the warehouse, not executing. Scale out | “Queued” |
| Query Insights | Auto-detected condition + message + suggestion | “Insights” |
| Insights via SQL | ACCOUNT_USAGE.QUERY_INSIGHTS view | “QUERY_INSIGHTS” |
Exam Tip
Domain 4 pairs a symptom with a fix. It tests whether you pair them right. The traps swap the fix. Spilling to remote storage points to a bigger warehouse, not to clustering. Weak pruning points to clustering, not to a bigger warehouse. Queue time points to scaling out, not to scaling up. Read the metric in the stem, then pick the fix that matches it.
Two distractor families repeat. The first invents a metric. “Excessive pruning” is the common one. It is never a real issue, because more pruning is always better. The second confuses the layers. It claims Query Insights replaces the Query Profile, or that it lives only in a higher edition. Neither is true. Insights sit alongside the profile and are read in Snowsight or from the QUERY_INSIGHTS view.
Hands-On Lab
lab_xs warehouse from Day 1, plus the shared SNOWFLAKE_SAMPLE_DATA database. This lab reads sample data only. It creates nothing.Run a date-filtered query and read its pruning. A selective filter on a date column is a good pruning candidate. Run it, then open the profile.
USE WAREHOUSE lab_xs;
SELECT *
FROM SNOWFLAKE_SAMPLE_DATA.TPCH_SF10.ORDERS
WHERE O_ORDERDATE = '1995-03-15';Find the most-expensive node. With no node selected, read the Most Expensive Nodes pane on the right. Note which operator carries the highest percentage and how the time splits in the Profile Overview.
Trigger an exploding join. A comma join with no condition is a Cartesian product. The LIMIT stops it early, so it returns fast and stays cheap.
SELECT *
FROM SNOWFLAKE_SAMPLE_DATA.TPCH_SF1.ORDERS o,
SNOWFLAKE_SAMPLE_DATA.TPCH_SF1.LINEITEM l
LIMIT 100;Compare against a real join key. Same two tables, now joined on the order key. Read the profile and contrast the join operator with the one from Step 3.
SELECT *
FROM SNOWFLAKE_SAMPLE_DATA.TPCH_SF1.ORDERS o
JOIN SNOWFLAKE_SAMPLE_DATA.TPCH_SF1.LINEITEM l
ON o.O_ORDERKEY = l.L_ORDERKEY
LIMIT 100;Cleanup. Nothing to drop. This lab created no objects and never touched day10_orders. Suspend the warehouse so it stops drawing credits.
ALTER WAREHOUSE lab_xs SUSPEND;
-- No DROP needed. The lab only read SNOWFLAKE_SAMPLE_DATA.Practice Questions
Options:
A. Aggregate[1]
B. Join[5]
C. TableScan[2]
D. TableScan[3]
Why D: Each node shows the percentage of total execution time it consumed. TableScan[3] is at 53.4%, well above the rest. The highest percentage is the most expensive operator, so it is the first place to tune.
Why not A: Aggregate[1] at 21.6% is the second-heaviest, not the heaviest. Picking it means reading the wrong line.
Why not B: Join[5] at 7.1% is minor here. Fixing it cannot move the total much.
Why not C: TableScan[2] at 6.3% is the smallest of the four. It is a distractor placed to reward a glance instead of a read.
Options:
A. Pruning
B. Spilling
C. Join explosion
D. Queue overloading
Why B: When memory runs out during a sort, aggregation, or large join, Snowflake spills intermediate results to local disk, then to remote storage. The profile reports Bytes spilled to local storage and Bytes spilled to remote storage. Remote spilling signals the warehouse should be sized up.
Why not A: Pruning is skipping micro-partitions a query does not need. It has nothing to do with memory or disk usage.
Why not C: Join explosion is row blowup in a join, not disk usage from low memory.
Why not D: Queue overloading is a query waiting for warehouse resources before it runs. It is not a memory-to-disk event.
Options:
A. Insufficient credit quota
B. Inefficient query pruning
C. Excessive query pruning
D. Exploding joins
E. Credit usage that exceeds a set threshold
Why B: Inefficient pruning shows on the TableScan, when Partitions scanned is close to Partitions total despite a selective filter. The profile makes that gap visible.
Why D: An exploding join shows on the row counts, when a join emits far more rows than it reads. That blowup is exactly what the profile is built to reveal.
Why not A: Credit quota is a Resource Monitor and ACCOUNT_USAGE concern, not a Query Profile metric. Re-read Day 26 if missed.
Why not C: Excessive pruning is not a real issue. More pruning is always better, so the phrase is a trap.
Why not E: Credit thresholds belong to Resource Monitors, not to the profile of a single query.
Options:
A. The warehouse was busy, so the query waited for a free slot
B. The warehouse memory was exhausted and data spilled to remote storage
C. The filter pruned almost no micro-partitions
D. A join produced a Cartesian product
Why A: Queue time is the wait before any execution begins. A long queue with short execution means the warehouse was occupied with other queries. The fix is to scale out by adding clusters, the topic of Day 37.
Why not B: Spilling shows in the Bytes spilled fields and inflates execution time, not queue time. This stem says execution time is low.
Why not C: Weak pruning makes a scan heavier. That raises execution time. It does not create queue time.
Why not D: A Cartesian product blows up row counts and execution time. It has no effect on time spent waiting in the queue.
Options:
A. Query Insights replaces the Query Profile in newer Snowflake accounts
B. Query Insights automatically flags conditions with a message and a suggestion, readable in Snowsight or from the QUERY_INSIGHTS view
C. Query Insights is available only on Business Critical edition
D. Query Insights shows real-time CPU usage per core for the warehouse
Why B: Query Insights detects known conditions, such as a non-selective filter. It attaches a plain-language message plus a suggested next step. You read it in the Snowsight profile pane or by querying SNOWFLAKE.ACCOUNT_USAGE.QUERY_INSIGHTS.
Why not A: Insights sit alongside the profile. They do not replace it. The operator tree is still where you read execution detail.
Why not C: No edition gate of that kind applies. The distractor invents one to sound plausible.
Why not D: Per-core CPU telemetry is not what insights report. They report named conditions and recommendations, not live hardware metrics.
Snowflake Documentation
Official docs for today’s topics. The exam pulls directly from these.
Today you learned: the Query Profile is the operator tree Snowflake executed. The most-expensive node is the one with the highest percentage of execution time. The profile names four problems: bytes spilled, inefficient pruning, exploding joins, and queue time. The fixes, in order: size up the warehouse, cluster the table, fix the join key, and scale out. Query Insights adds an automatic layer, flagging conditions with a message and a suggestion in Snowsight or in the QUERY_INSIGHTS view.
Key takeaway: every Domain 4 diagnosis question pairs a symptom with a fix. Read the metric in the stem first, then pick the fix that matches it. Spilling and weak pruning have different fixes. The exam swaps them.
Tomorrow (Day 37): Workload Management and the ACCOUNT_USAGE performance views. You separate ETL, BI, and ad-hoc work onto the right warehouse shapes. You read the queuing columns that split provisioning waits from overload waits. You also analyse query history at scale. It is the natural next step after learning to read a single query’s profile today.
