Week 6 Recap
No new material today. Six days into Domain 4, you have read Query Profiles, separated workloads, weighed Search Optimization against clustering, and unpacked caches, window functions, and FLATTEN. Today is a 30-minute closed-book drill, the recurring Domain 4 traps, and the Week 6 Practice Test. Domain 4 carries a large share of the scored questions. Query Profile reading shows up more than any single feature.
| Step | Time | What you do |
|---|---|---|
| 1. Closed-book drill | 30 min | Run the five drill tasks below. No tabs open. Write SQL on paper or a scratch buffer first. |
| 2. Self-grade | 5 min | Open Snowsight, run your SQL, score yourself honestly. |
| 3. Read the gotchas | 10 min | Speed-read the patterns below. These are the traps Domain 4 reuses in scenario form. |
| 4. Take the Week 6 Practice Test | 15 min | 10 mixed Domain 4 questions. Target 8 out of 10. Below that, re-read the day flagged in your wrong answers first. |
The 30-Minute Drill
Read a Query Profile from memory. State which node is the most expensive, and what a high bytes-spilled value tells you.
Most-expensive operator:
The node with the HIGHEST percentage of execution time.
Read that node first. It is where the query spends its time.
Bytes Spilled:
to LOCAL -> mild. Data overflowed memory onto local SSD.
to REMOTE -> severe. The warehouse is too small for this query.
Fix = size the warehouse UP (a bigger warehouse
has more memory per query).
Pruning:
Partitions scanned / Partitions total.
A LOW ratio means good pruning. A high ratio with a WHERE
filter present points to a clustering opportunity.State scale up versus scale out in one line each, and name the edition gate.
Scale UP : resize to a bigger warehouse.
Helps ONE large, complex, or spilling query.
Scale OUT : add clusters with a multi-cluster warehouse.
Helps CONCURRENCY (many users / queries queuing).
Requires Enterprise Edition or higher.
Queue signal in ACCOUNT_USAGE / QUERY_HISTORY:
QUEUED_OVERLOAD_TIME > 0 -> warehouse saturated by concurrency
-> scale OUT, not up.List the three caches, where each lives, and what clears each. No peeking.
Result cache : Cloud Services layer. Stores full query results.
Free, no warehouse. Reused on EXACT SQL match +
unchanged data + your privileges.
Held 24 hr, reset on each reuse, up to 31 days max.
Metadata cache : Cloud Services layer. Answers COUNT(*), MIN, MAX
on numeric/date columns with NO warehouse running.
Warehouse cache : Compute layer. Local SSD holding micro-partition
data. CLEARED when the warehouse suspends.Write a top-1-per-group query using QUALIFY, from memory. Then run it to verify.
USE WAREHOUSE lab_xs;
-- Highest-value order per customer, no subquery:
SELECT O_CUSTKEY, O_ORDERKEY, O_TOTALPRICE
FROM SNOWFLAKE_SAMPLE_DATA.TPCH_SF1.ORDERS
QUALIFY ROW_NUMBER() OVER (
PARTITION BY O_CUSTKEY
ORDER BY O_TOTALPRICE DESC) = 1
LIMIT 50;= 1 for <= 3 to get the top three per group. If you reached for a subquery instead, re-read Day 41.Explode a JSON array into rows with LATERAL FLATTEN, from memory. Then run it.
-- Self-contained, no table needed:
SELECT
v:name::STRING AS person,
f.value::STRING AS skill
FROM (SELECT PARSE_JSON(
'{"name":"Alice","skills":["SQL","Python","Snowpark"]}') AS v),
LATERAL FLATTEN(input => v:skills) f;LATERAL FLATTEN join did not come out, re-read Day 42. Then suspend the warehouse:ALTER WAREHOUSE lab_xs SUSPEND; -- keep the warehouse, just suspendWeek 6 Exam Gotchas
These are the Domain 4 patterns the exam reuses in scenario form. Each has a Tell, the phrase the question uses to set the trap. Learn the Tells and you will spot the trap before reading the options.
Gotcha 1: Bytes spilled to remote means the warehouse is too small
Spilling to local SSD is mild. Spilling to remote storage is the warning sign. The query ran out of memory and pushed data to slow remote storage. The fix is a bigger warehouse, not a query rewrite. Tell: a profile showing a non-zero “Bytes spilled to remote storage” value.
Gotcha 2: Concurrency is a scale-out problem, not scale-up
Scaling up (a bigger warehouse) helps one heavy query. Scaling out (multi-cluster) helps many users queuing at once. Multi-cluster warehouses need Enterprise Edition or higher. Tell: “many users hitting the dashboard at peak” or a high QUEUED_OVERLOAD_TIME. The right move is more clusters.
Gotcha 3: The result cache is not permanent
The query result cache holds a result for 24 hours from its last use. Each reuse resets that 24-hour window, up to 31 days from the first run. After 31 days it is purged regardless. Tell: any option claiming the result cache “lasts forever” or “until the data changes only.” Both are wrong.
Gotcha 4: Suspending a warehouse clears only the SSD cache
The warehouse cache lives on local SSD and is dropped on suspend. The result cache and metadata cache live in the Cloud Services layer and survive a suspend. Tell: “all warehouses suspended, the query still returned instantly.” That points to result or metadata cache, never the warehouse cache.
Gotcha 5: Metadata cache answers some aggregates with no warehouse
COUNT(*), and MIN or MAX on numeric and date columns, can be answered from metadata alone. No warehouse runs. The Query Profile shows a single METADATA-BASED RESULT node. Tell: a COUNT(*) returning instantly with every warehouse suspended. MIN or MAX on a string column does not use this path.
Gotcha 6: Cluster on filter columns first, join columns second
A clustering key helps most on the columns in your WHERE filters. Join predicate columns come next. GROUP BY and ORDER BY columns are not the priority. Tell: a multi-select asking which columns to cluster on. The WHERE filter column is almost always one of the answers.
Gotcha 7: Automatic Clustering runs on Standard; the others need Enterprise
Automatic Clustering is available in Standard Edition. Materialized Views, the Search Optimization Service, and the Query Acceleration Service all require Enterprise Edition or higher. Tell: a “minimum edition” question. If the feature is Automatic Clustering, the answer is Standard. The other three climb to Enterprise.
Gotcha 8: Search Optimization is for point lookups, not range scans
The Search Optimization Service speeds up selective lookups: equality, IN lists, substring and LIKE, and geospatial predicates. It builds a persistent search access path. It is not clustering. It does not help broad range scans. Tell: “find a single row by id” or “substring match on a large table.”
Gotcha 9: QUALIFY filters window functions without a subquery
QUALIFY is to window functions what HAVING is to GROUP BY. It filters on the result of ROW_NUMBER, RANK, or any window function in one statement. Tell: “return the top N per group without a subquery.” The answer is QUALIFY with a window function.
Gotcha 10: RANK skips on ties, DENSE_RANK does not
On a tie, RANK leaves a gap (1, 1, 3). DENSE_RANK does not (1, 1, 2). ROW_NUMBER assigns a unique number to every row, breaking ties arbitrarily. Tell: a result set with duplicate values in the ORDER BY, and an option that hinges on whether the next rank is 2 or 3.
Gotcha 11: JSON column name is case-insensitive, the key is case-sensitive
The VARIANT column name follows SQL rules and ignores case, so src and SRC are the same column. The JSON key inside is case-sensitive, so :Email and :email are different paths. Tell: two paths that differ only in the case of a key after the colon. They are not equivalent.
Gotcha 12: FLATTEN is a table function, used with LATERAL
FLATTEN returns a set of rows, not a scalar value. You join to it, usually with LATERAL, so it reads the VARIANT column from each row to its left. Tell: a query that selects FLATTEN like a normal function in the column list. That form does not produce the per-element rows the question wants.
Week 6 Readiness Checklist
| If you can confidently… | You’re solid on | Otherwise re-skim |
|---|---|---|
| Name the most-expensive node and read a bytes-spilled value | Day 36 | Day 36: Query Profile |
| Choose scale up for one query, scale out for concurrency | Day 37 | Day 37: Workload management |
| Pick Search Optimization for point lookups, QAS for big scans | Day 38 | Day 38: QAS and Search Optimization |
| State the WHERE-over-JOIN clustering rule, and MV edition | Day 39 | Day 39: Clustering and MVs |
| Place all three caches and say what each suspend clears | Day 40 | Day 40: Caching |
| Write a top-N-per-group query with QUALIFY | Day 41 | Day 41: Window functions |
| Explode a JSON array with LATERAL FLATTEN | Day 42 | Day 42: VARIANT and FLATTEN |
| Match Automatic Clustering to Standard, the rest to Enterprise | Day 38, Day 39 | Day 38, Day 39: editions |
Domain 4 questions hinge on the qualifier word. Read it before you read the options. MOST cost-effective rewards the serverless or cache answer over a bigger warehouse. FASTEST for a point lookup points to Search Optimization, not clustering. MINIMUM edition means start at Standard and climb only when the feature forces it. Automatic Clustering stops at Standard. Materialized Views, Search Optimization, and QAS reach Enterprise. Without a subquery in a top-N stem points to QUALIFY. On a Query Profile scenario, find the highest-percentage node first, then check for spilling and poor pruning. Most Domain 4 traps are won by matching the qualifier to the feature, not by recalling syntax.
You’ve covered: reading a Query Profile, workload isolation, and scale up versus out. Then the Query Acceleration Service, Search Optimization, clustering keys, materialized views, the three caches, QUALIFY, and FLATTEN. That is the core of Domain 4.
Take the Week 6 Practice Test (10 mixed Domain 4 questions). Target 8 out of 10. If you score 6 or below, re-read the days flagged in your wrong answers before starting Week 7.
Tomorrow (Day 44): Week 7 begins. We move into Domain 5 with Time Travel, Fail-safe, and Zero-Copy Cloning. You will learn who can recover data and when: Time Travel is yours to query, Fail-safe only Snowflake can reach. Cloning copies an object with no storage cost at first. Privileges do not come with it.
