Review Week 6

DAY 43

Week 6 Recap

No new material today. Six days into Domain 4, you have read Query Profiles, separated workloads, weighed Search Optimization against clustering, and unpacked caches, window functions, and FLATTEN. Today is a 30-minute closed-book drill, the recurring Domain 4 traps, and the Week 6 Practice Test. Domain 4 carries a large share of the scored questions. Query Profile reading shows up more than any single feature.

🗣️ How to use today

Step	Time	What you do
1. Closed-book drill	30 min	Run the five drill tasks below. No tabs open. Write SQL on paper or a scratch buffer first.
2. Self-grade	5 min	Open Snowsight, run your SQL, score yourself honestly.
3. Read the gotchas	10 min	Speed-read the patterns below. These are the traps Domain 4 reuses in scenario form.
4. Take the Week 6 Practice Test	15 min	10 mixed Domain 4 questions. Target 8 out of 10. Below that, re-read the day flagged in your wrong answers first.

🛠️

The 30-Minute Drill

Type: CLOSED-BOOK DRILL | Time: ~30 minutes | Credits: <0.05 | Rule: No docs, no Day 36 to 42 tabs. Write the answers first, run them after.

Read a Query Profile from memory. State which node is the most expensive, and what a high bytes-spilled value tells you.

CHECK

Most-expensive operator:
  The node with the HIGHEST percentage of execution time.
  Read that node first. It is where the query spends its time.

Bytes Spilled:
  to LOCAL  -> mild. Data overflowed memory onto local SSD.
  to REMOTE -> severe. The warehouse is too small for this query.
               Fix = size the warehouse UP (a bigger warehouse
               has more memory per query).

Pruning:
  Partitions scanned / Partitions total.
  A LOW ratio means good pruning. A high ratio with a WHERE
  filter present points to a clustering opportunity.

👀 Self-check: Among candidates I have trained, spill-to-remote is the signal most often misread as a slow query or a bug. It is neither. It is the warehouse running out of memory. The fix is a larger size, not a rewrite. Re-read Day 36 if that did not come out.

State scale up versus scale out in one line each, and name the edition gate.

CHECK

Scale UP   : resize to a bigger warehouse.
             Helps ONE large, complex, or spilling query.

Scale OUT  : add clusters with a multi-cluster warehouse.
             Helps CONCURRENCY (many users / queries queuing).
             Requires Enterprise Edition or higher.

Queue signal in ACCOUNT_USAGE / QUERY_HISTORY:
  QUEUED_OVERLOAD_TIME > 0  -> warehouse saturated by concurrency
                               -> scale OUT, not up.

👀 Self-check: The trap is answering “scale up” for a concurrency problem. Concurrency is a scale-out fix. If you wrote “bigger warehouse” for many queued users, re-read Day 37.

List the three caches, where each lives, and what clears each. No peeking.

CHECK

Result cache    : Cloud Services layer. Stores full query results.
                  Free, no warehouse. Reused on EXACT SQL match +
                  unchanged data + your privileges.
                  Held 24 hr, reset on each reuse, up to 31 days max.

Metadata cache  : Cloud Services layer. Answers COUNT(*), MIN, MAX
                  on numeric/date columns with NO warehouse running.

Warehouse cache : Compute layer. Local SSD holding micro-partition
                  data. CLEARED when the warehouse suspends.

👀 Self-check: The two Cloud Services caches (result and metadata) survive a suspend. The SSD warehouse cache does not. Getting that backwards is the most common cache mistake on the exam. Re-read Day 40.

Write a top-1-per-group query using QUALIFY, from memory. Then run it to verify.

SQL

USE WAREHOUSE lab_xs;

-- Highest-value order per customer, no subquery:
SELECT O_CUSTKEY, O_ORDERKEY, O_TOTALPRICE
FROM SNOWFLAKE_SAMPLE_DATA.TPCH_SF1.ORDERS
QUALIFY ROW_NUMBER() OVER (
          PARTITION BY O_CUSTKEY
          ORDER BY O_TOTALPRICE DESC) = 1
LIMIT 50;

👀 Self-check: QUALIFY filters the output of a window function without a subquery. Swap = 1 for <= 3 to get the top three per group. If you reached for a subquery instead, re-read Day 41.

Explode a JSON array into rows with LATERAL FLATTEN, from memory. Then run it.

SQL

-- Self-contained, no table needed:
SELECT
  v:name::STRING       AS person,
  f.value::STRING      AS skill
FROM (SELECT PARSE_JSON(
        '{"name":"Alice","skills":["SQL","Python","Snowpark"]}') AS v),
     LATERAL FLATTEN(input => v:skills) f;

👀 Self-check: One input row becomes three output rows, one per skill. If the colon notation or the LATERAL FLATTEN join did not come out, re-read Day 42. Then suspend the warehouse:

CLEANUP

ALTER WAREHOUSE lab_xs SUSPEND;  -- keep the warehouse, just suspend

🎯

Week 6 Exam Gotchas

These are the Domain 4 patterns the exam reuses in scenario form. Each has a Tell, the phrase the question uses to set the trap. Learn the Tells and you will spot the trap before reading the options.

Gotcha 1: Bytes spilled to remote means the warehouse is too small

Spilling to local SSD is mild. Spilling to remote storage is the warning sign. The query ran out of memory and pushed data to slow remote storage. The fix is a bigger warehouse, not a query rewrite. Tell: a profile showing a non-zero “Bytes spilled to remote storage” value.

Gotcha 2: Concurrency is a scale-out problem, not scale-up

Scaling up (a bigger warehouse) helps one heavy query. Scaling out (multi-cluster) helps many users queuing at once. Multi-cluster warehouses need Enterprise Edition or higher. Tell: “many users hitting the dashboard at peak” or a high QUEUED_OVERLOAD_TIME. The right move is more clusters.

Gotcha 3: The result cache is not permanent

The query result cache holds a result for 24 hours from its last use. Each reuse resets that 24-hour window, up to 31 days from the first run. After 31 days it is purged regardless. Tell: any option claiming the result cache “lasts forever” or “until the data changes only.” Both are wrong.

Gotcha 4: Suspending a warehouse clears only the SSD cache

The warehouse cache lives on local SSD and is dropped on suspend. The result cache and metadata cache live in the Cloud Services layer and survive a suspend. Tell: “all warehouses suspended, the query still returned instantly.” That points to result or metadata cache, never the warehouse cache.

Gotcha 5: Metadata cache answers some aggregates with no warehouse

COUNT(*), and MIN or MAX on numeric and date columns, can be answered from metadata alone. No warehouse runs. The Query Profile shows a single METADATA-BASED RESULT node. Tell: a COUNT(*) returning instantly with every warehouse suspended. MIN or MAX on a string column does not use this path.

Gotcha 6: Cluster on filter columns first, join columns second

A clustering key helps most on the columns in your WHERE filters. Join predicate columns come next. GROUP BY and ORDER BY columns are not the priority. Tell: a multi-select asking which columns to cluster on. The WHERE filter column is almost always one of the answers.

Gotcha 7: Automatic Clustering runs on Standard; the others need Enterprise

Automatic Clustering is available in Standard Edition. Materialized Views, the Search Optimization Service, and the Query Acceleration Service all require Enterprise Edition or higher. Tell: a “minimum edition” question. If the feature is Automatic Clustering, the answer is Standard. The other three climb to Enterprise.

Gotcha 8: Search Optimization is for point lookups, not range scans

The Search Optimization Service speeds up selective lookups: equality, IN lists, substring and LIKE, and geospatial predicates. It builds a persistent search access path. It is not clustering. It does not help broad range scans. Tell: “find a single row by id” or “substring match on a large table.”

Gotcha 9: QUALIFY filters window functions without a subquery

QUALIFY is to window functions what HAVING is to GROUP BY. It filters on the result of ROW_NUMBER, RANK, or any window function in one statement. Tell: “return the top N per group without a subquery.” The answer is QUALIFY with a window function.

Gotcha 10: RANK skips on ties, DENSE_RANK does not

On a tie, RANK leaves a gap (1, 1, 3). DENSE_RANK does not (1, 1, 2). ROW_NUMBER assigns a unique number to every row, breaking ties arbitrarily. Tell: a result set with duplicate values in the ORDER BY, and an option that hinges on whether the next rank is 2 or 3.

Gotcha 11: JSON column name is case-insensitive, the key is case-sensitive

The VARIANT column name follows SQL rules and ignores case, so src and SRC are the same column. The JSON key inside is case-sensitive, so :Email and :email are different paths. Tell: two paths that differ only in the case of a key after the colon. They are not equivalent.

Gotcha 12: FLATTEN is a table function, used with LATERAL

FLATTEN returns a set of rows, not a scalar value. You join to it, usually with LATERAL, so it reads the VARIANT column from each row to its left. Tell: a query that selects FLATTEN like a normal function in the column list. That form does not produce the per-element rows the question wants.

⚡

Week 6 Readiness Checklist

If you can confidently…	You’re solid on	Otherwise re-skim
Name the most-expensive node and read a bytes-spilled value	Day 36	Day 36: Query Profile
Choose scale up for one query, scale out for concurrency	Day 37	Day 37: Workload management
Pick Search Optimization for point lookups, QAS for big scans	Day 38	Day 38: QAS and Search Optimization
State the WHERE-over-JOIN clustering rule, and MV edition	Day 39	Day 39: Clustering and MVs
Place all three caches and say what each suspend clears	Day 40	Day 40: Caching
Write a top-N-per-group query with QUALIFY	Day 41	Day 41: Window functions
Explode a JSON array with LATERAL FLATTEN	Day 42	Day 42: VARIANT and FLATTEN
Match Automatic Clustering to Standard, the rest to Enterprise	Day 38, Day 39	Day 38, Day 39: editions

🎯 Exam Tip

Domain 4 questions hinge on the qualifier word. Read it before you read the options. MOST cost-effective rewards the serverless or cache answer over a bigger warehouse. FASTEST for a point lookup points to Search Optimization, not clustering. MINIMUM edition means start at Standard and climb only when the feature forces it. Automatic Clustering stops at Standard. Materialized Views, Search Optimization, and QAS reach Enterprise. Without a subquery in a top-N stem points to QUALIFY. On a Query Profile scenario, find the highest-percentage node first, then check for spilling and poor pruning. Most Domain 4 traps are won by matching the qualifier to the feature, not by recalling syntax.

📝 Week 6 Wrap

You’ve covered: reading a Query Profile, workload isolation, and scale up versus out. Then the Query Acceleration Service, Search Optimization, clustering keys, materialized views, the three caches, QUALIFY, and FLATTEN. That is the core of Domain 4.

Take the Week 6 Practice Test (10 mixed Domain 4 questions). Target 8 out of 10. If you score 6 or below, re-read the days flagged in your wrong answers before starting Week 7.

Tomorrow (Day 44): Week 7 begins. We move into Domain 5 with Time Travel, Fail-safe, and Zero-Copy Cloning. You will learn who can recover data and when: Time Travel is yours to query, Fail-safe only Snowflake can reach. Cloning copies an object with no storage cost at first. Privileges do not come with it.

← Day 42 All 50 Days Take Practice Test →

About the Author

Abhay Krishnan

Senior Data & AI Consultant

Connect on LinkedIn

With over five years of data engineering experience at EY and Infosys, Abhay Krishnan specializes in building scalable data pipelines and cloud warehousing solutions. He is a certified SnowPro Core professional, alongside credentials in AWS and Azure. Abhay created this 50-day track to solve a problem he faced firsthand: the lack of a structured, free resource for Snowflake certification prep. Follow him on LinkedIn for more data engineering insights.

Week 6 Recap

The 30-Minute Drill

Week 6 Exam Gotchas

Gotcha 1: Bytes spilled to remote means the warehouse is too small

Gotcha 2: Concurrency is a scale-out problem, not scale-up

Gotcha 3: The result cache is not permanent

Gotcha 4: Suspending a warehouse clears only the SSD cache

Gotcha 5: Metadata cache answers some aggregates with no warehouse

Gotcha 6: Cluster on filter columns first, join columns second

Gotcha 7: Automatic Clustering runs on Standard; the others need Enterprise

Gotcha 8: Search Optimization is for point lookups, not range scans

Gotcha 9: QUALIFY filters window functions without a subquery

Gotcha 10: RANK skips on ties, DENSE_RANK does not

Gotcha 11: JSON column name is case-insensitive, the key is case-sensitive

Gotcha 12: FLATTEN is a table function, used with LATERAL

Week 6 Readiness Checklist

Topics