Multi-Cluster Warehouses & Scaling Policies
Day 8 covered warehouse size. Day 9 covers cluster count. Resizing fixes one slow query. A multi-cluster warehouse fixes many queries queuing at the same time. The COF-C03 tests this distinction repeatedly. Three facts carry most of the marks: scale UP for complex queries, scale OUT for concurrency, and Enterprise Edition is the minimum for multi-cluster. STANDARD versus ECONOMY scaling is the fourth.
| Term | Plain meaning |
|---|---|
| Cluster | One MPP compute group inside a warehouse. A single-cluster warehouse has one. A multi-cluster warehouse can run several in parallel. |
| Scale UP | Make the warehouse bigger (XS → M → L). More compute per query. Fixes slow individual queries. |
| Scale OUT | Add more clusters of the same size. More queries running side by side. Fixes queuing under high concurrency. |
| Auto-scale mode | MIN_CLUSTER_COUNT < MAX_CLUSTER_COUNT. Snowflake adds and removes clusters as load changes. |
| Maximized mode | MIN_CLUSTER_COUNT == MAX_CLUSTER_COUNT. All clusters always run. No scaling decision. Used when you want guaranteed concurrency. |
| Scaling policy | Rule that controls how fast Snowflake spins clusters up and down in auto-scale mode. Two choices: STANDARD or ECONOMY. |
| Queuing | Queries waiting because the warehouse is out of resources. Visible in the QUEUED_OVERLOAD_TIME column of query history. |
Today’s Concept
Micro-Concept 1: Scale UP vs Scale OUT
One warehouse, two completely different problems. The decision tree below is the single most useful thing on this page.
| Symptom | Real problem | Fix |
|---|---|---|
| One complex query takes too long, even running alone | Compute complexity | Scale UP (resize XS → M → L) |
| Many queries queue up; each query alone is fast | Concurrency | Scale OUT (add clusters) |
| Mix of both: single query slow AND queues piling | Both | Resize larger and multi-cluster |
Resizing does not fix concurrency. A bigger warehouse runs each query faster. It still processes the workload with one cluster’s worth of capacity. When 200 BI users hit dashboards at the same time, what helps is more clusters, not a bigger one. This is the most common trap on this topic, and Snowflake’s docs say it plainly: warehouse resizing is not intended for handling concurrency issues.
Micro-Concept 2: Edition Gate, Enterprise+ Only
Multi-cluster warehouses require Enterprise Edition or higher. On Standard Edition, MAX_CLUSTER_COUNT is forced to 1. You can run as many warehouses as you like, but each one is single-cluster.
This is one of the most-tested edition-boundary facts on the COF-C03. When a question mentions a Standard Edition account with a concurrency problem, “enable multi-cluster” is never the right answer. The right answer is usually “upgrade the edition” or “create additional warehouses to distribute the load.”
Micro-Concept 3: Maximized Mode vs Auto-Scale Mode
The mode of a multi-cluster warehouse is decided by one thing: whether MIN equals MAX.
| Mode | Setting | Behavior | When to use |
|---|---|---|---|
| Auto-scale | MIN < MAX (e.g., MIN=1, MAX=5) | Snowflake starts and stops clusters based on load. Scaling policy controls aggressiveness. | Default for variable workloads. Most common. |
| Maximized | MIN == MAX (e.g., MIN=3, MAX=3) | All clusters always run. No scaling decision needed. Scaling policy is ignored. | Guaranteed concurrency for predictable peak load (e.g., always 200+ concurrent BI users). |
Snowflake’s documentation recommends starting in auto-scale with MIN=1. Move to maximized only when there is a clear reason: high availability or predictably high baseline concurrency. On the exam, maximized mode is the exception, not the default. Treat any answer that says “maximized is the recommended default” as wrong.
Micro-Concept 4: The Two Scaling Policies
Scaling policy applies only in auto-scale mode. In maximized mode it is silently ignored. The policy controls how fast Snowflake reacts to load.
| Policy | Spin-up trigger | Spin-down trigger | Tradeoff |
|---|---|---|---|
| STANDARD (default) | ~20 seconds of sustained queuing adds a cluster | ~2–3 consecutive minutes of low utilization shuts down a cluster | Performance-first. Minimizes queuing. More credits during brief spikes. |
| ECONOMY | ~6 minutes of sustained load that would keep a new cluster busy adds a cluster | Slower to shut down. Wants to keep running clusters fully loaded. | Cost-first. Some queuing accepted. Cheaper for spiky workloads. |
Quick picker:
- BI dashboards, customer-facing analytics, interactive queries → STANDARD
- ETL/ELT, scheduled batches, background loads → ECONOMY
Practice banks frequently include “ECONOMY adds clusters faster than STANDARD.” It is wrong. The word “economy” makes people assume “fast and cheap,” but the policy is conservative. It deliberately waits about 6 minutes of sustained load before provisioning a new cluster, so the cluster will stay busy once running. STANDARD is the aggressive one. The direction to remember: STANDARD = fast, ECONOMY = patient.
Micro-Concept 5: Auto-Suspend + Auto-Resume in Multi-Cluster
Both settings apply at the warehouse level, not at the individual cluster level. Two rules from the documentation are exam-relevant:
- Auto-suspend triggers only when the warehouse is down to its minimum cluster count and idle for the configured time. Excess clusters above the minimum shut down first, under the scaling policy. Suspension happens after the warehouse hits MIN and then sits idle.
- Auto-resume applies only when the entire warehouse is suspended. That is, when no clusters are running at all.
Micro-Concept 6: Credit Cost = Size × Active Clusters × Time
Each active cluster bills at the warehouse’s per-size credit rate. A Medium warehouse costs 4 credits per hour per cluster. With three clusters active, that is 12 credits per hour, but only for the time all three are running. The 60-second minimum applies per cluster start. Spin up a new cluster for one query and shut it down 10 seconds later. You have still paid for a full minute of that cluster.
Cost-control levers, in priority order. Right-size the warehouse first. Set a sensible MAX to cap the worst case. Prefer auto-scale over maximized. Pick ECONOMY where the workload can tolerate queuing. Tune auto-suspend last. The order matters. Many teams start with auto-suspend and miss the bigger wins above it.
Micro-Concept 7: What MIN > 1 Buys You
Setting MIN higher than 1 gives you high availability. If one cluster fails, the others keep serving queries with no gap. This is the right call for mission-critical production warehouses where any downtime is unacceptable. The cost is real, though. You are paying for at least MIN clusters whenever the warehouse is running, whether the load needs them or not.
Cheat Sheet
| Concept | What to remember |
|---|---|
| Edition gate | Multi-cluster = Enterprise+. Standard Edition forces MAX=1. |
| Scale UP | Resize. Fixes single-query complexity. |
| Scale OUT | Add clusters. Fixes concurrency and queuing. |
| Auto-scale mode | MIN < MAX. Clusters spin up and down by policy. |
| Maximized mode | MIN == MAX. All clusters always on. Policy ignored. |
| STANDARD policy | ~20 sec queue triggers spin-up. Performance-first. Default. |
| ECONOMY policy | ~6 min sustained load triggers spin-up. Cost-first. Tolerates queuing. |
| MAX cluster ceiling | Snowsight UI caps at 10. SQL allows higher. |
| MIN > 1 | For high availability or guaranteed concurrency. |
| Auto-suspend | Triggers only at MIN cluster count plus idle period. |
| Resize while running | Allowed. New compute joins, old drains. No disruption. |
Three traps to recognize before the question even finishes loading. (1) “Resize bigger to fix concurrency.” False, and Snowflake’s documentation says so directly. Resize fixes per-query complexity. Fifty BI users queuing on a Medium warehouse will still queue on a 4XL with one cluster. What they need is multi-cluster. (2) “ECONOMY adds clusters faster to save credits.” Backwards. ECONOMY is the conservative policy. It waits roughly six minutes of sustained load before adding a cluster, on purpose, so the new cluster will stay busy. STANDARD is the aggressive one, at roughly a 20 second queue trigger. (3) “Multi-cluster works on Standard Edition.” False. Enterprise Edition is the minimum. When the question describes a Standard Edition account with concurrency pain, the right answer is upgrade the edition or split the workload across additional warehouses. “Enable multi-cluster” is never correct in that scenario. I have seen all three of these wordings on practice banks in the last six months.
Hands-On Lab
lab_xs warehouse present.Convert lab_xs into a multi-cluster warehouse with auto-scale and the STANDARD policy.
USE ROLE SYSADMIN;
ALTER WAREHOUSE lab_xs SET
MIN_CLUSTER_COUNT = 1,
MAX_CLUSTER_COUNT = 3,
SCALING_POLICY = 'STANDARD';
SHOW WAREHOUSES LIKE 'lab_xs';
-- inspect min_cluster_count, max_cluster_count, scaling_policy
Trigger queuing by running concurrent heavy queries. Open 4 or 5 Snowsight worksheets, paste the same query into each one, and run them at the same time.
-- Run this in 4+ worksheets at the same time
USE WAREHOUSE lab_xs;
SELECT
L_SHIPMODE,
COUNT(*),
AVG(L_QUANTITY),
SUM(L_EXTENDEDPRICE)
FROM SNOWFLAKE_SAMPLE_DATA.TPCH_SF10.LINEITEM
GROUP BY L_SHIPMODE;
Inspect queuing in query history.
SELECT
QUERY_ID,
WAREHOUSE_NAME,
WAREHOUSE_SIZE,
CLUSTER_NUMBER,
QUEUED_PROVISIONING_TIME / 1000 AS queued_prov_sec,
QUEUED_OVERLOAD_TIME / 1000 AS queued_overload_sec,
TOTAL_ELAPSED_TIME / 1000 AS total_sec
FROM TABLE(INFORMATION_SCHEMA.QUERY_HISTORY())
WHERE WAREHOUSE_NAME = 'LAB_XS'
ORDER BY START_TIME DESC
LIMIT 20;
Switch to ECONOMY and watch the scale-up slow down.
ALTER WAREHOUSE lab_xs SET SCALING_POLICY = 'ECONOMY';
-- Re-run the concurrent test from step 2.
-- Notice clusters take much longer to start (~6 min sustained load).
-- Some queries queue rather than triggering new clusters.
Try maximized mode. Set MIN = MAX so all clusters run continuously.
ALTER WAREHOUSE lab_xs SET
MIN_CLUSTER_COUNT = 2,
MAX_CLUSTER_COUNT = 2;
-- Now in maximized mode. SCALING_POLICY no longer applies.
SHOW WAREHOUSES LIKE 'lab_xs';
-- Note both clusters running even with no queries
Cleanup. Restore single-cluster STANDARD.
ALTER WAREHOUSE lab_xs SET
MIN_CLUSTER_COUNT = 1,
MAX_CLUSTER_COUNT = 1,
SCALING_POLICY = 'STANDARD';
ALTER WAREHOUSE lab_xs SUSPEND;
-- KEEP lab_xs — persistent lab warehouse through Day 37
Snowflake Documentation
Official docs for today’s topics.
External References
Concurrency tuning and queuing diagnostics.
Practice Questions
Options:
A. Resize the warehouse from Medium to 2X-Large
B. Enable multi-cluster with MAX_CLUSTER_COUNT = 4 and SCALING_POLICY = STANDARD
C. Set AUTO_SUSPEND = 30 seconds to free resources faster
D. Disable auto-resume so clusters stay warm
Why B: Each query is fast on its own, so the problem is purely concurrency. Scale OUT with multi-cluster is the documented fix. STANDARD is the right policy for dashboards because latency matters more than credit savings on interactive BI.
Why not A: Resize fixes per-query complexity, not concurrency. A 2XL with one cluster still serves 50 users sequentially. Bigger queries, same queue. Re-read Day 8 if missed.
Why not C: Auto-suspend does not add capacity. Setting it more aggressively actually increases cost on busy warehouses because the 60-second minimum kicks in every time the warehouse resumes.
Why not D: Auto-resume controls whether a suspended warehouse wakes up automatically. It has no effect on queuing under active load.
Options:
A. All editions including Standard
B. Enterprise Edition and higher only
C. Business Critical Edition and higher only
D. Virtual Private Snowflake (VPS) only
Why B: The Snowflake docs are explicit. Multi-cluster warehouses require Enterprise Edition or higher. Standard Edition supports single-cluster warehouses only, with MAX_CLUSTER_COUNT forced to 1.
Why not A: Standard Edition is explicitly excluded. This is the classic “all editions” distractor.
Why not C/D: Available from Enterprise upward. Business Critical and VPS add other features (PrivateLink, HIPAA, dedicated metadata store) but multi-cluster is not gated to those tiers. Re-read Day 3 if missed.
Options:
A. ECONOMY adds clusters faster than STANDARD to keep latency low
B. STANDARD starts new clusters quickly (~20 sec of queuing) to minimize queuing; ECONOMY waits longer (~6 min sustained load) to favor credit savings
C. Both policies behave identically; the choice only affects logging
D. STANDARD is only available in Business Critical Edition
Why B: STANDARD is performance-first. Roughly 20 seconds of sustained queuing triggers a new cluster. ECONOMY is cost-first. It waits roughly 6 minutes of sustained load to confirm the new cluster will stay busy. This is the most common scaling-policy trap on the COF-C03. The wrong answer always sounds like “ECONOMY is fast because it saves money.”
Why not A: Direction reversed. STANDARD is the aggressive one. Anyone who remembers “economy” as “fast and cheap” will fall for this.
Why not C: The policies behave very differently. This is a real provisioning decision, not a logging flag.
Why not D: Both policies are available on Enterprise Edition and higher, the same edition that gates multi-cluster itself.
Options:
A. MIN_CLUSTER_COUNT = 1, MAX_CLUSTER_COUNT = 5
B. MIN_CLUSTER_COUNT = 3, MAX_CLUSTER_COUNT = 3
C. MIN_CLUSTER_COUNT = 2, MAX_CLUSTER_COUNT = 4
D. MIN_CLUSTER_COUNT = 1, MAX_CLUSTER_COUNT = 1
E. MIN_CLUSTER_COUNT = 5, MAX_CLUSTER_COUNT = 5
Why B & E: Maximized mode is defined by MIN equalling MAX, with both values greater than 1 so the warehouse is actually multi-cluster. All configured clusters run continuously and SCALING_POLICY is ignored. Both B (3=3) and E (5=5) match.
Why not A or C: MIN < MAX is auto-scale mode by definition, regardless of the size of the gap.
Why not D: MIN equals MAX, but at one cluster the warehouse is a single-cluster warehouse, not multi-cluster in maximized mode. The “maximized” label only applies to multi-cluster configurations.
Options:
A. STANDARD: scales up fast
B. ECONOMY: conservative cluster provisioning, lower credit consumption
C. Disable multi-cluster entirely
D. Maximized mode with MIN=MAX=5
Why B: Batch ETL with queuing tolerance is the textbook ECONOMY scenario. Waiting roughly 6 minutes before adding a cluster avoids spinning up capacity for transient spikes, which translates directly into fewer credits over the three-hour run.
Why not A: STANDARD over-provisions for batch workloads. It would add clusters for short bursts that the ETL could absorb by queuing for a few minutes.
Why not C: Some scale-out still helps during heavy phases of the ETL. Disabling multi-cluster gives up the upside while not addressing the cost concern.
Why not D: Five clusters running continuously is the maximum-cost configuration for this warehouse, the opposite of what the question asks for.
Today you learned: Multi-cluster warehouses scale OUT for concurrency, many queries at once rather than one big query. Enterprise Edition or higher is required. Auto-scale mode (MIN<MAX) lets Snowflake add and remove clusters automatically. Maximized mode (MIN=MAX) keeps every cluster running and ignores the scaling policy. The two scaling policies set the speed of that response: STANDARD (~20 second queue trigger) for performance, ECONOMY (~6 minutes sustained load) for cost. Resize fixes per-query complexity. Multi-cluster fixes queuing.
Key takeaway: Two scaling dimensions, one decision tree. Slow single query, scale UP. Many queries queuing, scale OUT. Mixed, do both. Edition gate: Enterprise+ for multi-cluster.
Tomorrow (Day 10): Micro-Partitions & Data Clustering. How Snowflake stores data in immutable 50–500 MB columnar chunks, how pruning skips them, and when (and when not) to add a clustering key.