Skip to main content
πŸ“–

Key Terms β€” Query Performance

QP

Query Profile

Snowsight's visual execution plan for a completed query β€” shows operator tree, timing, pruning stats, and spilling information.

β€”

Micro-Partition Pruning

Snowflake's automatic optimisation that skips micro-partitions whose metadata (min/max) indicates they cannot match the query's WHERE clause.

β€”

Spilling

When query intermediate data exceeds warehouse memory and overflows to local SSD, then to remote cloud storage β€” a key performance warning sign.

β€”

Result Cache

A 24-hour Cloud Services cache that returns identical query results instantly with zero warehouse credits.

β€”

EXPLAIN

SQL command that shows the logical execution plan without actually running the query β€” no credits consumed.

β€”

Scale Up

Increasing warehouse size to provide more memory/compute per query β€” fixes spilling.

β€”

Scale Out

Adding multi-cluster capacity to handle more concurrent queries β€” fixes queueing.


Query Lifecycle

Every SQL statement in Snowflake follows the same execution path through three layers:

Query Execution Flow

Flow diagram: 1) User submits SQL. 2) Cloud Services Layer parses, optimises, and creates execution plan. 3) Virtual Warehouse executes the plan against micro-partitions in cloud storage. 4) Results returned via Cloud Services Layer. Each step labelled with what happens: parse/compile, optimise/prune metadata, distribute to compute nodes, scan/join/aggregate, return result set.

Flow diagram showing SQL submission through Cloud Services parsing and optimisation, then execution on Virtual Warehouse compute nodes, then results returned
  1. Cloud Services β€” parses SQL, resolves objects, checks privileges, generates optimised execution plan, determines which micro-partitions to scan
  2. Virtual Warehouse β€” executes the plan: scans micro-partitions, applies filters, performs joins, aggregations, and sorts
  3. Cloud Services β€” returns results to client, caches result for potential reuse
ℹ️No Indexes, No VACUUM, No Manual Tuning

Unlike traditional databases, Snowflake has no indexes to create, no statistics to update, and no VACUUM to run. The query optimiser uses micro-partition metadata (min/max, NDV, null counts) automatically. Your main levers for performance are: warehouse sizing, clustering keys, and writing efficient SQL.


The Query Profile

The Query Profile in Snowsight is your most important diagnostic tool. It shows the actual execution plan as a tree of operator nodes.

Query Profile Operator Tree

Tree diagram of a typical Query Profile: Result node at top, below it an Aggregate node, feeding into a Join node with two branches β€” left branch shows TableScan on ORDERS with Filter, right branch shows TableScan on CUSTOMERS. Each node shows percentage of total time, rows processed, bytes scanned. The Join node consumes 45% of time. The TableScan shows partitions scanned 120 of 5000 total, indicating good pruning.

Tree diagram of Query Profile nodes: Result, Aggregate, Join, Filter, TableScan with timing statistics

Key Operator Nodes

NodePurpose
TableScanReads micro-partitions from storage β€” shows pruning stats
FilterApplies WHERE clause predicates
JoinCombines data from multiple tables (hash join, merge join)
AggregateGROUP BY, COUNT, SUM operations
SortORDER BY and merge join preparation
ExchangeRedistributes data between compute nodes β€” often the most expensive

What to Look For

  • Partitions scanned vs total β€” Is pruning effective?
  • Bytes spilled to local/remote storage β€” Is the warehouse too small?
  • Percentage bars β€” Which operators consume the most time?
  • Exploding joins β€” Does a join produce far more rows than its inputs?
Viewing Query Profile Programmatically
1-- Get the last query's ID
2SELECT LAST_QUERY_ID();
3
4-- Retrieve operator stats for any query
5SELECT *
6FROM TABLE(GET_QUERY_OPERATOR_STATS(LAST_QUERY_ID()));

Micro-Partition Pruning

Pruning is Snowflake’s primary query optimisation. Every micro-partition stores column-level metadata (min/max values). When your WHERE clause specifies a range or equality predicate, Snowflake compares the filter against each partition’s metadata and skips partitions that cannot contain matching rows.

Micro-Partition Pruning in Action

Eight micro-partitions labelled MP1-MP8 with date ranges. A WHERE clause for dates in February prunes 6 partitions (greyed out) and scans only 2 partitions (highlighted green). Shows: Partitions scanned 2 of 8 total β€” 75% pruning efficiency.

Diagram showing eight micro-partitions with date ranges, WHERE clause pruning six and scanning two
⚠️Pruning Killer: Functions on WHERE Columns

Applying a function to a column in WHERE disables pruning on that column. WHERE YEAR(order_date) = 2024 prevents Snowflake from using min/max metadata. Rewrite as WHERE order_date >= '2024-01-01' AND order_date < '2025-01-01' to preserve pruning.

Checking Pruning Effectiveness
1-- Run a filtered query and check Query Profile for partition stats
2SELECT order_id, customer_id, order_date, total_amount
3FROM sales.orders
4WHERE order_date BETWEEN '2024-06-01' AND '2024-06-30'
5AND region = 'EMEA';
6
7-- In Query Profile, click TableScan node:
8-- Partitions scanned: 150 / Partitions total: 12,000
9-- = 98.75% of partitions pruned

Spilling to Disk

When a query’s intermediate data exceeds warehouse memory, Snowflake spills data through a hierarchy:

Spilling Hierarchy

Three-tier pyramid: Top (green) = Memory/RAM (fastest, all processing ideally here). Middle (amber) = Local SSD (slower, data spills when memory exceeded). Bottom (red) = Remote Cloud Storage S3/Blob/GCS (slowest, severe performance degradation). Arrows point downward labelled Overflow.

Pyramid showing Memory at top, Local SSD in middle, Remote Storage at bottom
πŸ”‘Fixing Spilling: Scale Up the Warehouse

The primary fix for spilling is to increase warehouse size (e.g., Medium to Large). Each size doubles memory and compute. If spilling persists, also optimise the query: reduce data volume with better filters, avoid SELECT *, or break into smaller steps with temporary tables.

Finding Queries That Spill
1SELECT query_id, query_text, warehouse_name, warehouse_size,
2 execution_time / 1000 AS exec_seconds,
3 bytes_spilled_to_local_storage,
4 bytes_spilled_to_remote_storage
5FROM snowflake.account_usage.query_history
6WHERE bytes_spilled_to_remote_storage > 0
7ORDER BY bytes_spilled_to_remote_storage DESC
8LIMIT 20;

Query History

INFORMATION_SCHEMA vs ACCOUNT_USAGE Query History

Feature
INFORMATION_SCHEMA.QUERY_HISTORY
ACCOUNT_USAGE.QUERY_HISTORY
Retention
14 days
βœ“365 days
Latency
βœ“Real-time (no delay)
Up to 45 minutes
Scope
Current database only
βœ“Entire account
Dropped objects
Not available after drop
βœ“Available for full retention
Privileges
βœ“Any role with database access
IMPORTED PRIVILEGES on SNOWFLAKE db
🎯Exam Focus: 14 Days vs 365 Days

The exam tests the difference: INFORMATION_SCHEMA = 14 days, real-time, database-scoped. ACCOUNT_USAGE = 365 days, 45-minute latency, full account.


EXPLAIN Command

Using EXPLAIN
1-- Logical plan without execution (zero credits)
2EXPLAIN USING TABULAR
3SELECT c.name, SUM(o.total) AS total_spend
4FROM customers c
5JOIN orders o ON c.id = o.customer_id
6WHERE o.order_date >= '2024-01-01'
7GROUP BY c.name
8ORDER BY total_spend DESC;
9
10-- EXPLAIN shows operators and expected rows
11-- Query Profile shows ACTUAL execution after completion

Warehouse Sizing Strategy

Warehouse Sizing Decision Process

1
Identify the Problem

Check Query Profile for spilling, poor pruning, or queue times. Examine top 10 slowest queries via ACCOUNT_USAGE.

πŸ’‘Look for bytes_spilled_to_remote_storage > 0 first β€” that's the biggest red flag.
2
Spilling? Scale Up

Increase warehouse size (Medium to Large, Large to X-Large). Each step doubles memory and compute.

SQL
ALTER WAREHOUSE analytics_wh SET WAREHOUSE_SIZE = 'X-LARGE';
3
Queueing? Scale Out

Enable multi-cluster warehouses (Enterprise+). Adds clusters for concurrent queries.

SQL
ALTER WAREHOUSE analytics_wh SET
  MAX_CLUSTER_COUNT = 3
  SCALING_POLICY = 'STANDARD';
4
Over-provisioned? Scale Down

If queries finish in seconds with no spilling, use a smaller warehouse to save credits.

SQL
ALTER WAREHOUSE dashboard_wh SET WAREHOUSE_SIZE = 'X-SMALL';

Scale Up vs Scale Out

Scale Up vs Scale Out

Feature
Scale Up (Larger Warehouse)
Scale Out (Multi-Cluster)
Problem solved
Slow individual queries, spilling
Too many concurrent queries queueing
Effect on single query
βœ“Faster β€” more memory/compute per query
No change β€” same per-query resources
Effect on concurrency
Minimal improvement
βœ“Significant β€” queries distributed across clusters
Cost impact
Doubles credits per size step
Linear with clusters added
Best for
ETL, large aggregations, heavy joins
Dashboards, BI, many concurrent users

Common Anti-Patterns

⚠️Performance Anti-Patterns
  1. SELECT * on large tables β€” reads all columns, Snowflake is columnar β€” specify only needed columns
  2. Functions on WHERE columns β€” disables pruning: WHERE YEAR(date) = 2024
  3. Unnecessary DISTINCT β€” forces expensive sort/aggregate
  4. Cartesian joins β€” missing join conditions
  5. Wrong data types β€” VARCHAR compared to NUMBER forces implicit casting
  6. Overly complex CTEs β€” deeply nested CTEs can confuse the optimiser

Cheat Sheet

πŸ“‹
Quick Reference

Query Performance Quick Reference

πŸ”
Query Profile
Access
Snowsight β†’ Query History β†’ select query β†’ Query Profile
Pruning
TableScan node: partitions scanned vs total
Spilling
Sort/Join nodes: bytes spilled to local/remote
⚑
Warehouse Sizing
Spilling
Scale UP β€” increase warehouse size
Queueing
Scale OUT β€” multi-cluster (Enterprise+)
Result Cache
Same SQL + unchanged data = free, 24h TTLβ€” No warehouse needed
πŸ“Š
Query History
INFO_SCHEMA
14 days, real-time, database scope
ACCOUNT_USAGE
365 days, 45-min latency, full account

Practice Quiz

Query Performance

A Sort operator is spilling 50 GB to remote storage. What is the MOST appropriate fix?

Query Performance

A user runs the same SELECT twice in 10 minutes. Data unchanged. Warehouse is suspended. What happens on the second run?

Query Performance

A query filters WHERE YEAR(created_at) = 2024. Only 200 of 10,000 partitions are pruned. What is the best fix?


Flashcards

Query Performance
QUESTION

What are the two spilling locations and which is worse?

Click to reveal answer
ANSWER

First: local SSD (attached to warehouse nodes). Second: remote cloud storage (S3/Blob/GCS). Remote spilling is much slower and indicates the warehouse should be scaled up immediately.

Click to see question
Query Performance
QUESTION

What three conditions must be met for the result cache to fire?

Click to reveal answer
ANSWER

1) Exact same SQL text. 2) Underlying data unchanged since caching. 3) Same role and context (database, schema). Cache lasts 24 hours. No warehouse needed β€” served from Cloud Services.

Click to see question
Query Performance
QUESTION

INFORMATION_SCHEMA vs ACCOUNT_USAGE: retention and latency?

Click to reveal answer
ANSWER

INFORMATION_SCHEMA: 14 days, real-time, current database. ACCOUNT_USAGE: 365 days, up to 45-minute latency, full account scope.

Click to see question
Query Performance
QUESTION

How does EXPLAIN differ from Query Profile?

Click to reveal answer
ANSWER

EXPLAIN shows the logical plan BEFORE execution β€” zero credits. Query Profile shows ACTUAL execution AFTER completion β€” real timing, data volumes, and spilling.

Click to see question
Query Performance
QUESTION

When should you scale up vs scale out?

Click to reveal answer
ANSWER

Scale UP: individual queries are slow or spilling (more memory per query). Scale OUT (multi-cluster): many concurrent queries queueing (more clusters for concurrency).

Click to see question

Resources


Next Steps

Reinforce what you just read

Study the All flashcards with spaced repetition to lock it in.

Study flashcards β†’