Introduction to Snowflake Architecture

Snowflake’s architecture is fundamentally different from traditional data warehouses. It uses a hybrid architecture that combines the benefits of both shared-disk and shared-nothing architectures.

Three-Layer Architecture

Snowflake’s architecture consists of three distinct layers:

Database Storage Layer
Query Processing Layer (Virtual Warehouses)
Cloud Services Layer

Snowflake Three-Layer Architecture

Visual representation of how Snowflake separates storage, compute, and services into independent, scalable layers

1. Database Storage Layer

The storage layer is where all your data lives. Snowflake automatically manages:

Compression: Data is automatically compressed
Organisation: Micro-partitions and metadata
Optimisation: Columnar storage format

Micro-Partition Structure

Illustration of how Snowflake automatically organises table data into immutable micro-partitions with columnar storage

Micro-partition internal structure showing columnar format and metadata

Architecture

QUESTION

What is the smallest unit of storage in Snowflake?

Click to reveal answer

ANSWER

Micro-partition. Snowflake automatically divides tables into micro-partitions (50-500 MB compressed) without user intervention.

Click to see question

Key Concepts

Micro-Partitions:

Automatically created and sized by Snowflake
Typically 50-500 MB compressed
Immutable (never updated in place)
Columnar format for efficient scanning

Metadata:

Min/max values per column
Number of distinct values
Additional optimisation statistics
Enables automatic query pruning

Architecture

QUESTION

Are Snowflake micro-partitions mutable or immutable?

Click to reveal answer

ANSWER

Immutable. When data is modified, Snowflake creates new micro-partitions rather than updating existing ones. This enables features like Time Travel and Zero-Copy Cloning.

Click to see question

2. Query Processing Layer (Virtual Warehouses)

Virtual Warehouses are independent compute clusters that execute queries.

Virtual Warehouse Characteristics

Elastic: Scale up/down instantly
Isolated: No resource contention between warehouses
MPP Clusters: Massively Parallel Processing
Auto-suspend/resume: Cost optimisation

Virtual Warehouse Scaling

Visualisation of how virtual warehouses can scale up (larger size) or scale out (more clusters) independently

Creating a Virtual Warehouse

1-- Create a virtual warehouse
2CREATE WAREHOUSE ANALYTICS_WH
3WITH WAREHOUSE_SIZE = 'LARGE'
4AUTO_SUSPEND = 300
5AUTO_RESUME = TRUE
6INITIALLY_SUSPENDED = TRUE;
7
8-- Resize warehouse on-the-fly
9ALTER WAREHOUSE ANALYTICS_WH SET WAREHOUSE_SIZE = 'X-LARGE';
10
11-- Start the warehouse
12ALTER WAREHOUSE ANALYTICS_WH RESUME;
13
14-- Suspend the warehouse
15ALTER WAREHOUSE ANALYTICS_WH SUSPEND;

Virtual Warehouses

QUESTION

What happens to running queries when you resize a virtual warehouse?

Click to reveal answer

ANSWER

Running queries continue on the old warehouse size until completion. New queries use the new warehouse size. This ensures no query interruption during resizing.

Click to see question

Warehouse Sizes

Size	Credits/Hour	Notes
X-Small	1	Development, testing
Small	2	Small workloads
Medium	4	General purpose
Large	8	Large datasets
X-Large	16	Very large datasets
2X-Large	32	Massive workloads
3X-Large	64	Extreme workloads
4X-Large	128	Maximum size

Virtual Warehouses

QUESTION

How does Snowflake charge for compute resources?

Click to reveal answer

ANSWER

Snowflake charges credits based on warehouse size and time used (per-second billing with 60-second minimum). A suspended warehouse consumes zero credits.

Click to see question

3. Cloud Services Layer

The brain of Snowflake - coordinates all activities across the platform:

Authentication: User login and security
Infrastructure Management: Optimisation and monitoring
Metadata Management: Query parsing and optimisation
Query Parsing & Optimisation: Execution planning
Access Control: Role-based permissions

Complete Architecture Stack

End-to-end view showing how Cloud Services orchestrate compute and storage layers

Complete Snowflake architecture showing Cloud Services, Virtual Warehouses, and Storage layers with data flow

Key Services

┌─────────────────────────────────────────┐
│       Cloud Services Layer              │
├─────────────────────────────────────────┤
│  • Authentication & Access Control      │
│  • Query Compilation & Optimisation     │
│  • Transaction Management               │
│  • Metadata Management                  │
│  • Security & Encryption                │
└─────────────────────────────────────────┘
              ↓
┌─────────────────────────────────────────┐
│    Virtual Warehouses (Compute)         │
│  [WH1]  [WH2]  [WH3]  ...  [WHn]        │
└─────────────────────────────────────────┘
              ↓
┌─────────────────────────────────────────┐
│    Database Storage (Data)              │
│  Tables | Stages | Micro-partitions     │
└─────────────────────────────────────────┘

Cloud Services

QUESTION

Is the Cloud Services Layer charged separately?

Click to reveal answer

ANSWER

Generally no. Cloud services are included for free up to 10% of daily compute credit usage. Only usage exceeding 10% is billed.

Click to see question

Snowflake enables secure data sharing without data copying:

Live Access: Consumers query provider’s data directly
Zero-Copy: No data movement required
Real-Time: Always up-to-date data
Granular Control: Share specific objects only

Data Sharing Architecture

How data providers share live data with consumers without physical copying, using metadata and access grants

Data sharing diagram showing provider and consumer accounts accessing shared data

Creating a Share

1-- Create a share
2CREATE SHARE SALES_SHARE;
3
4-- Grant usage on database
5GRANT USAGE ON DATABASE SALES_DB TO SHARE SALES_SHARE;
6
7-- Grant usage on schema
8GRANT USAGE ON SCHEMA SALES_DB.PUBLIC TO SHARE SALES_SHARE;
9
10-- Grant select on specific table
11GRANT SELECT ON TABLE SALES_DB.PUBLIC.ORDERS TO SHARE SALES_SHARE;
12
13-- Add account to share
14ALTER SHARE SALES_SHARE ADD ACCOUNTS = xy12345;

Multi-Cluster Warehouses

For high concurrency scenarios, Snowflake offers Multi-cluster Warehouses:

Auto-scaling: Add clusters based on load
Maximized Mode: All clusters run simultaneously
Auto-scale Mode: Clusters scale based on demand
Concurrent Query Handling: Distributes queries across clusters

Multi-Cluster Warehouse

1CREATE WAREHOUSE PRODUCTION_WH
2WITH WAREHOUSE_SIZE = 'LARGE'
3MIN_CLUSTER_COUNT = 1
4MAX_CLUSTER_COUNT = 5
5SCALING_POLICY = 'STANDARD'
6AUTO_SUSPEND = 300
7AUTO_RESUME = TRUE;

Multi-Cluster

QUESTION

What's the difference between scaling UP and scaling OUT in Snowflake?

Click to reveal answer

ANSWER

Scaling UP = Increasing warehouse size (more resources per query). Scaling OUT = Adding more clusters (handles more concurrent queries). Use scaling UP for large queries, scaling OUT for high concurrency.

Click to see question

Key Architecture Benefits

Separation of Storage and Compute
- Scale independently
- Pay for what you use
- Multiple warehouses access same data
Automatic Optimisation
- No indexes to manage
- No partitioning required
- Automatic statistics
Data Sharing Without Copying
- Secure and controlled
- Real-time updates
- No ETL required
Zero Maintenance
- No infrastructure management
- Automatic updates
- Built-in disaster recovery

Architecture Benefits Summary

Visual summary comparing traditional data warehouses with Snowflake's modern architecture advantages

Comparison diagram showing Snowflake architecture benefits vs traditional warehouses

Practice Questions

Exam Prep

QUESTION

True or False: In Snowflake, you need to define partition keys for tables to optimise query performance.

Click to reveal answer

ANSWER

FALSE. Snowflake automatically creates micro-partitions and maintains metadata. Manual partitioning is not required and not recommended.

Click to see question

Exam Prep

QUESTION

What happens to data when you drop a virtual warehouse?

Click to reveal answer

ANSWER

Nothing. Data is stored independently in the storage layer. Dropping a warehouse only removes compute resources, not data.

Click to see question

Exam Prep

QUESTION

Can two different virtual warehouses query the same table simultaneously?

Click to reveal answer

ANSWER

YES. Multiple warehouses can access the same data concurrently without any contention, thanks to the separation of storage and compute.

Click to see question

Exam Prep

QUESTION

Which layer is responsible for query optimisation and execution planning?

Click to reveal answer

ANSWER

The Cloud Services Layer handles query compilation, optimisation, and execution planning. Virtual Warehouses execute the queries based on these plans.

Click to see question

Exam Prep

QUESTION

What is the maximum Time Travel retention period in Snowflake Enterprise Edition?

Click to reveal answer

ANSWER

90 days for permanent tables (requires Enterprise Edition or higher). Standard Edition supports up to 1 day.

Click to see question

Additional Resources

Official Snowflake Documentation

Next Steps

Now that you understand Snowflake’s architecture, continue to:

Reinforce what you just read

Study the All flashcards with spaced repetition to lock it in.

Study flashcards →

Introduction to Snowflake Architecture

Three-Layer Architecture

Snowflake Three-Layer Architecture

1. Database Storage Layer

Micro-Partition Structure

Key Concepts

2. Query Processing Layer (Virtual Warehouses)

Virtual Warehouse Characteristics

Virtual Warehouse Scaling

Warehouse Sizes

3. Cloud Services Layer

Complete Architecture Stack

Key Services

Data Sharing Architecture

Data Sharing Architecture

Multi-Cluster Warehouses

Key Architecture Benefits

Architecture Benefits Summary

Practice Questions

Additional Resources

Official Snowflake Documentation

Recommended Reading

Next Steps