Skip to main content
๐Ÿ“–

Key Terms โ€” Data Unloading

โ€”

COPY INTO <location>

SQL command to export data from a Snowflake table into files in a stage (internal or external). The reverse of COPY INTO <table>.

โ€”

GET

SQL command to download files from an internal stage to a local machine. Only works with internal stages, not external.

โ€”

SINGLE

COPY INTO option that writes all output to a single file instead of splitting across multiple files.

โ€”

MAX_FILE_SIZE

Controls the maximum size of each output file. Default is 16 MB for most formats.

โ€”

PARTITION BY

Organises unloaded files into a directory structure based on column expressions โ€” useful for partitioned data lakes.

โ€”

OVERWRITE

When TRUE, overwrites existing files in the target location. Default is FALSE.


What is Data Unloading?

Data unloading is the process of exporting data from Snowflake tables into files in a stage. This is the reverse of data loading โ€” instead of COPY INTO a table, you COPY INTO a location.

Unloading Flow

Flow diagram: 1) Snowflake table contains the source data. 2) COPY INTO location command executes on a virtual warehouse. 3) Data is written as files (CSV, JSON, Parquet) to internal stage (@~, @%, @named) or external stage (S3/Blob/GCS). 4) Optionally, GET command downloads from internal stage to local filesystem.

Flow from Snowflake table through COPY INTO to stage files, then optionally GET to local
๐ŸŽฏExam Focus: Loading vs Unloading

COPY INTO table = loading (files to table). COPY INTO location = unloading (table to files). Both use a virtual warehouse. Snowpipe only supports loading โ€” you cannot use Snowpipe to unload data.


COPY INTO Location Syntax

Basic Unloading
1-- Unload to a named internal stage
2COPY INTO @my_stage/export/
3FROM sales.orders
4FILE_FORMAT = (TYPE = CSV COMPRESSION = GZIP)
5HEADER = TRUE;
6
7-- Unload to an external stage (S3)
8COPY INTO @my_s3_stage/daily_export/
9FROM (
10SELECT order_id, customer_id, total_amount
11FROM sales.orders
12WHERE order_date = CURRENT_DATE()
13)
14FILE_FORMAT = (TYPE = PARQUET)
15HEADER = TRUE;
โ„น๏ธQuery-Based Unloading

You can unload from a full table or from a SELECT query (wrapped in parentheses). This lets you filter, transform, and join data before exporting โ€” no need to create a temporary table first.


Key COPY INTO Options

Important Unloading Options
1-- Single file output (default splits into multiple files)
2COPY INTO @my_stage/single_export/
3FROM sales.orders
4SINGLE = TRUE
5MAX_FILE_SIZE = 5368709120; -- 5 GB max for single file
6
7-- Overwrite existing files
8COPY INTO @my_stage/export/
9FROM sales.orders
10OVERWRITE = TRUE;
11
12-- Control file size (default 16 MB)
13COPY INTO @my_stage/export/
14FROM sales.orders
15MAX_FILE_SIZE = 67108864; -- 64 MB per file
16
17-- Include column headers
18COPY INTO @my_stage/export/
19FROM sales.orders
20FILE_FORMAT = (TYPE = CSV)
21HEADER = TRUE;
โš ๏ธSINGLE = TRUE Considerations

Writing to a single file is slower because it cannot parallelise writes. Use SINGLE only when the consumer requires a single file. For large exports, multiple files is significantly faster.


PARTITION BY

PARTITION BY organises unloaded files into a directory structure based on column expressions โ€” ideal for creating partitioned data lakes.

Partitioned Unloading
1-- Partition by date
2COPY INTO @my_s3_stage/orders/
3FROM (
4SELECT *, DATE_TRUNC('month', order_date) AS month
5FROM sales.orders
6)
7PARTITION BY (month)
8FILE_FORMAT = (TYPE = PARQUET)
9HEADER = TRUE;
10-- Creates: orders/2024-01/data_0_0_0.parquet
11-- orders/2024-02/data_0_0_0.parquet
12
13-- Partition by multiple columns
14COPY INTO @my_s3_stage/events/
15FROM (
16SELECT *, YEAR(event_date) AS yr, MONTH(event_date) AS mo
17FROM analytics.events
18)
19PARTITION BY (yr, mo)
20FILE_FORMAT = (TYPE = PARQUET);

Supported File Formats

Unloading File Formats

Feature
Structured (CSV/TSV)
Semi-Structured (JSON/Parquet)
Formats
CSV, TSV (delimited)
JSON, Parquet
Compression
GZIP (default), BZ2, BROTLI, ZSTD, NONE
GZIP for JSON, SNAPPY for Parquet
Headers
HEADER = TRUE supported
N/A for JSON; column names in Parquet metadata
Best for
Legacy systems, spreadsheets, simple integrations
โœ“Data lakes, analytics platforms, schema preservation
VARIANT support
Flattened to string
โœ“Native structure preserved
๐ŸŽฏExam Focus: Parquet Unloading

Parquet is the preferred format for data lake exports. It preserves data types, supports columnar compression, and maintains schema metadata. Snowflake can unload directly to Parquet without any external tools.


GET Command

The GET command downloads files from an internal stage to your local filesystem. It does not work with external stages.

GET Command
1-- Download from named internal stage
2GET @my_stage/export/ file:///tmp/local_data/;
3
4-- Download from user stage
5GET @~/export/ file:///tmp/local_data/;
6
7-- Download from table stage
8GET @%orders/export/ file:///tmp/local_data/;
๐Ÿ”‘GET = Internal Stages Only

GET only works with internal stages (@~, @%, @named_internal). For external stages, use cloud-native tools (AWS CLI, Azure CLI, gsutil) to download files directly from cloud storage.


Compression Options

๐Ÿ“‹
Quick Reference

Compression by Format

๐Ÿ“„
CSV / TSV
Default
GZIP
Options
GZIP, BZ2, BROTLI, ZSTD, DEFLATE, RAW_DEFLATE, NONE
Recommendation
GZIP for broad compatibility; ZSTD for best ratio
๐Ÿ“‹
JSON
Default
GZIP
Options
GZIP, BZ2, BROTLI, ZSTD, DEFLATE, RAW_DEFLATE, NONE
๐Ÿ“Š
Parquet
Default
SNAPPY
Options
SNAPPY, LZO, NONEโ€” Limited options for Parquet

StepByStep: Unloading Data

Unloading Data End-to-End

1
Choose your destination

Decide whether to unload to an internal stage (for GET download) or external stage (for cloud storage access).

๐Ÿ’กUse external stages for data lake patterns; internal stages for ad-hoc exports.
2
Write the COPY INTO statement

Specify the target location, source query, file format, and options.

SQL
COPY INTO @my_s3_stage/daily/
FROM (
  SELECT order_id, customer_id, total_amount
  FROM sales.orders
  WHERE order_date = CURRENT_DATE()
)
FILE_FORMAT = (TYPE = PARQUET)
HEADER = TRUE
OVERWRITE = TRUE;
3
Verify the output

List the files in the stage to confirm they were created.

SQL
LIST @my_s3_stage/daily/;
4
Download if needed

For internal stages, use GET to download locally.

SQL
-- Only for internal stages
GET @my_stage/daily/ file:///tmp/export/;

Cheat Sheet

๐Ÿ“‹
Quick Reference

Data Unloading Quick Reference

โŒจ๏ธ
Key Commands
Unload
COPY INTO @stage/ FROM table_or_query
Download
GET @internal_stage/ file:///local/path/โ€” Internal only
List files
LIST @stage/path/
Remove files
REMOVE @stage/path/file.csv.gz
โš™๏ธ
Key Options
SINGLE
TRUE = one file; FALSE (default) = multiple parallel files
MAX_FILE_SIZE
Default 16 MB; controls split size
OVERWRITE
TRUE = replace existing; FALSE (default) = fail if exists
HEADER
TRUE = include column headers in CSV
PARTITION BY
Organise output into directory structure

Practice Quiz

Data Loading

Which command exports data from a Snowflake table into files?

Data Loading

What is the default compression for Parquet unloading?

Data Loading

The GET command can download files from which type of stage?


Flashcards

Data Loading
QUESTION

What is the difference between COPY INTO table and COPY INTO location?

Click to reveal answer
ANSWER

COPY INTO table = loading (files to table). COPY INTO location = unloading (table to files). Both require a virtual warehouse.

Click to see question
Data Loading
QUESTION

What does SINGLE = TRUE do when unloading?

Click to reveal answer
ANSWER

Writes all output to a single file instead of splitting across multiple parallel files. Slower but useful when the consumer requires exactly one file.

Click to see question
Data Loading
QUESTION

What does PARTITION BY do in an unload operation?

Click to reveal answer
ANSWER

Organises output files into a directory structure based on column expressions. For example, PARTITION BY (year, month) creates folders like /2024/01/, /2024/02/. Ideal for data lake patterns.

Click to see question
Data Loading
QUESTION

Can you unload data using Snowpipe?

Click to reveal answer
ANSWER

No. Snowpipe only supports loading (ingesting files into tables). For unloading, use COPY INTO <location> with a virtual warehouse, or schedule it with a Task.

Click to see question
Data Loading
QUESTION

What is the default MAX_FILE_SIZE for unloading?

Click to reveal answer
ANSWER

16 MB. Snowflake splits output into multiple files of this size by default. You can increase it or use SINGLE = TRUE for one file.

Click to see question

Resources


Next Steps

Reinforce what you just read

Study the All flashcards with spaced repetition to lock it in.

Study flashcards โ†’