Skip to main content

Search Optimisation Service

The Search Optimisation Service (SOS) is a Snowflake Enterprise-tier feature designed to dramatically accelerate point lookup queries on large tables. Unlike clustering keys β€” which help range-based scans β€” SOS is purpose-built for queries that search for specific values in large datasets, such as finding a customer by ID or filtering on a phone number.

Understanding the distinction between SOS, clustering keys, and the result cache is a frequent topic on the SnowPro Core COF-C02 exam.

πŸ“–

Key Terms: Search Optimisation Service

SOS

Search Optimisation Service

A Snowflake Enterprise feature that builds and maintains a persistent search access path (similar to an index) on table columns to accelerate point lookup queries.

SEA

Search Access Path

A persistent structure maintained by Snowflake alongside table data that allows SOS to short-circuit standard partition scanning for qualifying queries.

POI

Point Lookup Query

A query that searches for one or more specific values β€” e.g. WHERE user_id = 99 or WHERE email IN (...) β€” rather than scanning a range of rows.

MIC

Micro-partition Pruning

Snowflake's default mechanism for skipping irrelevant micro-partitions based on metadata (min/max column values). SOS operates independently of this.

AUT

Automatic Clustering

A Snowflake service that keeps micro-partitions sorted on specified cluster keys, improving range-based query pruning. Separate from and complementary to SOS.

SER

Serverless Credits

A compute model where Snowflake manages resources on your behalf. SOS maintenance is billed via serverless credits, not your virtual warehouse.

EQU

Equality Predicate

A WHERE clause condition using = or IN operators. This is the primary query type that SOS is optimised to accelerate.

ILI

ILIKE

Snowflake's case-insensitive pattern matching operator. SOS supports substring and pattern search using ILIKE and LIKE predicates on configured columns.


Why SOS Exists: The Problem It Solves

Standard Snowflake query performance relies on micro-partition pruning β€” Snowflake reads the min/max metadata for each micro-partition and skips those that cannot contain the desired values. This works exceptionally well for range queries and sorted data, but breaks down for random point lookups.

Consider a table of 500 million customer records. If you query WHERE customer_id = 'C-00019284', Snowflake must scan potentially thousands of micro-partitions because customer IDs are randomly distributed β€” no pruning is possible without a sort order on that column.

The Search Optimisation Service solves this by building a persistent search access path β€” effectively a specialised index structure β€” that allows Snowflake to identify precisely which micro-partitions contain the target value, skipping the rest entirely.


How SOS Works Internally

Search Optimisation Service Architecture

Files arrive in a stage. Snowflake builds a persistent Search Access Path alongside table micro-partitions. When a point lookup query runs, SOS identifies qualifying micro-partitions, short-circuiting the standard full-partition scan. The warehouse then reads only the identified partitions.

Architecture diagram showing table micro-partitions on the left, a Search Access Path structure in the centre built and maintained by serverless compute, and a query flow that bypasses unnecessary micro-partitions via the access path

The process works as follows:

  1. Build Phase β€” When SOS is enabled on a table, Snowflake’s serverless background compute builds the initial search access path. This is a one-time build cost charged in serverless credits.
  2. Maintenance Phase β€” As DML operations (INSERT, UPDATE, DELETE, MERGE) change the table, Snowflake continuously and automatically updates the access path. Each update consumes additional serverless credits.
  3. Query Phase β€” When a qualifying query arrives (e.g. a WHERE id = ? predicate on an SOS-configured column), the query optimiser consults the access path to determine which micro-partitions are relevant. Only those partitions are read by the virtual warehouse.
πŸ”‘SOS is Separate from Table Data

The search access path is stored separately from the table’s micro-partitions. It consumes additional storage, which is billed at Snowflake’s standard storage rates. Factor this into cost modelling alongside the serverless compute credits for maintenance.


Supported Predicate Types

SOS does not help every query type. The exam tests whether you know exactly which predicates benefit.

SOS Supported vs Unsupported Predicates

SOS accelerates: equality predicates (=), IN lists, ILIKE/LIKE substring search, and geospatial predicates (ST_WITHIN, ST_INTERSECTS). SOS does NOT help: range predicates alone (>, <, BETWEEN), full table scans, JOIN performance, aggregations, or queries already pruning well via clustering keys.

Two-column diagram. Left column labelled Supported shows equality equals, IN lists, ILIKE pattern matching, and geospatial predicates. Right column labelled Not Supported shows range predicates, full scans, JOINs, and aggregate queries.

What SOS Accelerates

Predicate TypeExampleNotes
EqualityWHERE id = 'ABC123'Most common use case
IN listWHERE id IN ('A', 'B', 'C')Can include large IN lists
Substring / patternWHERE name ILIKE '%john%'Case-insensitive supported
LIKEWHERE code LIKE 'UK-%'Prefix and wildcard patterns
GeospatialST_WITHIN(point, region)Requires GEOGRAPHY columns

What SOS Does NOT Help

  • Range predicates alone: WHERE created_at > '2024-01-01' β€” clustering keys are better here
  • Full table scans with no selective predicate
  • JOIN performance β€” SOS does not accelerate join operations
  • Aggregate queries: COUNT(*), SUM(), AVG() without selective filters
  • Queries that already prune well via clustering keys (SOS adds cost without benefit)
🎯Exam Trap: Range Predicates

The exam may present a scenario with a date range filter β€” e.g. WHERE order_date BETWEEN '2024-01-01' AND '2024-03-31' β€” and ask which feature helps most. The answer is clustering keys, not SOS. SOS is for point lookups, not ranges.


Enabling Search Optimisation Service

SOS requires Snowflake Enterprise Edition or higher. It cannot be used on Standard Edition accounts.

Method 1: Enable on All Columns (Full Table)

Enable SOS on Entire Table
1-- Enable Search Optimisation Service on all columns
2-- Snowflake determines which column types are supported
3ALTER TABLE customer_events
4ADD SEARCH OPTIMIZATION;
5
6-- Verify configuration
7DESCRIBE SEARCH OPTIMIZATION ON TABLE customer_events;

Method 2: Enable on Specific Columns and Predicate Types

Enable SOS on Specific Columns
1-- Enable equality search on specific columns only
2ALTER TABLE customer_events
3ADD SEARCH OPTIMIZATION ON
4 EQUALITY(customer_id, session_id, email)
5 SUBSTRING(full_name, description)
6 GEO(location_point);
7
8-- Remove SOS from specific columns without disabling entirely
9ALTER TABLE customer_events
10DROP SEARCH OPTIMIZATION ON
11 EQUALITY(session_id);
12
13-- Remove SOS entirely from the table
14ALTER TABLE customer_events
15DROP SEARCH OPTIMIZATION;

Monitoring SOS

Monitoring Search Optimisation Service
1-- Check SOS configuration and optimised columns
2DESCRIBE SEARCH OPTIMIZATION ON TABLE customer_events;
3
4-- View SOS build and maintenance history (serverless credit usage)
5SELECT *
6FROM TABLE(
7INFORMATION_SCHEMA.SEARCH_OPTIMIZATION_HISTORY(
8 DATE_RANGE_START => DATEADD('day', -7, CURRENT_DATE()),
9 DATE_RANGE_END => CURRENT_DATE()
10)
11)
12ORDER BY start_time DESC;
13
14-- Confirm SOS is active on a table via SHOW TABLES
15SHOW TABLES LIKE 'customer_events';
16-- Look for the search_optimization_progress column (100 = fully built)
πŸ’‘Use Targeted Column Enablement

Enabling SOS on the entire table (ADD SEARCH OPTIMIZATION with no column list) can result in unnecessary storage and maintenance costs for columns that are never used in point lookups. Always prefer ADD SEARCH OPTIMIZATION ON EQUALITY(col1, col2) to target only the columns your queries actually filter on.


Step-by-Step: Implementing SOS

Enabling Search Optimisation Service on a Table

1
Confirm Enterprise Edition

SOS is only available on Enterprise Edition and above. Check your account edition before proceeding.

SQL
SELECT SYSTEM$BOOTSTRAP_DATA_REQUEST('ACCOUNT');
πŸ’‘You can also check your edition in Snowsight under Admin > Account.
2
Identify High-Cardinality Point Lookup Columns

Analyse your query patterns. SOS benefits columns used in equality filters where data is not naturally clustered β€” e.g. UUID-based IDs, email addresses, phone numbers. Run EXPLAIN on slow queries to confirm full partition scans.

SQL
EXPLAIN SELECT * FROM orders WHERE order_uuid = '550e8400-e29b-41d4-a716-446655440000';
πŸ’‘Look for 'TableScan' with high partition counts in the EXPLAIN output. These are prime SOS candidates.
3
Enable SOS on Target Columns

Use targeted column-level enablement to control costs. Specify the predicate types you actually use.

SQL
ALTER TABLE orders
  ADD SEARCH OPTIMIZATION ON
    EQUALITY(order_uuid, customer_id)
    SUBSTRING(customer_name);
πŸ’‘Start with equality β€” it covers most point lookup use cases.
4
Monitor the Build Progress

The initial build runs in the background via serverless compute. Track progress using SHOW TABLES.

SQL
SHOW TABLES LIKE 'orders';
-- Check search_optimization_progress column (0-100)
πŸ’‘Large tables may take some time to fully index. Queries will progressively benefit as the build proceeds.
5
Validate Query Improvement

Run your point lookup queries and compare query profiles before and after SOS enablement. Look for reduced 'Partitions scanned' in the Query Profile.

πŸ’‘In Snowsight Query Profile, compare 'Partitions scanned' vs 'Partitions total'. SOS should reduce the scanned count dramatically.
6
Monitor Ongoing Costs

Query SEARCH_OPTIMIZATION_HISTORY regularly to track serverless credit consumption from maintenance operations as data changes.

SQL
SELECT SUM(credits_used) AS total_sos_credits
FROM TABLE(
  INFORMATION_SCHEMA.SEARCH_OPTIMIZATION_HISTORY(
    DATE_RANGE_START => DATEADD('month', -1, CURRENT_DATE()),
    DATE_RANGE_END   => CURRENT_DATE()
  )
);

SOS vs Clustering Keys vs Result Cache

These three features are frequently compared on the exam. They are not mutually exclusive β€” tables can have both SOS and clustering keys simultaneously.

SOS vs Clustering Keys vs Result Cache

Feature
Search Optimisation Service
Clustering Keys
Best for
Point lookups (equality, IN, ILIKE)
Range scans (dates, numeric ranges, large sequential reads)
Edition required
Enterprise Edition+
βœ“Any edition (Automatic Clustering needs Enterprise+)
Setup cost
Serverless credits for initial build
Free to define; Automatic Clustering uses serverless credits
Ongoing maintenance cost
Serverless credits per DML change
Serverless credits (Automatic Clustering only)
Storage overhead
Additional storage for access path
Rewrites micro-partitions; storage similar
Query type benefited
Selective point lookups on any column
Queries filtering on cluster key columns (ranges)
Can be combined?
Yes β€” both can coexist on same table
Yes β€” both can coexist on same table
Requires data to be sorted?
βœ“No β€” works regardless of data order
Yes β€” effectiveness depends on data being clustered
ℹ️Result Cache vs SOS

The result cache serves identical query results instantly (free, 24-hour TTL). It helps when the exact same query text is repeated. SOS helps when query structure is the same but values differ β€” e.g. looking up a different customer ID each time. These two features solve different problems and complement each other.


Cost Breakdown

⚠️SOS Has Two Distinct Cost Components

Many candidates overlook one of the two SOS cost components. The exam may test both:

  1. Build cost β€” A one-time serverless credit charge when SOS is first enabled and the initial access path is constructed.
  2. Maintenance cost β€” Ongoing serverless credits consumed each time the access path is updated due to DML changes (INSERTs, UPDATEs, DELETEs, MERGEs).

Additionally, the search access path consumes storage, billed at standard Snowflake storage rates.

Query execution itself still uses your virtual warehouse β€” SOS reduces the data scanned but does not eliminate warehouse usage entirely.


CheatSheet: Search Optimisation Service

πŸ“‹
Quick Reference

Search Optimisation Service β€” COF-C02 Cheat Sheet

🏒
Edition & Requirements
Minimum edition
Enterprise Editionβ€” Not available on Standard
Setup command
ALTER TABLE t ADD SEARCH OPTIMIZATIONβ€” Enables on all supported columns
Targeted setup
ADD SEARCH OPTIMIZATION ON EQUALITY(col)β€” Preferred for cost control
Remove SOS
ALTER TABLE t DROP SEARCH OPTIMIZATIONβ€” Removes all access paths
βœ…
Supported Predicates
Equality
WHERE col = 'value'β€” Most common use case
IN list
WHERE col IN ('a', 'b', 'c')β€” Works for large lists too
Substring
WHERE col ILIKE '%pattern%'β€” Case-insensitive supported
Geospatial
ST_WITHIN, ST_INTERSECTSβ€” Requires GEOGRAPHY column type
❌
NOT Supported
Range predicates alone
WHERE date > '2024-01-01'β€” Use clustering keys instead
Full table scans
SELECT * FROM t (no filter)β€” No selective predicate
JOINs
JOIN performanceβ€” SOS does not help JOINs
Aggregations
COUNT, SUM, AVGβ€” Without selective filter
πŸ’°
Cost Components
Build cost
One-time serverless creditsβ€” Charged when SOS first enabled
Maintenance cost
Ongoing serverless creditsβ€” Per DML change to table
Storage
Additional storage billedβ€” Access path stored separately
Query cost
Still uses virtual warehouseβ€” Less data scanned = lower cost
πŸ“Š
Monitoring Commands
View config
DESCRIBE SEARCH OPTIMIZATION ON TABLE tβ€” Shows optimised columns
Credit history
INFORMATION_SCHEMA.SEARCH_OPTIMIZATION_HISTORYβ€” Serverless credit usage
Build progress
SHOW TABLES LIKE 't'β€” search_optimization_progress column
βš–οΈ
SOS vs Other Features
SOS vs Clustering
SOS=point lookups, Clustering=range scansβ€” Can use both simultaneously
SOS vs Result Cache
SOS helps varied values, Cache helps identical queriesβ€” Complementary features
SOS vs Pruning
SOS independent of micro-partition pruningβ€” Both active simultaneously

Flashcards: Search Optimisation Service

Search Optimisation Service
QUESTION

Which Snowflake edition is required to use the Search Optimisation Service?

Click to reveal answer
ANSWER

Enterprise Edition or higher. SOS is not available on Standard Edition accounts.

Click to see question
Search Optimisation Service
QUESTION

What are the two cost components of the Search Optimisation Service?

Click to reveal answer
ANSWER

1. Build cost β€” one-time serverless credits to construct the initial search access path. 2. Maintenance cost β€” ongoing serverless credits consumed each time DML changes (INSERTs, UPDATEs, DELETEs) update the access path. Storage for the access path is also billed separately.

Click to see question
Search Optimisation Service
QUESTION

A query uses WHERE order_date BETWEEN '2024-01-01' AND '2024-03-31'. Will SOS help this query?

Click to reveal answer
ANSWER

No. SOS does not accelerate range predicates. For date range queries, clustering keys (or Automatic Clustering) are the appropriate feature. SOS is designed for equality predicates, IN lists, and substring pattern matching.

Click to see question
Search Optimisation Service
QUESTION

What SQL command removes SOS from a specific column without disabling it on the whole table?

Click to reveal answer
ANSWER

ALTER TABLE t DROP SEARCH OPTIMIZATION ON EQUALITY(column_name); β€” This removes the access path for the specified predicate type and column while leaving SOS active on other columns.

Click to see question
Search Optimisation Service
QUESTION

Can a table have both Search Optimisation Service and Clustering Keys enabled at the same time?

Click to reveal answer
ANSWER

Yes. SOS and clustering keys are complementary and can coexist on the same table. Clustering helps range queries; SOS helps point lookups. Using both together allows the table to efficiently serve both query patterns.

Click to see question

Knowledge Check: Quiz

Search Optimisation Service

A data analyst frequently runs queries like: SELECT * FROM users WHERE email = 'user@example.com'; The users table has 800 million rows and email values are randomly distributed. Which feature would most improve this query's performance?

Search Optimisation Service

Which of the following queries would NOT benefit from Search Optimisation Service?

Search Optimisation Service

A Snowflake architect enables Search Optimisation Service on a large table that receives millions of INSERT and DELETE operations per day. Which cost concern should they flag with stakeholders?


Summary

The Search Optimisation Service fills a specific gap in Snowflake’s performance toolkit. When your workload includes frequent point lookups on high-cardinality, randomly distributed columns β€” and clustering keys cannot help β€” SOS delivers dramatic performance improvements by short-circuiting partition scanning.

The key exam takeaways are:

  • Edition: Enterprise Edition or higher required
  • Use case: Point lookups β€” equality, IN lists, ILIKE, geospatial
  • Not for: Range predicates, JOINs, aggregations, or full table scans
  • Costs: Build cost (one-time) + maintenance cost (ongoing, per DML) + storage
  • Monitoring: DESCRIBE SEARCH OPTIMIZATION ON TABLE t and SEARCH_OPTIMIZATION_HISTORY
  • SOS vs Clustering: Complementary, not competitive β€” can use both simultaneously
  • SOS vs Result Cache: Different problems β€” varied values vs identical queries
🎯Final Exam Reminder

Remember the three-way comparison: Result Cache (identical queries, free, 24h TTL) vs Clustering Keys (range scans, any edition) vs SOS (point lookups, Enterprise+, serverless credit cost). If you see a scenario asking which feature helps a WHERE id = ? query on a 500-million row table with no natural sort order β€” the answer is SOS.

Reinforce what you just read

Study the All flashcards with spaced repetition to lock it in.

Study flashcards β†’