Legacy snapshot configuration legacy
Use legacy SQL-based snapshot configurations with Jinja blocks in any dbt version. dbt v1.9 introduced YAML-based configs for better readability and environment awareness.
There are situations where you want to use the legacy syntax for snapshots in any dbt version or release track. This page details how you can use the legacy SQL-based configurations if you need to.
In dbt v1.9, this syntax was replaced with a YAML-based configuration in dbt's "Latest" release track. The benefits of YAML-based configurations are that the snapshots are environment aware, meaning you don't have to specify schema or database, and the syntax is more concise.
For new snapshots, we recommend using these latest YAML-based configs. If you'd like to move to the YAML-based configuration for existing snapshots, you can migrate over.
When would you want to use the SQL-based syntax and YAML-based syntax?
-
SQL-based syntax:
- Defined in
.sqlfiles within a snapshot Jinja block, typically located in yoursnapshotsdirectory. Available in all versions. - Useful for existing snapshots already using this syntax.
- Suitable for performing very light transformations (but creating a separate ephemeral model for transformations is recommended for better maintainability).
- Defined in
-
YAML-based syntax:
- Defined in
whatever_name.ymlor in thesnapshotsormodelsdirectory you prefer. Available in dbt's "Latest" release track and dbt v1.9 and later. - Ideal for new snapshots or existing snapshots that need to be migrated.
- Create transformations separate from the snapshot file by creating an ephemeral model and referencing it in the snapshot using the
relationfield.
- Defined in
Snapshot configurations
Although you can use the more performant YAML-based configuration, you might still want to use the legacy configuration to define your snapshots if it suits your needs.
Snapshots can be configured in two main ways:
- Using snapshot-specific configurations
- Or using general configurations
These configurations allow you to control how dbt detects changes in your data and where snapshots are stored. Both types of configurations can coexist in your project in the same config block (or from your dbt_project.yml file or properties.yaml file).
One of the most important configs you can decide is strategies, which tells dbt how to detect modified rows.
Snapshot specific configurations
Snapshot-specific configurations are applicable to only one dbt resource type rather than multiple resource types. You can define these settings within the resource’s file using the {{ config() }} macro (as well as in the project file (dbt_project.yml) or a property file (models/properties.yml for models, similarly for other resources)).
{ % snapshot orders_snapshot %}
{{ config(
target_schema="<string>",
target_database="<string>",
unique_key="<column_name_or_expression>",
strategy="timestamp" | "check",
updated_at="<column_name>",
check_cols=["<column_name>"] | "all"
invalidate_hard_deletes : true | false
)
}}
select * from {{ source('jaffle_shop', 'orders') }}
{% endsnapshot %}
General configuration
Use general configurations for broader operational settings applicable across multiple resource types. Like resource-specific configurations, these can also be set in the project file, property files, or within resource-specific files using a config block.
Snapshot strategies
Snapshot "strategies" define how dbt knows if a row has changed. There are two strategies built-in to dbt that require the strategy parameter:
- Timestamp — Uses an
updated_atcolumn to determine if a row has changed. - Check — Compares a list of columns between their current and historical values to determine if a row has changed. Uses the
check_colsparameter.
- Timestamp
- Check
The timestamp strategy uses an updated_at field to determine if a row has changed. If the configured updated_at column for a row is more recent than the last time the snapshot ran, then dbt will invalidate the old record and record the new one. If the timestamps are unchanged, then dbt will not take any action.
Example
{% snapshot orders_snapshot_timestamp %}
{{
config(
target_schema='snapshots',
strategy='timestamp',
unique_key='id',
updated_at='updated_at',
)
}}
select * from {{ source('jaffle_shop', 'orders') }}
{% endsnapshot %}
The check strategy is useful for tables which do not have a reliable updated_at column. It requires the check_cols parameter, which is a list of columns within the results of your snapshot query to check for changes. Alternatively, use all columns using the all value (however this may be less performant).
Example
{% snapshot orders_snapshot_check %}
{{
config(
strategy='check',
unique_key='id',
check_cols=['status', 'is_cancelled'],
)
}}
select * from {{ source('jaffle_shop', 'orders') }}
{% endsnapshot %}
Examples
Configuration reference
Configure your snapshot to tell dbt how to detect record changes. Snapshots are select statements, defined within a snapshot block in a .sql file (typically in your snapshots directory or any other directory).
The following table outlines the configurations available for snapshots:
Add snapshot to a project
To add a snapshot to your project:
- Create a file in your
snapshotsdirectory with a.sqlfile extension. For example,snapshots/orders.sql - Use a
snapshotblock to define the start and end of a snapshot:
{% snapshot orders_snapshot %}
{% endsnapshot %}
- Write a
selectstatement within the snapshot block (tips for writing a good snapshot query are below). This select statement defines the results that you want to snapshot over time. You can usesourcesorrefshere.
{% snapshot orders_snapshot %}
select * from {{ source('jaffle_shop', 'orders') }}
{% endsnapshot %}
-
Check whether the result set of your query includes a reliable timestamp column that indicates when a record was last updated. For our example, the
updated_atcolumn reliably indicates record changes, so we can use thetimestampstrategy. If your query result set does not have a reliable timestamp, you'll need to instead use thecheckstrategy — more details on this in the next step. -
Add configurations to your snapshot using a
configblock. You can also configure your snapshot from yourdbt_project.ymlfile.
- Run the
dbt snapshotcommand. For our example, a new table will be created atanalytics.snapshots.orders_snapshot. You can change thetarget_databaseconfiguration, thetarget_schemaconfiguration and the name of the snapshot (as defined in{% snapshot .. %}) will change how dbt names this table.
Running with dbt=1.8.0
15:07:36 | Concurrency: 8 threads (target='dev')
15:07:36 |
15:07:36 | 1 of 1 START snapshot snapshots.orders_snapshot...... [RUN]
15:07:36 | 1 of 1 OK snapshot snapshots.orders_snapshot..........[SELECT 3 in 1.82s]
15:07:36 |
15:07:36 | Finished running 1 snapshots in 0.68s.
Completed successfully
Done. PASS=2 ERROR=0 SKIP=0 TOTAL=1
-
Inspect the results by selecting from the table dbt created. After the first run, you should see the results of your query, plus the snapshot meta fields as described earlier.
-
Run the
dbt snapshotcommand again, and inspect the results. If any records have been updated, the snapshot should reflect this. -
Select from the
snapshotin downstream models using thereffunction.
select * from {{ ref('orders_snapshot') }}
- Snapshots are only useful if you run them frequently — schedule the
snapshotcommand to run regularly.
Examples
This section outlines some examples of how to apply configurations to snapshots using the legacy method.
Was this page helpful?
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.