CLI
This page works through the SQLMesh example project using the SQLMesh command-line interface.
1. Create the SQLMesh project
First, create a project directory and navigate to it:
If using a python virtual environment, ensure it's activated first by running the source .env/bin/activate
command from the folder used during installation.
Create a SQLMesh scaffold with the following command, specifying a default SQL dialect for your models. The dialect should correspond to the dialect most of your models are written in; it can be overridden for specific models in the model's MODEL
specification. All SQL dialects supported by the SQLGlot library are allowed.
In this example, we specify the snowflake
dialect:
See the quick start overview for more information about the project directories, files, data, and models.
2. Create a prod environment
SQLMesh's key actions are creating and applying plans to environments. At this point, the only environment is the empty prod
environment.
The first SQLMesh plan must execute every model to populate the production environment. Running sqlmesh plan
will generate the plan and the following output:
Line 3 of the output notes that sqlmesh plan
successfully executed the project's test tests/test_full_model.yaml
with duckdb.
Line 5 describes what environments the plan will affect when applied - a new prod
environment in this case.
Lines 7-10 of the output show that SQLMesh detected three new models relative to the current empty environment.
Lines 11-14 list each model that will be executed by the plan, along with the date intervals that will be run. Note that full_model
and incremental_model
both show 2020-01-01
as their start date because:
- The incremental model specifies that date in the
start
property of itsMODEL
statement and - The full model depends on the incremental model.
The seed_model
date range begins on the same day the plan was made because SEED
models have no temporality associated with them other than whether they have been modified since the previous SQLMesh plan.
Line 15 asks you whether to proceed with executing the model backfills described in lines 11-14. Enter y
and press Enter
, and SQLMesh will execute the models and return this output:
Lines 2-4 show the completion percentage and run time for each model (very fast in this simple example). Line 8 shows that the prod
environment now points to the tables created during model execution.
You've now created a new production environment with all of history backfilled.
3. Update a model
Now that we have have populated the prod
environment, let's modify one of the SQL models.
We modify the incremental SQL model by adding a new column to the query. Open the models/incremental_model.sql
file and add 'z' AS new_column
below item_id
as follows:
4. Work with a development environment
4.1 Create a dev environment
Now that you've modified a model, it's time to create a development environment so that you can validate the model change without affecting production.
Run sqlmesh plan dev
to create a development environment called dev
:
Line 5 of the output states that a new environment dev
will be created from the existing prod
environment.
Lines 6-10 summarize the differences between the modified model and the prod
environment, detecting that we directly modified incremental_model
and that full_model
was indirectly modified because it selects from the incremental model.
On line 24, we see that SQLMesh automatically classified the change as Non-breaking
because understood that the change was additive (added a column not used by full_model
) and did not invalidate any data already in prod
.
Hit Enter
at the prompt to backfill data from our start date 2020-01-01
. Another prompt will appear asking for a backfill end date; hit Enter
to backfill until now. Finally, enter y
and press Enter
to apply the plan and execute the backfill:
Line 8 of the output shows that SQLMesh applied the change to sqlmesh_example__dev.incremental_model
. In the model schema, the suffix "__dev
" indicates that it is in the dev
environment.
SQLMesh did not need to backfill anything for the full_model
since the change was Non-breaking
.
4.2 Validate updates in dev
You can now view this change by querying data from incremental_model
with sqlmesh fetchdf "select * from sqlmesh_example__dev.incremental_model"
.
Note that the environment name __dev
is appended to the schema namespace sqlmesh_example
in the query:
$ sqlmesh fetchdf "select * from sqlmesh_example__dev.incremental_model"
id item_id new_column ds
0 1 2 z 2020-01-01
1 2 1 z 2020-01-01
2 3 3 z 2020-01-03
3 4 1 z 2020-01-04
4 5 1 z 2020-01-05
5 6 1 z 2020-01-06
6 7 1 z 2020-01-07
You can see that new_column
was added to the dataset. The production table was not modified; you can validate this by querying the production table using sqlmesh fetchdf "select * from sqlmesh_example.incremental_model"
.
Note that nothing has been appended to the schema namespace sqlmesh_example
because prod
is the default environment.
$ sqlmesh fetchdf "select * from sqlmesh_example.incremental_model"
id item_id ds
0 1 2 2020-01-01
1 2 1 2020-01-01
2 3 3 2020-01-03
3 4 1 2020-01-04
4 5 1 2020-01-05
5 6 1 2020-01-06
6 7 1 2020-01-07
The production table does not have new_column
because the changes to dev
have not yet been applied to prod
.
5. Update the prod environment
5.1 Apply updates to prod
Now that we've tested the changes in dev, it's time to move them to production. Run sqlmesh plan
to plan and apply your changes to the prod
environment.
Enter y
and press Enter
at the Apply - Virtual Update [y/n]:
prompt to apply the plan and execute the backfill:
$ sqlmesh plan
======================================================================
Successfully Ran 1 tests against duckdb
----------------------------------------------------------------------
Summary of differences against `prod`:
├── Directly Modified:
│ └── sqlmesh_example.incremental_model
└── Indirectly Modified:
└── sqlmesh_example.full_model
---
+++
@@ -1,6 +1,7 @@
SELECT
id,
item_id,
+ 'z' AS new_column,
ds
FROM sqlmesh_example.seed_model
WHERE
Directly Modified: sqlmesh_example.incremental_model (Non-breaking)
└── Indirectly Modified Children:
└── sqlmesh_example.full_model
Apply - Virtual Update [y/n]: y
Virtually Updating 'prod' ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100.0% • 0:00:00
The target environment has been updated successfully
Virtual Update executed successfully
Note that a backfill was not necessary and only a Virtual Update occurred.
5.2 Validate updates in prod
Double-check that the data updated in prod
by running sqlmesh fetchdf "select * from sqlmesh_example.incremental_model"
:
$ sqlmesh fetchdf "select * from sqlmesh_example.incremental_model"
id item_id new_column ds
0 1 2 z 2020-01-01
1 2 1 z 2020-01-01
2 3 3 z 2020-01-03
3 4 1 z 2020-01-04
4 5 1 z 2020-01-05
5 6 1 z 2020-01-06
6 7 1 z 2020-01-07
6. Next steps
Congratulations, you've now conquered the basics of using SQLMesh!
From here, you can: