Snowflake
This page provides information about how to use SQLMesh with the Snowflake SQL engine.
It begins with a Connection Quickstart that demonstrates how to connect to Snowflake, or you can skip directly to information about using Snowflake with the built-in or airflow schedulers.
Connection quickstart
Connecting to cloud warehouses involves a few steps, so this connection quickstart provides the info you need to get up and running with Snowflake.
It demonstrates connecting to Snowflake with the snowflake-connector-python
library bundled with SQLMesh.
Snowflake provides multiple methods of authorizing a connection (e.g., password, SSO, etc.). This quickstart demonstrates authorizing with a password, but configurations for other methods are described below.
Tip
This quickstart assumes you are familiar with basic SQLMesh commands and functionality.
If you're not, work through the SQLMesh Quickstart before continuing!
Prerequisites
Before working through this connection quickstart, ensure that:
- You have a Snowflake account and know your username and password
- Your Snowflake account has at least one warehouse available for running computations
- Your computer has SQLMesh installed with the Snowflake extra available
- Install from the command line with the command
pip install "sqlmesh[snowflake]"
- Install from the command line with the command
- You have initialized a SQLMesh example project on your computer
- Open a command line interface and navigate to the directory where the project files should go
- Initialize the project with the command
sqlmesh init snowflake
Access control permissions
SQLMesh must have sufficient permissions to create and access different types of database objects.
SQLMesh's core functionality requires relatively broad permissions, including:
- Ability to create and delete schemas in a database
- Ability to create, modify, delete, and query tables and views in the schemas it creates
If your project uses materialized views or dynamic tables, SQLMesh will also need permissions to create, modify, delete, and query those object types.
We now describe how to grant SQLMesh appropriate permissions.
Snowflake roles
Snowflake allows you to grant permissions directly to a user, or you can create and assign permissions to a "role" that you then grant to the user.
Roles provide a convenient way to bundle sets of permissions and provide them to multiple users. We create and use a role to grant our user permissions in this quickstart.
The role must be granted USAGE
on a warehouse so it can execute computations. We describe other permissions below.
Database permissions
The top-level object container in Snowflake is a "database" (often called a "catalog" in other engines). SQLMesh does not need permission to create databases; it may use an existing one.
The simplest way to grant SQLMesh sufficient permissions for a database is to give it OWNERSHIP
of the database, which includes all the necessary permissions.
Alternatively, you may grant SQLMesh granular permissions for all the actions and objects it will work with in the database.
Granting the permissions
This section provides example code for creating a sqlmesh
role, granting it sufficient permissions, and granting it to a user.
The code must be executed by a user with USERADMIN
level permissions or higher. We provide two versions of the code, one that grants database OWNERSHIP
to the role and another that does not.
Both examples create a role named sqlmesh
, grant it usage of the warehouse compute_wh
, create a database named demo_db
, and assign the role to the user demo_user
. The step that creates the database can be omitted if the database already exists.
Get connection info
Now that our user has sufficient access permissions, we're ready to gather the information needed to configure the SQLMesh connection.
Account name
Snowflake connection configurations require the account
parameter that identifies the Snowflake account SQLMesh should connect to.
Snowflake account identifiers have two components: your organization name and your account name. Both are embedded in your Snowflake web interface URL, separated by a /
.
This shows the default view when you log in to your Snowflake account, where we can see the two components of the account identifier:
In this example, our organization name is idapznw
, and our account name is wq29399
.
We concatenate the two components, separated by a -
, for the SQLMesh account
parameter: idapznw-wq29399
.
Warehouse name
Your Snowflake account may have more than one warehouse available - any will work for this quickstart, which runs very few computations.
Some Snowflake user accounts may have a default warehouse they automatically use when connecting.
The connection configuration's warehouse
parameter is not required, but we recommend specifying the warehouse explicitly in the configuration to ensure SQLMesh's behavior doesn't change if the user's default warehouse changes.
Database name
Snowflake user accounts may have a "Default Namespace" that includes a default database they automatically use when connecting.
The connection configuration's database
parameter is not required, but we recommend specifying the database explicitly in the configuration to ensure SQLMesh's behavior doesn't change if the user's default namespace changes.
Configure the connection
We now have the information we need to configure SQLMesh's connection to Snowflake.
We start the configuration by adding a gateway named snowflake
to our example project's config.yaml file and making it our default_gateway
:
And we specify the account
, user
, password
, database
, and warehouse
connection parameters using the information from above:
Warning
Best practice for storing secrets like passwords is placing them in environment variables that the configuration file loads dynamically. For simplicity, this guide instead places the value directly in the configuration file.
This code demonstrates how to use the environment variable SNOWFLAKE_PASSWORD
for the configuration's password
parameter:
Check connection
We have now specified the snowflake
gateway connection information, so we can confirm that SQLMesh is able to successfully connect to Snowflake. We will test the connection with the sqlmesh info
command.
First, open a command line terminal. Now enter the command sqlmesh info
:
The output shows that our data warehouse connection succeeded:
However, the output includes a WARNING
about using the Snowflake SQL engine for storing SQLMesh state:
Warning
Snowflake is not designed for transactional workloads and should not be used to store SQLMesh state even in testing deployments.
Learn more about storing SQLMesh state here.
Specify state connection
We can store SQLMesh state in a different SQL engine by specifying a state_connection
in our snowflake
gateway.
This example uses the DuckDB engine to store state in the local snowflake_state.db
file:
Now we no longer see the warning when running sqlmesh info
, and we see a new entry State backend connection succeeded
:
Run a sqlmesh plan
Now we're ready to run a sqlmesh plan
in Snowflake:
And confirm that our schemas and objects exist in the Snowflake catalog:
Congratulations - your SQLMesh project is up and running on Snowflake!
Local/Built-in Scheduler
Engine Adapter Type: snowflake
Installation
Connection options
Option | Description | Type | Required |
---|---|---|---|
type |
Engine type name - must be snowflake |
string | Y |
account |
The Snowflake account name | string | Y |
user |
The Snowflake username | string | N |
password |
The Snowflake password | string | N |
authenticator |
The Snowflake authenticator method | string | N |
warehouse |
The Snowflake warehouse name | string | N |
database |
The Snowflake database name | string | N |
role |
The Snowflake role name | string | N |
token |
The Snowflake OAuth 2.0 access token | string | N |
private_key |
The optional private key to use for authentication. Key can be Base64-encoded DER format (representing the key bytes), a plain-text PEM format, or bytes (Python config only). | string | N |
private_key_path |
The optional path to the private key to use for authentication. This would be used instead of private_key . |
string | N |
private_key_passphrase |
The optional passphrase to use to decrypt private_key (if in PEM format) or private_key_path . Keys can be created without encryption so only provide this if needed. |
string | N |
session_parameters |
The optional session parameters to set for the connection. | dict | N |
Lowercase object names
Snowflake object names are case-insensitive by default, and Snowflake automatically normalizes them to uppercase. For example, the command CREATE SCHEMA sqlmesh
will generate a schema named SQLMESH
in Snowflake.
If you need to create an object with a case-sensitive lowercase name, the name must be double-quoted in SQL code. In the SQLMesh configuration file, it also requires outer single quotes.
For example, a connection to the database "my_db"
would include:
connection:
type: snowflake
<other connection options>
database: '"my_db"' # outer single and inner double quotes
Snowflake authorization methods
The simplest (but arguably least secure) method of authorizing a connection with Snowflake is with a username and password.
This section describes how to configure other authorization methods.
Snowflake SSO Authorization
SQLMesh supports Snowflake SSO authorization connections using the externalbrowser
authenticator method. For example:
gateways:
snowflake:
connection:
type: snowflake
account: ************
user: ************
authenticator: externalbrowser
warehouse: ************
database: ************
role: ************
Snowflake OAuth Authorization
SQLMesh supports Snowflake OAuth authorization connections using the oauth
authenticator method. For example:
Snowflake Private Key Authorization
SQLMesh supports Snowflake private key authorization connections by providing the private key as a path, Base64-encoded DER format (representing the key bytes), a plain-text PEM format, or as bytes (Python Only).
The account
and user
parameters are required for each of these methods.
Private Key Path
Note: private_key_passphrase
is only needed if the key was encrypted with a passphrase.
Private Key PEM
Note: private_key_passphrase
is only needed if the key was encrypted with a passphrase.
Private Key Base64
Note: This is base64 encoding of the bytes of the key itself and not the PEM file contents.
Private Key Bytes
Base64 encode the bytes and follow Private Key Base64 instructions.
from sqlmesh.core.config import (
Config,
GatewayConfig,
ModelDefaultsConfig,
SnowflakeConnectionConfig,
)
from cryptography.hazmat.primitives import serialization
key = """-----BEGIN PRIVATE KEY-----
...
-----END PRIVATE KEY-----""".encode()
p_key= serialization.load_pem_private_key(key, password=None)
pkb = p_key.private_bytes(
encoding=serialization.Encoding.DER,
format=serialization.PrivateFormat.PKCS8,
encryption_algorithm=serialization.NoEncryption(),
)
config = Config(
model_defaults=ModelDefaultsConfig(dialect="snowflake"),
gateways={
"my_gateway": GatewayConfig(
connection=SnowflakeConnectionConfig(
user="user",
account="account",
private_key=pkb,
),
),
}
)
The authenticator method is assumed to be snowflake_jwt
when private_key
is provided, but it can also be explicitly provided in the connection configuration.
Airflow Scheduler
Engine Name: snowflake
The SQLMesh Snowflake Operator is similar to the SnowflakeOperator, and relies on the same SnowflakeHook implementation.
To enable support for this operator, the Airflow Snowflake provider package should be installed on the target Airflow cluster along with SQLMesh with the Snowflake extra:
The operator requires an Airflow connection to determine the target Snowflake account. Refer to Snowflake connection for more details.
By default, the connection ID is set to snowflake_default
, but can be overridden using the engine_operator_args
parameter to the SQLMeshAirflow
instance as in the example below:
Configuring Virtual Warehouses
The Snowflake Virtual Warehouse a model should use can be specified in the session_properties
attribute of the model definition: