Multi-Engine guide
Organizations typically connect to a data warehouse through a single engine to ensure data consistency. However, there are cases where the processing capabilities of one engine may be better suited to specific tasks than another.
By decoupling storage from compute and with growing support for open table formats like Apache Iceberg and Hive, different engines can now interact with the same data.
With SQLMesh's new multi-engine feature, users can leverage multiple engine adapters within a single project, offering the flexibility to choose the best engine for each task.
This feature allows you to run each model on a specified engine, provided the data catalog is shared and the engines support read/write operations on it.
Configuring project with multiple engines
To configure a SQLMesh project with multiple engines, simply include all required gateway connections in your configuration.
Next, specify the appropriate gateway
in the MODEL
DDL for each model. If no gateway is explicitly defined, the default gateway will be used.
The virtual layer will be created within the engine corresponding to the default gateway.
Example
Below is a simple example of setting up a project with connections to both DuckDB and PostgreSQL.
In this setup, the PostgreSQL engine is set as the default, so it will be used to manage views in the virtual layer.
Meanwhile, the DuckDB's attach feature enables read-write access to the PostgreSQL catalog's physical tables.
Given this configuration, when a model’s gateway is set to duckdb, it will be materialized within the PostgreSQL main_db
catalog, but it will be evaluated using DuckDB’s engine.
In this model, the DuckDB engine can be used to scan and load data from an iceberg table and create the physical table in the PostgreSQL database.
While the PostgreSQL engine is responsible for creating the model's view for the virtual layer.