Tobiko Cloud: Getting Started
Tobiko Cloud is a data platform that extends SQLMesh to make it easy to manage data at scale without the waste.
We're here to make it easy to get started and feel confident that everything is working as expected. After you've completed the steps below, you'll have achieved the following:
- Log in to Tobiko Cloud via the browser
- Connect Tobiko Cloud to your local machine via the CLI
- Connect Tobiko Cloud to your data warehouse
- Verify that Tobiko Cloud interacts with your data warehouse as expected
Prerequisites
Before you start, the Tobiko team must complete a few steps.
Your Tobiko Solutions Architect will:
- Set up a 1 hour meeting with you to fully onboard
- Request that a new Tobiko Cloud account be created for you (single tenant by default)
- Share a temporary password link that expires in 7 days
- Make sure you save the password in your own password manager
To prepare for the meeting, ensure you or another attendee have data warehouse administrator rights to:
- Update warehouse user and object permissions
- Create new users and grant them create/update/delete permissions on a specific database (ex:
database.schema.table
)
For migrations from SQLMesh (open source) to Tobiko Cloud only:
- Your Tobiko Solutions Architect will send you a script to extract your current state
- You send that state to the Tobiko Cloud engineers to validate before the migration occurs
- After validation, Tobiko Solutions Architect will schedule a migration date and meeting to move your state to Tobiko Cloud. There will be some downtime if you are running SQLMesh in a production environment.
Note: if you must be on VPN to access your data warehouse or have specific security requirements, please let us know and we can discuss options to ensure Tobiko Cloud can securely connect.
Technical Requirements:
- Tobiko Cloud requires Python version 3.9 or later
Log in to Tobiko Cloud
The first step to setting up Tobiko Cloud is logging in to the web interface:
- Open a browser and navigate to the Tobiko Cloud URL (ex: https://cloud.tobikodata.com/sqlmesh/tobiko/public-demo/observer/)
- Leave the username blank and use the temporary password you received from the Solutions Architect in the temporary password link
- Once logged in, you should see the home page. Your view should be empty, but the figure below shows a populated example with Tobiko Cloud running in production:
Install the tcloud
CLI
Now we need to configure the tcloud
command line interface tool.
First, open a terminal within your terminal/IDE (ex: VSCode). Then follow the following steps to install the tcloud
CLI:
-
Create a new project directory and navigate into it:
-
Create a new file called
requirements.txt
and addtcloud
to it:Pypi source: tcloud
-
Create a Python virtual environment and install
tcloud
:Note: you may need to run
python3
orpip3
instead ofpython
orpip
, depending on your python installation. -
Create an alias to ensure use of
tcloud
:We recommend using a command line alias to ensure all
sqlmesh
commands run on Tobiko Cloud.Set the alias in the terminal by running
alias sqlmesh='tcloud sqlmesh'
in every session.Or add this to your shell profile file (ex:
~/.zshrc
or~/.bashrc
) so you don't have to run the command every time:Note: the rest of the commands in this document will NOT use the alias to avoid confusion with the open source SQLMesh CLI.
Connect Tobiko Cloud to Data Warehouse
Now we're ready to connect your data warehouse to Tobiko Cloud:
-
Create a new file called
tcloud.yaml
and add the project configuration below, substituting the appropriate values for your project:projects: public-demo: # TODO: update this for the project name in the URL url: https://cloud.tobikodata.com/sqlmesh/tobiko/public-demo/ # TODO: update for your unique URL gateway: tobiko_cloud extras: bigquery,web,github # TODO: update bigquery for your data warehouse default_project: public-demo # TODO: update this for the project name in the URL
-
Export the token from the Solutions Architect:
tcloud
provides your security token to Tobiko Cloud via theTCLOUD_TOKEN
environment variable, so we must create and export it.Obtain the token from your Solutions Architect, then pass it to the environment variable with this command (substituting your token value in single quotes):
Note: always include the single quotes ' ' around your token
Storing your token
Always follow your organization's procedures for storing secrets and credentials.
The command above will create the
TCLOUD_TOKEN
environment variable, but the variable will only exist for the duration of the terminal session. We need a mechanism to create and export the variable every time we use Tobiko Cloud.If your organization doesn't have specific procedures for storing secrets, we recommend defining the environment variable in either an
.env
file in your root project directory or in your terminal's profile file.If using the former, make sure the
.env
file is listed in your.gitignore
file to prevent it from being tracked by Git and exposed in plain text.Example
.env
file:# TODO: add any other environment variables such as username, ports, etc. based on your data warehouse DATA_WAREHOUSE_CREDENTIALS=<your data warehouse credentials> TCLOUD_TOKEN=<your tcloud token>
Run these commands to load the environment variables from
.env
:set -a # Turn on auto-export source .env # Read the file, all variables are automatically exported set +a # Turn off auto-export
Or add these lines to your
~/.bashrc
or~/.zshrc
file to automatically export the environment variables when you open the terminal (instead of running theset -a
andsource .env
commands every time):# .bashrc or .zshrc export DATA_WAREHOUSE_CREDENTIALS=<your data warehouse credentials> export TCLOUD_TOKEN=<your tcloud token>
Note that the automatically exported environment variables will be accessible to other programs on your computer.
-
Initialize a new SQLMesh project:
-
Update your project's
config.yaml
with your data warehouse connection information:Your new SQLMesh project will contain a configuration file named
config.yaml
that includes a DuckDB connection.Replace the DuckDB connection information with your data warehouse's information.
This example shows a Bigquery warehouse connection; see more examples here.
-
Create a
tcloud
user in the warehouseDuring your onboarding call, we will walk through instructions live to create a new
tcloud
data warehouse user with the necessary permissions.SQLMesh will run as this user to create, update, and delete tables in your data warehouse. You can scope the user permissions to a specific database if needed.
Find additional data warehouse specific instructions here: Data Warehouse Integrations.
-
Verify the connection between Tobiko Cloud and data warehouse:
Now we're ready to verify that the connection between Tobiko Cloud and the data warehouse is working properly.
Run the
info
command from your terminal:It will return output similar to this:
Verify SQLMesh functionality
Let's run a plan
to verify that SQLMesh is working correctly.
Run tcloud sqlmesh plan
in your terminal and enter y
at the prompt to apply the changes.
It will return output similar to this:
(.venv) ➜ tcloud_project git:(main) ✗ tcloud sqlmesh plan
======================================================================
Successfully Ran 1 tests against duckdb
----------------------------------------------------------------------
New environment `prod` will be created from `prod`
Summary of differences against `prod`:
Models:
└── Added:
├── sqlmesh_example.full_model
├── sqlmesh_example.incremental_model
└── sqlmesh_example.seed_model
Models needing backfill (missing dates):
├── sqlmesh_example.full_model: 2024-11-24 - 2024-11-24
├── sqlmesh_example.incremental_model: 2020-01-01 - 2024-11-24
└── sqlmesh_example.seed_model: 2024-11-24 - 2024-11-24
Apply - Backfill Tables [y/n]: y
Creating physical tables ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100.0% • 3/3 • 0:00:00
All model versions have been created successfully
[1/1] sqlmesh_example.seed_model evaluated in 0.00s
[1/1] sqlmesh_example.incremental_model evaluated in 0.01s
[1/1] sqlmesh_example.full_model evaluated in 0.01s
Evaluating models ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100.0% • 3/3 • 0:00:00
All model batches have been executed successfully
Virtually Updating 'prod' ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100.0% • 0:00:00
The target environment has been updated successfully
Tobiko Cloud and SQLMesh are working!
Next steps
Your tcloud
project directory should look and feel like this:
From here, if you have an existing SQLMesh project, you can copy over your existing models and macros to the models
and macros
directories (along with other files as needed).
You are now fully onboarded with Tobiko Cloud. We recommend reviewing the helpful links below to get familiar with SQLMesh and Tobiko Cloud.
Here's to data transformation without the waste!