Qarion ETL Tutorials
Step-by-step tutorials for using Qarion ETL.
Tutorial 1: Building Your First Change Feed
This tutorial walks you through creating a change feed flow to track changes in customer data.
Step 1: Initialize Project
qarion-etl init --project-name my_project
cd my_project
Step 2: Create Flow Definition
Create flows/customers_change_feed.toml:
id = "customers_change_feed"
name = "Customers Change Feed"
flow_type = "change_feed"
namespace = "raw"
[input]
primary_key = ["customer_id"]
columns = ["customer_id", "name", "email", "status", "updated_at"]
[properties.load]
source_path = "data/customers"
file_pattern = "customers_*.csv"
Step 3: Generate Datasets
qarion-etl generate-docs
This creates dataset definitions in datasets/.
Step 4: Generate Code
qarion-etl generate-code --format sql --flow customers_change_feed --output-dir output
Step 5: Build and Apply Migrations
# Generate datasets and migrations
qarion-etl build
# Apply migrations to create tables
qarion-etl apply-migrations
Tutorial 2: Financial Transaction Processing
Build a delta publishing flow for financial transactions.
Step 1: Create Flow
Create flows/transactions_delta.toml:
id = "transactions_delta"
name = "Transactions Delta Publishing"
flow_type = "delta_publishing"
[input]
primary_key = ["transaction_id"]
columns = ["transaction_id", "account_id", "amount", "transaction_date", "type"]
[properties]
namespace = "finance"
Step 2: Generate DBT Code
qarion-etl generate-code --format dbt --flow transactions_delta --output-dir dbt_project --dialect postgres
Step 3: Review Generated Code
Check the generated DBT models in dbt_project/models/.
Tutorial 3: User Session Analysis
Create a sessionization flow for web analytics.
Step 1: Create Flow
Create flows/user_sessions.toml:
id = "user_sessions"
name = "User Sessionization"
flow_type = "sessionization"
[input]
primary_key = ["event_id"]
columns = ["event_id", "user_id", "event_time", "event_type", "page_url"]
[properties]
session_timeout_minutes = 30
Step 2: Generate Code
qarion-etl generate-code --format dbt --flow user_sessions --output-dir dbt_project --dialect postgres
Step 3: Review Generated Code
Check the generated DBT models in dbt_project/models/.
Tutorial 4: Quick Start — Standard Flow End-to-End
Create a standard flow that loads CSV data, transforms it, and exports results.
Step 1: Initialize and Create Flow
qarion-etl init --project-name quick_start
cd quick_start
# Copy the example flow
cp examples/flows/standard.toml flows/my_first_flow.toml
Step 2: Prepare Sample Data
Create data/users.csv:
id,name,email,created_at
1,Alice,alice@example.com,2024-01-15
2,Bob,,2024-02-20
3,Charlie,charlie@example.COM,2024-03-10
Step 3: Build, Validate, and Run
# Validate your flow definition
qarion-etl validate-config
# Build datasets and migrations
qarion-etl build
# Run the flow
qarion-etl trigger --flow-id example_standard --batch-id 1
Step 4: Check for Optimizations
qarion-etl suggest-optimizations --flow-id example_standard
Tutorial 5: Data Quality — Freshness and Quality Checks
Set up quality checks with alerting for a data table.
Step 1: Define Quality Check Flow
Create flows/user_quality.toml:
id = "user_quality"
name = "User Data Quality"
flow_type = "quality_check"
[input]
columns = ["id", "email", "age", "updated_at"]
primary_key = "id"
[properties]
source_table = "users"
[[properties.checks]]
id = "email_complete"
type = "completeness"
columns = ["email"]
severity = "error"
[[properties.checks]]
id = "data_freshness"
type = "freshness"
columns = ["updated_at"]
severity = "warning"
[properties.checks.config]
timestamp_column = "updated_at"
max_age_hours = 24
Step 2: Run Quality Checks with Alerting
# Run the quality suite
qarion-etl run-quality-checks --suite-id user_quality --alert-channel log
# Schedule recurring checks (runs every 6 hours)
qarion-etl schedule-quality-check \
--suite-id user_quality \
--cron "0 */6 * * *" \
--alert-channel file \
--alert-file quality_alerts.jsonl
Tutorial 6: Multi-Engine — Same Flow, Different Engines
Run the same flow on SQLite and then DuckDB to compare engines.
Step 1: Create a Flow
Use the same standard flow from Tutorial 4.
Step 2: Run on SQLite (Default)
# qarion.toml already defaults to SQLite
qarion-etl trigger --flow-id example_standard --batch-id 1
Step 3: Switch to DuckDB
Edit qarion.toml:
[engine]
type = "duckdb"
path = "data/warehouse.db"
Step 4: Run the Same Flow on DuckDB
qarion-etl trigger --flow-id example_standard --batch-id 1
Both runs produce the same result — the flow definition is engine-agnostic.