Configuration Reference
Complete reference for Qarion ETL configuration options.
Configuration File
Qarion ETL uses a config.toml file for project configuration.
Engine Configuration
Qarion ETL supports two types of engines:
- Processing Engine (
[engine]): Required. Used for data transformations and processing. - Metadata Engine (
[metadata_engine]): Optional. Used for storing metadata in database storage. Defaults to processing engine if not specified.
Processing Engine
The processing engine is configured in the [engine] section:
SQLite
[engine]
name = "sqlite"
[engine.config]
path = "data/qarion-etl.db"
Pandas In-Memory
[engine]
name = "pandas_memory"
[engine.config]
# No configuration required
Pandas Local Storage
[engine]
name = "pandas_local"
[engine.config]
storage_dir = "data/pandas"
DuckDB
[engine]
name = "duckdb"
[engine.config]
path = "data/qarion-etl.duckdb"
Metadata Engine
The metadata engine is optional and configured in the [metadata_engine] section. If not specified, the processing engine is used for metadata storage.
When to configure a separate metadata engine:
- Using database storage for metadata (
dataset_storage = "database",flow_storage = "database", etc.) - Want to separate processing workloads from metadata management
- Using different engines optimized for different purposes
Example: Separate Metadata Engine
# Processing engine
[engine]
name = "pandas_memory"
# Metadata engine (for database storage)
[metadata_engine]
name = "sqlite"
[metadata_engine.config]
path = "data/metadata.db"
# Use database storage
dataset_storage = "database"
flow_storage = "database"
Note: If metadata_engine is not specified and you're using database storage, the processing engine will be used for metadata operations.
Repository Storage Configuration
Local Storage
Store metadata (datasets, flows, migrations) in local files:
[dataset_storage]
type = "local"
config = { dataset_dir = "datasets" }
[flow_storage]
type = "local"
config = { flow_dir = "flows" }
[migration_storage]
type = "local"
config = { migration_dir = "migrations" }
Database Storage
Store metadata in database tables. When using database storage, you can optionally configure a separate [metadata_engine] for metadata operations. If not specified, the processing engine ([engine]) is used.
# Processing engine (for data transformations)
[engine]
name = "pandas_memory"
# Metadata engine (optional - for metadata storage)
[metadata_engine]
name = "sqlite"
[metadata_engine.config]
path = "data/metadata.db"
# Use database storage
dataset_storage = "database"
flow_storage = "database"
schema_storage = "database"
Note: The metadata_engine configuration is used when dataset_storage, flow_storage, or schema_storage is set to "database". If metadata_engine is not specified, the processing engine is used for metadata operations.
Schema History Storage
Local Schema History
Schema history from migration files:
[schema_storage]
type = "local"
config = { migration_dir = "migrations" }
Database Schema History
Schema history in database:
[schema_storage]
type = "database"
config = {
connection_string = "sqlite:///metadata.db",
namespace = "xt"
}
Storage Backends (Input Files)
Storage backends are automatically detected from file paths. For S3:
[properties.input_ingestion]
path = "s3://my-bucket/data/"
pattern = "orders_*.csv"
credentials = {
aws_access_key_id = "your-access-key"
aws_secret_access_key = "your-secret-key"
region_name = "us-east-1"
}
Flow Loading Configuration
CSV Loader
[flow.load]
type = "csv"
delimiter = ","
header = true
JSON Loader
[flow.load]
type = "json"
Parquet Loader
[flow.load]
type = "parquet"
Fernet Key
A Fernet encryption key is automatically generated on project initialization:
fernet_key = "gAAAAABh..." # Automatically generated, never commit to version control
Important:
- Generated automatically when you run
qarion-etl new-projectorqarion-etl init - Required for credential encryption
- Never commit to version control - add
qarion-etl.tomlto.gitignore - Each project should have its own unique key
Credential Store Configuration
Local Keystore
[credential_store]
type = "local_keystore"
[credential_store.config]
keystore_path = "~/.qarion_etl/credentials.keystore" # Optional
# fernet_key is automatically loaded from project config
Database Store
[credential_store]
type = "database"
[credential_store.config]
engine = { name = "sqlite", config = { path = "metadata.db" } }
table_name = "xt_credentials" # Optional
# fernet_key is automatically loaded from project config
AWS SSM Parameter Store
[credential_store]
type = "aws_ssm"
[credential_store.config]
parameter_prefix = "/qarion_etl/credentials/" # Optional, default: /qarion_etl/credentials/
region_name = "us-east-1"
kms_key_id = "alias/my-credentials-key" # Optional, uses default SSM key if not provided
Credential Definitions
[[credentials]]
id = "my_aws_creds"
name = "AWS Production Credentials"
credential_type = "aws"
description = "AWS credentials for production"
[credentials.metadata]
environment = "production"
Credential Types:
aws: AWS credentialsdatabase: Database credentialsapi_key: API key credentialsoauth: OAuth credentialsbasic_auth: Basic authenticationcustom: Custom credential type
Environment Variables
Configuration files support environment variable substitution. See Configuration Guide for details.
Quick Reference
${VAR_NAME}- Substitute with optional default:${VAR_NAME:-default}$VAR_NAME- Simple substitution (no default)XTRANSACT_CONFIG_PATH- Override configuration file path
Note: For production environments, consider using the Credential Store instead of environment variables for better security and management.
Related Documentation
- Configuration Guide
- Credential Management - Complete credential management guide
- Engines and Storage - Detailed guide on engines and storage
- Getting Started