Flow Variables
A comprehensive guide to passing variables to flows in Qarion ETL, enabling dynamic configuration and parameterization of flow execution.
Overview
Flow variables allow you to pass dynamic values to flows at execution time, making flows reusable and configurable. Variables are available in:
- Flow templates (Jinja2-style templating)
- Task configurations
- SQL queries
- File paths
- Any template-enabled configuration
Quick Start
Basic Variable Passing
qarion-etl trigger --flow-id my_flow --var environment=production --var region=us-east-1
Using Variables in Flow Templates
# flows/my_flow.toml
id = "my_flow"
name = "My Flow"
flow_type = "standard"
# Variables can be used in templates
[[tasks]]
id = "ingest_data"
type = "ingestion"
target_dataset_id = "data_{{ environment }}"
[tasks.config]
file_path = "s3://bucket-{{ region }}/data/input.csv"
CLI Usage
Trigger Command with Variables
# Single variable
qarion-etl trigger --flow-id my_flow --var key=value
# Multiple variables
qarion-etl trigger --flow-id my_flow \
--var environment=production \
--var region=us-east-1 \
--var batch_size=1000
# Variables with spaces (use quotes)
qarion-etl trigger --flow-id my_flow \
--var message="Hello World" \
--var description="Data processing flow"
Variable Types
Variables are automatically parsed based on their values:
# String (default)
--var name=John
# Number (integer)
--var count=123
# Number (float)
--var ratio=0.95
# Boolean
--var enabled=true
--var disabled=false
# Null
--var optional=null
Variable Formats
String Variables
--var name=value
--var message="value with spaces"
--var path=/data/files/
Number Variables
--var count=100 # Integer
--var ratio=0.95 # Float
--var threshold=1000 # Integer
Boolean Variables
--var enabled=true
--var disabled=false
Null Variables
--var optional=null
--var missing=none
Using Variables in Flows
In Flow Definition
Variables can be defined in the flow definition and overridden at execution time:
# flows/my_flow.toml
id = "my_flow"
flow_type = "standard"
# Default variables (can be overridden)
[variables]
environment = "development"
region = "us-west-2"
batch_size = 100
[[tasks]]
id = "process_data"
type = "transformation"
source_dataset_id = "data_{{ environment }}"
In Task Configurations
[[tasks]]
id = "export_data"
type = "export"
[tasks.properties]
destination = "s3://bucket-{{ region }}/output/"
format = "parquet"
[tasks.properties.export_config]
batch_size = {{ batch_size }}
In SQL Queries
[[tasks]]
id = "filter_data"
type = "transformation"
[tasks.config]
sql = """
SELECT *
FROM source_table
WHERE region = '{{ region }}'
AND environment = '{{ environment }}'
AND batch_id = {{ batch_id }}
"""
In File Paths
[[tasks]]
id = "load_file"
type = "ingestion"
[tasks.config]
file_path = "s3://data-bucket/{{ environment }}/{{ region }}/input.csv"
Variable Precedence
Variables are merged in the following order (later values override earlier ones):
- Flow-level variables (defined in flow definition)
- CLI-provided variables (from
--vararguments) - System metadata (
batch_id,execution_date)
CLI-provided variables take precedence over flow-level variables.
Examples
Example 1: Environment-Specific Configuration
# Production
qarion-etl trigger --flow-id data_pipeline \
--var environment=production \
--var region=us-east-1 \
--var s3_bucket=prod-data-bucket
# Development
qarion-etl trigger --flow-id data_pipeline \
--var environment=development \
--var region=us-west-2 \
--var s3_bucket=dev-data-bucket
Flow definition:
id = "data_pipeline"
flow_type = "standard"
[[tasks]]
id = "ingest"
type = "ingestion"
[tasks.config]
file_path = "s3://{{ s3_bucket }}/{{ environment }}/input/"
Example 2: Dynamic Batch Processing
qarion-etl trigger --flow-id batch_processor \
--var batch_size=5000 \
--var max_retries=3 \
--var timeout=300
Flow definition:
[[tasks]]
id = "process_batch"
type = "transformation"
[tasks.config]
sql = """
SELECT *
FROM source_table
LIMIT {{ batch_size }}
"""
[tasks.config.retry_config]
max_retries = {{ max_retries }}
timeout = {{ timeout }}
Example 3: Date-Based Processing
qarion-etl trigger --flow-id daily_report \
--var report_date="2024-01-15" \
--var include_weekend=false
Flow definition:
[[tasks]]
id = "generate_report"
type = "export"
[tasks.properties]
destination = "s3://reports/{{ report_date }}/daily_report.csv"
[tasks.config]
query = """
SELECT *
FROM daily_data
WHERE date = '{{ report_date }}'
{% if not include_weekend %}
AND DAYOFWEEK(date) NOT IN (1, 7)
{% endif %}
"""
Example 4: Multi-Environment Deployment
# Staging
qarion-etl trigger --flow-id deploy \
--var env=staging \
--var db_host=staging-db.example.com \
--var api_key="${STAGING_API_KEY}"
# Production
qarion-etl trigger --flow-id deploy \
--var env=production \
--var db_host=prod-db.example.com \
--var api_key="${PROD_API_KEY}"
System Variables
The following variables are automatically available in all flows:
batch_id: Current batch ID (integer)execution_date: Execution date/time (datetime object)
These are always available and don't need to be passed via --var.
Variable Access in Templates
Variables are available in Jinja2 templates throughout the flow:
# In file paths
file_path = "s3://bucket/{{ environment }}/{{ date }}/data.csv"
# In SQL queries
sql = "SELECT * FROM {{ table_prefix }}_data WHERE region = '{{ region }}'"
# In conditions
{% if environment == 'production' %}
# Production-specific configuration
{% endif %}
# In loops
{% for region in regions %}
# Process each region
{% endfor %}
Best Practices
-
Use Descriptive Variable Names:
--var environment=production # Good
--var e=prod # Avoid -
Document Variables:
# flows/my_flow.toml
# Required variables:
# - environment: Deployment environment (production, staging, development)
# - region: AWS region (us-east-1, eu-west-1, etc.) -
Provide Defaults:
[variables]
environment = "development" # Default value
region = "us-west-2" # Default value -
Use Environment Variables:
# Pass environment variables as flow variables
qarion-etl trigger --flow-id my_flow \
--var api_key="${API_KEY}" \
--var db_password="${DB_PASSWORD}" -
Validate Variables:
- Use flow validation to ensure required variables are provided
- Check variable values in flow templates
Troubleshooting
Variable Not Available
Problem: Variable is None or not found in template.
Solution:
- Check variable name spelling
- Ensure variable is passed via
--var - Verify variable is merged correctly (CLI variables override flow-level)
Variable Type Issues
Problem: Variable is treated as string when it should be a number.
Solution:
- Numbers are automatically parsed
- For explicit type conversion, use Jinja2 filters:
{{ count | int }}
{{ ratio | float }}
Special Characters
Problem: Variable value contains special characters.
Solution:
- Use quotes for values with spaces or special characters:
--var message="Hello, World!"
--var path="/data/files/"