Quality Framework
Qarion's quality framework provides automated monitoring of data health through configurable checks that run on a schedule, detect problems, and surface alerts when something goes wrong. This page explains the quality dimensions the platform measures, the check types available, how severity and scheduling work, and how quality integrates with the broader governance model.
Quality Dimensions
Data quality is a multi-faceted concept. Qarion organizes it into five key dimensions, each addressing a different aspect of data health:
| Dimension | Question | How Qarion Measures |
|---|---|---|
| Freshness | Is the data current? | Timestamp age checks |
| Completeness | Is all expected data present? | Row counts, null checks |
| Uniqueness | Are there duplicates? | Primary key validation |
| Validity | Does data match expected formats? | Custom SQL assertions |
| Consistency | Is data aligned across sources? | Cross-product checks |
Not every product needs checks across all five dimensions, but thinking in these terms helps teams identify the most important quality risks for each dataset and prioritize monitoring accordingly.
Quality Check Types
Freshness
Freshness checks monitor how recently a dataset was updated. They work by examining a timestamp column (typically updated_at or loaded_at) and comparing the most recent value to a maximum age threshold. If the data is older than the threshold, the check fails.
{
"check_type": "freshness",
"config": {
"timestamp_column": "updated_at",
"max_age_hours": 24
}
}
Freshness checks are often the single most valuable type of quality monitoring, because stale data is a symptom of almost every pipeline failure — whether it's a crashed job, a blocked dependency, or a permissions issue. If you only set up one check per product, make it a freshness check.
Row Count
Row count checks validate that the number of rows in a dataset falls within expected bounds. You specify minimum and maximum thresholds, and the check fails if the actual count falls outside that range.
{
"check_type": "row_count",
"config": {
"min_rows": 1000,
"max_rows": 1000000
}
}
Unexpected drops in row count often indicate that a data load failed or was only partially completed, while unexpected spikes can signal duplicate ingestion or a runaway upstream process. Row count checks serve as a simple but effective sanity check on pipeline health.
Uniqueness
Uniqueness checks detect duplicate records by examining a specific column — typically the primary key or a business key that should contain only distinct values.
{
"check_type": "uniqueness",
"config": {
"column_name": "id"
}
}
The check passes only when zero duplicates are found. Duplicates can cause double-counting in aggregations, incorrect joins, and misleading metrics, so uniqueness checks are especially important for products that serve as dimension tables or that feed into financial reporting.
Not Null
Not null checks ensure that required fields are populated. They examine a single column and fail if any null values are found.
{
"check_type": "not_null",
"config": {
"column_name": "customer_id"
}
}
This check type is essential for fields that downstream consumers depend on — such as foreign keys, required identifiers, and critical business attributes. A null value in a join key, for example, can silently drop rows from downstream query results without producing any visible error.
Custom SQL
Custom SQL checks provide maximum flexibility by letting you write any SQL assertion and compare the result against an expected value. The query should return a single numeric value, which is then compared using a configurable operator.
{
"check_type": "custom_sql",
"config": {
"sql_query": "SELECT COUNT(*) FROM orders WHERE amount < 0",
"expected_value": 0,
"comparison": "equals"
}
}
Supported comparison operators include equals, greater_than, less_than, and between. Custom SQL checks are the right choice for business rule validation (e.g., "no orders should have negative amounts"), cross-table consistency checks (e.g., "the sum of line items should match the order total"), and any quality rule that can't be expressed by the built-in check types.
Severity Levels
Each quality check is assigned a severity level that determines how its failures are prioritized and communicated:
| Level | Icon | When to Use |
|---|---|---|
| Info | ℹ️ | Minor issues, FYI only |
| Warning | ⚠️ | Degraded quality, needs attention |
| Critical | 🚨 | Data unusable, immediate action required |
Severity affects how alerts appear in the platform — critical alerts are surfaced prominently in dashboards and may trigger immediate notifications, while info-level alerts are logged for trend analysis but don't demand immediate attention. Choosing the right severity is important: if everything is marked as critical, teams quickly develop alert fatigue and start ignoring notifications entirely.
Scheduling
Quality checks run on a schedule defined by a cron expression. The schedule should align with the cadence of the data pipeline that produces the product:
| Schedule | Cron | Use Case |
|---|---|---|
| Every hour | 0 * * * * | Streaming data, fast-moving pipelines |
| Daily 6 AM | 0 6 * * * | Batch pipelines, daily loads |
| Weekly Monday | 0 0 * * 1 | Weekly aggregations |
| Every 15 min | */15 * * * * | Real-time monitoring |
Running checks too frequently wastes compute resources and can generate noisy alerts during normal pipeline windows. Running them too infrequently means problems go undetected for longer. A good rule of thumb is to schedule the check to run shortly after the pipeline is expected to complete — for a daily batch job that finishes at 5 AM, scheduling a freshness check at 6 AM gives a reasonable buffer while still catching failures quickly.
Quality Score
Each data product has an aggregated quality score that provides a quick summary of its overall health. The score is calculated as the ratio of passing checks to total checks:
Quality Score = Passing Checks / Total Checks
This score maps to a visual health indicator in the catalog:
| Score | Status | Color |
|---|---|---|
| ≥ 0.95 | Healthy | 🟢 Green |
| 0.80 - 0.94 | Warning | 🟡 Yellow |
| < 0.80 | Critical | 🔴 Red |
The quality score is a useful signal for data consumers deciding whether to trust a dataset, and it can also be referenced in data contracts as an SLA term (for example, requiring that a product maintain a minimum quality score of 0.95).
Alert Lifecycle
When a quality check fails, the platform creates an alert and tracks it through a lifecycle:
An alert begins in the Open state, indicating that it needs attention. When someone begins investigating, they can move it to Acknowledged to signal that the problem is being worked on. Once the underlying issue is fixed and the check passes again, the alert transitions to Resolved — either through manual intervention or automatically when the next scheduled run succeeds.
For problems that require more formal tracking, open alerts can be escalated into issues, which provide a richer workflow with assignments, priorities, and comment threads.
Quality Checks via API
Create Check
To create a new quality check, send a POST request with the check configuration:
POST /quality/checks
{
"name": "Orders Freshness",
"product_id": "product-uuid",
"check_type": "freshness",
"config": {
"timestamp_column": "updated_at",
"max_age_hours": 24
},
"schedule": "0 * * * *",
"severity": "critical"
}
Trigger Manual Run
You can trigger a check to run immediately outside of its scheduled cadence, which is useful for validating that a newly created check is working correctly or for re-checking after a fix:
POST /quality/checks/{id}/run
Get Check History
To review the recent execution history of a check, use the history endpoint with an optional days parameter:
GET /quality/checks/{id}/history?days=7
Integration with Governance
Quality monitoring doesn't exist in isolation — it connects to the broader governance framework at several points.
Steward Responsibility
The product Steward is the person most directly responsible for quality. They define the quality rules and thresholds, set appropriate severity levels, and are the first responder when quality alerts are triggered. By connecting quality checks to governance roles, Qarion ensures that quality monitoring has a clear human owner.
Contract SLAs
Data contracts between producers and consumers can include explicit quality requirements as part of their SLA terms:
{
"sla": {
"quality_score_min": 0.95
}
}
When a product's quality score drops below the contracted threshold, the platform automatically tracks the violation, making quality a measurable and enforceable commitment — not just a best-effort aspiration.
Issue Integration
Critical alerts that require structured investigation and resolution can be escalated into formal issues. This creates a bridge between automated quality monitoring and the human-driven issue management workflow, ensuring that significant quality problems are tracked to resolution with full accountability.
Best Practices
Start with Freshness
If you're setting up quality monitoring for the first time, begin by adding a freshness check to every product. Freshness failures are the most common and most impactful data quality problem, and a single freshness check catches a wide variety of pipeline failures.
Layer Your Checks
As your quality program matures, build up monitoring in tiers:
Tier 1: Freshness + Row Count (all products)
Tier 2: Uniqueness + Not Null (critical fields)
Tier 3: Custom SQL (business rules)
This tiered approach ensures comprehensive coverage for important products without overwhelming your team with alerts on assets that don't warrant close monitoring.
Right-Size Severity
Reserve the Critical severity for genuine emergencies — situations where data is unusable and downstream processes are at risk. Use Warning for degraded quality that needs attention but isn't an emergency, and Info for monitoring trends over time. If most of your alerts are critical, none of them effectively are.
Test Your Checks
Before enabling a new check on a production schedule, run it manually a few times to verify that the logic works correctly. Check historical data to confirm that the thresholds you've set would have produced reasonable results over the past few days or weeks, and adjust thresholds based on normal variance to avoid false positives.
Related
- DQ Config (YAML) — Define quality checks as code using YAML configuration files
- Drift Detection Guide — Implement continuous monitoring for AI systems
- Quality API — Endpoint reference
- Quality Automation — Programmatic setup
- Data Model — Entity relationships