Master Data Architecture
Qarion's Master Data Management system provides version-controlled reference and master datasets with a draft/publish lifecycle, schema editing, row management, and governed exports.
Lifecycle
Core Concepts
Datasets
A dataset represents a reference or master data collection (e.g., country codes, product hierarchies). Each dataset has:
- Schema — Column definitions with types, constraints, and foreign keys
- Rows — The actual data records
- Versions — Immutable point-in-time snapshots
- Drafts — Editable working copies
Versioning
Each publish creates an immutable snapshot:
- Current schema and rows are frozen as a version
- Version number increments automatically
- Previous versions remain accessible for comparison and audit
- Version comparison shows added/modified/deleted rows
Drafts
Drafts allow batch editing before publishing:
- Create a draft from the current state
- Edit rows and schema in the draft
- Apply the draft to update the live dataset
- Publish to create a new version
Space Scoping
All datasets are scoped to spaces — the same access model used for data products. Authorization is enforced at the space level with audit logging on all mutations.
Import/Export
- CSV Import — Bulk load rows with upsert or replace strategies
- CSV/JSON Export — Download current or versioned row data
- Schema Export — Column definitions as structured metadata
Key Files
| File | Purpose |
|---|---|
app/api/endpoints/master_data.py | All API endpoints (66 operations) |
app/services/master_data_service.py | Business logic and validation |
app/models/master_data.py | ManagedDataset, DatasetVersion, DatasetRow, DatasetColumn |
app/schemas/master_data.py | Request/response Pydantic schemas |