Skip to main content

Database Design

Qarion uses PostgreSQL as its primary relational database, managed via SQLAlchemy (ORM) and Alembic (migrations).

Schema Principles

1. Multi-Tenancy

The schema is designed for multi-tenancy. Most resources belong to a Space or an Organization.

  • Space Isolation: Critical data (Datasets, Quality Checks) is scoped to a Space ID.
  • Cross-Tenant Aggregation: Tables like audit_logs or alerts are designed to allow efficient querying across spaces for organization-level administrators.

2. Polymorphism (Standard #330-PolymorphicIntegrity)

We use a polymorphic foreign key pattern for shared resources like Comments, Attachments, and Annotations.

  • Implementation: A main table (e.g., comments) has a resource_type and resource_id.
  • Integrity: While database-level FK constraints are difficult with true polymorphism, we enforce integrity via application-level checks and "Shadow Foreign Key" patterns where applicable for performance.

3. Provider Models

For integrating with external systems (Data Warehouses, Alerting Tools), we use discriminator-based inheritance.

  • Base Table: SourceSystem (common fields: name, type, active).
  • Provider specifics: Stored in JSONB fields (config) or separate 1-to-1 tables if the schema is complex.

Migration Strategy

Database changes are strictly versioned using Alembic.

  • Versioning: Every schema change must have a corresponding migration script.
  • Offline Compatible: Migrations generate SQL scripts that can be reviewed and applied by DBAs if necessary.
  • No Auto-Generation: While Alembic can auto-generate changes, we manually review all migrations to ensure efficient index usage and prevent locking issues.

Consistency Standards

Standard #330-SL: Relationship Hardening

All relationships are defined with lazy='raise' by default. Developers must explicitly opt-in to loading relationships.

Binary Sanitization (Standard #185)

To prevent encoding errors and large payload issues, binary fields and potentially non-UTF-8 text fields are recursively sanitized before being returned in API responses.