Observability
This guide covers Qarion's observability infrastructure: structured logging, request correlation, and application metrics.
Overview
Qarion's multi-instance architecture requires first-class observability. Three middleware components work together to provide a unified picture of request processing:
| Component | Middleware | Purpose |
|---|---|---|
| Structured Logging | request_logging_middleware.py | JSON-formatted request/response logs |
| Correlation IDs | correlation_middleware.py | Trace requests across services |
| Metrics | metrics_middleware.py | Prometheus-compatible application metrics |
Structured Logging
All log output uses structured JSON format for machine-readable log aggregation.
Log Format
Each log entry includes:
| Field | Description |
|---|---|
timestamp | ISO 8601 timestamp |
level | Log level (INFO, WARNING, ERROR) |
logger | Python logger name |
message | Human-readable log message |
correlation_id | Request correlation ID (see below) |
method | HTTP method |
path | Request path |
status_code | Response status code |
duration_ms | Request duration in milliseconds |
user_id | Authenticated user ID (if available) |
space_id | Space context (if available) |
Configuration
Structured logging is enabled by default. Configure the log level via the LOG_LEVEL environment variable:
LOG_LEVEL=INFO # Default
LOG_LEVEL=DEBUG # Include SQL queries and detailed traces
Log Filtering
API request logs are emitted at INFO level with a distinct logger name, making them easy to filter from SQL query noise. Use your log aggregator's filtering to isolate API activity:
logger:"app.middleware.request_logging_middleware"
Request Correlation
Every incoming request is assigned a unique Correlation ID that propagates through all log entries, database queries, and downstream service calls for that request.
How It Works
- The
correlation_middlewarechecks for an incomingX-Correlation-IDheader - If present, that ID is used; otherwise a new UUID is generated
- The ID is stored in context and attached to all log entries
- The ID is returned in the
X-Correlation-IDresponse header
Usage
Pass a correlation ID from your client to trace a request end-to-end:
curl -H "X-Correlation-ID: my-trace-123" https://api.qarion.com/...
All log entries for that request will include "correlation_id": "my-trace-123", enabling you to filter your log aggregator by this value.
Multi-Instance Tracing
In a multi-instance deployment, correlation IDs allow you to trace a single user action across multiple backend instances. The ID travels with the request regardless of which instance handles it.
Application Metrics
Qarion exposes Prometheus-compatible metrics for monitoring request latency, error rates, and throughput.
Metrics Endpoint
GET /metrics
Returns metrics in Prometheus exposition format.
Available Metrics
| Metric | Type | Labels | Description |
|---|---|---|---|
http_requests_total | Counter | method, path, status | Total HTTP requests |
http_request_duration_seconds | Histogram | method, path | Request latency distribution |
http_requests_in_progress | Gauge | — | Currently processing requests |
Prometheus Configuration
Add Qarion to your Prometheus scrape configuration:
scrape_configs:
- job_name: 'qarion'
scrape_interval: 15s
static_configs:
- targets: ['qarion-api:8000']
metrics_path: '/metrics'
Grafana Dashboards
Use the exposed metrics to build dashboards for:
- Request rate —
rate(http_requests_total[5m]) - Error rate —
rate(http_requests_total{status=~"5.."}[5m]) - P95 latency —
histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[5m])) - Concurrent requests —
http_requests_in_progress
Instance Identification
The instance_middleware tags each request with the instance identifier, useful for debugging load balancer routing and identifying instance-specific issues.
| Header | Description |
|---|---|
X-Instance-ID | Instance identifier returned in response headers |
Best Practices
- Always pass Correlation IDs from client applications to enable end-to-end tracing
- Set
LOG_LEVEL=INFOin production to capture API requests without SQL noise - Scrape
/metricswith Prometheus for real-time monitoring and alerting - Use structured log fields (not free-text search) for log aggregation queries
- Monitor P95 latency to catch performance regressions early