AI Ops
AI Ops gives administrators a live operational view of Qarion AI usage, failures, guardrails, workflow runs, evaluations, and authoring readiness.
Open Admin -> AI Ops.
Pipeline Authoring Readiness
The Pipeline Authoring readiness section summarizes whether generated-code workflows can run safely in the current instance. Use it before rolling out or debugging AI-assisted implementation. It combines runtime settings, recent workflow evidence, failure signals, and reliability audit findings.
The top status is intentionally conservative:
| Status | Meaning |
|---|---|
| Ready | Required rollout controls are configured and recent evidence is healthy. |
| Attention | The workflow can run, but reliability, coverage, or rollout posture needs review. |
| Blocked | A required control is missing, unsupported, or failing closed. |
| Unknown | Qarion does not yet have enough recent activity or definitions to grade the signal. |
Key runtime chips include:
| Signal | Meaning |
|---|---|
| Code writer | Whether dedicated code-writer routing is enabled for executable Pipeline Authoring changes. |
| Code writer source | Which setting or environment source selected the code-writer state. |
| Sandbox | Whether generated-code sandbox validation is enabled. |
| Backend | The generated-code sandbox backend, such as Docker or Kubernetes. |
| Image | Whether the generated-code sandbox image is configured. |
| Namespace | Whether the Kubernetes sandbox namespace is configured when Kubernetes is used. |
| Timeout | The maximum generated-code sandbox execution time. |
| DB sandbox | Whether database sandbox validation is enabled for generated SQL/database work. |
| DB backend | The database sandbox backend, currently expected to be SQLite when enabled. |
| Dependency smoke | Whether dependency smoke validation is enabled. |
| Package fetch | The validation package-fetch policy. |
| Public index | Whether public package index access is configured. |
| Qarion base URL | Whether validation package fetch can resolve the Qarion package endpoint. |
| Profiler | Whether Pipeline Authoring performance profiling is enabled. |
| Slow module | The module load threshold that creates profiler warnings. |
| CPU spike | The CPU spike threshold that creates profiler warnings. |
| Memory growth | The memory growth threshold that creates profiler warnings. |
Readiness Checks
Readiness checks combine runtime configuration, recent failures, and coding-agent diagnostics. Statuses are intended to be operational:
- Ready means the check has the expected configuration or recent evidence.
- Attention means the workflow can still run, but rollout or reliability needs review.
- Blocked means the feature is missing required configuration or is failing closed.
- Unknown means there is not enough recent evidence for that check.
Common checks include:
| Check | What to review |
|---|---|
| Review integrity | Persisted plan reviews, saved-file records, failed-file reviews, and review drift. |
| Failure signals | Recent blocked or degraded examples from the reliability audit. |
| Active jobs | Pipeline Authoring chat jobs that appear stale or stuck. |
| Code generation specialist | Whether dedicated code-writer routing is intentionally enabled or disabled. |
| Validation runtime | Generated-code sandbox settings and database sandbox settings. |
| Package fetch | Whether validation can install dependencies from Qarion repositories and, when allowed, a public index. |
| Dependency smoke | Whether generated dependencies are smoke-tested before rollout. |
| Performance profiler | CPU, memory, and module timing diagnostics in recent traces. |
| Recent AI activity | Failed, blocked, invalid-output, degraded, or running Pipeline Authoring incidents. |
| Prompt/cache health | Cache-aware prompt layout and stable prompt section evidence. |
| Memory freshness | Bounded conversation, clarification, and workspace memory provenance. |
| Subagent/tool policy | Read-only subagent evidence, denied tool-call recording, and tool-policy diagnostics. |
| Reliability trace health | Structured contract hashes, validation tiers, checkpoint coverage, repair counters, and terminal status. |
| Run trace health | Normalized workflow run traces for recent Pipeline Authoring activity. |
When a check has a link target, use it to jump to the related AI logs, workflow runs, jobs, or system settings. Metadata chips show compact evidence, such as selected backend, enabled state, source setting, or warning counts.
Failure Signals
Failure signals show recent Pipeline Authoring examples that need review. They can include stuck jobs, failed generated-code validation, sandbox failures, guardrail blocks, missing package access, command-approval pauses, or repeated repair loops.
Use the linked workflow, log, or authoring session to inspect the affected workspace. The AI Ops page should provide enough context to triage the failure without exposing raw prompts, secrets, or unsanitized command output.
Failure signals are separate from readiness findings. Findings describe persisted review or saved-file drift. Failure signals describe recent workflow behavior, such as validation failures, stale resume checkpoints, sandbox failures, blocked evidence gates, denied tool calls, or subagent budget caps.
Rollout Checklist
Before enabling broad generated-code rollout:
- Confirm code-writer routing is intentionally enabled or intentionally disabled.
- Enable sandbox validation and verify the backend and image are configured.
- Enable database sandbox validation when generated SQL or database-facing support files are part of the rollout.
- Enable dependency smoke checks when package fetch and sandbox execution are available.
- Set validation package fetch to
qarion_onlyorqarion_plus_publicwhen generated dependencies need package installation. - Confirm private package repositories, public-index access, and the Qarion base URL match the package policy.
- Review prompt/cache health, memory freshness, subagent policy, reliability trace health, and run trace health for warnings.
- Enable the performance profiler during rollout windows when CPU, memory, or module load behavior is part of the risk.
- Investigate recent failure signals before increasing rollout.
Troubleshooting
Code writer is disabled means executable changes use the main planning path. Enable the dedicated code-writer setting only when the specialist path is ready for rollout.
Sandbox validation is disabled means generated code is not being executed in the sandbox before review. Enable sandbox validation before relying on generated code for broader teams.
Sandbox backend is unsupported means the configured backend is not one of the supported runtime backends. Use Docker or Kubernetes and configure the matching image and runtime settings.
Database sandbox backend is unsupported means database sandbox validation is enabled with a backend Qarion cannot run safely. Use the supported SQLite-backed database sandbox or disable database sandbox validation until the backend is ready.
Package fetch is disabled means validation cannot install generated external
dependencies. Use qarion_only for private Qarion package repositories or
qarion_plus_public when public-index access is allowed.
Package fetch is enabled but the Qarion base URL is missing means generated dependency installation cannot resolve private Qarion package endpoints. Set the validation package base URL before relying on private package dependencies.
Dependency smoke is disabled means dependencies may be selected but not import-smoke-tested before review. Enable it once package fetch and the sandbox runtime are ready.
Performance profiler warnings appear means recent traces exceeded module load, CPU, or memory growth thresholds. Treat these as rollout signals and inspect the linked workflow before assuming the model response is the root cause.
Prompt/cache or memory freshness warnings appear means recent authoring runs may be missing stable prompt layout, context freshness, or recovery state. Check the linked workflow runs before treating the issue as a model-quality problem.
Reliability trace warnings appear means recent runs missed structured contract evidence, validation-tier evidence, checkpoint coverage, sandbox results, repair counters, or expected terminal status. Review the run trace and the replay/eval output before increasing rollout.
Related Guides
- Pipeline Authoring for the user-facing review and validation flow.
- Authoring Overview for shared workspace and execution controls.
- Artifact Repositories for private package and OCI dependencies.
- Notebook Workers for dedicated notebook runtime monitoring.
- AI-Assisted Capabilities for the broader AI feature map.