AI Product Evidence
AI product evidence lives on the Product Detail page for catalog items such as ML Models, AI Systems, LLM Agents, AI Apps, and Agents. Use these tabs to keep the operational evidence for an AI product current before conformity review, launch approval, reassessment, or incident investigation.
For the broader EU AI Act workflow, see Governance Evidence Workflows.
Where To Work
Open the AI product from the Data Catalog, then use these Product Detail tabs:
| Tab | Use it for |
|---|---|
| AI Governance | Governance score, completeness checks, drift indicators, evidence counts, runtime event counts, and AI governance review requests. |
| Training Runs | Training, fine-tuning, evaluation, and deployment run history for the model or system. |
| Performance & Safety | Accuracy, robustness, fairness, cybersecurity, and drift metrics with thresholds and attestations. |
| Tech Docs | Annex IV technical documentation sections, attachments, export, auto-population, and AI-assisted change log drafting. |
These tabs complement the product governance tabs for Art. 5 screening, risk register entries, conformity assessments, transparency notices, and GPAI obligations.
AI Governance Tab
The AI Governance tab summarizes whether the product has enough evidence for its current lifecycle gate. It shows:
- A governance score based on completed checks.
- Lifecycle gate and review status.
- Completeness checks for owner, purpose, risk classification, data dependencies, evaluation evidence, policies, monitoring, and review.
- Drift indicators such as overdue review, upstream data changes, missing evaluation evidence, expired approval, unresolved high-risk mitigation, and stale documentation.
- Evidence counts for training runs, safety metrics, risk assessments, conformity assessments, policy evaluations, quality checks, contracts, and active alerts.
- Runtime governance event counts.
Use the review request action when one or more active drift indicators need governance follow-up. Qarion creates or reuses an open governance review request for the product so duplicate drift reviews do not pile up.
Training Runs
Use Training Runs to record model and AI system execution history that matters for governance. Runs can represent:
trainingfine_tuningevaluationdeployment
Each run can capture dataset name, dataset version, dataset URI, training, validation, and test sample counts, random seed, framework version, environment, hyperparameters, metrics, model artifact URI, duration, notes, start and completion timestamps, status, and deployment status.
Only one run is treated as deployed for a product at a time. When a run is marked as deployed, Qarion clears the deployed flag from the product's other training runs and triggers governance reassessment for the product.
Performance & Safety
Use Performance & Safety for measured evidence about model behavior. Metrics can be grouped by:
| Category | Typical evidence |
|---|---|
accuracy | Benchmark score, test accuracy, precision, recall, or business KPI fit. |
robustness | Stress tests, adversarial checks, fallback behavior, or resilience measures. |
fairness | Demographic parity, equalized odds, or subgroup performance. |
cybersecurity | Prompt-injection, abuse, sandbox, or exploit-resistance findings. |
drift | Data drift, concept drift, embedding drift, or model-output drift. |
Metric status values are healthy, warning, critical, and not_available. Warning and critical thresholds let teams compare current values with expected operating bounds. Add the test dataset, dataset version, methodology, demographic group, external URL, notes, and measurement timestamp when they help reviewers reproduce or understand the result.
Attestation records who reviewed the metric and when. Use it after the evidence has been checked, not while the metric is still being drafted.
Technical Documentation
Use Tech Docs to maintain Annex IV documentation for the product. The shipped sections are:
| Section | Evidence focus |
|---|---|
| General Description | Purpose, provider, version history, and interaction with other systems. |
| System Architecture | Design choices, algorithms, components, and computational resources. |
| Data Governance | Training data origin, preparation, labeling, and quality measures. |
| Risk Management | Risks, mitigations, residual risk, and foreseeable misuse. |
| Human Oversight | Oversight mechanisms, instructions, and limitations. |
| Performance & Testing | Accuracy, robustness, benchmark, and test-methodology evidence. |
| Change Log | Modifications after the initial documentation baseline. |
The tab can save the documentation back to the product's AI details, attach supporting files, export the documentation, auto-populate known product context into selected sections, and generate draft change log text from product snapshots, audit logs, version milestones, and training run history. Treat generated change log text as a draft that model owners or compliance reviewers edit before relying on it.
Operating Practices
- Refresh training runs after material training, fine-tuning, evaluation, or deployment events.
- Keep a deployed run selected only when the run represents the current production or production-like model state.
- Add performance and safety metrics before a conformity review and whenever thresholds, datasets, or monitoring expectations change.
- Attest metrics after review so audit readers can distinguish measured evidence from reviewed evidence.
- Keep technical documentation aligned with risk register entries, conformity checklist evidence, transparency notices, GPAI obligations, and product attachments.
- Use review requests when active drift indicators need a formal governance decision.
Troubleshooting
| Symptom | What to check |
|---|---|
| AI evidence tabs are missing | Confirm the catalog item is an AI product type and the required feature flag is enabled for the space. |
| Governance score looks low | Review the incomplete checks on the AI Governance tab, especially owner, purpose, risk classification, data dependencies, evaluation evidence, monitoring, and review. |
| Review request action does not create a new item | Qarion reuses an existing open AI governance review request for the product when one already exists. |
| Training run deployment changed another run | Only one run can be deployed per product, so marking a run deployed clears the deployed flag on the others. |
| A metric still looks unreviewed | Use attestation only after the metric evidence has been checked by the accountable reviewer. |
| Generated change log text is incomplete | Check that product versions, audit activity, training runs, and existing change log content are available before generating a new entry. |
Developer Reference
Developers can automate these workflows with the AI Product Governance API.