AI Product Evidence

AI product evidence lives on the Product Detail page for catalog items such as ML Models, AI Systems, LLM Agents, AI Apps, and Agents. Use these tabs to keep the operational evidence for an AI product current before conformity review, launch approval, reassessment, or incident investigation.

For the broader EU AI Act workflow, see Governance Evidence Workflows.

Where To Work

Open the AI product from the Data Catalog, then use these Product Detail tabs:

Tab	Use it for
AI Governance	Governance score, completeness checks, drift indicators, evidence counts, runtime event counts, and AI governance review requests.
Training Runs	Training, fine-tuning, evaluation, and deployment run history for the model or system.
Performance & Safety	Accuracy, robustness, fairness, cybersecurity, and drift metrics with thresholds and attestations.
Tech Docs	Annex IV technical documentation sections, attachments, export, auto-population, and AI-assisted change log drafting.

These tabs complement the product governance tabs for Art. 5 screening, risk register entries, conformity assessments, transparency notices, and GPAI obligations.

AI Governance Tab

The AI Governance tab summarizes whether the product has enough evidence for its current lifecycle gate. It shows:

A governance score based on completed checks.
Lifecycle gate and review status.
Completeness checks for owner, purpose, risk classification, data dependencies, evaluation evidence, policies, monitoring, and review.
Drift indicators such as overdue review, upstream data changes, missing evaluation evidence, expired approval, unresolved high-risk mitigation, and stale documentation.
Evidence counts for training runs, safety metrics, risk assessments, conformity assessments, policy evaluations, quality checks, contracts, and active alerts.
Runtime governance event counts.

Use the review request action when one or more active drift indicators need governance follow-up. Qarion creates or reuses an open governance review request for the product so duplicate drift reviews do not pile up.

Training Runs

Use Training Runs to record model and AI system execution history that matters for governance. Runs can represent:

training
fine_tuning
evaluation
deployment

Each run can capture dataset name, dataset version, dataset URI, training, validation, and test sample counts, random seed, framework version, environment, hyperparameters, metrics, model artifact URI, duration, notes, start and completion timestamps, status, and deployment status.

Only one run is treated as deployed for a product at a time. When a run is marked as deployed, Qarion clears the deployed flag from the product's other training runs and triggers governance reassessment for the product.

Performance & Safety

Use Performance & Safety for measured evidence about model behavior. Metrics can be grouped by:

Category	Typical evidence
`accuracy`	Benchmark score, test accuracy, precision, recall, or business KPI fit.
`robustness`	Stress tests, adversarial checks, fallback behavior, or resilience measures.
`fairness`	Demographic parity, equalized odds, or subgroup performance.
`cybersecurity`	Prompt-injection, abuse, sandbox, or exploit-resistance findings.
`drift`	Data drift, concept drift, embedding drift, or model-output drift.

Metric status values are healthy, warning, critical, and not_available. Warning and critical thresholds let teams compare current values with expected operating bounds. Add the test dataset, dataset version, methodology, demographic group, external URL, notes, and measurement timestamp when they help reviewers reproduce or understand the result.

Attestation records who reviewed the metric and when. Use it after the evidence has been checked, not while the metric is still being drafted.

Technical Documentation

Use Tech Docs to maintain Annex IV documentation for the product. The shipped sections are:

Section	Evidence focus
General Description	Purpose, provider, version history, and interaction with other systems.
System Architecture	Design choices, algorithms, components, and computational resources.
Data Governance	Training data origin, preparation, labeling, and quality measures.
Risk Management	Risks, mitigations, residual risk, and foreseeable misuse.
Human Oversight	Oversight mechanisms, instructions, and limitations.
Performance & Testing	Accuracy, robustness, benchmark, and test-methodology evidence.
Change Log	Modifications after the initial documentation baseline.

The tab can save the documentation back to the product's AI details, attach supporting files, export the documentation, auto-populate known product context into selected sections, and generate draft change log text from product snapshots, audit logs, version milestones, and training run history. Treat generated change log text as a draft that model owners or compliance reviewers edit before relying on it.

Operating Practices

Refresh training runs after material training, fine-tuning, evaluation, or deployment events.
Keep a deployed run selected only when the run represents the current production or production-like model state.
Add performance and safety metrics before a conformity review and whenever thresholds, datasets, or monitoring expectations change.
Attest metrics after review so audit readers can distinguish measured evidence from reviewed evidence.
Keep technical documentation aligned with risk register entries, conformity checklist evidence, transparency notices, GPAI obligations, and product attachments.
Use review requests when active drift indicators need a formal governance decision.

Troubleshooting

Symptom	What to check
AI evidence tabs are missing	Confirm the catalog item is an AI product type and the required feature flag is enabled for the space.
Governance score looks low	Review the incomplete checks on the AI Governance tab, especially owner, purpose, risk classification, data dependencies, evaluation evidence, monitoring, and review.
Review request action does not create a new item	Qarion reuses an existing open AI governance review request for the product when one already exists.
Training run deployment changed another run	Only one run can be deployed per product, so marking a run deployed clears the deployed flag on the others.
A metric still looks unreviewed	Use attestation only after the metric evidence has been checked by the accountable reviewer.
Generated change log text is incomplete	Check that product versions, audit activity, training runs, and existing change log content are available before generating a new entry.

Developer Reference

Developers can automate these workflows with the AI Product Governance API.

Where To Work​

AI Governance Tab​

Training Runs​

Performance & Safety​

Technical Documentation​

Operating Practices​

Troubleshooting​

Developer Reference​