Skip to main content

AI Product Evidence

AI product evidence lives on the Product Detail page for catalog items such as ML Models, AI Systems, LLM Agents, AI Apps, and Agents. Use these tabs to keep the operational evidence for an AI product current before conformity review, launch approval, reassessment, or incident investigation.

For the broader EU AI Act workflow, see Governance Evidence Workflows.

Where To Work

Open the AI product from the Data Catalog, then use these Product Detail tabs:

TabUse it for
AI GovernanceGovernance score, completeness checks, drift indicators, evidence counts, runtime event counts, and AI governance review requests.
Training RunsTraining, fine-tuning, evaluation, and deployment run history for the model or system.
Performance & SafetyAccuracy, robustness, fairness, cybersecurity, and drift metrics with thresholds and attestations.
Tech DocsAnnex IV technical documentation sections, attachments, export, auto-population, and AI-assisted change log drafting.

These tabs complement the product governance tabs for Art. 5 screening, risk register entries, conformity assessments, transparency notices, and GPAI obligations.

AI Governance Tab

The AI Governance tab summarizes whether the product has enough evidence for its current lifecycle gate. It shows:

  • A governance score based on completed checks.
  • Lifecycle gate and review status.
  • Completeness checks for owner, purpose, risk classification, data dependencies, evaluation evidence, policies, monitoring, and review.
  • Drift indicators such as overdue review, upstream data changes, missing evaluation evidence, expired approval, unresolved high-risk mitigation, and stale documentation.
  • Evidence counts for training runs, safety metrics, risk assessments, conformity assessments, policy evaluations, quality checks, contracts, and active alerts.
  • Runtime governance event counts.

Use the review request action when one or more active drift indicators need governance follow-up. Qarion creates or reuses an open governance review request for the product so duplicate drift reviews do not pile up.

Training Runs

Use Training Runs to record model and AI system execution history that matters for governance. Runs can represent:

  • training
  • fine_tuning
  • evaluation
  • deployment

Each run can capture dataset name, dataset version, dataset URI, training, validation, and test sample counts, random seed, framework version, environment, hyperparameters, metrics, model artifact URI, duration, notes, start and completion timestamps, status, and deployment status.

Only one run is treated as deployed for a product at a time. When a run is marked as deployed, Qarion clears the deployed flag from the product's other training runs and triggers governance reassessment for the product.

Performance & Safety

Use Performance & Safety for measured evidence about model behavior. Metrics can be grouped by:

CategoryTypical evidence
accuracyBenchmark score, test accuracy, precision, recall, or business KPI fit.
robustnessStress tests, adversarial checks, fallback behavior, or resilience measures.
fairnessDemographic parity, equalized odds, or subgroup performance.
cybersecurityPrompt-injection, abuse, sandbox, or exploit-resistance findings.
driftData drift, concept drift, embedding drift, or model-output drift.

Metric status values are healthy, warning, critical, and not_available. Warning and critical thresholds let teams compare current values with expected operating bounds. Add the test dataset, dataset version, methodology, demographic group, external URL, notes, and measurement timestamp when they help reviewers reproduce or understand the result.

Attestation records who reviewed the metric and when. Use it after the evidence has been checked, not while the metric is still being drafted.

Technical Documentation

Use Tech Docs to maintain Annex IV documentation for the product. The shipped sections are:

SectionEvidence focus
General DescriptionPurpose, provider, version history, and interaction with other systems.
System ArchitectureDesign choices, algorithms, components, and computational resources.
Data GovernanceTraining data origin, preparation, labeling, and quality measures.
Risk ManagementRisks, mitigations, residual risk, and foreseeable misuse.
Human OversightOversight mechanisms, instructions, and limitations.
Performance & TestingAccuracy, robustness, benchmark, and test-methodology evidence.
Change LogModifications after the initial documentation baseline.

The tab can save the documentation back to the product's AI details, attach supporting files, export the documentation, auto-populate known product context into selected sections, and generate draft change log text from product snapshots, audit logs, version milestones, and training run history. Treat generated change log text as a draft that model owners or compliance reviewers edit before relying on it.

Operating Practices

  • Refresh training runs after material training, fine-tuning, evaluation, or deployment events.
  • Keep a deployed run selected only when the run represents the current production or production-like model state.
  • Add performance and safety metrics before a conformity review and whenever thresholds, datasets, or monitoring expectations change.
  • Attest metrics after review so audit readers can distinguish measured evidence from reviewed evidence.
  • Keep technical documentation aligned with risk register entries, conformity checklist evidence, transparency notices, GPAI obligations, and product attachments.
  • Use review requests when active drift indicators need a formal governance decision.

Troubleshooting

SymptomWhat to check
AI evidence tabs are missingConfirm the catalog item is an AI product type and the required feature flag is enabled for the space.
Governance score looks lowReview the incomplete checks on the AI Governance tab, especially owner, purpose, risk classification, data dependencies, evaluation evidence, monitoring, and review.
Review request action does not create a new itemQarion reuses an existing open AI governance review request for the product when one already exists.
Training run deployment changed another runOnly one run can be deployed per product, so marking a run deployed clears the deployed flag on the others.
A metric still looks unreviewedUse attestation only after the metric evidence has been checked by the accountable reviewer.
Generated change log text is incompleteCheck that product versions, audit activity, training runs, and existing change log content are available before generating a new entry.

Developer Reference

Developers can automate these workflows with the AI Product Governance API.