Skip to main content

Managing Datasets

This guide covers the day-to-day operations for working with master data datasets: editing schemas, managing versions, importing and exporting data, and configuring governance.

Schema Editing

Each dataset has a typed schema that defines its columns:

Adding Columns

  1. Open the dataset and navigate to the Schema tab
  2. Click Add Column
  3. Provide a column name and select a type (text, number, boolean, date, etc.)
  4. Optionally mark as required

Modifying Columns

  • Rename a column by clicking its header
  • Change the column type (data is coerced where possible)
  • Reorder columns with drag-and-drop

Removing Columns

  • Click the column menu (⋮) and select Delete Column
  • Existing data in that column is removed on next publish

Data Entry

Edit data directly in the browser with a spreadsheet-style interface:

  • Inline editing — Click a cell to edit
  • Row operations — Add, duplicate, or delete rows
  • Bulk paste — Paste tabular data from a spreadsheet

All edits are saved to the current draft — they don't affect the published version until you explicitly publish.

Versioning

Publishing a Version

  1. Click Publish when your draft is ready
  2. Add an optional version note describing the changes
  3. The draft becomes a new immutable version (e.g., v3)
  4. A new draft is automatically created for future edits

Comparing Versions

Click Version History to see all previous versions, then select two versions to compare:

  • Added rows — highlighted in green
  • Modified rows — highlighted in yellow, with per-cell diffs
  • Deleted rows — highlighted in red
  • Schema changes — column additions, removals, and type changes

Reverting

To revert to a previous version, open it from the version history and click Restore as Draft. This replaces the current draft with the selected version's data.

Import & Export

Importing Data

Supported formats: CSV, XLSX, JSON

  1. Click Import on the dataset toolbar
  2. Upload a file or drag-and-drop
  3. Map columns (auto-matched by header name)
  4. Preview the import and confirm

Imports replace the current draft data. The published version is unaffected until you publish.

Exporting Data

  1. Click Export on the dataset toolbar
  2. Choose format (CSV, XLSX, or JSON)
  3. Select sync mode:
    • Full snapshot — exports all rows
    • Incremental — exports only changes since the last export
  4. Configure frequency: on-demand, on-change, hourly, or daily

Scheduled exports run automatically via the platform's background job system.

Governance

Governance Mode

Each dataset has a governance mode that controls how changes are handled:

ModeBehaviour
Direct EditAnyone with edit access can modify and publish
Approval RequiredChanges require approval before publishing

Switch modes from the Governance tab on the dataset detail page.

Role Assignments

Datasets support the same role-based governance as data products:

  • Owner — Full control, can change governance mode
  • Steward — Can edit schema and data, approve changes
  • Viewer — Read-only access

Chatter

Each dataset has a built-in discussion thread (chatter) for team collaboration:

  • Comment on data quality, schema decisions, or upcoming changes
  • @mention teammates for notifications
  • Full comment history is preserved with the dataset

Audit Trail

All dataset operations are logged in the platform's audit log:

  • Schema changes
  • Data modifications
  • Publish events
  • Governance mode changes
  • Export operations

Access the audit log from the dataset detail page or from the global Audit Log in admin settings.