Managing Datasets
This guide covers the day-to-day operations for working with master data datasets: editing schemas, managing versions, importing and exporting data, and configuring governance.
Schema Editing
Each dataset has a typed schema that defines its columns:
Adding Columns
- Open the dataset and navigate to the Schema tab
- Click Add Column
- Provide a column name and select a type (text, number, boolean, date, etc.)
- Optionally mark as required
Modifying Columns
- Rename a column by clicking its header
- Change the column type (data is coerced where possible)
- Reorder columns with drag-and-drop
Removing Columns
- Click the column menu (⋮) and select Delete Column
- Existing data in that column is removed on next publish
Data Entry
Edit data directly in the browser with a spreadsheet-style interface:
- Inline editing — Click a cell to edit
- Row operations — Add, duplicate, or delete rows
- Bulk paste — Paste tabular data from a spreadsheet
All edits are saved to the current draft — they don't affect the published version until you explicitly publish.
Versioning
Publishing a Version
- Click Publish when your draft is ready
- Add an optional version note describing the changes
- The draft becomes a new immutable version (e.g., v3)
- A new draft is automatically created for future edits
Comparing Versions
Click Version History to see all previous versions, then select two versions to compare:
- Added rows — highlighted in green
- Modified rows — highlighted in yellow, with per-cell diffs
- Deleted rows — highlighted in red
- Schema changes — column additions, removals, and type changes
Reverting
To revert to a previous version, open it from the version history and click Restore as Draft. This replaces the current draft with the selected version's data.
Import & Export
Importing Data
Supported formats: CSV, XLSX, JSON
- Click Import on the dataset toolbar
- Upload a file or drag-and-drop
- Map columns (auto-matched by header name)
- Preview the import and confirm
Imports replace the current draft data. The published version is unaffected until you publish.
Exporting Data
- Click Export on the dataset toolbar
- Choose format (CSV, XLSX, or JSON)
- Select sync mode:
- Full snapshot — exports all rows
- Incremental — exports only changes since the last export
- Configure frequency: on-demand, on-change, hourly, or daily
Scheduled exports run automatically via the platform's background job system.
Governance
Governance Mode
Each dataset has a governance mode that controls how changes are handled:
| Mode | Behaviour |
|---|---|
| Direct Edit | Anyone with edit access can modify and publish |
| Approval Required | Changes require approval before publishing |
Switch modes from the Governance tab on the dataset detail page.
Role Assignments
Datasets support the same role-based governance as data products:
- Owner — Full control, can change governance mode
- Steward — Can edit schema and data, approve changes
- Viewer — Read-only access
Chatter
Each dataset has a built-in discussion thread (chatter) for team collaboration:
- Comment on data quality, schema decisions, or upcoming changes
- @mention teammates for notifications
- Full comment history is preserved with the dataset
Audit Trail
All dataset operations are logged in the platform's audit log:
- Schema changes
- Data modifications
- Publish events
- Governance mode changes
- Export operations
Access the audit log from the dataset detail page or from the global Audit Log in admin settings.