Usage Analytics API

The Usage Analytics API provides access to BI query scraping, query log mining, usage aggregation, and popularity scoring.

BI Sync

Execute BI Sync

POST /connectors/{connector_id}/bi-sync

Triggers a full BI scrape for a connector. Creates a SyncJob, invokes the registered BI scraper (Superset, Metabase, Looker, Tableau), upserts BiQuery rows, and triggers usage aggregation.

Supported connector types: superset, metabase, looker, tableau, power_bi

Response: 200 OK — SyncJob

Errors:

404 Not Found — Connector not found
400 Bad Request — Connector is not a BI type

Execute Query Log Mining

POST /connectors/{connector_id}/query-log-mining

Mines warehouse query logs directly from data warehouse audit logs. This is separate from BI connector scraping — it extracts query history from warehouses the platform already connects to.

Supported connector types: snowflake, bigquery, postgres

Response: 200 OK

{
  "rows_upserted": 156
}

Usage Aggregation

Aggregate Usage

POST /spaces/{slug}/usage/aggregate

Recomputes ProductUsageStats for every product in a space. Groups BiQuery records by referenced table, matches them to existing data products, and upserts aggregated counts.

Response: 200 OK

{
  "products_updated": 42
}

Get Unused Products

GET /spaces/{slug}/usage/unused

Returns products with zero BI query references in the lookback window.

Query Parameters:

Parameter	Type	Default	Description
`days`	int	30	Lookback window in days

Response: 200 OK — Array of DataProduct summaries with no detected usage.

Data Model

BiQuery

The BiQuery model stores individual query or chart references scraped from BI tools or mined from query logs.

Field	Type	Description
`id`	UUID	Record identifier
`connector_id`	UUID	Source connector
`resource_type`	string	`saved_query`, `chart`, `dashboard_chart`, `query_log`
`resource_id`	string	BI-tool-specific identifier
`title`	string \| null	Query/chart title
`sql_text`	string \| null	Anonymized SQL
`referenced_tables`	string[]	Table names referenced in the SQL
`referenced_columns`	string[]	Column names referenced
`executed_by`	string \| null	User/service account identity
`executed_at`	datetime \| null	Query execution time
`dashboard_id`	string \| null	Parent dashboard (for chart resources)
`dashboard_title`	string \| null	Dashboard name
`dashboard_url`	string \| null	Link-back URL to the BI tool
`created_at`	datetime	Record creation timestamp

ProductUsageStats

Aggregated usage statistics per product, computed by the UsageAggregationService.

Field	Type	Description
`product_id`	UUID	Product identifier
`space_id`	UUID	Space identifier
`query_count_30d`	int	Queries in the last 30 days
`unique_users_30d`	int	Distinct users in the last 30 days
`last_queried_at`	datetime \| null	Most recent query timestamp
`popularity_score`	float	0–100 score
`top_columns`	string[]	Most referenced columns
`updated_at`	datetime	Last aggregation timestamp

Popularity Score Formula

The popularity score is computed as:

raw = 0.5 × log₂(1 + query_count_30d)
    + 0.3 × unique_users_30d
    + 0.2 × recency_factor

score = min(100, raw / 10.0 × 100)

Where recency_factor decays based on how recently the product was last queried.

SQL Anonymization

All query text is anonymized before storage. String and numeric literals are replaced with ? placeholders:

'C-1234'  →  ?
1000      →  ?
3.14      →  ?

This produces normalized SQL patterns suitable for grouping (Popular Queries) without exposing sensitive filter values.

BI Sync​

Execute BI Sync​

Execute Query Log Mining​

Usage Aggregation​

Aggregate Usage​

Get Unused Products​

Data Model​

BiQuery​

ProductUsageStats​

Popularity Score Formula​

SQL Anonymization​