Data Lineage

Track the flow of data from source systems through transformations to final dashboards and reports.

What is Data Lineage?

Data lineage shows the complete path that data takes from its origin to its final destination. It answers critical questions:

  • Where does this data come from? - Trace any metric back to its source tables
  • What uses this data? - See which dashboards and reports depend on a table
  • What happens if this changes? - Impact analysis for schema or logic changes
  • Why are these numbers different? - Debug data discrepancies across systems

Lineage Sources

Lucaro builds lineage graphs from multiple sources:

dbt Models

Automatically parsed from dbt manifest files. Includes model dependencies, source references, and exposure definitions.

Tableau Metadata API

Extracts column-level lineage from Tableau workbooks, including calculated fields and data source connections.

Power BI REST API

Discovers dataset and report dependencies, DAX measure definitions, and workspace relationships.

SQL Query Analysis

Parses SQL queries from Lucaro dashboards to identify table and column references.

Understanding the Lineage Graph

The lineage graph shows nodes (data assets) connected by edges (data flow):

Source Tables
dbt Models
Metrics
Dashboards

Node Types

TypeDescription
SourceRaw tables from source systems
ModelTransformed tables (dbt models, views)
MetricDefined metrics from the registry
DashboardLucaro, Tableau, or Power BI dashboards
ReportScheduled reports and exports

Impact Analysis

Before making changes, understand what will be affected:

# Run impact analysis
curl -X POST "https://api.lucaro.dev/v2/projects/{projectId}/medallion/impact-analysis" \
  -H "Authorization: Bearer YOUR_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "asset_id": "model_dim_customers",
    "change_type": "column_removed",
    "column": "customer_segment"
  }'

# Response
{
  "direct_impacts": [
    {"type": "metric", "name": "revenue_by_segment", "severity": "breaking"},
    {"type": "dashboard", "name": "Sales Overview", "severity": "breaking"}
  ],
  "indirect_impacts": [
    {"type": "report", "name": "Weekly Revenue Email", "severity": "warning"}
  ],
  "total_affected": 3
}

Orphan Detection

Lucaro identifies orphaned assets that are no longer used:

  • Orphaned tables - Tables not referenced by any models or dashboards
  • Orphaned dashboards - Dashboards with no recent views
  • Broken dependencies - Assets referencing deleted sources
# Find orphaned assets
curl "https://api.lucaro.dev/v2/projects/{projectId}/medallion/orphans" \
  -H "Authorization: Bearer YOUR_API_TOKEN"

# Response
{
  "orphaned_tables": ["stg_legacy_orders", "tmp_migration_data"],
  "orphaned_dashboards": ["Old Sales Report", "Test Dashboard"],
  "broken_dependencies": []
}

Viewing Lineage in the UI

  1. Navigate to any dashboard, metric, or data asset
  2. Click the Lineage tab to view the graph
  3. Use the controls to expand upstream or downstream nodes
  4. Click any node to see details and navigate to that asset
  5. Use Impact Analysis to simulate changes