Medallion Architecture

Understand how Lucaro uses the Medallion Architecture to organize data layers and track data quality across your pipeline.

What is Medallion Architecture?

The Medallion Architecture is a data design pattern that organizes data into three layers, each with increasing levels of refinement and quality:

Bronze

Raw Data

Raw, unprocessed data exactly as it arrives from source systems. Preserved for audit and reprocessing.

Silver

Cleaned Data

Cleaned, validated, and deduplicated data. Conforms to schema standards and data quality rules.

Gold

Business Data

Business-level aggregates, metrics, and curated datasets ready for analytics and reporting.

Benefits

Data Quality Tracking

Track quality scores at each layer. Identify where data quality issues originate and propagate.

Clear Ownership

Assign owners to each layer and asset. Know who is responsible for data at each stage.

Impact Analysis

Understand how changes to source data flow through to business metrics and dashboards.

Reprocessing Capability

Bronze layer preservation allows reprocessing if transformations need to be updated.

How Lucaro Uses Medallion

Lucaro automatically detects and categorizes your data assets into medallion layers:

Asset Detection

When you connect data sources and BI tools, Lucaro scans for:

  • Raw tables from source systems (Bronze)
  • Transformed and cleaned tables (Silver)
  • Aggregated views and metrics tables (Gold)
  • Relationships between assets

Layer Assignment

Assets are assigned to layers based on:

# Layer detection heuristics
Bronze:
  - Tables with _raw or _staging suffix
  - Tables in raw_* or staging_* schemas
  - Direct loads from source systems

Silver:
  - Tables with cleaned, transformed data
  - Tables in analytics_* or processed_* schemas
  - dbt models without final aggregations

Gold:
  - Aggregate tables and materialized views
  - Tables in marts_* or reporting_* schemas
  - Metric tables and KPI summaries

Quality Tracking

Each asset in the medallion architecture has quality metrics:

Quality CheckDescription
FreshnessHow recently the data was updated
CompletenessPercentage of non-null values in key columns
Schema DriftDetection of unexpected column changes
Row CountAnomaly detection on row count changes
Quality ScoreComposite score from 0-100

API Access

Query medallion assets and their relationships via the API:

# List all medallion assets
curl "https://api.lucaro.dev/v2/projects/{projectId}/medallion/assets" \
  -H "Authorization: Bearer YOUR_API_TOKEN"

# Get lineage for an asset
curl "https://api.lucaro.dev/v2/projects/{projectId}/medallion/lineage?asset_id=gold_revenue_summary" \
  -H "Authorization: Bearer YOUR_API_TOKEN"

# Run impact analysis
curl -X POST "https://api.lucaro.dev/v2/projects/{projectId}/medallion/impact-analysis" \
  -H "Authorization: Bearer YOUR_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"asset_id": "bronze_orders", "change_type": "schema_change"}'