Medallion Architecture
Understand how Lucaro uses the Medallion Architecture to organize data layers and track data quality across your pipeline.
What is Medallion Architecture?
The Medallion Architecture is a data design pattern that organizes data into three layers, each with increasing levels of refinement and quality:
Raw Data
Raw, unprocessed data exactly as it arrives from source systems. Preserved for audit and reprocessing.
Cleaned Data
Cleaned, validated, and deduplicated data. Conforms to schema standards and data quality rules.
Business Data
Business-level aggregates, metrics, and curated datasets ready for analytics and reporting.
Benefits
Data Quality Tracking
Track quality scores at each layer. Identify where data quality issues originate and propagate.
Clear Ownership
Assign owners to each layer and asset. Know who is responsible for data at each stage.
Impact Analysis
Understand how changes to source data flow through to business metrics and dashboards.
Reprocessing Capability
Bronze layer preservation allows reprocessing if transformations need to be updated.
How Lucaro Uses Medallion
Lucaro automatically detects and categorizes your data assets into medallion layers:
Asset Detection
When you connect data sources and BI tools, Lucaro scans for:
- Raw tables from source systems (Bronze)
- Transformed and cleaned tables (Silver)
- Aggregated views and metrics tables (Gold)
- Relationships between assets
Layer Assignment
Assets are assigned to layers based on:
# Layer detection heuristics Bronze: - Tables with _raw or _staging suffix - Tables in raw_* or staging_* schemas - Direct loads from source systems Silver: - Tables with cleaned, transformed data - Tables in analytics_* or processed_* schemas - dbt models without final aggregations Gold: - Aggregate tables and materialized views - Tables in marts_* or reporting_* schemas - Metric tables and KPI summaries
Quality Tracking
Each asset in the medallion architecture has quality metrics:
| Quality Check | Description |
|---|---|
| Freshness | How recently the data was updated |
| Completeness | Percentage of non-null values in key columns |
| Schema Drift | Detection of unexpected column changes |
| Row Count | Anomaly detection on row count changes |
| Quality Score | Composite score from 0-100 |
API Access
Query medallion assets and their relationships via the API:
# List all medallion assets
curl "https://api.lucaro.dev/v2/projects/{projectId}/medallion/assets" \
-H "Authorization: Bearer YOUR_API_TOKEN"
# Get lineage for an asset
curl "https://api.lucaro.dev/v2/projects/{projectId}/medallion/lineage?asset_id=gold_revenue_summary" \
-H "Authorization: Bearer YOUR_API_TOKEN"
# Run impact analysis
curl -X POST "https://api.lucaro.dev/v2/projects/{projectId}/medallion/impact-analysis" \
-H "Authorization: Bearer YOUR_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{"asset_id": "bronze_orders", "change_type": "schema_change"}'