Data Modeling for Analytics
Dimensional design, slowly changing dimensions and the one-big-table debate
By Houssam Kodad
One-time purchase
€27.95
VAT included
where applicable
- Instant download after purchase
- Readable on any device
- Free updates to this edition
- Secure checkout
About this book
What's inside
Most warehouse pain traces back to the model, not the tooling. This book teaches dimensional modelling as a practical craft for the cloud-warehouse era — facts and dimensions, grain, slowly changing dimensions, and when a wide denormalised table actually wins. You'll learn to design schemas analysts can navigate without a map and that stay correct as the business changes underneath them.
What you'll learn
Skills you'll walk away with
- Pick the right grain for a fact table and never break it
- Design conformed dimensions shared across the business
- Implement Type 1, 2 and 3 slowly changing dimensions
- Model many-to-many relationships with bridge tables
- Decide between star schemas and one-big-table designs
- Handle late-arriving dimensions and dimension reloads
- Translate messy source systems into clean analytical models
Table of contents
9 chapters-
01
Why Modelling Still Matters in the Cloud
- · Cheap compute, expensive confusion
- · The analyst as your real user
- · Symptoms of a bad model
-
02
Facts, Dimensions and Grain
- · Declaring the grain first
- · Additive, semi-additive and non-additive facts
- · Degenerate and factless facts
-
03
Designing Dimensions People Can Use
- · Attributes and hierarchies
- · Surrogate keys and natural keys
- · Conformed dimensions across marts
-
04
Slowly Changing Dimensions in Practice
- · Type 1, 2 and 3 explained
- · Effective dates and current flags
- · Auditing history without bloat
-
05
Many-to-Many and Bridge Tables
- · When a foreign key is not enough
- · Weighting factors and allocation
- · Avoiding double counting
-
06
The One-Big-Table Debate
- · Denormalisation in columnar stores
- · Trade-offs in cost and clarity
- · A pragmatic decision framework
-
07
Late-Arriving Data and Reloads
- · Late-arriving facts
- · Late-arriving dimensions
- · Idempotent rebuilds
-
08
Modelling Messy Source Systems
- · Taming application databases
- · Event data into dimensional models
- · Handling deletes and soft deletes
-
09
Metrics, Semantics and the Final Mile
- · A single definition per metric
- · The semantic layer
- · Documenting the model for analysts
This is the full chapter list — exactly what you'll receive in the PDF.
More in Data Engineering
Keep exploring this track
Building Reliable Data Pipelines with dbt and Airflow
Orchestration, testing and incremental models for production warehouses
Streaming Data Engineering with Kafka and Flink
Real-time pipelines, exactly-once processing and stateful streams
Spark Performance Tuning: A Field Guide
Diagnosing shuffles, skew and memory pressure in production
Data Quality and Observability
Contracts, tests and lineage for pipelines you can trust