Data Modeling for Analytics

Dimensional design, slowly changing dimensions and the one-big-table debate

By Houssam Kodad

PDF 232 pages Intermediate English

One-time purchase

€27.95

VAT included
where applicable

Download sample

Instant download after purchase
Readable on any device
Free updates to this edition
Secure checkout

About this book

What's inside

Most warehouse pain traces back to the model, not the tooling. This book teaches dimensional modelling as a practical craft for the cloud-warehouse era — facts and dimensions, grain, slowly changing dimensions, and when a wide denormalised table actually wins. You'll learn to design schemas analysts can navigate without a map and that stay correct as the business changes underneath them.

What you'll learn

Skills you'll walk away with

Pick the right grain for a fact table and never break it
Design conformed dimensions shared across the business
Implement Type 1, 2 and 3 slowly changing dimensions
Model many-to-many relationships with bridge tables
Decide between star schemas and one-big-table designs
Handle late-arriving dimensions and dimension reloads
Translate messy source systems into clean analytical models

Table of contents

9 chapters

01
Why Modelling Still Matters in the Cloud
- · Cheap compute, expensive confusion
- · The analyst as your real user
- · Symptoms of a bad model
02
Facts, Dimensions and Grain
- · Declaring the grain first
- · Additive, semi-additive and non-additive facts
- · Degenerate and factless facts
03
Designing Dimensions People Can Use
- · Attributes and hierarchies
- · Surrogate keys and natural keys
- · Conformed dimensions across marts
04
Slowly Changing Dimensions in Practice
- · Type 1, 2 and 3 explained
- · Effective dates and current flags
- · Auditing history without bloat
05
Many-to-Many and Bridge Tables
- · When a foreign key is not enough
- · Weighting factors and allocation
- · Avoiding double counting
06
The One-Big-Table Debate
- · Denormalisation in columnar stores
- · Trade-offs in cost and clarity
- · A pragmatic decision framework
07
Late-Arriving Data and Reloads
- · Late-arriving facts
- · Late-arriving dimensions
- · Idempotent rebuilds
08
Modelling Messy Source Systems
- · Taming application databases
- · Event data into dimensional models
- · Handling deletes and soft deletes
09
Metrics, Semantics and the Final Mile
- · A single definition per metric
- · The semantic layer
- · Documenting the model for analysts