Statistical Foundations for Data Scientists
Inference, experimentation and A/B testing done right
By Houssam Kodad
One-time purchase
€24.95
VAT included
where applicable
- Instant download after purchase
- Readable on any device
- Free updates to this edition
- Secure checkout
About this book
What's inside
Plenty of data scientists can train a model but stumble when asked whether a result is real. This book rebuilds the statistical foundations that matter day to day: sampling, uncertainty, hypothesis testing and — above all — running and reading experiments. It's written for practitioners who want intuition and correct practice, not proofs, so you can design an A/B test that survives scrutiny.
What you'll learn
Skills you'll walk away with
- Reason about sampling, variance and the standard error
- Build and interpret confidence intervals correctly
- Run hypothesis tests without common misinterpretations
- Design A/B tests with adequate power and sample size
- Avoid peeking, p-hacking and multiple-comparison traps
- Choose the right test for proportions, means and counts
- Communicate uncertainty to non-technical stakeholders
Table of contents
9 chapters-
01
Thinking in Distributions
- · Populations and samples
- · Variance and the standard error
- · The central limit theorem in practice
-
02
Estimation and Confidence
- · Point estimates and bias
- · Confidence intervals
- · Bootstrapping uncertainty
-
03
Hypothesis Testing Without Myths
- · Null and alternative framing
- · What a p-value is and is not
- · Type I and Type II errors
-
04
Choosing the Right Test
- · Means, proportions and counts
- · Parametric vs non-parametric
- · Paired and unpaired designs
-
05
Designing an A/B Test
- · Hypotheses and metrics
- · Power and sample size
- · Randomisation pitfalls
-
06
Analysing an Experiment
- · Effect size and intervals
- · Segmentation and Simpson’s paradox
- · Guardrail metrics
-
07
The Ways Experiments Lie
- · Peeking and early stopping
- · Multiple comparisons
- · Novelty and network effects
-
08
Beyond the Basic Test
- · Sequential testing
- · CUPED and variance reduction
- · Bayesian A/B testing
-
09
Communicating Results
- · Decisions under uncertainty
- · Visualising effects
- · Writing a trustworthy readout
This is the full chapter list — exactly what you'll receive in the PDF.
More in Data Science & ML
Keep exploring this track
Feature Engineering for Machine Learning
Crafting, selecting and serving features that move metrics
Practical MLOps: From Notebook to Production
Packaging, deployment, monitoring and retraining that lasts
Time Series Forecasting in Practice
Classical models, gradient boosting and deep learning for demand and operations
Recommender Systems at Scale
Collaborative filtering, embeddings and multi-stage ranking
Gradient Boosting with XGBoost and LightGBM
A practitioner's guide to winning with tabular data