Feature Engineering for Machine Learning
Crafting, selecting and serving features that move metrics
By Houssam Kodad
One-time purchase
€27.95
VAT included
where applicable
- Instant download after purchase
- Readable on any device
- Free updates to this edition
- Secure checkout
About this book
What's inside
Better features beat fancier models more often than anyone admits. This book is a practical tour of feature engineering for tabular, temporal and text data, with a hard focus on the thing that breaks real systems: training/serving skew. You'll learn to build features that are predictive, reproducible and available at inference time — not just impressive in a notebook.
What you'll learn
Skills you'll walk away with
- Engineer features for numerical, categorical and date fields
- Encode high-cardinality categories without leakage
- Build temporal and window features from event data
- Avoid target leakage in cross-validation and pipelines
- Close the gap between training and serving features
- Select features with importance, permutation and pruning
- Operate a feature store for online and offline parity
Table of contents
9 chapters-
01
Why Features Decide the Outcome
- · Model capacity vs signal
- · The notebook-to-production gap
- · A workflow for the book
-
02
Numerical Features Done Right
- · Scaling and transforms
- · Binning and outliers
- · Interactions and ratios
-
03
Encoding Categorical Variables
- · One-hot vs ordinal
- · Target and frequency encoding
- · High-cardinality strategies
-
04
Time, Dates and Window Features
- · Calendar and cyclical features
- · Lag and rolling-window features
- · Avoiding lookahead bias
-
05
Text and Embeddings as Features
- · Bag-of-words to TF-IDF
- · Pretrained embeddings
- · Dimensionality reduction
-
06
The Leakage Traps
- · Target leakage in practice
- · Leakage through preprocessing
- · Leak-proof cross-validation
-
07
Selecting What Matters
- · Filter, wrapper and embedded methods
- · Permutation importance
- · Pruning redundant features
-
08
Training/Serving Parity
- · Why features drift apart
- · Shared transformation code
- · Point-in-time correctness
-
09
Feature Stores in Production
- · Offline and online stores
- · Backfills and freshness
- · Governance and reuse
This is the full chapter list — exactly what you'll receive in the PDF.
More in Data Science & ML
Keep exploring this track
Practical MLOps: From Notebook to Production
Packaging, deployment, monitoring and retraining that lasts
Time Series Forecasting in Practice
Classical models, gradient boosting and deep learning for demand and operations
Recommender Systems at Scale
Collaborative filtering, embeddings and multi-stage ranking
Statistical Foundations for Data Scientists
Inference, experimentation and A/B testing done right
Gradient Boosting with XGBoost and LightGBM
A practitioner's guide to winning with tabular data