Gradient Boosting with XGBoost and LightGBM
A practitioner's guide to winning with tabular data
By Houssam Kodad
One-time purchase
€22.95
VAT included
where applicable
- Instant download after purchase
- Readable on any device
- Free updates to this edition
- Secure checkout
About this book
What's inside
On tabular data, gradient-boosted trees still beat almost everything — when you use them well. This focused guide explains how boosting actually works and turns that understanding into practical skill with XGBoost and LightGBM. You'll learn to tune the parameters that matter, handle categoricals and imbalance, read feature importance honestly, and avoid the overfitting that quietly wrecks leaderboard models in production.
What you'll learn
Skills you'll walk away with
- Understand how gradient boosting builds trees sequentially
- Tune the handful of parameters that actually matter
- Handle categorical features and class imbalance
- Use early stopping and proper validation
- Interpret models with SHAP and importance the right way
- Compare XGBoost, LightGBM and CatBoost trade-offs
- Ship boosted models without overfitting surprises
Table of contents
8 chapters-
01
How Boosting Actually Works
- · Additive models and residuals
- · Gradients and loss functions
- · Trees as weak learners
-
02
XGBoost and LightGBM Under the Hood
- · Histogram-based splits
- · Leaf-wise vs level-wise growth
- · Regularisation terms
-
03
The Parameters That Matter
- · Learning rate and trees
- · Depth, leaves and min child
- · Subsampling and column sampling
-
04
Categoricals, Missing Values and Imbalance
- · Native categorical handling
- · Missing-value behaviour
- · Class weights and sampling
-
05
Validation and Early Stopping
- · Cross-validation schemes
- · Early stopping rounds
- · Avoiding validation leakage
-
06
Tuning Without Wasting Weeks
- · Sensible search spaces
- · Bayesian optimisation
- · Knowing when to stop
-
07
Interpreting Boosted Models
- · Gain vs permutation importance
- · SHAP values in practice
- · Partial dependence
-
08
Shipping to Production
- · Serialisation and serving
- · Monitoring for drift
- · Retraining cadence
This is the full chapter list — exactly what you'll receive in the PDF.
More in Data Science & ML
Keep exploring this track
Feature Engineering for Machine Learning
Crafting, selecting and serving features that move metrics
Practical MLOps: From Notebook to Production
Packaging, deployment, monitoring and retraining that lasts
Time Series Forecasting in Practice
Classical models, gradient boosting and deep learning for demand and operations
Recommender Systems at Scale
Collaborative filtering, embeddings and multi-stage ranking
Statistical Foundations for Data Scientists
Inference, experimentation and A/B testing done right