Gradient Boosting with XGBoost and LightGBM

A practitioner's guide to winning with tabular data

By Houssam Kodad

PDF 164 pages Intermediate English

One-time purchase

€22.95

VAT included
where applicable

Download sample

Instant download after purchase
Readable on any device
Free updates to this edition
Secure checkout

About this book

What's inside

On tabular data, gradient-boosted trees still beat almost everything — when you use them well. This focused guide explains how boosting actually works and turns that understanding into practical skill with XGBoost and LightGBM. You'll learn to tune the parameters that matter, handle categoricals and imbalance, read feature importance honestly, and avoid the overfitting that quietly wrecks leaderboard models in production.

What you'll learn

Skills you'll walk away with

Understand how gradient boosting builds trees sequentially
Tune the handful of parameters that actually matter
Handle categorical features and class imbalance
Use early stopping and proper validation
Interpret models with SHAP and importance the right way
Compare XGBoost, LightGBM and CatBoost trade-offs
Ship boosted models without overfitting surprises

Table of contents

8 chapters

01
How Boosting Actually Works
- · Additive models and residuals
- · Gradients and loss functions
- · Trees as weak learners
02
XGBoost and LightGBM Under the Hood
- · Histogram-based splits
- · Leaf-wise vs level-wise growth
- · Regularisation terms
03
The Parameters That Matter
- · Learning rate and trees
- · Depth, leaves and min child
- · Subsampling and column sampling
04
Categoricals, Missing Values and Imbalance
- · Native categorical handling
- · Missing-value behaviour
- · Class weights and sampling
05
Validation and Early Stopping
- · Cross-validation schemes
- · Early stopping rounds
- · Avoiding validation leakage
06
Tuning Without Wasting Weeks
- · Sensible search spaces
- · Bayesian optimisation
- · Knowing when to stop
07
Interpreting Boosted Models
- · Gain vs permutation importance
- · SHAP values in practice
- · Partial dependence
08
Shipping to Production
- · Serialisation and serving
- · Monitoring for drift
- · Retraining cadence