Fine-Tuning and Adapting Open LLMs

LoRA, quantization and instruction tuning on your own data

By Houssam Kodad

PDF 216 pages Advanced English

One-time purchase

€28.95

VAT included
where applicable

Download sample

Instant download after purchase
Readable on any device
Free updates to this edition
Secure checkout

About this book

What's inside

When prompting hits its limits, fine-tuning an open model on your own data is the next move — but it's full of expensive ways to waste a week. This book gives you a grounded process for adapting open LLMs with LoRA and QLoRA, from building a clean instruction dataset to evaluating and serving the result. You'll learn what fine-tuning can and can't fix, and how to do it on a realistic GPU budget.

What you'll learn

Skills you'll walk away with

Decide when to fine-tune versus prompt or use RAG
Build and clean an instruction-tuning dataset
Apply LoRA and QLoRA for parameter-efficient tuning
Use quantization to train and serve on modest GPUs
Set hyperparameters that converge without overfitting
Evaluate a tuned model against honest baselines
Serve adapters efficiently in production

Table of contents

9 chapters

01
When Fine-Tuning Is the Right Tool
- · Prompting vs RAG vs tuning
- · What tuning can and cannot fix
- · Cost and effort reality check
02
Datasets Make or Break It
- · Instruction data design
- · Cleaning and deduplication
- · Synthetic data with care
03
Parameter-Efficient Fine-Tuning
- · Full vs LoRA tuning
- · How LoRA adapters work
- · Targeting the right layers
04
Quantization for Modest GPUs
- · 8-bit and 4-bit basics
- · QLoRA end to end
- · Memory and throughput trade-offs
05
Running a Training Job
- · Hyperparameters that matter
- · Monitoring loss and stability
- · Checkpointing and resuming
06
Alignment and Preference Tuning
- · Supervised fine-tuning
- · DPO and preference data
- · Avoiding capability regressions
07
Evaluating a Tuned Model
- · Task-specific benchmarks
- · Regression against the base model
- · Human evaluation
08
Serving Fine-Tuned Models
- · Merging vs serving adapters
- · Multi-adapter serving
- · Latency and batching
09
Maintaining Adapted Models
- · Versioning data and weights
- · Re-tuning on new data
- · Governance and licensing