Cover of BigQuery for Data Engineers
DRM-free · Yours to keep forever
Cloud & Infrastructure

BigQuery for Data Engineers

Warehouse design, optimization and cost control on Google Cloud

By Houssam Kodad

PDF 208 pages Intermediate English

One-time purchase

€26.95

VAT included
where applicable

Download sample
  • Instant download after purchase
  • Readable on any device
  • Free updates to this edition
  • Secure checkout

About this book

What's inside

BigQuery makes it trivially easy to run a query — and just as easy to run a very expensive one. This book teaches data engineers to design schemas, ingestion and queries that exploit BigQuery's architecture instead of fighting it. You'll master partitioning, clustering, slot economics and the cost controls that keep a serverless warehouse both fast and predictable as usage grows.

What you'll learn

Skills you'll walk away with

  • Understand BigQuery storage and the slot execution model
  • Design tables with partitioning and clustering that pay off
  • Load and stream data efficiently and idempotently
  • Write queries that scan less and run faster
  • Choose between on-demand and capacity pricing
  • Control cost with quotas, reservations and monitoring
  • Schedule and transform with scheduled queries and dbt

Table of contents

9 chapters
  1. 01

    How BigQuery Works

    • · Separation of storage and compute
    • · Slots and the execution model
    • · Columnar storage internals
  2. 02

    Designing Tables

    • · Partitioning by time and range
    • · Clustering for pruning
    • · Nested and repeated fields
  3. 03

    Getting Data In

    • · Batch loads and external tables
    • · Streaming inserts
    • · Idempotent ingestion patterns
  4. 04

    Writing Queries That Scan Less

    • · Pruning with partitions
    • · Avoiding SELECT *
    • · Reading the query plan
  5. 05

    Joins, Window Functions and Scale

    • · Broadcast vs shuffle joins
    • · Window functions at scale
    • · Approximate aggregations
  6. 06

    Pricing and Slot Economics

    • · On-demand vs editions
    • · Reservations and autoscaling
    • · Estimating query cost
  7. 07

    Cost Control That Sticks

    • · Quotas and custom limits
    • · Monitoring with INFORMATION_SCHEMA
    • · Billing alerts and labels
  8. 08

    Transformations and Scheduling

    • · Scheduled queries
    • · dbt on BigQuery
    • · Materialised views
  9. 09

    Governance and Sharing

    • · IAM and authorized views
    • · Column and row security
    • · Analytics Hub sharing

This is the full chapter list — exactly what you'll receive in the PDF.