Skip to main content

Chapter 11 checkpoint

You can now reason about scale, reliability, and the judgment calls and career around them. Recall the chapter, then prove it.

The throughline

  • Scaling out beats up. Vertical scaling is simple but capped and a single point of failure; horizontal scaling is unlimited and resilient. Its unlock is statelessness — push state into shared caches/DBs/object storage so any server handles any request. Load balancers (L4 = fast by IP/port, L7 = smart HTTP) spread traffic; caches avoid redoing work; queues make slow work async and absorb spikes. Design assuming constant failure.
  • Distributed reality is hard. CAP forces consistency-vs-availability during partitions (strong vs eventual, per data). State is the hard part: replication + read replicas first, sharding last. Make operations idempotent, retry with backoff + jitter, and use circuit breakers. Build with redundancy, default to multi-AZ, use graceful degradation, and define RTO/RPO — then test backups by restoring them.
  • Match complexity to need. Plan capacity, load test (k6/Locust) to find the real ceiling, and autoscale (HPA pods, VPA right-size, cluster autoscaler/Karpenter nodes). Don't over-engineer — microservices, Kubernetes, and multi-cloud before you have the problem are the dominant failure mode.
  • Decide and write it down. Build-vs-buy, managed-vs-self-hosted, monolith-vs-microservices, single-vs-multi-cloud each default to the simpler option; deviate on a stated trigger. Record decisions in ADRs, align via RFCs, and measure delivery with DORA (deploy frequency, lead time, change failure rate, MTTR).
  • The career spans Cloud/DevOps/SRE/Platform/DevSecOps — same base, different emphasis, on-call, and comp. Certs signal, portfolios prove; soft skills and incident leadership drive levels. Invest in the durable (Linux, networking, distributed systems, declarative mindset, decisions) over the dated (specific tools).

Quiz

Required checkpoint

Chapter 11 — Scale, Decisions & Career

Pass to unlock the Next button below

That completes the guide. You can now explain the cloud and its primitives, provision them as code, run containers at scale, deliver and operate them, manage their cost and ML workloads — and, in this chapter, scale and harden a design, make the recurring decisions and write them down, measure whether your engineering is improving, and navigate the career that does all of it. The throughline from page one holds: invest in the durable ideas, hold the dated tools loosely, and match complexity to actual need.

Next: Glossary →