Chapter 9 checkpoint
You can now read a cloud bill, attribute it, shrink it, and design so it never bloats in the first place. Recall the throughline, then prove it with the quiz.
The throughline
- The bill is meters, not a price tag: compute (you pay for running time — idle is waste), storage (GB × tier × time), data transfer (egress + sneaky cross-AZ/region), and requests. Hidden costs: observability ingestion, NAT gateways, managed premium.
- Manage unit economics, not the total: cost per customer / request / feature. A rising bill is fine if cost-per-customer is flat — that's growth.
- Three pricing models, layered: on-demand (flexible middle), commitments (RI/Savings Plan/CUD — 30–60% off the steady floor, watch coverage & utilization), spot/preemptible (60–90% off, reclaimable — fault-tolerant work only). Commit the valley, flex the middle, spot the peak.
- Kill waste: zombies (turn off) and overprovisioning (rightsize to measured usage). In K8s, requests are what you pay for; VPA sizes pods, HPA scales count, Karpenter/Cluster Autoscaler sizes nodes. Rightsize before committing.
- Allocate everything: consistent tags enforced in IaC turn an opaque bill into owned line items; drive unallocated % to zero. Showback alone cuts waste 15–20%. Split shared clusters with OpenCost/Kubecost (including idle). FOCUS normalizes billing across clouds.
- Run the loop: Inform → Optimize → Operate, forever — with budgets and anomaly detection for early warning. Decentralized ownership: engineers who create cost own it, enabled (not gatekept) by central FinOps; shift cost left with Infracost at PR time.
- Design for cost (an NFR): serverless-to-zero for spiky, reserved for steady; storage lifecycle; data locality to cut egress; managed-premium priced on purpose.
Quiz
Chapter 9 — Cost & FinOps
Pass to unlock the Next button belowYou can now read a bill as meters, manage unit economics instead of totals, layer pricing models, rightsize and allocate, run the inform→optimize→operate loop with decentralized ownership, and design cost out before it's spent. One of the fastest-growing, least-predictable cost frontiers is machine learning — GPUs and inference — where every lever in this chapter is stress-tested. That's Chapter 10.