Chapter 9 checkpoint

You can now read a cloud bill, attribute it, shrink it, and design so it never bloats in the first place. Recall the throughline, then prove it with the quiz.

The throughline

The bill is meters, not a price tag: compute (you pay for running time — idle is waste), storage (GB × tier × time), data transfer (egress + sneaky cross-AZ/region), and requests. Hidden costs: observability ingestion, NAT gateways, managed premium.
Manage unit economics, not the total: cost per customer / request / feature. A rising bill is fine if cost-per-customer is flat — that's growth.
Three pricing models, layered: on-demand (flexible middle), commitments (RI/Savings Plan/CUD — 30–60% off the steady floor, watch coverage & utilization), spot/preemptible (60–90% off, reclaimable — fault-tolerant work only). Commit the valley, flex the middle, spot the peak.
Kill waste: zombies (turn off) and overprovisioning (rightsize to measured usage). In K8s, requests are what you pay for; VPA sizes pods, HPA scales count, Karpenter/Cluster Autoscaler sizes nodes. Rightsize before committing.
Allocate everything: consistent tags enforced in IaC turn an opaque bill into owned line items; drive unallocated % to zero. Showback alone cuts waste 15–20%. Split shared clusters with OpenCost/Kubecost (including idle). FOCUS normalizes billing across clouds.
Run the loop: Inform → Optimize → Operate, forever — with budgets and anomaly detection for early warning. Decentralized ownership: engineers who create cost own it, enabled (not gatekept) by central FinOps; shift cost left with Infracost at PR time.
Design for cost (an NFR): serverless-to-zero for spiky, reserved for steady; storage lifecycle; data locality to cut egress; managed-premium priced on purpose.

Quiz

Required checkpoint

Chapter 9 — Cost & FinOps

Pass to unlock the Next button below

You can now read a bill as meters, manage unit economics instead of totals, layer pricing models, rightsize and allocate, run the inform→optimize→operate loop with decentralized ownership, and design cost out before it's spent. One of the fastest-growing, least-predictable cost frontiers is machine learning — GPUs and inference — where every lever in this chapter is stress-tested. That's Chapter 10.

Next: Chapter 10: MLOps / LLMOps →

The throughline​

Quiz​

The throughline

Quiz