Skip to main content

How cloud pricing actually works

Every chapter before this one taught you to build things. This one teaches you what each of those things costs — because in the cloud you are not billed once for a server, you are billed continuously, by the second, by the gigabyte, and by the request. The same pay-as-you-go model that lets you scale in seconds (Chapter 1) lets your bill grow in seconds too, and it does so silently. Before you can control a bill, you have to understand how it is actually computed. That is this lesson.

The bill is a sum of meters, not a price tag

The first mental shift: a cloud bill is not a list of products with prices. It is the output of thousands of meters, each ticking up as you consume a resource, multiplied by a unit rate. A meter is anything the provider counts — VM-seconds, gigabytes stored, gigabytes transferred, API calls made. Your monthly invoice is just the sum of every meter times its rate.

That's why two engineers can deploy "the same app" and get wildly different bills: one left a meter running (an idle VM at 3 a.m.), one chose a pricier meter (a premium storage tier), one triggered a meter they didn't even know existed (cross-zone data transfer). You can't reason about a bill until you can name its meters.

The four meters that dominate almost every bill are compute, storage, data transfer, and requests. Learn these four and you can explain the large majority of any invoice.

Meter 1 — Compute: you pay for time, not for work

Compute is billed by how long a resource exists and is running, not by how much useful work it did. A virtual machine (Chapter 2) that sits idle costs exactly the same as one running flat out — both are allocated to you, so both meters tick. The unit is roughly (instance size) × (time running): a bigger instance has a higher per-second rate; running it longer multiplies that rate by more seconds.

This has a sharp consequence: idle is the enemy. An over-sized VM that uses 5% of its CPU still bills 100% of its rate. A test environment nobody shut off on Friday bills all weekend. The cheapest compute is compute that isn't running — which is why serverless (scales to zero) and autoscaling (matches running capacity to load) are cost levers, not just performance ones.

:::note Per-second vs per-hour, and the "minimum charge" trap Most providers now bill compute per second (often with a small one-minute minimum). But many managed services round up — a serverless function may bill in 1 ms or 100 ms increments, a managed database may have an hourly floor. The durable idea (pay for running time) holds; the exact rounding is dated and worth checking per service. :::

Meter 2 — Storage: volume × tier × time

Storage is billed by how many gigabytes you keep, in which tier, for how long — typically quoted as dollars per GB-month. Two things make the real cost non-obvious:

  • Tiers. As you saw in Storage, object storage offers tiers that trade retrieval speed and cost against storage price. A "hot" tier is cheap to read and pricier to store; a "cold"/"archive" tier is very cheap to store but slow and costly to retrieve. Keeping year-old logs in the hot tier can cost 5–10× what archive would.
  • It never sleeps. Storage bills whether or not anyone reads the data. Forgotten snapshots, orphaned disks from deleted VMs, and old backups are pure waste — a meter with no value flowing out of it.

Meter 3 — Data transfer: the one everybody forgets

Data transfer (often called egress when it leaves the provider) is the single most overlooked line on a cloud bill. The pattern almost everywhere:

  • Ingress (data flowing in to the cloud) is usually free.
  • Egress (data flowing out to the internet) is charged per GB, and the rate is meaningful.
  • Cross-zone and cross-region transfer — data moving between availability zones or regions inside the provider — is also charged, and this is the part that ambushes people. Two services chatting across AZs for high availability can quietly run up a transfer bill larger than the compute they run on.
InternetAZ-a\nservice AAZ-b\nservice Bingress: usuallyFREEegress: $ per GBcross-AZ: $ per GB

Related hidden transfer costs: NAT gateways (which let private-subnet servers reach the internet — Chapter 2) bill per GB processed on top of an hourly rate, so a chatty private service behind a NAT gateway pays twice. These don't appear as a tidy "data transfer" line; they hide inside service-specific charges, which is exactly why they surprise people.

Meter 4 — Requests and operations

Many services bill per request or per operation, not per hour. A serverless function bills per invocation (and per GB-second of memory × duration). An object store bills tiny fractions of a cent per GET/PUT. A managed queue bills per message. Individually trivial — but at scale, count × tiny-rate becomes a real number. A function called a billion times a day, or a tight loop hammering an object store, can produce a startling bill from a per-unit price that looked like a rounding error.

The other hidden costs

Beyond the big four, the line items that most often blindside teams:

  • Observability ingestion. Logs, metrics, and traces (Chapter 6) are usually billed per GB ingested and per GB retained. Verbose logging at scale can make your monitoring bill rival your compute bill.
  • Managed-service premium. A managed database, queue, or cache costs more per unit than running the same software yourself on a raw VM — you're paying for the operations the provider handles. Often worth it (Chapter 2's "managed by default"), but it is a premium, and it should be a conscious choice.
  • Inter-service glue. Load balancers, API gateways, and DNS each have their own small hourly + per-request meters that add up across a large architecture.

:::warning The five hidden-cost categories to always check Cross-AZ/region & internet egress, NAT gateway processing, log/trace ingestion, storage tiers & orphaned snapshots, and the managed-service premium. These five are missing from most "how much will this cost?" estimates, and they're where surprise bills come from. Put them on a checklist. :::

From "the bill" to unit economics

Here is the idea that separates a mature cloud team from a panicking one: the total bill is the wrong number to manage. A bill of $100k/month is not good or bad in isolation — it depends on what it produced. The number that matters is unit cost: cost divided by a unit of business value.

  • Cost per customer — total spend ÷ active customers. If it's flat or falling as you grow, your economics are healthy.
  • Cost per request — spend ÷ requests served. The infrastructure efficiency of your product.
  • Cost per feature / per team — spend attributed to a slice of the product (this requires tagging — Tagging & cost allocation).

Unit economics means tying spend to a denominator of value instead of looking at the total. Why it's the whole game: a total bill that doubles is alarming only if revenue didn't. If cost-per-customer is constant while you add customers, a rising total bill is just growth, and that's good. If cost-per-customer is rising, you have an efficiency problem the total bill alone would never reveal — it would look like "we're just getting bigger."

:::tip The unit-economics test for any cost question Before reacting to a number, ask: "What's the denominator?" "We spent $2M on cloud" is unanswerable. "We spent $2M to serve 4 billion requests for 200k customers — $0.50/customer/month, down from $0.80 last year" is a story you can act on. A team that can't answer "what does one customer cost us?" cannot make its cost scale with the business. :::

A worked example: tracing one feature's cost

A team ships an image-thumbnail feature. Naively they look at the monthly bill, see it rose $4,000, and shrug. Let's trace the meters instead:

MeterWhat happenedRoughly
RequestsServerless function invoked 50M times to resize images$1,000
ComputeEach invocation runs ~2s at 1 GB memory (GB-seconds)$1,600
StorageResized thumbnails stored, hot tier, growing monthly$400
EgressThumbnails served to users over the internet, per GB$1,000
Total$4,000

Now divide by value: the feature serves 50M thumbnails to 200k customers → $0.02 per customer per month, $0.00008 per thumbnail. Now the team can decide intelligently: is $0.02/customer worth it? Could caching at the CDN (cutting egress and invocations) halve it? The total bill hid all of this; the meters and the unit cost reveal it. This trace — bill → meters → unit cost → decision — is the core analytical move of FinOps.

Why it matters

A cloud bill is a sum of meters — compute (you pay for running time, so idle is pure waste), storage (GB × tier × time, and it never sleeps), data transfer (egress and the sneaky cross-AZ/region charges everyone forgets), and requests (tiny rates × huge counts). Hidden costs hide in observability ingestion, NAT gateways, and the managed-service premium. But the total bill is the wrong thing to manage: the durable skill is unit economics — cost per customer, per request, per feature — which tells you whether a rising bill is healthy growth or creeping waste. You can only act on a number when you know its denominator. Next, the biggest single lever on the compute meter: choosing the right pricing model — on-demand, committed, or spot.

Where this leads: the meters you just learned are what every later lesson optimizes — pricing models attack the compute meter, rightsizing attacks idle, architecting for cost attacks egress and tiers.

Next: Pricing models & commitments →