Become a cloud engineer
This guide takes you from "I can write code but I've never provisioned a server" to being able to design, provision, ship, observe, secure, and pay for real cloud infrastructure. It is written linearly: read top to bottom and every term is defined the first time it appears. You do not need any prior operations, networking, or sysadmin background — chapter 1 starts at "what even is a region?".
It is also useful to working engineers who want a sharp 2026 refresh: the decision rules, the durable-vs-dated framing, and the patterns that actually hold up in production.
The one idea this whole guide rests on
The cloud is just someone else's computers — rented by the second and driven by an API.
Every cloud service, no matter how exotic its marketing name, is underneath one of a small number of primitives: a computer to run your code (compute), a disk to keep your data (storage), a database to query it, a network to connect it, and an identity system to say who is allowed to do what. The hundreds of branded products from Amazon, Google and Microsoft are recombinations of those few primitives. Learn the primitive once and the product catalog collapses into something you can hold in your head.
Durable vs dated — the rule that keeps this guide useful
Cloud moves fast, but not all of it moves fast. We constantly separate two things:
- Durable — concepts that have been true for a decade and will be true for another: the shared-responsibility model, object vs block storage, least-privilege identity, declarative infrastructure, the idea of an SLO. Most of your study time should go here. These transfer across every provider and survive every rebrand.
- Dated — the specific button, the current free-tier limit, the exact product name, this quarter's "new" service. Useful to know, but it will change. We flag dated facts as dated so you never anchor your understanding on them.
:::tip How to read this guide When something is marked durable, internalize it. When something is marked dated (a price, a console path, a product name), treat it as an example, not a law — verify it against current docs when you actually build. :::
What you'll be able to do by the end
- Explain how a cloud request flows — from your CLI, through a region and an availability zone, to a running server, and back.
- Pick the right primitive — choose compute (VM vs container vs serverless), storage (object vs block vs file), and a database (SQL vs NoSQL) for a given job.
- Provision infrastructure as code — write Terraform, read a
plan, understand state and drift, and factor things into modules. - Run containers at scale — build an image, push it to a registry, and reason about Kubernetes pods, deployments, services, and the control plane.
- Operate what you ship — set an SLO, instrument the three pillars of observability, and run a calm incident.
- Make the recurring decisions — managed vs self-hosted, multi-cloud vs single, when Kubernetes is overkill — with a rule instead of a vibe.
How the guide is structured
The eleven chapters follow the real arc of getting something to production and keeping it there:
- Part A — Foundations & core (Ch. 1–4): the mental model, the primitives, how to provision them as code, and how to run containers on them.
- Part B — Delivery & operations (Ch. 5–8): ship continuously, observe and operate, build a platform for other engineers, and secure all of it.
- Part C — Economics, ML & scale (Ch. 9–11): control the bill, run machine-learning and LLM workloads, and adapt the whole playbook from solo to enterprise.
How this guide is built
All eleven chapters are authored in depth — every chapter has a full set of lessons with worked examples, pitfalls, "trace this" exercises, and a checkpoint quiz:
- Part A — Foundations & core (Ch. 1–4): Cloud Foundations, Core Services, Infrastructure as Code, Containers & Kubernetes.
- Part B — Delivery & operations (Ch. 5–8): CI/CD & GitOps, Observability & SRE, Platform Engineering, Cloud Security.
- Part C — Economics, ML & scale (Ch. 9–11): Cost & FinOps, MLOps / LLMOps, and Scale, Decisions & Career.
Read it top to bottom: every term is defined on first use, and each chapter builds on the ones before it.
Where this sits on the ladder
This guide assumes you can already write a small program. If you can't yet, start with Programming Basics. It deliberately does not re-teach application security (see the Modern Security Engineer Guide) or how LLMs work internally (see the Modern AI Guide) — instead it cross-links them where they meet cloud, in chapters 8 and 10.
Ready? Start with Chapter 1: What the cloud actually is →