Platform abstractions over Kubernetes

Lesson 7.4 was the storefront. Now we go into the warehouse: how does the platform actually fulfill a self-service request safely? When a developer clicks "give me a Postgres database," what turns that wish into real, correct, ongoing infrastructure? The answer is one pattern you already met in Chapter 4 — the reconciliation loop — extended beyond Kubernetes itself. This is the single most important technical primitive in platform engineering, so we build it up carefully.

The layer in the middle: the platform orchestrator

Before the mechanics, name the layer that sits between the portal (7.4) and the raw infrastructure. The portal is the storefront; the IaC modules, Crossplane Compositions, and Operators below are the warehouse. But something has to take a developer's intent — "deploy this workload to staging" — and resolve it into the concrete resources for that specific environment, wiring app to database to network correctly each time. That middle layer is the platform orchestrator.

A platform orchestrator is the engine that takes a developer's declared intent (a workload spec, see Score below) plus the environment it's targeting, and dynamically assembles and provisions the right concrete infrastructure for it — choosing the database flavor, the network, the secrets wiring per environment — instead of the developer or the portal hard-coding all of that. It's distinct from the portal (which is just the UI) and from any single provisioning tool (Terraform, Crossplane) that it drives: the orchestrator is the decision-making layer that turns "what I want" into "here's exactly how it's built here."

Humanitec is the canonical commercial example of this layer (its Platform Orchestrator coined the term); it reads a workload spec, applies environment-specific rules, and generates the deployment configuration so the same app lands correctly in dev, staging, and prod without per-environment copy-paste. The durable point isn't the product: it's recognizing that resolving intent-plus-environment into concrete infra is its own layer, separate from the portal above and the provisioning tools below — and the rest of this lesson is the machinery an orchestrator ultimately drives.

The core primitive: the reconciliation pattern

Recall from Chapter 4 how Kubernetes works: you declare desired state ("5 replicas of this image"), and a controller continuously compares desired state to actual state and acts to close the gap. That continuous compare-and-correct loop is reconciliation.

Three properties make this pattern the bedrock of platforms:

It's declarative — you state the what, not the how. The controller figures out the steps.
It's self-healing — if reality drifts (something deleted, crashed, changed), the next loop notices and corrects it, forever. This is automatic drift-correction, far stronger than running a script once.
It's continuous — the loop never stops, so the system stays converged on desired state rather than only at deploy time.

The whole magic of modern platforms is: let developers declare what they want, and have a controller continuously make it true. Self-service safety comes from this — the developer declares intent, and a controller you wrote enforces how it's actually built.

The big leap of platform engineering is realizing this pattern doesn't have to be limited to pods and deployments. You can teach Kubernetes about your own concepts — a "Database," a "Postgres," a "Tenant" — and write a controller that reconciles them. That's what the next pieces do.

CRDs and Operators: extending the pattern to your own concepts

Two terms unlock custom reconciliation:

A CRD — Custom Resource Definition — teaches the Kubernetes API a new kind of object. Out of the box Kubernetes knows Pod, Service, Deployment; a CRD lets you add, say, PostgresDatabase as a first-class object developers can declare with the same kubectl apply they already use.
An Operator is a custom controller that reconciles a CRD. It watches for your custom objects and does whatever's needed to make them real and keep them real — provision the database, set up backups, heal it if it drifts. An Operator is essentially operational knowledge encoded as a reconciliation loop: the playbook a human expert would follow, automated.

So the flow becomes: a developer writes a tiny YAML declaring a PostgresDatabase; your Operator sees it and provisions a real, backed-up, correctly-configured database; if anything drifts, the Operator fixes it on the next loop. The developer never touched the how.

:::note Building Operators — isolated as dated You don't write the reconciliation plumbing from scratch. Kubebuilder and the Operator SDK are the common frameworks that scaffold an Operator so you implement just the reconcile logic. Which framework is dated; the durable idea — CRD (new noun) + controller (reconciles it) = self-service for your own concepts — is what to keep. :::

Crossplane: provision cloud resources the Kubernetes way

Operators run reconciliation inside Kubernetes. Crossplane turns the same machinery outward to provision cloud infrastructure — managed databases, buckets, queues, whole networks — using Kubernetes as the control plane. With Crossplane, "give me an RDS database" is a Kubernetes object that a controller reconciles into a real AWS resource, and keeps reconciled (drift gets corrected, unlike a one-shot script).

The platform-defining feature is the Composition: the platform team bundles several real cloud resources (a database + its network + its security rules + least-privilege IAM) into one simple custom resource the developer asks for. The developer declares Database: name=billing, size=large; the Composition expands that into the dozen properly-wired, secure cloud resources underneath.

This is a golden path as code (7.2): the secure, opinionated bundle is the only thing exposed, so the developer literally cannot forget the private subnet or over-broad IAM — the Composition already made those choices. It's self-service (7.3) with safety baked into the abstraction itself.

:::note Neighbors in this space Kratix builds platforms around the same idea using "Promises" (a packaged capability a team can request); Cluster API uses CRDs+controllers to provision whole Kubernetes clusters declaratively. All share the DNA: declare a high-level intent, a controller reconciles the messy reality. :::

Score: a workload spec the developer actually writes

Even Crossplane resources can leak cloud detail. Score pushes abstraction one level higher: it's an open workload specification — a small, platform-agnostic file where a developer describes what their workload needs ("a container, a route, a Postgres database") without saying how or where it runs. The platform then translates that intent into the concrete target — Kubernetes manifests, a Compose file, whatever the environment uses.

The point is separation of concerns: developers express needs in one simple spec; the platform owns the implementation and can change it (new cluster, new cloud) without the developer rewriting anything. It's the 7.2 principle — developers declare intent, the platform owns the how — distilled to a single file.

Terraform modules as products

Not every platform is Kubernetes-native. The same philosophy applies to Infrastructure as Code at platform scale. You met Terraform modules in Chapter 3 as reusable, parameterized infrastructure. A platform team elevates them into products: versioned, documented, secure-by-default modules that are the golden paths for infrastructure.

"Treat the module as a product" means it gets the same product discipline from 7.1:

Versioned and released, so teams pin a known-good version and upgrade deliberately — not a copy-pasted snapshot that drifts.
Secure and opinionated by default, encoding the org's standards (encryption on, no public exposure, tagged for cost) so consumers can't easily get it wrong.
Documented with examples, treated as an interface other teams depend on.

And critically, platform-scale IaC means treating IaC as composable modules with state hygiene — remote state, locking, and drift detection from Chapter 3 — plus policy guardrails: automated checks (e.g. policy-as-code) that block a plan violating the rules before it applies. Self-service Terraform without state hygiene and policy guardrails is how self-service becomes a security and cost hole (7.6).

:::note IaC platform tooling — isolated as dated Teams operationalize self-service IaC with managed runners and policy gates: Spacelift, Atlantis (pull-request-driven Terraform), Terragrunt (DRY wrappers and remote-state wiring), plus OpenTofu (the open-source Terraform fork) and Pulumi. The durable idea is modules-as-products + state hygiene + policy-as-code, not any one runner. :::

A pitfall this lesson exists to prevent

The biggest conceptual gap in platform engineering is not understanding the reconciliation pattern — and so treating these tools as black boxes. If you don't grasp that an Operator and Crossplane are the same controller loop you already learned in Kubernetes, you'll mis-debug them, mis-design CRDs, and reach for one-shot scripts where a reconciling controller was the right tool. The second gap is treating IaC as scripts rather than composable, versioned modules with state hygiene and policy guardrails — which is how self-service quietly becomes ungoverned.

Common pitfalls

Black-boxing the controller. Not seeing that Operators/Crossplane are the reconciliation loop leads to bad designs and bad debugging. The loop is the primitive; learn it once, apply it everywhere.
One-shot scripts where you needed a controller. A script provisions once and walks away; a controller keeps state correct and heals drift. Self-service infra wants the controller.
Leaky abstractions. A Composition or module that still forces the developer to understand the cloud underneath hasn't abstracted anything. Expose intent, hide implementation.
IaC as scripts. Unversioned, copy-pasted Terraform with no state locking or policy gates is technical debt and a security hole, not a platform. Modules are products with state hygiene and guardrails.
Over-abstracting. Building a deep custom platform abstraction for a need two teams have is premature. Match the abstraction's cost to the duplication it removes (7.1).

Why it matters

The technical heart of platform engineering is the reconciliation pattern from Chapter 4 — declare desired state; a controller continuously makes it true — extended beyond pods via CRDs (teach Kubernetes new nouns) and Operators (controllers that reconcile them), and outward to cloud resources via Crossplane Compositions, with Score giving developers a platform-agnostic intent spec and Kratix/Cluster API sharing the same DNA. Off Kubernetes, the same product discipline turns Terraform modules into versioned, secure-by-default products governed by state hygiene and policy guardrails. The unifying primitive — expose intent, let a controller own the how — is what makes self-service both possible and safe. Next: how to keep many tenants on that self-service platform isolated, bounded, and out of each other's blast radius.

Next: 7.6 Multi-tenancy & self-service guardrails →

The layer in the middle: the platform orchestrator​

The core primitive: the reconciliation pattern​

CRDs and Operators: extending the pattern to your own concepts​

Crossplane: provision cloud resources the Kubernetes way​

Score: a workload spec the developer actually writes​

Terraform modules as products​

A pitfall this lesson exists to prevent​

Common pitfalls​

Why it matters​