Skip to main content

Storage: object, block & file

Storage is the primitive that keeps your data at rest — the bytes that must survive when the compute that created them shuts down. The single most common beginner confusion is treating "storage" as one thing. It isn't: there are three distinct shapes of storage, each suited to different data, and picking the wrong one makes your system slow, expensive, or impossible to build. This lesson teaches the three and exactly when each fits.

The three shapes

Objectstorage\nfilesaddressed byBlock storage\na rawvirtualdisk\nattached toFile storage\nashared folder\nmanymachines mount at

Object storage: the cloud's workhorse

Object storage holds data as objects — each object is a file plus some metadata, stored in a flat namespace and retrieved by a unique name (a key) over HTTP. There's no folder hierarchy underneath (folders are faked by putting slashes in names), no disk to attach, no size limit you'll ever realistically hit. You PUT an object to store it and GET it by name to read it. Containers for objects are called buckets.

Why it's the workhorse of the cloud:

  • Effectively infinite and cheap. Store a few kilobytes or many petabytes; the price per gigabyte is among the lowest in cloud, and you pay only for what you store.
  • Extremely durable. Providers replicate each object across multiple devices and availability zones automatically, advertising astronomical durability (the famous "eleven nines" — 99.999999999%). Once stored, your data essentially doesn't get lost.
  • Accessible from anywhere over HTTP — perfect for serving to web apps and CDNs.

It's ideal for: images, video, backups, logs, static website files, data-lake storage, and any large unstructured blob. It is not for things a running program treats like a local disk or database — you can't run a database file on it efficiently.

Brand names: S3 (AWS), Cloud Storage (GCP), Blob Storage (Azure).

:::note Storage classes / tiers — pay less for colder data Object storage offers tiers trading access speed and retrieval cost against storage price. "Hot" tiers are cheap to read but pricier to store; "cold"/"archive" tiers are very cheap to store but slow and costly to retrieve. Put frequently served images in hot, put year-old backups you'll rarely touch in archive. Choosing tiers well is a real FinOps lever (Chapter 9). (The exact tier names and prices are dated.) :::

Block storage: a raw disk for one machine

Block storage gives you a raw virtual hard drive — a volume you attach to a single virtual machine, which then formats it and uses it exactly like a physical disk. It's called "block" because the data is managed in fixed-size chunks (blocks), the same low-level way a real disk works. It's fast, low-latency, and behaves like local storage, but it's attached to one VM at a time.

It's ideal for: the boot disk of a VM, the data files of a database running on a VM, and any workload that needs a real, mountable, high-performance disk. It is not for sharing across many machines (one attachment at a time) or for storing huge archives cheaply (object storage is far cheaper for that).

Brand names: EBS (AWS), Persistent Disk (GCP), Managed Disks (Azure).

File storage: a shared folder for many machines

File storage is a shared file system — a network drive that many machines can mount and use at the same time, with the familiar folders-and-files hierarchy. It exists for the case block storage can't cover: several servers that all need to read and write the same files concurrently (a shared uploads directory, a content repository several app servers serve from).

It's ideal for: shared content across a fleet of servers, lift-and-shift of legacy apps that expect a network file share. It is generally pricier than object storage and used when you specifically need that shared-filesystem, mount-it-like-a-folder behavior.

Brand names: EFS (AWS), Filestore (GCP), Azure Files (Azure).

Choosing the right shape

The decision is usually obvious once you ask how the data is accessed:

QuestionUse
Is it a blob (image, video, backup, log) served or stored at scale?Object
Does one VM need a fast disk to boot from or run a database on?Block
Do many machines need to read/write the same files at once?File

:::tip The default-first instinct When unsure, start with object storage. It's the cheapest, the most durable, the most scalable, and it fits the largest share of real workloads (assets, backups, logs, data lakes). Reach for block when something needs to behave like a local disk, and for file only when you genuinely need a shared filesystem. Most teams use a lot of object, some block, and a little file. :::

A worked example

Picture a photo-sharing app:

  • The uploaded photosobject storage (huge, unstructured, served to users and a CDN, must be durable and cheap). Eleven-nines durability means a user's photos won't vanish.
  • The VM's operating system and the database's data filesblock storage (the running machine needs a real, fast, mountable disk).
  • A shared directory of templates that all the app servers render from → file storage (many machines, same files, concurrent access).

One app, three storage shapes, each chosen by how the data is used — exactly the reasoning you'll apply everywhere.

Why it matters

Cloud storage isn't one thing — it's three shapes. Object storage (files by name over HTTP, infinitely scalable, ultra-durable, cheap) is the workhorse for blobs, backups, logs, and assets, with cost tiers for hot vs cold data. Block storage is a raw virtual disk attached to a single VM for boot volumes and databases. File storage is a shared network filesystem many machines mount at once. Choose by asking how the data is accessed; default to object storage when unsure. Data at rest needs structure and querying as well as raw bytes — and that's the database, the next primitive.

Next: Databases: managed SQL & NoSQL →