Common anti-patterns & pointer to the rest of the series
02
What "The Cloud" Actually Is
The cloud is someone else's computer, rented by the second, behind an API. Everything else — IaaS, PaaS, SaaS — is a question of how much of the stack you rent versus run.
NIST's five essentials (SP 800-145, 2011)
On-demand self-service — provision via API, no human in the loop
Broad network access — reachable from anywhere
Resource pooling — physical hardware shared by many tenants
Rapid elasticity — scale up and down minute-to-minute
Measured service — you only pay for what you used
If a thing lacks any one of these, it isn't cloud — it's outsourced hosting.
The three deployment models
Public cloud — AWS / GCP / Azure / Cloudflare. Pay-as-you-go, multi-tenant by default.
Private cloud — same APIs, your hardware (OpenStack, VMware Cloud, Outposts).
Hybrid & multi-cloud — workloads split across both, often for sovereignty or burst capacity.
"Cloud-native" ≠ "in the cloud"
Lifting a VM image to EC2 is "in the cloud" but not "cloud-native". Cloud-native means designed to survive instance death, scale horizontally, and be deployed continuously — see CNCF for the canonical definition.
What the cloud is not
It is not magically secure, not magically cheap, and not magically reliable. Each of those is a property you have to design in — see decks 04, 05, and the CI/CD deck.
03
The *aaS Spectrum — What You Manage
Reading bottom-to-top: as you climb the stack, the provider takes more responsibility and you lose more control. Pick the highest layer that still gives you what you need.
04
Shared Responsibility — The Most Misread Diagram
The shared-responsibility model says "cloud security is a partnership". The provider runs security of the cloud; you run security in the cloud. The line moves up the stack as you move up the *aaS ladder.
What the provider always handles
Physical data-centre security & power
Hypervisor patching
Top-of-rack network & cross-region transit
Hardware decommissioning & media destruction
The bricks-and-mortar of compliance — they give you the SOC 2 Type II report; you have to use it correctly
What you always handle
Your data (encryption keys, classification, retention)
Your identities & access policies
Your client-side configuration & secrets
How you grant access to the data — see deck 05
What moves with the layer
Layer
OS
Runtime
App
Data
IaaS
you
you
you
you
CaaS
provider
you
you
you
PaaS
provider
provider
you
you
FaaS
provider
provider
you
you
SaaS
provider
provider
provider
shared
The classic mistake
"It's in S3, AWS handles security." S3 is multi-tenant object storage; AWS secures the storage substrate, but every public-bucket leak in the last decade — Capital One, Verizon, Accenture, the Pentagon, FedEx — was a customer IAM misconfiguration, not a provider failure. Deck 05 covers this in detail.
05
IaaS — Infrastructure-as-a-Service
You rent the virtual hardware — VMs, virtual networks, virtual disks — and assemble everything on top yourself. The closest cloud equivalent of "a server in a rack".
What you get
Compute — a Linux/Windows VM you SSH into, a chosen CPU/RAM shape, a region, an availability zone.
Networking — a VPC, subnets, routes, security groups, public IPs, load balancers.
Identity — IAM users / roles / policies that govern everything.
Canonical examples
AWS EC2, EBS, VPC, IAM, S3
GCP Compute Engine, Persistent Disk, VPC, GCS
Azure Virtual Machines, Managed Disks, VNets
Linode / Vultr / Hetzner — cheaper, simpler IaaS
When IaaS still wins
You need very specific kernels, drivers, or kernel modules (eBPF, GPU, RDMA, real-time).
You're regulated and the auditor wants "we own and patch the OS" in writing.
You have a stateful workload that doesn't fit a managed pattern (legacy ERP, MPI cluster).
You want maximum negotiating leverage on cost (3-year reserved instances, savings plans).
The hidden cost
An EC2 instance is "cheap" until you realise you also need: patching, monitoring, log shipping, backups, a bastion, two of everything across AZs, and a person to wake up at 03:00. Deck 02 spells out the operational footprint.
The rule of thumb
If you can do the job at PaaS or above, do. IaaS is the floor — useful when you need it, expensive when you don't.
06
PaaS — Platform-as-a-Service
You give the provider a git push (or an artefact); they give you a running URL. The provider owns the OS, the runtime, the build pipeline, the load balancer, the autoscaler, the rolling deploy, the SSL cert, and the logs UI.
The "Heroku-style" PaaS
Heroku — the original (2007). Buildpacks, Procfile, dynos.
You trade control for velocity. The platform decides your build image, your config surface, your tail-latency budget, your scaling algorithm. In exchange you ship in minutes and never touch a hypervisor.
Deck 03 covers PaaS in detail with side-by-side examples.
07
SaaS — Software-as-a-Service
The provider runs a finished application on your behalf, multi-tenant, accessed over the network — usually a browser or an API. You log in, you use it, you pay a subscription. This is the part the user actually sees.
Halfway between IaaS and PaaS. You bring a container image; the provider runs it, scales it, networks it. No OS, no buildpack opinions, no servers — but unlike PaaS, the container is your world end-to-end.
"Run my container" services
Google Cloud Run — request-billed, scale-to-zero, <1s cold start
AWS ECS on Fargate — task-based, no EC2 to manage
Azure Container Apps — KEDA-driven autoscale, Dapr built in
EKS / GKE / AKS — provider runs the control plane, you run the workloads
Cluster-as-a-Service (cluster-ops) is its own emerging layer above plain managed K8s
When CaaS beats PaaS
Polyglot stack — your runtime is not on the PaaS menu
You need exact control over the image (CVE patching, FIPS mode)
You want portability — the same image runs anywhere Docker runs
When CaaS beats Kubernetes
You don't have a platform team
You don't need workload-level customisation (PSPs, custom CSI, CRDs)
Scale-to-zero and per-request billing matter for cost
The trap
"Container" doesn't mean "stateless". CaaS providers love stateless HTTP services and hate everything else. Stateful workloads (databases, queues, GPU training) push you back to IaaS or specialised managed services.
09
FaaS — Functions / Serverless
You upload a function. It runs when an event arrives — an HTTP request, a queue message, a file upload, a cron tick — and is billed per invocation, often to the millisecond. The platform handles scaling from zero to thousands of concurrent invocations and back.
Glue between managed services (S3 → Lambda → DynamoDB)
Long-tail APIs where idle cost >> active cost
Where FaaS hurts
Cold starts — first hit after idle is slow, breaks SLOs
Long-running jobs — most platforms cap at 5–15 minutes
Sticky / stateful workloads — no in-memory cache between invocations
Local dev — emulators are good but never quite the cloud
Edge FaaS — a different beast
Cloudflare Workers / Vercel Edge / Deno Deploy run V8 isolates instead of containers. Cold start is ~ms, but you lose Node APIs, native binaries, and long-running connections. Best for low-latency request rewriting, A/B routing, auth, geofencing — not heavy compute.
The "serverless monolith" trap
Resist the urge to ship 200 Lambdas where 5 services would do. Per-function complexity is real — observability, deploys, IAM, cold starts, dependency footprint. Function granularity should match your change boundary, not your route table.
10
AIaaS — AI / LLM-as-a-Service
The newest layer (~2020). The provider runs the model — increasingly, the entire agent — and gives you an API. Pricing is per token, per second, per request, or per agent-step.
Three sub-layers
Inference-as-a-Service — OpenAI, Anthropic, Google, Together, Groq, Fireworks, Replicate. Bring a prompt, get tokens.
Embedding & RAG-as-a-Service — Pinecone, Weaviate Cloud, Turbopuffer, Vespa Cloud. Bring documents, get a search API.
Agents-as-a-Service — Bedrock Agents, Vertex AI Agent Builder, OpenAI Assistants, LangSmith Hub. Bring a goal, get an agent.
Hyperscaler model gardens
AWS Bedrock — Anthropic, Meta, Mistral, Cohere, Amazon Nova
GCP Vertex AI — Gemini, Anthropic, Meta, partner zoo
Azure OpenAI — OpenAI under Microsoft compliance perimeter
Why AIaaS is its own *aaS
The shared-responsibility model is different — your prompt and tokenised output are data, but inference happens inside someone else's model. Whose data is the model's hidden state?
Pricing isn't per-second-of-VM — it's per token, with caching tiers that can be 10× cheaper. See prompt caching.
Latency is a first-class concern (TTFT, tokens-per-second).
Compliance has new shapes — model-card transparency, BYOK, no-training options, EU AI Act.
Companion decks
Deck 06 covers managed LLM services in depth. For self-hosted alternatives see Local LLM Hosting.
11
A Brief History — 1999 → 2026
The pre-2006 world
1999: Salesforce — first SaaS at scale; "no software" as a slogan
2002: AWS-the-internal-team forms inside Amazon
Hosted apps existed (ASPs in the 90s) but lacked the NIST essentials — no API, no elasticity, no per-second billing
2006–2010 — the great unbundling
2006: EC2 + S3 redefine "hosting" as IaaS
2008: Heroku ships git push heroku main
2008: App Engine — Google's PaaS bet
2010: OpenStack formalises private-cloud IaaS
2014–2020 — containers and functions eat IaaS
2014: Lambda launches — FaaS goes mainstream
2015: Kubernetes 1.0; ECS already 1 year old
2017: Fargate — serverless containers
2019: Cloud Run — request-billed serverless containers
2020+ — the AI layer
2020: GPT-3 API; AIaaS arrives quietly
2023: ChatGPT, Bedrock, Vertex AI, Azure OpenAI all GA
2022+: repatriation debate — DHH at 37signals, Dropbox earlier, Stack Overflow — large steady-state workloads moving back on-prem
What you actually pay for (the long tail)
Egress — bytes leaving the cloud. AWS/GCP/Azure all charge $0.05–0.12/GB; S3 → internet at ~$10K/TB/month
Cross-AZ traffic — billed even between two of your instances
NAT gateways — flat hourly + per-GB; surprise on a busy week
Public IPv4 addresses — chargeable from Feb 2024 (~$4/month each)
Idle managed services — RDS doesn't scale to zero; one forgotten test cluster = real money
Cloud egress cartel — and the cracks
EU Data Act (effective 12 Sep 2025) forces "switching charges" on customers leaving a cloud to be free by Jan 2027. AWS/GCP/Azure responded with limited free-egress on full-account exit. Cloudflare's R2 (zero-egress) has been pressuring this since 2022.
13
Single-Tenant vs Multi-Tenant
Every *aaS layer makes a choice: do all customers share one running stack (multi-tenant) or does each get their own (single-tenant)? This is the most important architectural decision in cloud after "which region".
Multi-tenant — the default for SaaS
One DB schema, one app deployment; tenant ID is a column on every table
Highest density, lowest unit cost — Salesforce, Slack, Notion
Failure-domain risk: one bad query, every customer suffers ("noisy-neighbour")
Hardest part: tenant isolation — the one bug that returns Tenant B's data to Tenant A is existential
Single-tenant — the enterprise upcharge
Each customer gets their own DB, their own deployment, sometimes their own VPC
Higher per-customer cost, lower density
Sold as "isolated" / "dedicated" / "Enterprise"
Mandatory for regulated workloads — HIPAA BAA, FedRAMP, EU sovereign clouds
The four real-world patterns
Pattern
Isolation
Density
Pool — one stack, tenant-id everywhere
logical
highest
Bridge — shared compute, isolated DB schema
db-level
high
Silo — full per-tenant stack
full
low
Hybrid — pool by default, silo for whales
tiered
tuned
Deck 04 of this series goes deep on these.
The most common failure
Starting pool, growing past the point where one bad customer can take down everyone, and then spending two years bolting silo onto pool. Build the silo escape hatch from day one even if you never use it.
AWS / GCP / Azure are the only providers that span every row. The bargain: deeper integration (and bigger bills) than any specialist.
The specialist play
Each specialist owns one row and tries to be measurably better than the hyperscaler in DX, price, or geography. Vercel for frontend PaaS, Fly for global containers, Cloudflare for edge, Anthropic/OpenAI for AIaaS, PlanetScale/Neon for DBaaS.
15
The "Managed-Services Gradient"
Real systems live in multiple *aaS layers at once. A typical SaaS app uses VMs (IaaS) for batch ML, Cloud Run (CaaS) for the API, S3 (storage-as-a-service) for files, RDS (database-as-a-service) for state, Cognito (identity-as-a-service) for auth, and Bedrock (AIaaS) for the smart features.
The healthy mix
Default to the highest layer that meets your need
Drop a layer only when you've hit a real ceiling (cost, latency, control)
Most workloads get all the operational maturity they need at PaaS / CaaS
IaaS is where the rough edges are — only choose it deliberately
Where the gradient breaks
Stateful workloads — managed services exist (RDS, Cloud SQL, Spanner) but with steep cost and feature gaps
Realtime / very low-latency — sometimes cheaper to own the box
16
Capex vs Opex — What Cloud Actually Changes
Cloud doesn't make compute cheaper — it makes it elastic. Whether that's net-cheaper depends on how steady your workload is.
Pre-cloud — capex
Buy hardware up front, depreciate over 3–5 years
Provision for peak — Black Friday, end-of-quarter
Most racks ran at 10–20% utilisation
Lead time for new capacity: 6–12 weeks
Cloud — opex
Pay per second / GB / request
Provision for now; scale up / down on demand
Utilisation can hit 70–90% with autoscaling and spot
Lead time: seconds
When cloud is cheaper
Spiky / seasonal load
Early-stage, unknown demand curve
Many small services (managed glue is real)
Anywhere capex approval is slow
When on-prem wins
Steady, predictable, large workloads (DHH 37signals: ~$2M/yr saved)
Heavy egress (CDN, video) — cloud egress is the killer
Specialised hardware (FPGAs, custom ASICs, large GPU farms)
Strong cost-engineering culture & ops staff already in place
Hidden cost
The ops team that no longer racks servers still has to write Terraform, audit IAM, debug VPC routes, and run on-call. Cloud moves work; it doesn't always remove it.
17
Data Gravity & Vendor Lock-in
The compute layer is portable; the data is not. Once you have a petabyte in S3, every other workload wants to be where the data is — and getting it out costs egress, time, and rewrites.
Lock-in shapes
Data lock-in — bytes and the egress bill
API lock-in — DynamoDB, Cosmos, Spanner have no equivalents
Bedrock's compliance perimeter for HIPAA / FedRAMP-aligned LLM use
Lock-in is a tax. Sometimes it buys something worth the tax.
Multi-cloud — the false escape hatch
"Multi-cloud" usually means "we have two single-cloud deployments and double the ops burden". Use it when regulation demands it (sovereignty, two-vendor rule for critical infra), or when one workload genuinely fits a different provider — not as an architectural default.
18
Sovereignty & Regulated Regions
Why "where" matters as much as "what"
GDPR (EU) — personal data of EU residents has to follow them, including across cloud regions; transfers need SCCs / DPF
UK Data Protection Act 2018 — post-Brexit GDPR equivalent, with the EU adequacy decision still in force as of 2025
Control plane traffic stays — not always; some metadata routes via the US
Provider personnel access — depends on the SKU; "sovereign" tiers add personnel-jurisdiction guarantees
Subpoena risk under the US CLOUD Act — applies to any US-headquartered provider regardless of region; this is what sovereign clouds explicitly solve
The schism that did not happen
2018 fears that GDPR would force the internet into national silos largely didn't materialise — but the AI Act (2024), the EU Data Act (2025) and rising geopolitics are slowly producing the layered, residency-first cloud market they predicted. Architect for it now.
19
Decision Tree — Pick a Layer
20
Common Anti-Patterns
"Lift & shift" without re-architecting
VMware-on-AWS your entire data centre in 18 months and act surprised when the bill is double. The real cost wins live one or two layers up. Migrate the load-bearing services to PaaS/CaaS as you go.
"Multi-cloud from day one"
Two clouds = double the IAM model, double the network, double the on-call, half the depth on each. Solve your cloud well first.
"Serverless monolith"
200 Lambdas where 5 services would do. Every function deploys independently, but they share a database and tightly-coupled code paths. You've shipped a distributed monolith with extra cold starts.
"Build platform on Kubernetes"
If you're not Spotify or Goldman Sachs, you don't need a platform team running EKS, Istio, ArgoCD, Flux, OPA, Crossplane, Backstage. Cloud Run / App Runner / Container Apps will save your team's lives.
"Free tier means free at scale"
Most "$0 forever" tiers cap at 1k requests/day; surprise spend hits at the 99th-percentile day. Set budget alarms and request quotas before you ship.
"AIaaS = unlimited intelligence"
Tokens are expensive, latency is real, agents loop, and prompts leak. Treat AIaaS calls like external API calls — observability, circuit breakers, caching, cost ceilings. Deck 06 covers this.
21
Summary & What's Next
Three takeaways
The *aaS layers form a spectrum of "how much of the stack you rent". Pick the highest layer that meets your need.
Cloud doesn't make compute cheaper — it makes it elastic. Whether that's net-cheaper depends on workload shape and cost discipline.
Lock-in, sovereignty, and the shared-responsibility model are architectural decisions you can't fix later cheaply.
"The cloud is a stack of rented layers; the architect's job is to pick the layer where the marginal control you keep is worth more than the operational tax it costs."