CLOUD SERVICE MODELS · PART 1 OF 6

The *aaS Series
Service Models

IaaS · PaaS · SaaS · CaaS · FaaS · AIaaS — the taxonomy that runs the modern internet
IaaS PaaS SaaS CaaS / FaaS AIaaS
🏢 Hardware ☁ IaaS 🛠 PaaS 📦 CaaS ⚡ FaaS 🧑‍💼 SaaS 🧠 AIaaS

What each aaS layer actually is · the shared-responsibility model · 1999 → 2026 history · pricing · vendor matrix · how to choose.

Taxonomy  ·  Responsibility  ·  History  ·  Choice
01

Topics

Foundations

  • What "the cloud" actually is — and what it isn't
  • The shared-responsibility model
  • 1999 → 2026 timeline of cloud service models
  • Pricing models — usage, seat, tier, commit

The six layers

  • IaaS — Infrastructure-as-a-Service
  • PaaS — Platform-as-a-Service
  • CaaS — Containers-as-a-Service
  • FaaS — Functions / serverless
  • SaaS — Software-as-a-Service
  • AIaaS — AI / LLM-as-a-Service

Trade-offs

  • Single-tenant vs multi-tenant
  • Capex vs opex — what cloud actually changes
  • Data gravity & vendor lock-in
  • Sovereignty & regulated regions

Choosing & landscape

  • Vendor matrix — AWS / GCP / Azure / Cloudflare / Fly / Vercel
  • The "managed-services gradient"
  • A decision tree: pick a layer
  • Common anti-patterns & pointer to the rest of the series
02

What "The Cloud" Actually Is

The cloud is someone else's computer, rented by the second, behind an API. Everything else — IaaS, PaaS, SaaS — is a question of how much of the stack you rent versus run.

NIST's five essentials (SP 800-145, 2011)

  • On-demand self-service — provision via API, no human in the loop
  • Broad network access — reachable from anywhere
  • Resource pooling — physical hardware shared by many tenants
  • Rapid elasticity — scale up and down minute-to-minute
  • Measured service — you only pay for what you used

If a thing lacks any one of these, it isn't cloud — it's outsourced hosting.

The three deployment models

  • Public cloud — AWS / GCP / Azure / Cloudflare. Pay-as-you-go, multi-tenant by default.
  • Private cloud — same APIs, your hardware (OpenStack, VMware Cloud, Outposts).
  • Hybrid & multi-cloud — workloads split across both, often for sovereignty or burst capacity.

"Cloud-native" ≠ "in the cloud"

Lifting a VM image to EC2 is "in the cloud" but not "cloud-native". Cloud-native means designed to survive instance death, scale horizontally, and be deployed continuously — see CNCF for the canonical definition.

What the cloud is not

It is not magically secure, not magically cheap, and not magically reliable. Each of those is a property you have to design in — see decks 04, 05, and the CI/CD deck.

03

The *aaS Spectrum — What You Manage

Reading bottom-to-top: as you climb the stack, the provider takes more responsibility and you lose more control. Pick the highest layer that still gives you what you need.

Layer On-prem IaaS CaaS PaaS FaaS SaaS AIaaS Application Data Runtime Container OS Virtualisation Servers Net & Storage Building & power you manage provider manages shared (e.g. AIaaS data ↔ provider)
04

Shared Responsibility — The Most Misread Diagram

The shared-responsibility model says "cloud security is a partnership". The provider runs security of the cloud; you run security in the cloud. The line moves up the stack as you move up the *aaS ladder.

What the provider always handles

  • Physical data-centre security & power
  • Hypervisor patching
  • Top-of-rack network & cross-region transit
  • Hardware decommissioning & media destruction
  • The bricks-and-mortar of compliance — they give you the SOC 2 Type II report; you have to use it correctly

What you always handle

  • Your data (encryption keys, classification, retention)
  • Your identities & access policies
  • Your client-side configuration & secrets
  • How you grant access to the data — see deck 05

What moves with the layer

LayerOSRuntimeAppData
IaaSyouyouyouyou
CaaSprovideryouyouyou
PaaSproviderprovideryouyou
FaaSproviderprovideryouyou
SaaSproviderproviderprovidershared

The classic mistake

"It's in S3, AWS handles security." S3 is multi-tenant object storage; AWS secures the storage substrate, but every public-bucket leak in the last decade — Capital One, Verizon, Accenture, the Pentagon, FedEx — was a customer IAM misconfiguration, not a provider failure. Deck 05 covers this in detail.

05

IaaS — Infrastructure-as-a-Service

You rent the virtual hardware — VMs, virtual networks, virtual disks — and assemble everything on top yourself. The closest cloud equivalent of "a server in a rack".

What you get

  • Compute — a Linux/Windows VM you SSH into, a chosen CPU/RAM shape, a region, an availability zone.
  • Networking — a VPC, subnets, routes, security groups, public IPs, load balancers.
  • Storage — block volumes (EBS), object stores (S3), file shares (EFS).
  • Identity — IAM users / roles / policies that govern everything.

Canonical examples

  • AWS EC2, EBS, VPC, IAM, S3
  • GCP Compute Engine, Persistent Disk, VPC, GCS
  • Azure Virtual Machines, Managed Disks, VNets
  • Linode / Vultr / Hetzner — cheaper, simpler IaaS

When IaaS still wins

  • You need very specific kernels, drivers, or kernel modules (eBPF, GPU, RDMA, real-time).
  • You're regulated and the auditor wants "we own and patch the OS" in writing.
  • You have a stateful workload that doesn't fit a managed pattern (legacy ERP, MPI cluster).
  • You want maximum negotiating leverage on cost (3-year reserved instances, savings plans).

The hidden cost

An EC2 instance is "cheap" until you realise you also need: patching, monitoring, log shipping, backups, a bastion, two of everything across AZs, and a person to wake up at 03:00. Deck 02 spells out the operational footprint.

The rule of thumb

If you can do the job at PaaS or above, do. IaaS is the floor — useful when you need it, expensive when you don't.

06

PaaS — Platform-as-a-Service

You give the provider a git push (or an artefact); they give you a running URL. The provider owns the OS, the runtime, the build pipeline, the load balancer, the autoscaler, the rolling deploy, the SSL cert, and the logs UI.

The "Heroku-style" PaaS

  • Heroku — the original (2007). Buildpacks, Procfile, dynos.
  • Render — modern Heroku spiritual successor; native HTTPS, free tier, preview environments.
  • Fly.io — Heroku-style DX, multi-region by default, Firecracker VMs under the hood.
  • Railway — generous free tier, Postgres-and-friends in a click.

Hyperscaler PaaS

  • AWS Elastic Beanstalk, App Runner, Amplify Hosting, Lightsail Containers
  • GCP App Engine (Standard & Flexible)
  • Azure App Service, Static Web Apps, Container Apps

Frontend-shaped PaaS

  • Vercel — Next.js native, edge functions, ISR, image optimisation
  • Netlify — JAMstack, Functions, Edge Handlers, Forms
  • Cloudflare Pages — globally distributed static + Workers backend

The PaaS bargain

You trade control for velocity. The platform decides your build image, your config surface, your tail-latency budget, your scaling algorithm. In exchange you ship in minutes and never touch a hypervisor.

Deck 03 covers PaaS in detail with side-by-side examples.

07

SaaS — Software-as-a-Service

The provider runs a finished application on your behalf, multi-tenant, accessed over the network — usually a browser or an API. You log in, you use it, you pay a subscription. This is the part the user actually sees.

Examples (you use most of them)

  • Communication — Slack, Microsoft 365, Zoom, Gmail
  • CRM & sales — Salesforce (the canonical SaaS, 1999), HubSpot, Pipedrive
  • Dev tools — GitHub, GitLab, Linear, Sentry, Datadog
  • Finance — Stripe, Xero, QuickBooks Online
  • Storage — Dropbox, Box, Google Drive, OneDrive

What makes it SaaS (not just hosted software)

  • Multi-tenant by default — one shared database, tenant-scoped queries
  • Continuously deployed — same version for everyone (mostly)
  • Usage- or seat-priced subscription billing
  • Self-service signup & cancellation

B2C vs B2B SaaS

  • B2C — individual users pay with a card; signup is friction. Notion, Spotify.
  • B2B — organisations pay; SSO, SCIM, audit logs, BYO domain, BAA, DPA, custom MSA. Slack Enterprise Grid.
  • Same code, very different go-to-market — and very different identity stories. Deck 04 covers B2B identity in detail.

SaaS architecture is its own discipline

  • Multi-tenancy patterns (silo / pool / bridge)
  • Per-tenant observability & rate-limiting
  • Metering & billing data models
  • Identity federation (SAML, SCIM, OIDC)
  • Data residency & compliance

→ Deck 04 of this series. Existing companion: Monetising & Distributing Software for the business side.

08

CaaS — Containers-as-a-Service

Halfway between IaaS and PaaS. You bring a container image; the provider runs it, scales it, networks it. No OS, no buildpack opinions, no servers — but unlike PaaS, the container is your world end-to-end.

"Run my container" services

  • Google Cloud Run — request-billed, scale-to-zero, <1s cold start
  • AWS ECS on Fargate — task-based, no EC2 to manage
  • Azure Container Apps — KEDA-driven autoscale, Dapr built in
  • Fly Machines — micro-VMs, multi-region, <500ms cold start
  • Cloudflare Containers — globally distributed, scale-to-zero (2025)

Managed Kubernetes (the heavyweight CaaS)

  • EKS / GKE / AKS — provider runs the control plane, you run the workloads
  • Cluster-as-a-Service (cluster-ops) is its own emerging layer above plain managed K8s

When CaaS beats PaaS

  • Polyglot stack — your runtime is not on the PaaS menu
  • You need exact control over the image (CVE patching, FIPS mode)
  • You want portability — the same image runs anywhere Docker runs

When CaaS beats Kubernetes

  • You don't have a platform team
  • You don't need workload-level customisation (PSPs, custom CSI, CRDs)
  • Scale-to-zero and per-request billing matter for cost

The trap

"Container" doesn't mean "stateless". CaaS providers love stateless HTTP services and hate everything else. Stateful workloads (databases, queues, GPU training) push you back to IaaS or specialised managed services.

09

FaaS — Functions / Serverless

You upload a function. It runs when an event arrives — an HTTP request, a queue message, a file upload, a cron tick — and is billed per invocation, often to the millisecond. The platform handles scaling from zero to thousands of concurrent invocations and back.

The five canonical platforms

PlatformRuntimeCold-start
AWS LambdaNode, Python, Go, Java, custom50–500 ms
Google Cloud Functions / RunNode, Python, Java, .NET100–800 ms
Azure Functions.NET, Node, Python, Java100–1500 ms
Cloudflare WorkersJS / WASM (V8 isolates)< 5 ms
Vercel / Netlify FunctionsLambda or edgevariable

Where FaaS shines

  • Spiky / event-driven workloads (Slack bots, webhooks, ETL triggers)
  • Glue between managed services (S3 → Lambda → DynamoDB)
  • Long-tail APIs where idle cost >> active cost

Where FaaS hurts

  • Cold starts — first hit after idle is slow, breaks SLOs
  • Long-running jobs — most platforms cap at 5–15 minutes
  • Sticky / stateful workloads — no in-memory cache between invocations
  • Local dev — emulators are good but never quite the cloud

Edge FaaS — a different beast

Cloudflare Workers / Vercel Edge / Deno Deploy run V8 isolates instead of containers. Cold start is ~ms, but you lose Node APIs, native binaries, and long-running connections. Best for low-latency request rewriting, A/B routing, auth, geofencing — not heavy compute.

The "serverless monolith" trap

Resist the urge to ship 200 Lambdas where 5 services would do. Per-function complexity is real — observability, deploys, IAM, cold starts, dependency footprint. Function granularity should match your change boundary, not your route table.

10

AIaaS — AI / LLM-as-a-Service

The newest layer (~2020). The provider runs the model — increasingly, the entire agent — and gives you an API. Pricing is per token, per second, per request, or per agent-step.

Three sub-layers

  • Inference-as-a-Service — OpenAI, Anthropic, Google, Together, Groq, Fireworks, Replicate. Bring a prompt, get tokens.
  • Embedding & RAG-as-a-Service — Pinecone, Weaviate Cloud, Turbopuffer, Vespa Cloud. Bring documents, get a search API.
  • Agents-as-a-Service — Bedrock Agents, Vertex AI Agent Builder, OpenAI Assistants, LangSmith Hub. Bring a goal, get an agent.

Hyperscaler model gardens

  • AWS Bedrock — Anthropic, Meta, Mistral, Cohere, Amazon Nova
  • GCP Vertex AI — Gemini, Anthropic, Meta, partner zoo
  • Azure OpenAI — OpenAI under Microsoft compliance perimeter

Why AIaaS is its own *aaS

  • The shared-responsibility model is different — your prompt and tokenised output are data, but inference happens inside someone else's model. Whose data is the model's hidden state?
  • Pricing isn't per-second-of-VM — it's per token, with caching tiers that can be 10× cheaper. See prompt caching.
  • Latency is a first-class concern (TTFT, tokens-per-second).
  • Compliance has new shapes — model-card transparency, BYOK, no-training options, EU AI Act.

Companion decks

Deck 06 covers managed LLM services in depth. For self-hosted alternatives see Local LLM Hosting.

11

A Brief History — 1999 → 2026

1999 2006 2008 2010 2014 2017 2020 2024+ Salesforcelaunches EC2 / S3IaaS born App EngineHeroku — PaaS Azure GAOpenStack AWS LambdaFaaS arrives K8s 1.0 → ECSFargate · Cloud Run OpenAI APIAIaaS layer Agents-aaS,MCP, EU AI Act Twenty-five years from "rent a CRM by the seat" to "rent an agent by the step" — every layer added on top, none removed.

The pre-2006 world

  • 1999: Salesforce — first SaaS at scale; "no software" as a slogan
  • 2002: AWS-the-internal-team forms inside Amazon
  • Hosted apps existed (ASPs in the 90s) but lacked the NIST essentials — no API, no elasticity, no per-second billing

2006–2010 — the great unbundling

  • 2006: EC2 + S3 redefine "hosting" as IaaS
  • 2008: Heroku ships git push heroku main
  • 2008: App Engine — Google's PaaS bet
  • 2010: OpenStack formalises private-cloud IaaS

2014–2020 — containers and functions eat IaaS

  • 2014: Lambda launches — FaaS goes mainstream
  • 2015: Kubernetes 1.0; ECS already 1 year old
  • 2017: Fargate — serverless containers
  • 2019: Cloud Run — request-billed serverless containers

2020+ — the AI layer

  • 2020: GPT-3 API; AIaaS arrives quietly
  • 2023: ChatGPT, Bedrock, Vertex AI, Azure OpenAI all GA
  • 2024–2025: Agents-as-a-Service, MCP, evaluation-as-a-service
  • 2025: EU AI Act (Aug 2024) and US Executive Orders reshape AIaaS compliance
12

Pricing Models — Five Shapes

ShapeUnitBest fitTrap
Pay-as-you-go (PAYG)second / GB / requestvariable, unpredictable loadinfinite spend if a loop runs away
Reserved / committed-use1- or 3-year commitmentsteady-state baselinelocked in if your shape changes
Spot / preemptibleauction-priced VM-secondfault-tolerant batch, trainingcan be reclaimed mid-job with 30s warning
Per-seat subscription$N / user / monthSaaSevery "license" gets shared internally
Per-token / per-call$N / 1M tokens or $N / callAIaaSretries, agent loops, long contexts blow it up

The three eras of cloud cost

  • 2006–2014: cloud is cheap — laptops vs racks
  • 2014–2022: FinOps emerges — commitment discounts, rightsizing, idle reaping
  • 2022+: repatriation debate — DHH at 37signals, Dropbox earlier, Stack Overflow — large steady-state workloads moving back on-prem

What you actually pay for (the long tail)

  • Egress — bytes leaving the cloud. AWS/GCP/Azure all charge $0.05–0.12/GB; S3 → internet at ~$10K/TB/month
  • Cross-AZ traffic — billed even between two of your instances
  • NAT gateways — flat hourly + per-GB; surprise on a busy week
  • Public IPv4 addresses — chargeable from Feb 2024 (~$4/month each)
  • Idle managed services — RDS doesn't scale to zero; one forgotten test cluster = real money

Cloud egress cartel — and the cracks

EU Data Act (effective 12 Sep 2025) forces "switching charges" on customers leaving a cloud to be free by Jan 2027. AWS/GCP/Azure responded with limited free-egress on full-account exit. Cloudflare's R2 (zero-egress) has been pressuring this since 2022.

13

Single-Tenant vs Multi-Tenant

Every *aaS layer makes a choice: do all customers share one running stack (multi-tenant) or does each get their own (single-tenant)? This is the most important architectural decision in cloud after "which region".

Multi-tenant — the default for SaaS

  • One DB schema, one app deployment; tenant ID is a column on every table
  • Highest density, lowest unit cost — Salesforce, Slack, Notion
  • Failure-domain risk: one bad query, every customer suffers ("noisy-neighbour")
  • Hardest part: tenant isolation — the one bug that returns Tenant B's data to Tenant A is existential

Single-tenant — the enterprise upcharge

  • Each customer gets their own DB, their own deployment, sometimes their own VPC
  • Higher per-customer cost, lower density
  • Sold as "isolated" / "dedicated" / "Enterprise"
  • Mandatory for regulated workloads — HIPAA BAA, FedRAMP, EU sovereign clouds

The four real-world patterns

PatternIsolationDensity
Pool — one stack, tenant-id everywherelogicalhighest
Bridge — shared compute, isolated DB schemadb-levelhigh
Silo — full per-tenant stackfulllow
Hybrid — pool by default, silo for whalestieredtuned

Deck 04 of this series goes deep on these.

The most common failure

Starting pool, growing past the point where one bad customer can take down everyone, and then spending two years bolting silo onto pool. Build the silo escape hatch from day one even if you never use it.

14

Vendor Matrix — Who Sells Each Layer

LayerAWSGCPAzureCloudflareSpecialist
IaaS EC2, EBS, VPC, S3 Compute Engine, GCS, VPC VMs, VNets, Blob Hetzner, Linode, Vultr, OVH
CaaS ECS, Fargate, App Runner, EKS Cloud Run, GKE Container Apps, AKS Containers (2025) Fly.io, Koyeb, Railway, Northflank
PaaS Beanstalk, Amplify, Lightsail App Engine App Service, Static Web Apps Pages Heroku, Render, Vercel, Netlify, Railway
FaaS Lambda, Step Functions Cloud Functions, Cloud Run jobs Functions, Logic Apps Workers, Durable Objects Vercel Functions, Deno Deploy
Managed K8s EKS, EKS Auto Mode GKE Autopilot AKS DigitalOcean, Civo, Linode, Scaleway
SaaS infra (DBaaS) RDS, DynamoDB, Aurora Cloud SQL, Spanner, Firestore SQL, Cosmos DB D1, KV, R2, Hyperdrive Neon, PlanetScale, Supabase, MongoDB Atlas
AIaaS Bedrock, SageMaker Vertex AI, Gemini API Azure OpenAI, AI Foundry Workers AI, AI Gateway OpenAI, Anthropic, Together, Groq, Fireworks, Replicate

The hyperscaler gravity

AWS / GCP / Azure are the only providers that span every row. The bargain: deeper integration (and bigger bills) than any specialist.

The specialist play

Each specialist owns one row and tries to be measurably better than the hyperscaler in DX, price, or geography. Vercel for frontend PaaS, Fly for global containers, Cloudflare for edge, Anthropic/OpenAI for AIaaS, PlanetScale/Neon for DBaaS.

15

The "Managed-Services Gradient"

Real systems live in multiple *aaS layers at once. A typical SaaS app uses VMs (IaaS) for batch ML, Cloud Run (CaaS) for the API, S3 (storage-as-a-service) for files, RDS (database-as-a-service) for state, Cognito (identity-as-a-service) for auth, and Bedrock (AIaaS) for the smart features.

on-prem IaaS CaaS PaaS / FaaS SaaS AIaaS max controlmax ops min controlmin ops "we own it" "we configure it"

The healthy mix

  • Default to the highest layer that meets your need
  • Drop a layer only when you've hit a real ceiling (cost, latency, control)
  • Most workloads get all the operational maturity they need at PaaS / CaaS
  • IaaS is where the rough edges are — only choose it deliberately

Where the gradient breaks

  • Stateful workloads — managed services exist (RDS, Cloud SQL, Spanner) but with steep cost and feature gaps
  • GPU compute — IaaS-only on hyperscalers; specialists (Modal, RunPod, Lambda Labs) sell GPU-as-a-Service
  • Realtime / very low-latency — sometimes cheaper to own the box
16

Capex vs Opex — What Cloud Actually Changes

Cloud doesn't make compute cheaper — it makes it elastic. Whether that's net-cheaper depends on how steady your workload is.

Pre-cloud — capex

  • Buy hardware up front, depreciate over 3–5 years
  • Provision for peak — Black Friday, end-of-quarter
  • Most racks ran at 10–20% utilisation
  • Lead time for new capacity: 6–12 weeks

Cloud — opex

  • Pay per second / GB / request
  • Provision for now; scale up / down on demand
  • Utilisation can hit 70–90% with autoscaling and spot
  • Lead time: seconds

When cloud is cheaper

  • Spiky / seasonal load
  • Early-stage, unknown demand curve
  • Many small services (managed glue is real)
  • Anywhere capex approval is slow

When on-prem wins

  • Steady, predictable, large workloads (DHH 37signals: ~$2M/yr saved)
  • Heavy egress (CDN, video) — cloud egress is the killer
  • Specialised hardware (FPGAs, custom ASICs, large GPU farms)
  • Strong cost-engineering culture & ops staff already in place

Hidden cost

The ops team that no longer racks servers still has to write Terraform, audit IAM, debug VPC routes, and run on-call. Cloud moves work; it doesn't always remove it.

17

Data Gravity & Vendor Lock-in

The compute layer is portable; the data is not. Once you have a petabyte in S3, every other workload wants to be where the data is — and getting it out costs egress, time, and rewrites.

Lock-in shapes

  • Data lock-in — bytes and the egress bill
  • API lock-in — DynamoDB, Cosmos, Spanner have no equivalents
  • Operational lock-in — IAM, secrets, observability tooling
  • Skill lock-in — your team only knows AWS
  • Compliance lock-in — re-certifying takes 9+ months

Mitigations (partial)

  • Stick to portable interfaces — Postgres > Aurora-only features; Kafka > Kinesis-only features; OpenTelemetry > vendor-specific tracing
  • Containerise everything — same image runs on any CaaS
  • Treat IaC (Terraform, Pulumi) as your contract with the cloud
  • Multi-cloud only when it pays for itself; usually it doesn't

When lock-in is a feature

  • Spanner, BigQuery, DynamoDB single-digit-ms reads — nothing portable matches them
  • Cloudflare Workers' <5ms cold start — V8 isolates aren't portable to Lambda
  • Bedrock's compliance perimeter for HIPAA / FedRAMP-aligned LLM use

Lock-in is a tax. Sometimes it buys something worth the tax.

Multi-cloud — the false escape hatch

"Multi-cloud" usually means "we have two single-cloud deployments and double the ops burden". Use it when regulation demands it (sovereignty, two-vendor rule for critical infra), or when one workload genuinely fits a different provider — not as an architectural default.

18

Sovereignty & Regulated Regions

Why "where" matters as much as "what"

  • GDPR (EU) — personal data of EU residents has to follow them, including across cloud regions; transfers need SCCs / DPF
  • UK Data Protection Act 2018 — post-Brexit GDPR equivalent, with the EU adequacy decision still in force as of 2025
  • US — sectoral: HIPAA (health), FedRAMP (federal), CJIS (criminal-justice), ITAR (defence)
  • China & Russia — data localisation is mandatory; foreign cloud presence is restricted

Sovereign clouds (a 2024–2026 wave)

  • AWS European Sovereign Cloud — GA late 2025, German-staffed, EU-only data plane
  • Microsoft Cloud for Sovereignty + Azure Local
  • Google Sovereign Solutions with T-Systems / Thales / Minsait partners
  • OVHcloud, Scaleway, Aruba — EU-headquartered hyperscalers
  • Bleu (Capgemini × Orange × Microsoft for France)

What "in-region" actually guarantees

  • Data at rest stays in the region's storage — yes
  • Control plane traffic stays — not always; some metadata routes via the US
  • Provider personnel access — depends on the SKU; "sovereign" tiers add personnel-jurisdiction guarantees
  • Subpoena risk under the US CLOUD Act — applies to any US-headquartered provider regardless of region; this is what sovereign clouds explicitly solve

The schism that did not happen

2018 fears that GDPR would force the internet into national silos largely didn't materialise — but the AI Act (2024), the EU Data Act (2025) and rising geopolitics are slowly producing the layered, residency-first cloud market they predicted. Architect for it now.

19

Decision Tree — Pick a Layer

need to ship something smart features? need control of OS / kernel? AIaaS (deck 06) selling to others? PaaS / FaaS / CaaS IaaS (deck 02) SaaS (deck 04) spiky & event-driven? FaaS (deck 03) PaaS / CaaS (deck 03) yes no yes no no yes yes no no yes
20

Common Anti-Patterns

"Lift & shift" without re-architecting

VMware-on-AWS your entire data centre in 18 months and act surprised when the bill is double. The real cost wins live one or two layers up. Migrate the load-bearing services to PaaS/CaaS as you go.

"Multi-cloud from day one"

Two clouds = double the IAM model, double the network, double the on-call, half the depth on each. Solve your cloud well first.

"Serverless monolith"

200 Lambdas where 5 services would do. Every function deploys independently, but they share a database and tightly-coupled code paths. You've shipped a distributed monolith with extra cold starts.

"Build platform on Kubernetes"

If you're not Spotify or Goldman Sachs, you don't need a platform team running EKS, Istio, ArgoCD, Flux, OPA, Crossplane, Backstage. Cloud Run / App Runner / Container Apps will save your team's lives.

"Free tier means free at scale"

Most "$0 forever" tiers cap at 1k requests/day; surprise spend hits at the 99th-percentile day. Set budget alarms and request quotas before you ship.

"AIaaS = unlimited intelligence"

Tokens are expensive, latency is real, agents loop, and prompts leak. Treat AIaaS calls like external API calls — observability, circuit breakers, caching, cost ceilings. Deck 06 covers this.

21

Summary & What's Next

Three takeaways

  1. The *aaS layers form a spectrum of "how much of the stack you rent". Pick the highest layer that meets your need.
  2. Cloud doesn't make compute cheaper — it makes it elastic. Whether that's net-cheaper depends on workload shape and cost discipline.
  3. Lock-in, sovereignty, and the shared-responsibility model are architectural decisions you can't fix later cheaply.

Series ahead

  • 02 IaaS Foundations — VMs, VPC, storage, regions
  • 03 PaaS / FaaS / CaaS — managed compute in depth
  • 04 SaaS Architecture — multi-tenancy, B2B identity, metering
  • 05 Cloud Security — IAM, secrets, network, compliance
  • 06 LLM-as-a-Service — providers, RAG-aaS, agents-aaS, governance

Companion decks

One sentence

"The cloud is a stack of rented layers; the architect's job is to pick the layer where the marginal control you keep is worth more than the operational tax it costs."