CLOUD SERVICE MODELS · PART 1 OF 6
    

The *aaS Series
Service Models

IaaS · PaaS · SaaS · CaaS · FaaS · AIaaS — the taxonomy that runs the modern internet

IaaS PaaS SaaS CaaS / FaaS AIaaS

🏢 Hardware → ☁ IaaS → 🛠 PaaS → 📦 CaaS → ⚡ FaaS → 🧑‍💼 SaaS → 🧠 AIaaS

What each aaS layer actually is · the shared-responsibility model · 1999 → 2026 history · pricing · vendor matrix · how to choose.

Taxonomy · Responsibility · History · Choice

Topics

Foundations

What "the cloud" actually is — and what it isn't
The shared-responsibility model
1999 → 2026 timeline of cloud service models
Pricing models — usage, seat, tier, commit

The six layers

IaaS — Infrastructure-as-a-Service
PaaS — Platform-as-a-Service
CaaS — Containers-as-a-Service
FaaS — Functions / serverless
SaaS — Software-as-a-Service
AIaaS — AI / LLM-as-a-Service

Trade-offs

Single-tenant vs multi-tenant
Capex vs opex — what cloud actually changes
Data gravity & vendor lock-in
Sovereignty & regulated regions

Choosing & landscape

Vendor matrix — AWS / GCP / Azure / Cloudflare / Fly / Vercel
The "managed-services gradient"
A decision tree: pick a layer
Common anti-patterns & pointer to the rest of the series

What "The Cloud" Actually Is

The cloud is someone else's computer, rented by the second, behind an API. Everything else — IaaS, PaaS, SaaS — is a question of how much of the stack you rent versus run.

NIST's five essentials (SP 800-145, 2011)

On-demand self-service — provision via API, no human in the loop
Broad network access — reachable from anywhere
Resource pooling — physical hardware shared by many tenants
Rapid elasticity — scale up and down minute-to-minute
Measured service — you only pay for what you used

If a thing lacks any one of these, it isn't cloud — it's outsourced hosting.

The three deployment models

Public cloud — AWS / GCP / Azure / Cloudflare. Pay-as-you-go, multi-tenant by default.
Private cloud — same APIs, your hardware (OpenStack, VMware Cloud, Outposts).
Hybrid & multi-cloud — workloads split across both, often for sovereignty or burst capacity.

"Cloud-native" ≠ "in the cloud"

Lifting a VM image to EC2 is "in the cloud" but not "cloud-native". Cloud-native means designed to survive instance death, scale horizontally, and be deployed continuously — see CNCF for the canonical definition.

What the cloud is not

It is not magically secure, not magically cheap, and not magically reliable. Each of those is a property you have to design in — see decks 04, 05, and the CI/CD deck.

The *aaS Spectrum — What You Manage

Reading bottom-to-top: as you climb the stack, the provider takes more responsibility and you lose more control. Pick the highest layer that still gives you what you need.

Shared Responsibility — The Most Misread Diagram

The shared-responsibility model says "cloud security is a partnership". The provider runs security of the cloud; you run security in the cloud. The line moves up the stack as you move up the *aaS ladder.

What the provider always handles

Physical data-centre security & power
Hypervisor patching
Top-of-rack network & cross-region transit
Hardware decommissioning & media destruction
The bricks-and-mortar of compliance — they give you the SOC 2 Type II report; you have to use it correctly

What you always handle

Your data (encryption keys, classification, retention)
Your identities & access policies
Your client-side configuration & secrets
How you grant access to the data — see deck 05

What moves with the layer

Layer	OS	Runtime	App	Data
IaaS	you	you	you	you
CaaS	provider	you	you	you
PaaS	provider	provider	you	you
FaaS	provider	provider	you	you
SaaS	provider	provider	provider	shared

The classic mistake

"It's in S3, AWS handles security." S3 is multi-tenant object storage; AWS secures the storage substrate, but every public-bucket leak in the last decade — Capital One, Verizon, Accenture, the Pentagon, FedEx — was a customer IAM misconfiguration, not a provider failure. Deck 05 covers this in detail.

IaaS — Infrastructure-as-a-Service

You rent the virtual hardware — VMs, virtual networks, virtual disks — and assemble everything on top yourself. The closest cloud equivalent of "a server in a rack".

What you get

Compute — a Linux/Windows VM you SSH into, a chosen CPU/RAM shape, a region, an availability zone.
Networking — a VPC, subnets, routes, security groups, public IPs, load balancers.
Storage — block volumes (EBS), object stores (S3), file shares (EFS).
Identity — IAM users / roles / policies that govern everything.

Canonical examples

AWS EC2, EBS, VPC, IAM, S3
GCP Compute Engine, Persistent Disk, VPC, GCS
Azure Virtual Machines, Managed Disks, VNets
Linode / Vultr / Hetzner — cheaper, simpler IaaS

When IaaS still wins

You need very specific kernels, drivers, or kernel modules (eBPF, GPU, RDMA, real-time).
You're regulated and the auditor wants "we own and patch the OS" in writing.
You have a stateful workload that doesn't fit a managed pattern (legacy ERP, MPI cluster).
You want maximum negotiating leverage on cost (3-year reserved instances, savings plans).

The hidden cost

An EC2 instance is "cheap" until you realise you also need: patching, monitoring, log shipping, backups, a bastion, two of everything across AZs, and a person to wake up at 03:00. Deck 02 spells out the operational footprint.

The rule of thumb

If you can do the job at PaaS or above, do. IaaS is the floor — useful when you need it, expensive when you don't.

PaaS — Platform-as-a-Service

You give the provider a git push (or an artefact); they give you a running URL. The provider owns the OS, the runtime, the build pipeline, the load balancer, the autoscaler, the rolling deploy, the SSL cert, and the logs UI.

The "Heroku-style" PaaS

Heroku — the original (2007). Buildpacks, Procfile, dynos.
Render — modern Heroku spiritual successor; native HTTPS, free tier, preview environments.
Fly.io — Heroku-style DX, multi-region by default, Firecracker VMs under the hood.
Railway — generous free tier, Postgres-and-friends in a click.

Hyperscaler PaaS

AWS Elastic Beanstalk, App Runner, Amplify Hosting, Lightsail Containers
GCP App Engine (Standard & Flexible)
Azure App Service, Static Web Apps, Container Apps

Frontend-shaped PaaS

Vercel — Next.js native, edge functions, ISR, image optimisation
Netlify — JAMstack, Functions, Edge Handlers, Forms
Cloudflare Pages — globally distributed static + Workers backend

The PaaS bargain

You trade control for velocity. The platform decides your build image, your config surface, your tail-latency budget, your scaling algorithm. In exchange you ship in minutes and never touch a hypervisor.

Deck 03 covers PaaS in detail with side-by-side examples.

SaaS — Software-as-a-Service

The provider runs a finished application on your behalf, multi-tenant, accessed over the network — usually a browser or an API. You log in, you use it, you pay a subscription. This is the part the user actually sees.

Examples (you use most of them)

Communication — Slack, Microsoft 365, Zoom, Gmail
CRM & sales — Salesforce (the canonical SaaS, 1999), HubSpot, Pipedrive
Dev tools — GitHub, GitLab, Linear, Sentry, Datadog
Finance — Stripe, Xero, QuickBooks Online
Storage — Dropbox, Box, Google Drive, OneDrive

What makes it SaaS (not just hosted software)

Multi-tenant by default — one shared database, tenant-scoped queries
Continuously deployed — same version for everyone (mostly)
Usage- or seat-priced subscription billing
Self-service signup & cancellation

B2C vs B2B SaaS

B2C — individual users pay with a card; signup is friction. Notion, Spotify.
B2B — organisations pay; SSO, SCIM, audit logs, BYO domain, BAA, DPA, custom MSA. Slack Enterprise Grid.
Same code, very different go-to-market — and very different identity stories. Deck 04 covers B2B identity in detail.

SaaS architecture is its own discipline

Multi-tenancy patterns (silo / pool / bridge)
Per-tenant observability & rate-limiting
Metering & billing data models
Identity federation (SAML, SCIM, OIDC)
Data residency & compliance

→ Deck 04 of this series. Existing companion: Monetising & Distributing Software for the business side.

CaaS — Containers-as-a-Service

Halfway between IaaS and PaaS. You bring a container image; the provider runs it, scales it, networks it. No OS, no buildpack opinions, no servers — but unlike PaaS, the container is your world end-to-end.

"Run my container" services

Google Cloud Run — request-billed, scale-to-zero, <1s cold start
AWS ECS on Fargate — task-based, no EC2 to manage
Azure Container Apps — KEDA-driven autoscale, Dapr built in
Fly Machines — micro-VMs, multi-region, <500ms cold start
Cloudflare Containers — globally distributed, scale-to-zero (2025)

Managed Kubernetes (the heavyweight CaaS)

EKS / GKE / AKS — provider runs the control plane, you run the workloads
Cluster-as-a-Service (cluster-ops) is its own emerging layer above plain managed K8s

When CaaS beats PaaS

Polyglot stack — your runtime is not on the PaaS menu
You need exact control over the image (CVE patching, FIPS mode)
You want portability — the same image runs anywhere Docker runs

When CaaS beats Kubernetes

You don't have a platform team
You don't need workload-level customisation (PSPs, custom CSI, CRDs)
Scale-to-zero and per-request billing matter for cost

The trap

"Container" doesn't mean "stateless". CaaS providers love stateless HTTP services and hate everything else. Stateful workloads (databases, queues, GPU training) push you back to IaaS or specialised managed services.

FaaS — Functions / Serverless

You upload a function. It runs when an event arrives — an HTTP request, a queue message, a file upload, a cron tick — and is billed per invocation, often to the millisecond. The platform handles scaling from zero to thousands of concurrent invocations and back.

The five canonical platforms

Platform	Runtime	Cold-start
AWS Lambda	Node, Python, Go, Java, custom	50–500 ms
Google Cloud Functions / Run	Node, Python, Java, .NET	100–800 ms
Azure Functions	.NET, Node, Python, Java	100–1500 ms
Cloudflare Workers	JS / WASM (V8 isolates)	< 5 ms
Vercel / Netlify Functions	Lambda or edge	variable

Where FaaS shines

Spiky / event-driven workloads (Slack bots, webhooks, ETL triggers)
Glue between managed services (S3 → Lambda → DynamoDB)
Long-tail APIs where idle cost >> active cost

Where FaaS hurts

Cold starts — first hit after idle is slow, breaks SLOs
Long-running jobs — most platforms cap at 5–15 minutes
Sticky / stateful workloads — no in-memory cache between invocations
Local dev — emulators are good but never quite the cloud

Edge FaaS — a different beast

Cloudflare Workers / Vercel Edge / Deno Deploy run V8 isolates instead of containers. Cold start is ~ms, but you lose Node APIs, native binaries, and long-running connections. Best for low-latency request rewriting, A/B routing, auth, geofencing — not heavy compute.

The "serverless monolith" trap

Resist the urge to ship 200 Lambdas where 5 services would do. Per-function complexity is real — observability, deploys, IAM, cold starts, dependency footprint. Function granularity should match your change boundary, not your route table.

AIaaS — AI / LLM-as-a-Service

The newest layer (~2020). The provider runs the model — increasingly, the entire agent — and gives you an API. Pricing is per token, per second, per request, or per agent-step.

Three sub-layers

Inference-as-a-Service — OpenAI, Anthropic, Google, Together, Groq, Fireworks, Replicate. Bring a prompt, get tokens.
Embedding & RAG-as-a-Service — Pinecone, Weaviate Cloud, Turbopuffer, Vespa Cloud. Bring documents, get a search API.
Agents-as-a-Service — Bedrock Agents, Vertex AI Agent Builder, OpenAI Assistants, LangSmith Hub. Bring a goal, get an agent.

Hyperscaler model gardens

AWS Bedrock — Anthropic, Meta, Mistral, Cohere, Amazon Nova
GCP Vertex AI — Gemini, Anthropic, Meta, partner zoo
Azure OpenAI — OpenAI under Microsoft compliance perimeter

Why AIaaS is its own *aaS

The shared-responsibility model is different — your prompt and tokenised output are data, but inference happens inside someone else's model. Whose data is the model's hidden state?
Pricing isn't per-second-of-VM — it's per token, with caching tiers that can be 10× cheaper. See prompt caching.
Latency is a first-class concern (TTFT, tokens-per-second).
Compliance has new shapes — model-card transparency, BYOK, no-training options, EU AI Act.

Companion decks

Deck 06 covers managed LLM services in depth. For self-hosted alternatives see Local LLM Hosting.

A Brief History — 1999 → 2026

The pre-2006 world

1999: Salesforce — first SaaS at scale; "no software" as a slogan
2002: AWS-the-internal-team forms inside Amazon
Hosted apps existed (ASPs in the 90s) but lacked the NIST essentials — no API, no elasticity, no per-second billing

2006–2010 — the great unbundling

2006: EC2 + S3 redefine "hosting" as IaaS
2008: Heroku ships git push heroku main
2008: App Engine — Google's PaaS bet
2010: OpenStack formalises private-cloud IaaS

2014–2020 — containers and functions eat IaaS

2014: Lambda launches — FaaS goes mainstream
2015: Kubernetes 1.0; ECS already 1 year old
2017: Fargate — serverless containers
2019: Cloud Run — request-billed serverless containers

2020+ — the AI layer

2020: GPT-3 API; AIaaS arrives quietly
2023: ChatGPT, Bedrock, Vertex AI, Azure OpenAI all GA
2024–2025: Agents-as-a-Service, MCP, evaluation-as-a-service
2025: EU AI Act (Aug 2024) and US Executive Orders reshape AIaaS compliance

Pricing Models — Five Shapes

Shape	Unit	Best fit	Trap
Pay-as-you-go (PAYG)	second / GB / request	variable, unpredictable load	infinite spend if a loop runs away
Reserved / committed-use	1- or 3-year commitment	steady-state baseline	locked in if your shape changes
Spot / preemptible	auction-priced VM-second	fault-tolerant batch, training	can be reclaimed mid-job with 30s warning
Per-seat subscription	$N / user / month	SaaS	every "license" gets shared internally
Per-token / per-call	$N / 1M tokens or $N / call	AIaaS	retries, agent loops, long contexts blow it up

The three eras of cloud cost

2006–2014: cloud is cheap — laptops vs racks
2014–2022: FinOps emerges — commitment discounts, rightsizing, idle reaping
2022+: repatriation debate — DHH at 37signals, Dropbox earlier, Stack Overflow — large steady-state workloads moving back on-prem

What you actually pay for (the long tail)

Egress — bytes leaving the cloud. AWS/GCP/Azure all charge $0.05–0.12/GB; S3 → internet at ~$10K/TB/month
Cross-AZ traffic — billed even between two of your instances
NAT gateways — flat hourly + per-GB; surprise on a busy week
Public IPv4 addresses — chargeable from Feb 2024 (~$4/month each)
Idle managed services — RDS doesn't scale to zero; one forgotten test cluster = real money

Cloud egress cartel — and the cracks

EU Data Act (effective 12 Sep 2025) forces "switching charges" on customers leaving a cloud to be free by Jan 2027. AWS/GCP/Azure responded with limited free-egress on full-account exit. Cloudflare's R2 (zero-egress) has been pressuring this since 2022.

Single-Tenant vs Multi-Tenant

Every *aaS layer makes a choice: do all customers share one running stack (multi-tenant) or does each get their own (single-tenant)? This is the most important architectural decision in cloud after "which region".

Multi-tenant — the default for SaaS

One DB schema, one app deployment; tenant ID is a column on every table
Highest density, lowest unit cost — Salesforce, Slack, Notion
Failure-domain risk: one bad query, every customer suffers ("noisy-neighbour")
Hardest part: tenant isolation — the one bug that returns Tenant B's data to Tenant A is existential

Single-tenant — the enterprise upcharge

Each customer gets their own DB, their own deployment, sometimes their own VPC
Higher per-customer cost, lower density
Sold as "isolated" / "dedicated" / "Enterprise"
Mandatory for regulated workloads — HIPAA BAA, FedRAMP, EU sovereign clouds

The four real-world patterns

Pattern	Isolation	Density
Pool — one stack, tenant-id everywhere	logical	highest
Bridge — shared compute, isolated DB schema	db-level	high
Silo — full per-tenant stack	full	low
Hybrid — pool by default, silo for whales	tiered	tuned

Deck 04 of this series goes deep on these.

The most common failure

Starting pool, growing past the point where one bad customer can take down everyone, and then spending two years bolting silo onto pool. Build the silo escape hatch from day one even if you never use it.

Vendor Matrix — Who Sells Each Layer

Layer	AWS	GCP	Azure	Cloudflare	Specialist
IaaS	EC2, EBS, VPC, S3	Compute Engine, GCS, VPC	VMs, VNets, Blob	—	Hetzner, Linode, Vultr, OVH
CaaS	ECS, Fargate, App Runner, EKS	Cloud Run, GKE	Container Apps, AKS	Containers (2025)	Fly.io, Koyeb, Railway, Northflank
PaaS	Beanstalk, Amplify, Lightsail	App Engine	App Service, Static Web Apps	Pages	Heroku, Render, Vercel, Netlify, Railway
FaaS	Lambda, Step Functions	Cloud Functions, Cloud Run jobs	Functions, Logic Apps	Workers, Durable Objects	Vercel Functions, Deno Deploy
Managed K8s	EKS, EKS Auto Mode	GKE Autopilot	AKS	—	DigitalOcean, Civo, Linode, Scaleway
SaaS infra (DBaaS)	RDS, DynamoDB, Aurora	Cloud SQL, Spanner, Firestore	SQL, Cosmos DB	D1, KV, R2, Hyperdrive	Neon, PlanetScale, Supabase, MongoDB Atlas
AIaaS	Bedrock, SageMaker	Vertex AI, Gemini API	Azure OpenAI, AI Foundry	Workers AI, AI Gateway	OpenAI, Anthropic, Together, Groq, Fireworks, Replicate

The hyperscaler gravity

AWS / GCP / Azure are the only providers that span every row. The bargain: deeper integration (and bigger bills) than any specialist.

The specialist play

Each specialist owns one row and tries to be measurably better than the hyperscaler in DX, price, or geography. Vercel for frontend PaaS, Fly for global containers, Cloudflare for edge, Anthropic/OpenAI for AIaaS, PlanetScale/Neon for DBaaS.

The "Managed-Services Gradient"

Real systems live in multiple *aaS layers at once. A typical SaaS app uses VMs (IaaS) for batch ML, Cloud Run (CaaS) for the API, S3 (storage-as-a-service) for files, RDS (database-as-a-service) for state, Cognito (identity-as-a-service) for auth, and Bedrock (AIaaS) for the smart features.

The healthy mix

Default to the highest layer that meets your need
Drop a layer only when you've hit a real ceiling (cost, latency, control)
Most workloads get all the operational maturity they need at PaaS / CaaS
IaaS is where the rough edges are — only choose it deliberately

Where the gradient breaks

Stateful workloads — managed services exist (RDS, Cloud SQL, Spanner) but with steep cost and feature gaps
GPU compute — IaaS-only on hyperscalers; specialists (Modal, RunPod, Lambda Labs) sell GPU-as-a-Service
Realtime / very low-latency — sometimes cheaper to own the box

Capex vs Opex — What Cloud Actually Changes

Cloud doesn't make compute cheaper — it makes it elastic. Whether that's net-cheaper depends on how steady your workload is.

Pre-cloud — capex

Buy hardware up front, depreciate over 3–5 years
Provision for peak — Black Friday, end-of-quarter
Most racks ran at 10–20% utilisation
Lead time for new capacity: 6–12 weeks

Cloud — opex

Pay per second / GB / request
Provision for now; scale up / down on demand
Utilisation can hit 70–90% with autoscaling and spot
Lead time: seconds

When cloud is cheaper

Spiky / seasonal load
Early-stage, unknown demand curve
Many small services (managed glue is real)
Anywhere capex approval is slow

When on-prem wins

Steady, predictable, large workloads (DHH 37signals: ~$2M/yr saved)
Heavy egress (CDN, video) — cloud egress is the killer
Specialised hardware (FPGAs, custom ASICs, large GPU farms)
Strong cost-engineering culture & ops staff already in place

Hidden cost

The ops team that no longer racks servers still has to write Terraform, audit IAM, debug VPC routes, and run on-call. Cloud moves work; it doesn't always remove it.

Data Gravity & Vendor Lock-in

The compute layer is portable; the data is not. Once you have a petabyte in S3, every other workload wants to be where the data is — and getting it out costs egress, time, and rewrites.

Lock-in shapes

Data lock-in — bytes and the egress bill
API lock-in — DynamoDB, Cosmos, Spanner have no equivalents
Operational lock-in — IAM, secrets, observability tooling
Skill lock-in — your team only knows AWS
Compliance lock-in — re-certifying takes 9+ months

Mitigations (partial)

Stick to portable interfaces — Postgres > Aurora-only features; Kafka > Kinesis-only features; OpenTelemetry > vendor-specific tracing
Containerise everything — same image runs on any CaaS
Treat IaC (Terraform, Pulumi) as your contract with the cloud
Multi-cloud only when it pays for itself; usually it doesn't

When lock-in is a feature

Spanner, BigQuery, DynamoDB single-digit-ms reads — nothing portable matches them
Cloudflare Workers' <5ms cold start — V8 isolates aren't portable to Lambda
Bedrock's compliance perimeter for HIPAA / FedRAMP-aligned LLM use

Lock-in is a tax. Sometimes it buys something worth the tax.

Multi-cloud — the false escape hatch

"Multi-cloud" usually means "we have two single-cloud deployments and double the ops burden". Use it when regulation demands it (sovereignty, two-vendor rule for critical infra), or when one workload genuinely fits a different provider — not as an architectural default.

Sovereignty & Regulated Regions

Why "where" matters as much as "what"

GDPR (EU) — personal data of EU residents has to follow them, including across cloud regions; transfers need SCCs / DPF
UK Data Protection Act 2018 — post-Brexit GDPR equivalent, with the EU adequacy decision still in force as of 2025
US — sectoral: HIPAA (health), FedRAMP (federal), CJIS (criminal-justice), ITAR (defence)
China & Russia — data localisation is mandatory; foreign cloud presence is restricted

Sovereign clouds (a 2024–2026 wave)

AWS European Sovereign Cloud — GA late 2025, German-staffed, EU-only data plane
Microsoft Cloud for Sovereignty + Azure Local
Google Sovereign Solutions with T-Systems / Thales / Minsait partners
OVHcloud, Scaleway, Aruba — EU-headquartered hyperscalers
Bleu (Capgemini × Orange × Microsoft for France)

What "in-region" actually guarantees

Data at rest stays in the region's storage — yes
Control plane traffic stays — not always; some metadata routes via the US
Provider personnel access — depends on the SKU; "sovereign" tiers add personnel-jurisdiction guarantees
Subpoena risk under the US CLOUD Act — applies to any US-headquartered provider regardless of region; this is what sovereign clouds explicitly solve

The schism that did not happen

2018 fears that GDPR would force the internet into national silos largely didn't materialise — but the AI Act (2024), the EU Data Act (2025) and rising geopolitics are slowly producing the layered, residency-first cloud market they predicted. Architect for it now.

Decision Tree — Pick a Layer

Common Anti-Patterns

"Lift & shift" without re-architecting

VMware-on-AWS your entire data centre in 18 months and act surprised when the bill is double. The real cost wins live one or two layers up. Migrate the load-bearing services to PaaS/CaaS as you go.

"Multi-cloud from day one"

Two clouds = double the IAM model, double the network, double the on-call, half the depth on each. Solve your cloud well first.

"Serverless monolith"

200 Lambdas where 5 services would do. Every function deploys independently, but they share a database and tightly-coupled code paths. You've shipped a distributed monolith with extra cold starts.

"Build platform on Kubernetes"

If you're not Spotify or Goldman Sachs, you don't need a platform team running EKS, Istio, ArgoCD, Flux, OPA, Crossplane, Backstage. Cloud Run / App Runner / Container Apps will save your team's lives.

"Free tier means free at scale"

Most "$0 forever" tiers cap at 1k requests/day; surprise spend hits at the 99th-percentile day. Set budget alarms and request quotas before you ship.

"AIaaS = unlimited intelligence"

Tokens are expensive, latency is real, agents loop, and prompts leak. Treat AIaaS calls like external API calls — observability, circuit breakers, caching, cost ceilings. Deck 06 covers this.

Summary & What's Next

Three takeaways

The *aaS layers form a spectrum of "how much of the stack you rent". Pick the highest layer that meets your need.
Cloud doesn't make compute cheaper — it makes it elastic. Whether that's net-cheaper depends on workload shape and cost discipline.
Lock-in, sovereignty, and the shared-responsibility model are architectural decisions you can't fix later cheaply.

Series ahead

02 IaaS Foundations — VMs, VPC, storage, regions
03 PaaS / FaaS / CaaS — managed compute in depth
04 SaaS Architecture — multi-tenancy, B2B identity, metering
05 Cloud Security — IAM, secrets, network, compliance
06 LLM-as-a-Service — providers, RAG-aaS, agents-aaS, governance

Companion decks

Deploying Web Applications — beginner deployment tour
Monetising & Distributing Software — SaaS business side
Docker series — containers
Introduction to OAuth — identity for SaaS
LLMs hub — AI / agentic content

One sentence

"The cloud is a stack of rented layers; the architect's job is to pick the layer where the marginal control you keep is worth more than the operational tax it costs."

The *aaS SeriesService Models

Topics

Foundations

The six layers

Trade-offs

Choosing & landscape

What "The Cloud" Actually Is

NIST's five essentials (SP 800-145, 2011)

The three deployment models

"Cloud-native" ≠ "in the cloud"

What the cloud is not

The *aaS Spectrum — What You Manage

Shared Responsibility — The Most Misread Diagram

What the provider always handles

What you always handle

What moves with the layer

The classic mistake

IaaS — Infrastructure-as-a-Service

What you get

Canonical examples

When IaaS still wins

The hidden cost

The rule of thumb

PaaS — Platform-as-a-Service

The "Heroku-style" PaaS

Hyperscaler PaaS

Frontend-shaped PaaS

The PaaS bargain

SaaS — Software-as-a-Service

Examples (you use most of them)

What makes it SaaS (not just hosted software)

B2C vs B2B SaaS

SaaS architecture is its own discipline

CaaS — Containers-as-a-Service

"Run my container" services

Managed Kubernetes (the heavyweight CaaS)

When CaaS beats PaaS

When CaaS beats Kubernetes

The trap

FaaS — Functions / Serverless

The five canonical platforms

Where FaaS shines

Where FaaS hurts

Edge FaaS — a different beast

The "serverless monolith" trap

AIaaS — AI / LLM-as-a-Service

Three sub-layers

Hyperscaler model gardens

Why AIaaS is its own *aaS

Companion decks

A Brief History — 1999 → 2026

The pre-2006 world

2006–2010 — the great unbundling

2014–2020 — containers and functions eat IaaS

2020+ — the AI layer

Pricing Models — Five Shapes

The three eras of cloud cost

What you actually pay for (the long tail)

Cloud egress cartel — and the cracks

Single-Tenant vs Multi-Tenant

Multi-tenant — the default for SaaS

Single-tenant — the enterprise upcharge

The four real-world patterns

The most common failure

Vendor Matrix — Who Sells Each Layer

The hyperscaler gravity

The specialist play

The "Managed-Services Gradient"

The healthy mix

Where the gradient breaks

Capex vs Opex — What Cloud Actually Changes

Pre-cloud — capex

Cloud — opex

When cloud is cheaper

When on-prem wins

Hidden cost

Data Gravity & Vendor Lock-in

Lock-in shapes

Mitigations (partial)

When lock-in is a feature

The *aaS Series
Service Models