CLOUD SERVICE MODELS · PART 3 OF 6

PaaS · FaaS · CaaS

The managed-compute middle — where most workloads should live
Heroku · Render · Fly · Vercel Cloud Run · Fargate · Container Apps Lambda · Cloud Functions · Workers EKS · GKE · AKS
git push build image deploy live URL

Side-by-side tour of every flavour of managed compute · cold-start economics · pricing & latency · decision matrix.

Platform  ·  Function  ·  Container  ·  Choose
01

Topics

PaaS

  • Heroku-style — Render, Fly, Railway
  • Hyperscaler — Beanstalk, App Engine, App Service
  • Frontend-shaped — Vercel, Netlify, Cloudflare Pages
  • Buildpacks & build orchestration

CaaS

  • "Run my container" — Cloud Run, Fargate, Container Apps, Fly Machines
  • Managed Kubernetes — EKS, GKE, AKS, Autopilot
  • The serverless container revolution

FaaS

  • Lambda anatomy — invoke → init → execute
  • Cold starts — what causes them, what fixes them
  • Edge FaaS — V8 isolates (Workers, Edge Functions)
  • Event-driven patterns & orchestration (Step Functions)

Choosing

  • Cost model comparison — per-hour vs per-request
  • Latency profiles — cold start, warm, p99
  • Decision matrix & anti-patterns
  • Deployment strategies — blue/green, canary, rolling
02

The Managed-Compute Spectrum

Reading left-to-right: your unit shrinks (server → container → function), scale-to-zero gets faster, the platform handles more, and you trade away more flexibility.

VMs (IaaS) Managed K8s CaaS PaaS FaaS unit: instance scale-to-0: never control: max unit: pod scale-to-0: nodes only control: high unit: container scale-to-0: yes control: medium unit: app scale-to-0: some control: low unit: function scale-to-0: instant control: minimal $/hour, idle pays $/node-hour $/req-second $/instance-hour or req $/invocation·ms "the box is yours" "a function ran somewhere" more managed → less control · faster scale-to-zero · finer-grained billing
03

Heroku-Style PaaS — git push, get URL

The original PaaS pattern — Heroku, 2007. Push code; the platform detects the language, runs a build, deploys an app, gives you a URL, scales it on a slider. Modern siblings have closed the gap on price and added preview environments, multi-region, and IPv6.

The five contenders

PlatformBuildKiller featureFrom ($)
HerokuBuildpacksadd-ons marketplace, 18 yrs of stability$5/mo eco-dyno
RenderAuto-detect / Dockerfilepreview environments, native HTTPS, free DBsfree tier · $7/mo paid
Fly.ioDockerfile / nixpacksmulti-region by default, Firecracker μVMs~$5/mo
RailwayNixpacksPostgres-and-friends in a click$5/mo
KoyebBuildpacks / Dockeredge global, scale-to-0 paid tierfree hobby

A typical workflow

$ git push render main
==> Detected language: Node.js (package.json)
==> Running build: npm ci && npm run build
==> Building image (cached layers)
==> Provisioning HTTPS cert (Let's Encrypt)
==> Deploying to: my-app.onrender.com
==> Health check passing
==> Live in 2m18s

What you give up

  • Build-image choice (your runtime is whatever the platform says it is)
  • Networking depth — limited VPC, no NAT, no peering
  • Sidecars / multi-process pods (Heroku has limited buildpack-multi-procfile, Fly has machine-side)
  • OS tweaks (no apt-get install at runtime; build-time only)

What you get back

  • Time. From "I have an idea" to "URL anyone can hit" in < 10 minutes
  • Free TLS, free preview env per PR, autoscaling, log UI, no on-call for hypervisor
  • Add-ons — Postgres, Redis, S3-compat object stores, CDN, Sentry — one click

When you outgrow it

Per-instance pricing > CaaS at > ~$2k/month spend; or you need a private link to an internal data store; or your stack hits a runtime limit. Climb to CaaS or managed K8s.

04

Hyperscaler PaaS

Each big-three cloud has its own PaaS. They're more enterprisey, more integrated with the rest of the cloud (IAM, VPC, observability) and more boring than the indie PaaS — which is the point if you're already in the cloud.

AWS

  • Elastic Beanstalk — the original, EC2 underneath, dated UX
  • App Runner — modern, container-or-source, scale-to-zero (2024+)
  • Amplify Hosting — frontend SPAs & SSR, Vercel-like
  • Lightsail — VPS-style fixed-price PaaS

GCP

  • App Engine Standard — strict sandbox, sub-second autoscale, deprecated for greenfield
  • App Engine Flex — VM-based, less popular than Cloud Run
  • Firebase Hosting — frontend + functions

Azure

  • App Service — broad runtime support, Windows + Linux, slot-based deployment
  • Static Web Apps — frontend + API in one
  • Container Apps — also CaaS, see slide 08

Why they exist

Selling "your team uses one cloud" — same IAM, same network, same support contract. Procurement loves them. Developers usually prefer Render, Vercel, or Cloud Run.

The Beanstalk warning

Old, opinionated, slow-deploying. Many teams trying to escape Beanstalk in 2024–2026 — App Runner is the AWS-native escape hatch; ECS Fargate is the polyglot one.

05

Frontend-Shaped PaaS

A new category: PaaS optimised for the JS frontend stack. Edge by default, framework-aware, preview-per-PR, and tightly bound to one or two frameworks (Next, Astro, SvelteKit).

Vercel

  • Made by the Next.js team — first-class for Next App Router, RSC, ISR, image optimisation
  • Edge functions (V8 isolates) + Lambda functions (Node)
  • Preview deployments per branch / PR — game-changing UX
  • Pricing: free hobby, $20/mo Pro, custom Enterprise; bandwidth gets expensive at scale

Netlify

  • The original JAMstack PaaS (2014)
  • Functions, Edge Handlers, Forms, Identity, A/B split testing
  • Strong agency / marketing-site fit

Cloudflare Pages + Workers

  • Sites pushed to 300+ edge locations globally
  • Workers = backend at the same edge, V8-isolate cold-start <5ms
  • D1 (SQLite), R2 (object), KV, Durable Objects, Queues — full-stack-on-the-edge
  • Pricing: free tier huge, $5/mo Workers Paid; no egress fees ever

Same-day frontends

The Vercel/Netlify/CF pattern collapses what used to be: hosting + CDN + TLS + functions + previews + analytics into one git push. For greenfield consumer-facing apps with React/Next/Svelte, this is the new default.

The framework lock-in

Vercel works best with Next, Cloudflare Pages with their Workers SDK, Netlify with their Functions runtime. Self-hosting Next (via next start on a container) is harder than they advertise — open-next.js has emerged to bridge.

06

Buildpacks · Dockerfiles · Nixpacks

How a PaaS turns your repo into something runnable. Three approaches — pick your trade-off between magic and control.

Buildpacks

  • Heroku invented (2011); CNB (Cloud Native Buildpacks) is the OSS standard
  • Auto-detect: package.json → Node, requirements.txt → Python
  • Reproducible, layered, rebases efficiently on base-image updates
  • Used by Heroku, GCP App Engine / Cloud Run source-deploy, IBM Code Engine

Dockerfile

  • You write the recipe; the platform builds & runs it
  • Maximum control, language-agnostic, portable across every CaaS
  • Multi-stage builds for tiny final images — see Multi-Stage Builds deck
  • Required for: heavy native deps, custom CA bundles, FIPS, BYO image

Nixpacks

  • Railway-led modern alternative, uses Nix under the hood
  • Faster than buildpacks, deterministic, cache-friendly
  • Used by Railway, Coolify, Sevalla

Build at PaaS vs build in CI

  • Build-in-PaaS — push source, platform builds. Simpler, less reproducible.
  • Build-in-CI — GitHub Actions builds the image, pushes to a registry, deploys. More complex, fully reproducible, scans for CVEs in pipeline. The pattern at scale.
  • Hybrid — CI builds & tags; PaaS pulls. Best of both for medium teams.

The "works in CI, fails in PaaS" trap

Different base images, different versions of node, different libc. Lock the runtime version in .tool-versions / package.json engines / runtime.txt — and ideally build the image yourself.

07

"Run My Container" CaaS

Push a container image; the platform runs it, scales it, networks it, gives you a URL. No VMs, no Kubernetes, no buildpack opinions. Most workloads should default here.

ServiceCloudCold startMin CPUScale-to-0?Pricing
Cloud RunGCP~500 ms gen1, < 100ms gen20.08 vCPUyesper-100ms vCPU + GB-sec + req
ECS FargateAWS20–60 s (task launch)0.25 vCPUno (min 1 task)per-second vCPU + GB
App RunnerAWS10–30 s (warm pool)0.25 vCPUyes (paused state)per-second + provisioned
Container AppsAzure~1 s0.25 vCPUyes (KEDA)per-second vCPU + GB + req
Fly MachinesFly.io200–600 ms (cold), < 50 ms (suspended)shared-1× CPUyes (auto-stop)per-second + bandwidth
Cloudflare ContainersCF< 1 s (2025 GA)0.25 vCPUyesper-10ms vCPU + GB

Why CaaS won

  • You bring a container — works in your laptop, prod, CI, anywhere
  • Scale to zero — pay nothing when idle (most platforms)
  • No platform team needed; one team can ship, deploy, and operate
  • Migration path from PaaS or Kubernetes alike

A Cloud Run deploy

gcloud run deploy myapp \
  --image=us-docker.pkg.dev/proj/repo/app:v42 \
  --region=europe-west2 \
  --concurrency=80 \
  --min-instances=0 \
  --max-instances=100 \
  --cpu=1 --memory=512Mi \
  --service-account=app-runner@proj.iam... \
  --allow-unauthenticated
08

Managed Kubernetes — EKS · GKE · AKS

The cloud runs the control plane (etcd, API server, scheduler); you bring node pools and deploy workloads. The "platform you build on top of" — and the right answer when CaaS isn't enough.

When you actually need Kubernetes

  • You need workload-level customisation — sidecars, init containers, custom CSI / CNI
  • You're running enough services that Cloud Run / Fargate billing crosses break-even (~$3-5k/month workloads)
  • You need a service mesh, advanced traffic shaping, custom CRDs
  • Cross-cluster portability (on-prem ↔ cloud, or multi-cloud)

The three managed offerings

ServiceControl planeAuto-modeNotes
EKS$0.10/hr ($73/mo)EKS Auto Mode (2024)Most flexible, most ops
GKE$0.10/hr / cluster after firstAutopilotSmartest defaults, Autopilot is the standout
AKSFree (Standard SKU costs)Tightest Azure integration

Auto / Autopilot mode

  • Provider runs both control plane and node lifecycle
  • You declare pods; the platform decides nodes (Karpenter-style)
  • Closes the gap with CaaS — pay per pod, scale to zero, but keep K8s API
  • GKE Autopilot is the most mature; EKS Auto Mode launched re:Invent 2024

Critical add-ons (real prod K8s)

  • Karpenter / cluster-autoscaler — node-level scaling
  • External-DNS, cert-manager — DNS + TLS automation
  • ArgoCD / Flux — GitOps
  • External Secrets Operator — secrets sync
  • Datadog / OpenTelemetry agents — observability
  • Kyverno / OPA Gatekeeper — admission policy

The truth about K8s cost

"Kubernetes is cheap" is a myth — the control plane is, the platform team isn't. A serious K8s install needs at least 1.5 SREs to operate well. If you don't have them, use CaaS.

09

FaaS — How a Lambda Actually Runs

Caller FaaS Frontend Worker / Firecracker μVM Your handler 1. invoke (HTTP / event) 2. assign warm worker, or "cold start" 3. download package, init runtime, init handler (cold only) 4. handler executes — this is the only part billed pre-2024 5. response 6. response (HTTP / SQS ack / etc) Worker stays warm a few minutes — next invocation skips steps 2–3 entirely (warm path).

Cold path

Steps 2–3. 50ms (Workers) → 1.5s (Java Lambda in VPC). Once per fresh worker.

Warm path

Step 4 only. Handler is already in memory. P50 latency = your code's latency.

Burst path

Sudden traffic → hundreds of cold workers spun up in parallel. P99 spikes; this is where load tests reveal the truth.

10

Cold Starts — Causes & Cures

What contributes to a cold start

  • μVM boot — ~125 ms on Firecracker
  • Runtime init — Node, Python ~50ms; JVM 300–800ms; .NET 200–600ms; Go < 50ms
  • Code download — bigger zip = longer; 50 MB Lambda zip ~150 ms slower than 1 MB
  • Handler init — module imports, DB connection setup, AWS SDK warmup. This is usually the biggest variable.
  • VPC ENI attach — historically seconds; "Hyperplane ENIs" (2019) brought it to ~250 ms

Mitigations

  • Provisioned concurrency / min instances — keep N warm; you pay for them
  • SnapStart — Lambda Java/Python — restore from a memory snapshot, ~100 ms instead of 800 ms
  • Smaller deploy package — split function dependencies, layer common code
  • Lazy-init at module-top only when needed — defer optional clients
  • Compile-on-build — Go & Rust dominate cold-start charts

Cold-start latency, real numbers (2024-25)

Runtimep50 coldp99 cold
Workers (V8 isolate)< 5 ms< 30 ms
Lambda Node 20180 ms650 ms
Lambda Python 3.12200 ms700 ms
Lambda Java + SnapStart200 ms800 ms
Lambda Java (no Snap)900 ms3500 ms
Cloud Run gen2 Node120 ms400 ms
11

Edge FaaS — V8 Isolates

A different beast: each function runs in a V8 isolate (the same construct as a browser tab) instead of a μVM. ~5 ms cold start. No Node-native APIs. Code runs at the closest of hundreds of edge locations to the user.

What's running it

  • Cloudflare Workers — 300+ POPs, the original isolate-based platform
  • Vercel Edge Functions — also V8 isolates, on Vercel's edge
  • Deno Deploy — Deno runtime, Anthropic-and-others backed
  • Fastly Compute@Edge — WASM-based, multi-language

Best fits

  • Auth checks, A/B routing, geofencing — request rewriting at the edge
  • Lightweight APIs with global users (auth, KV reads, image proxying)
  • Edge SSR — render close to the user; revalidate from the origin

What you give up

  • No fs, no net as in Node — only fetch, Request, Response, Streams, WebCrypto
  • No native binaries (sharp, tensorflow.js without WASM)
  • Limited CPU per request (~50 ms wall on Workers Free, 30s Paid; varies)
  • No long-running background jobs (Workers Cron, Durable Objects fill some of this)

Workers KV / D1 / Durable Objects

  • KV — eventually consistent global key-value, microseconds at edge
  • D1 — SQLite per region, replicated
  • Durable Objects — single-tenant in-memory state, transactional, the "actor" pattern at the edge

Closest equivalents elsewhere: Vercel KV/Postgres, Deno KV, PlanetScale Hyperdrive + edge functions.

12

Event-Driven Patterns

FaaS shines on events, not requests. Most production FaaS code runs in response to: a queue message, an S3 upload, a CloudWatch event, a webhook, a DynamoDB change, a Kinesis record.

Common event sources

  • Queues — SQS / Pub/Sub / Service Bus / EventBridge
  • Storage — S3 / GCS / Blob notifications
  • DBs — DynamoDB Streams, Firestore triggers, Cosmos change feed
  • Schedule — EventBridge / Cloud Scheduler / Cron
  • HTTP — API Gateway, ALB, Function URLs

Orchestration — when one function isn't enough

  • AWS Step Functions — JSON state machine; retries, parallel, choice; the standard for AWS pipelines
  • GCP Workflows — YAML state machine
  • Azure Durable Functions — code-first orchestration in C# / JS
  • Temporal — vendor-neutral workflow engine; runs on any cloud

Idempotency & at-least-once

Event sources retry. Your function will be called with the same event again. Design for it: idempotency keys, conditional writes (DynamoDB attribute_not_exists), de-dup via a transaction id.

Dead-letter queues are not optional

  • Configure DLQ on every function
  • Set max-receive count (3–5)
  • Alarm on DLQ depth > 0
  • Have a redrive runbook

The "lambda calls lambda" smell

If one function synchronously calls another, you're paying double, doubling cold starts, and increasing tail latency. Use Step Functions, queues, or just one bigger function.

13

Cost Comparison — Same Workload, Five Layers

A small JSON API at 100 requests/sec sustained and 100 ms median, 256 MB. Approximate, us-east-1, 2025 prices.

LayerServiceSizing$/month (rough)Comment
IaaS2× t3.small (HA)2 vCPU, 2 GB each ~$30 (RI) – $60 (on-demand) + ALB ($23) + NAT GW ($32) + EBS — closer to $120–150
K8sEKS small cluster + 2 t3.small nodes1 control plane + 2 nodes ~$73 + $30–$60 = $100–135 Plus ALB / NAT; before any platform team cost
CaaSCloud Run, max-instances 4, scale-to-01 vCPU, 256MB ~$15–25 Excludes free tier; idle costs $0
PaaSRender Standard $7/mo or Heroku Basic $7/mo each x 2 instancesHA, autoscaling off $14–25 + $7/mo Postgres add-on
FaaSLambda + API GW256 MB, 100 ms each ~$70 (Lambda) + ~$45 (APIGW) API GW is the killer; switch to Function URL or ALB to cut to ~$25 total

Take-aways

  • CaaS & PaaS are the cheapest for steady ~100 RPS workloads
  • Lambda is cheapest only if you're not using API Gateway
  • K8s is rarely cheapest for small workloads — it's a platform, not an optimisation
  • IaaS only wins below this scale (single VM running everything) or above (3-year RI giants)

Where the curve flips

  • < 1 RPS, spiky → FaaS
  • 1–500 RPS, steady → CaaS / PaaS
  • 500+ RPS or stateful → managed K8s or IaaS with reserved capacity
14

Deployment Strategies on Managed Compute

Rolling

The default everywhere. Replace N pods/instances at a time; drain old, start new. Simple, slow, hard to roll back mid-deploy.

Blue / green

Stand up the new version alongside, swap traffic atomically. Cloud Run revisions, AWS App Runner deploys, K8s with two services + a switch. Fast rollback (swap back).

Canary

Send 1% / 5% / 25% / 100% of traffic to the new version while watching SLO metrics. Cloud Run native traffic-splitting, ALB weighted target groups, Linkerd / Istio + Argo Rollouts.

Feature flags & ring deploys

Different problem from infra deploys. Push code dark; flip flags per cohort. LaunchDarkly, Unleash, GrowthBook, Statsig, AWS AppConfig. Decouples "deploy" from "release".

The forgotten half — db migrations

Code can roll forward; data rarely can. Separate schema migrations from code deploys: backwards-compatible migration first, code, then drop-old-column migration. Two deploys, never one.

Where to read more

CI/CD deck covers the pipeline. Deploying Web Applications for beginner deploy strategies.

15

Decision Matrix — Which Layer?

You are…DefaultWhy
Building a Next.js / SvelteKit consumer site Vercel / Cloudflare Pages Framework-aware, edge SSR, preview-per-PR, free TLS
Shipping a small Go / Python / Node API Cloud Run / Render / Fly One container, scale-to-0, deploy via gcloud run deploy or git push
Glue / webhook / S3-triggered ETL Lambda / Cloud Functions Per-event billing; nothing else fits this shape
Cron / scheduled jobs EventBridge + Lambda; Cloud Scheduler + Cloud Run Jobs $0 idle, robust retries, observability built-in
Auth / geofence / A/B at request edge Cloudflare Workers / Vercel Edge < 5 ms cold start, runs at every POP
20+ services, custom CRDs, service mesh Managed K8s (Autopilot/Auto Mode) K8s API, but nodes are someone else's problem
GPU training / inference workload IaaS GPUs or specialist (Modal, RunPod) Lambda has no GPU; CaaS GPU support is thin
Stateful service (DB, queue, cache) Managed service (RDS, MemoryStore, Aurora, Redis Cloud) Don't run state on FaaS or PaaS — use the DBaaS
Existing Heroku app, growing Render / Fly first; Cloud Run if data-side is on GCP Same DX, modern infra, often cheaper
16

Anti-Patterns

"Lambda for everything"

200 functions, mutually-coupled, sharing one Postgres, on a 14-step Step Function. Move steady-state HTTP services to CaaS; keep FaaS for events.

"K8s for everything"

Three engineers, one Helm chart per service, one Crossplane install per cloud. Cloud Run / App Runner / Container Apps will give back literal months of life.

"Vercel as our backend"

Long DB connections, background jobs, big files, 6 GB function deploys, 10s timeouts. Push your backend off the frontend PaaS; use a real CaaS.

"Multi-cloud abstraction layer"

Abstracting across Lambda + Cloud Functions + Azure Functions in one repo. The lowest common denominator is awful, debugging gets worse, and you still pick a cloud per region. Don't.

"Free tier serverless will never bill"

Until it does. Lambda + S3 + DynamoDB run-away can rack $thousands overnight. Set hard quota / concurrent execution limits / budget alarms before launch.

"Serverless = no monitoring"

If anything you need more instrumentation. Cold starts, throttles, DLQ depth, init duration. Default CloudWatch is the floor; OpenTelemetry → Honeycomb / Datadog / Tempo is what you actually want.

17

Summary

Three takeaways

  1. Most workloads belong on CaaS (Cloud Run / Container Apps / App Runner). Default there; deviate when there's a real reason.
  2. FaaS is for events, not services. Use it where its shape fits — webhooks, queues, S3 triggers — not as a substitute for an API.
  3. Kubernetes is a platform you build on top of, not a faster way to ship. Choose it deliberately, or you'll feed it for years.

Next in the series

  • 04 SaaS Architecture — multi-tenancy, B2B identity, metering
  • 05 Cloud Security — IAM in depth, secrets, network, compliance
  • 06 LLM-as-a-Service

Companion decks

One sentence

"Pick the smallest unit of deploy that fits your workload — function for events, container for services, pod for platforms — and resist the temptation to use the same hammer for all three."