Deployment strategies — blue/green, canary, rolling
02
The Managed-Compute Spectrum
Reading left-to-right: your unit shrinks (server → container → function), scale-to-zero gets faster, the platform handles more, and you trade away more flexibility.
03
Heroku-Style PaaS — git push, get URL
The original PaaS pattern — Heroku, 2007. Push code; the platform detects the language, runs a build, deploys an app, gives you a URL, scales it on a slider. Modern siblings have closed the gap on price and added preview environments, multi-region, and IPv6.
The five contenders
Platform
Build
Killer feature
From ($)
Heroku
Buildpacks
add-ons marketplace, 18 yrs of stability
$5/mo eco-dyno
Render
Auto-detect / Dockerfile
preview environments, native HTTPS, free DBs
free tier · $7/mo paid
Fly.io
Dockerfile / nixpacks
multi-region by default, Firecracker μVMs
~$5/mo
Railway
Nixpacks
Postgres-and-friends in a click
$5/mo
Koyeb
Buildpacks / Docker
edge global, scale-to-0 paid tier
free hobby
A typical workflow
$ git push render main
==> Detected language: Node.js (package.json)
==> Running build: npm ci && npm run build
==> Building image (cached layers)
==> Provisioning HTTPS cert (Let's Encrypt)
==> Deploying to: my-app.onrender.com
==> Health check passing
==> Live in 2m18s
What you give up
Build-image choice (your runtime is whatever the platform says it is)
Networking depth — limited VPC, no NAT, no peering
Sidecars / multi-process pods (Heroku has limited buildpack-multi-procfile, Fly has machine-side)
OS tweaks (no apt-get install at runtime; build-time only)
What you get back
Time. From "I have an idea" to "URL anyone can hit" in < 10 minutes
Free TLS, free preview env per PR, autoscaling, log UI, no on-call for hypervisor
Per-instance pricing > CaaS at > ~$2k/month spend; or you need a private link to an internal data store; or your stack hits a runtime limit. Climb to CaaS or managed K8s.
04
Hyperscaler PaaS
Each big-three cloud has its own PaaS. They're more enterprisey, more integrated with the rest of the cloud (IAM, VPC, observability) and more boring than the indie PaaS — which is the point if you're already in the cloud.
AWS
Elastic Beanstalk — the original, EC2 underneath, dated UX
App Engine Standard — strict sandbox, sub-second autoscale, deprecated for greenfield
App Engine Flex — VM-based, less popular than Cloud Run
Firebase Hosting — frontend + functions
Azure
App Service — broad runtime support, Windows + Linux, slot-based deployment
Static Web Apps — frontend + API in one
Container Apps — also CaaS, see slide 08
Why they exist
Selling "your team uses one cloud" — same IAM, same network, same support contract. Procurement loves them. Developers usually prefer Render, Vercel, or Cloud Run.
The Beanstalk warning
Old, opinionated, slow-deploying. Many teams trying to escape Beanstalk in 2024–2026 — App Runner is the AWS-native escape hatch; ECS Fargate is the polyglot one.
05
Frontend-Shaped PaaS
A new category: PaaS optimised for the JS frontend stack. Edge by default, framework-aware, preview-per-PR, and tightly bound to one or two frameworks (Next, Astro, SvelteKit).
Vercel
Made by the Next.js team — first-class for Next App Router, RSC, ISR, image optimisation
The Vercel/Netlify/CF pattern collapses what used to be: hosting + CDN + TLS + functions + previews + analytics into one git push. For greenfield consumer-facing apps with React/Next/Svelte, this is the new default.
The framework lock-in
Vercel works best with Next, Cloudflare Pages with their Workers SDK, Netlify with their Functions runtime. Self-hosting Next (via next start on a container) is harder than they advertise — open-next.js has emerged to bridge.
06
Buildpacks · Dockerfiles · Nixpacks
How a PaaS turns your repo into something runnable. Three approaches — pick your trade-off between magic and control.
Buildpacks
Heroku invented (2011); CNB (Cloud Native Buildpacks) is the OSS standard
Required for: heavy native deps, custom CA bundles, FIPS, BYO image
Nixpacks
Railway-led modern alternative, uses Nix under the hood
Faster than buildpacks, deterministic, cache-friendly
Used by Railway, Coolify, Sevalla
Build at PaaS vs build in CI
Build-in-PaaS — push source, platform builds. Simpler, less reproducible.
Build-in-CI — GitHub Actions builds the image, pushes to a registry, deploys. More complex, fully reproducible, scans for CVEs in pipeline. The pattern at scale.
Hybrid — CI builds & tags; PaaS pulls. Best of both for medium teams.
The "works in CI, fails in PaaS" trap
Different base images, different versions of node, different libc. Lock the runtime version in .tool-versions / package.json engines / runtime.txt — and ideally build the image yourself.
07
"Run My Container" CaaS
Push a container image; the platform runs it, scales it, networks it, gives you a URL. No VMs, no Kubernetes, no buildpack opinions. Most workloads should default here.
Service
Cloud
Cold start
Min CPU
Scale-to-0?
Pricing
Cloud Run
GCP
~500 ms gen1, < 100ms gen2
0.08 vCPU
yes
per-100ms vCPU + GB-sec + req
ECS Fargate
AWS
20–60 s (task launch)
0.25 vCPU
no (min 1 task)
per-second vCPU + GB
App Runner
AWS
10–30 s (warm pool)
0.25 vCPU
yes (paused state)
per-second + provisioned
Container Apps
Azure
~1 s
0.25 vCPU
yes (KEDA)
per-second vCPU + GB + req
Fly Machines
Fly.io
200–600 ms (cold), < 50 ms (suspended)
shared-1× CPU
yes (auto-stop)
per-second + bandwidth
Cloudflare Containers
CF
< 1 s (2025 GA)
0.25 vCPU
yes
per-10ms vCPU + GB
Why CaaS won
You bring a container — works in your laptop, prod, CI, anywhere
Scale to zero — pay nothing when idle (most platforms)
No platform team needed; one team can ship, deploy, and operate
The cloud runs the control plane (etcd, API server, scheduler); you bring node pools and deploy workloads. The "platform you build on top of" — and the right answer when CaaS isn't enough.
When you actually need Kubernetes
You need workload-level customisation — sidecars, init containers, custom CSI / CNI
You're running enough services that Cloud Run / Fargate billing crosses break-even (~$3-5k/month workloads)
You need a service mesh, advanced traffic shaping, custom CRDs
Cross-cluster portability (on-prem ↔ cloud, or multi-cloud)
The three managed offerings
Service
Control plane
Auto-mode
Notes
EKS
$0.10/hr ($73/mo)
EKS Auto Mode (2024)
Most flexible, most ops
GKE
$0.10/hr / cluster after first
Autopilot
Smartest defaults, Autopilot is the standout
AKS
Free (Standard SKU costs)
—
Tightest Azure integration
Auto / Autopilot mode
Provider runs both control plane and node lifecycle
You declare pods; the platform decides nodes (Karpenter-style)
Closes the gap with CaaS — pay per pod, scale to zero, but keep K8s API
GKE Autopilot is the most mature; EKS Auto Mode launched re:Invent 2024
"Kubernetes is cheap" is a myth — the control plane is, the platform team isn't. A serious K8s install needs at least 1.5 SREs to operate well. If you don't have them, use CaaS.
09
FaaS — How a Lambda Actually Runs
Cold path
Steps 2–3. 50ms (Workers) → 1.5s (Java Lambda in VPC). Once per fresh worker.
Warm path
Step 4 only. Handler is already in memory. P50 latency = your code's latency.
Burst path
Sudden traffic → hundreds of cold workers spun up in parallel. P99 spikes; this is where load tests reveal the truth.
Code download — bigger zip = longer; 50 MB Lambda zip ~150 ms slower than 1 MB
Handler init — module imports, DB connection setup, AWS SDK warmup. This is usually the biggest variable.
VPC ENI attach — historically seconds; "Hyperplane ENIs" (2019) brought it to ~250 ms
Mitigations
Provisioned concurrency / min instances — keep N warm; you pay for them
SnapStart — Lambda Java/Python — restore from a memory snapshot, ~100 ms instead of 800 ms
Smaller deploy package — split function dependencies, layer common code
Lazy-init at module-top only when needed — defer optional clients
Compile-on-build — Go & Rust dominate cold-start charts
Cold-start latency, real numbers (2024-25)
Runtime
p50 cold
p99 cold
Workers (V8 isolate)
< 5 ms
< 30 ms
Lambda Node 20
180 ms
650 ms
Lambda Python 3.12
200 ms
700 ms
Lambda Java + SnapStart
200 ms
800 ms
Lambda Java (no Snap)
900 ms
3500 ms
Cloud Run gen2 Node
120 ms
400 ms
11
Edge FaaS — V8 Isolates
A different beast: each function runs in a V8 isolate (the same construct as a browser tab) instead of a μVM. ~5 ms cold start. No Node-native APIs. Code runs at the closest of hundreds of edge locations to the user.
What's running it
Cloudflare Workers — 300+ POPs, the original isolate-based platform
Vercel Edge Functions — also V8 isolates, on Vercel's edge
FaaS shines on events, not requests. Most production FaaS code runs in response to: a queue message, an S3 upload, a CloudWatch event, a webhook, a DynamoDB change, a Kinesis record.
Common event sources
Queues — SQS / Pub/Sub / Service Bus / EventBridge
AWS Step Functions — JSON state machine; retries, parallel, choice; the standard for AWS pipelines
GCP Workflows — YAML state machine
Azure Durable Functions — code-first orchestration in C# / JS
Temporal — vendor-neutral workflow engine; runs on any cloud
Idempotency & at-least-once
Event sources retry. Your function will be called with the same event again. Design for it: idempotency keys, conditional writes (DynamoDB attribute_not_exists), de-dup via a transaction id.
Dead-letter queues are not optional
Configure DLQ on every function
Set max-receive count (3–5)
Alarm on DLQ depth > 0
Have a redrive runbook
The "lambda calls lambda" smell
If one function synchronously calls another, you're paying double, doubling cold starts, and increasing tail latency. Use Step Functions, queues, or just one bigger function.
13
Cost Comparison — Same Workload, Five Layers
A small JSON API at 100 requests/sec sustained and 100 ms median, 256 MB. Approximate, us-east-1, 2025 prices.
Render Standard $7/mo or Heroku Basic $7/mo each x 2 instances
HA, autoscaling off
$14–25
+ $7/mo Postgres add-on
FaaS
Lambda + API GW
256 MB, 100 ms each
~$70 (Lambda) + ~$45 (APIGW)
API GW is the killer; switch to Function URL or ALB to cut to ~$25 total
Take-aways
CaaS & PaaS are the cheapest for steady ~100 RPS workloads
Lambda is cheapest only if you're not using API Gateway
K8s is rarely cheapest for small workloads — it's a platform, not an optimisation
IaaS only wins below this scale (single VM running everything) or above (3-year RI giants)
Where the curve flips
< 1 RPS, spiky → FaaS
1–500 RPS, steady → CaaS / PaaS
500+ RPS or stateful → managed K8s or IaaS with reserved capacity
14
Deployment Strategies on Managed Compute
Rolling
The default everywhere. Replace N pods/instances at a time; drain old, start new. Simple, slow, hard to roll back mid-deploy.
Blue / green
Stand up the new version alongside, swap traffic atomically. Cloud Run revisions, AWS App Runner deploys, K8s with two services + a switch. Fast rollback (swap back).
Canary
Send 1% / 5% / 25% / 100% of traffic to the new version while watching SLO metrics. Cloud Run native traffic-splitting, ALB weighted target groups, Linkerd / Istio + Argo Rollouts.
Feature flags & ring deploys
Different problem from infra deploys. Push code dark; flip flags per cohort. LaunchDarkly, Unleash, GrowthBook, Statsig, AWS AppConfig. Decouples "deploy" from "release".
The forgotten half — db migrations
Code can roll forward; data rarely can. Separate schema migrations from code deploys: backwards-compatible migration first, code, then drop-old-column migration. Two deploys, never one.
One container, scale-to-0, deploy via gcloud run deploy or git push
Glue / webhook / S3-triggered ETL
Lambda / Cloud Functions
Per-event billing; nothing else fits this shape
Cron / scheduled jobs
EventBridge + Lambda; Cloud Scheduler + Cloud Run Jobs
$0 idle, robust retries, observability built-in
Auth / geofence / A/B at request edge
Cloudflare Workers / Vercel Edge
< 5 ms cold start, runs at every POP
20+ services, custom CRDs, service mesh
Managed K8s (Autopilot/Auto Mode)
K8s API, but nodes are someone else's problem
GPU training / inference workload
IaaS GPUs or specialist (Modal, RunPod)
Lambda has no GPU; CaaS GPU support is thin
Stateful service (DB, queue, cache)
Managed service (RDS, MemoryStore, Aurora, Redis Cloud)
Don't run state on FaaS or PaaS — use the DBaaS
Existing Heroku app, growing
Render / Fly first; Cloud Run if data-side is on GCP
Same DX, modern infra, often cheaper
16
Anti-Patterns
"Lambda for everything"
200 functions, mutually-coupled, sharing one Postgres, on a 14-step Step Function. Move steady-state HTTP services to CaaS; keep FaaS for events.
"K8s for everything"
Three engineers, one Helm chart per service, one Crossplane install per cloud. Cloud Run / App Runner / Container Apps will give back literal months of life.
"Vercel as our backend"
Long DB connections, background jobs, big files, 6 GB function deploys, 10s timeouts. Push your backend off the frontend PaaS; use a real CaaS.
"Multi-cloud abstraction layer"
Abstracting across Lambda + Cloud Functions + Azure Functions in one repo. The lowest common denominator is awful, debugging gets worse, and you still pick a cloud per region. Don't.
"Free tier serverless will never bill"
Until it does. Lambda + S3 + DynamoDB run-away can rack $thousands overnight. Set hard quota / concurrent execution limits / budget alarms before launch.
"Serverless = no monitoring"
If anything you need more instrumentation. Cold starts, throttles, DLQ depth, init duration. Default CloudWatch is the floor; OpenTelemetry → Honeycomb / Datadog / Tempo is what you actually want.
17
Summary
Three takeaways
Most workloads belong on CaaS (Cloud Run / Container Apps / App Runner). Default there; deviate when there's a real reason.
FaaS is for events, not services. Use it where its shape fits — webhooks, queues, S3 triggers — not as a substitute for an API.
Kubernetes is a platform you build on top of, not a faster way to ship. Choose it deliberately, or you'll feed it for years.
"Pick the smallest unit of deploy that fits your workload — function for events, container for services, pod for platforms — and resist the temptation to use the same hammer for all three."