CLOUD SERVICE MODELS · PART 4 OF 6
    

SaaS
Architecture

Multi-tenancy · B2B identity · metering · per-tenant observability

Pool · Bridge · Silo SAML · OIDC · SCIM Metering · Stripe Per-tenant SLOs

🏢 Tenant → 🔐 IdP → 🌐 App → 📊 Meter → 💳 Bill

The architectural side of SaaS — the patterns that decide whether your platform scales to ten or ten-thousand tenants without rewrites. Companion to Monetising & Distributing Software (the business side).

Tenants · Identity · Meter · Observe

Topics

Multi-tenancy

The four patterns: Pool, Bridge, Silo, Hybrid
Tenant isolation — at app, DB, network
Postgres Row-Level Security
Cell-based architecture for scale

B2B identity

Why B2B is different from B2C
SAML / OIDC / SCIM federation
Auth-as-a-Service (Auth0, WorkOS, Stytch, Clerk, FrontEgg)
BYO IdP, multi-IdP per tenant

Metering & billing

Metering data models
Stripe / Lago / Orb / Metronome
Subscription & usage hybrids
Dunning & revenue recognition

Operations

Per-tenant observability & SLOs
Per-tenant rate-limiting
Audit logs as a feature
Data residency & compliance shapes

What "SaaS Architecture" Means

SaaS isn't a particular tech stack — it's the discipline of running one application for many customers at once, where each customer (tenant) believes the app exists for them alone, while underneath it's shared.

The unique problems SaaS poses

Isolation — Tenant A must never see Tenant B's data, ever
Noisy-neighbour — one tenant must not be able to degrade everyone else
Tenant lifecycle — sign-up, trial, paid, churn, GDPR delete
Tenant-aware billing — usage, seats, tiers, overage, mid-cycle changes
Tenant-aware identity — every tenant brings their own IdP, ideally
Tenant-aware compliance — DPA, BAA, residency, audit
Tenant-aware ops — debug Customer X's bug without copying their data

What it shares with everything else

SaaS still uses the same primitives: VPCs, containers, queues, databases. The art is partitioning them by tenant — sometimes logically, sometimes physically.

B2C vs B2B SaaS

B2C — millions of small accounts, individuals, no procurement, low ARPU. Notion-personal, Spotify, Calm.
B2B — hundreds-to-thousands of company tenants, each with users, SSO, SCIM, audit logs, BAA, DPA. Slack, Linear, Notion-teams.
Same code; very different identity, billing, and contractual surface.

PLG vs sales-led

Product-Led Growth — frictionless self-serve signup; viral inside companies. Sales-led — pilot, security review, MSA. Most successful SaaS today is hybrid: PLG funnels feed sales for enterprise tier.

Multi-Tenancy — The Four Patterns

Pool — One DB, tenant_id Everywhere

Highest density. One running application, one (sharded) database, one deploy pipeline. Every row carries a tenant_id; every query filters on it. Used by Slack at scale, Notion at scale, Linear from day one.

The schema pattern

CREATE TABLE projects (
  id          uuid PRIMARY KEY,
  tenant_id   uuid NOT NULL REFERENCES tenants(id),
  name        text NOT NULL,
  created_at  timestamptz NOT NULL DEFAULT now()
);

-- Index every tenant-scoped query on (tenant_id, ...)
CREATE INDEX projects_by_tenant
  ON projects (tenant_id, created_at DESC);

Every query must include WHERE tenant_id = $1. Forgetting once = cross-tenant leak.

Postgres Row-Level Security

ALTER TABLE projects ENABLE ROW LEVEL SECURITY;

CREATE POLICY tenant_isolation ON projects
  USING (tenant_id = current_setting('app.tenant_id')::uuid);

-- In your connection pooler / middleware:
SET app.tenant_id = '...uuid of current request...';

RLS turns a leak from "always" to "almost-impossible" — the DB rejects unscoped queries.

The good

Cheapest per-tenant operating cost
One backup, one upgrade, one schema migration
Easiest to add cross-tenant analytics features later
Scales to millions of small tenants if you shard sensibly

The hard parts

One bad tenant takes everyone down — N+1 query, runaway export, accidental cartesian. Per-tenant rate-limiting is essential.
Cross-tenant leak is existential — tests, RLS, query-shape audits, tenant-bound DB users
Per-tenant features hard — feature flags scoped by tenant, not by user
Customer-specific schema customisation impossible — use JSONB columns

Sharding pool

Shard tenants across multiple Postgres clusters by hash(tenant_id) % N. Citus, Vitess, AWS Aurora Limitless, PlanetScale. Most large SaaS are sharded pool, not single-DB pool.

Bridge — Schema- or DB-Per-Tenant

Compromise: shared compute, isolated data. Every tenant gets their own schema (Postgres) or own database in a shared Postgres cluster. The cluster is shared; the data is fully partitioned.

Schema per tenant

CREATE SCHEMA tenant_acme;
CREATE TABLE tenant_acme.projects (id uuid PRIMARY KEY, ...);

-- Per request, in middleware:
SET search_path = tenant_acme, public;

Same DDL applied to every schema. Migrations replay across thousands of schemas.

Where bridge wins

Auditor likes hearing "Tenant A's data is in its own schema"
Per-tenant backup & restore is trivial — pg_dump --schema=…
GDPR delete = drop schema. Done.
Customer-specific schema tweaks become possible (rare, but useful for legacy)

Where bridge breaks

Postgres slows down beyond ~10k schemas — system catalogs balloon
Migrations: 5000 schemas × 30s migration = 41 hours
Connections: pgbouncer + many schemas can confuse pool reset
Cross-tenant analytics = read each schema and union — painful

Real-world

PostHog uses pool with tenant_id everywhere
GitLab ships pool too (the open-source product is single-tenant; the SaaS is pooled)
Atlassian Confluence historically used DB-per-tenant — now blends both
Salesforce originated bridge-style metadata-driven multi-tenancy

Bridge is rarely the long-term answer

Most SaaS that start at bridge end up moving to sharded pool (small tenants) + silo (whales) — the hybrid model.

Silo — A Stack Per Tenant

Each customer gets their own application instance, their own database, often their own VPC. Highest isolation, highest cost. Used for top-tier enterprise tenants and regulated industries.

Why customers ask for silo

"Our auditor demands data isolation" — HIPAA, FedRAMP, financial
"We need our own backup & restore SLA"
"We need a different SLA" — 99.99% on top, 99.9% in pool
"BYO encryption keys (BYOK)"
"Our network team needs to peer into your VPC"

How to build it cheaply

Same container image as pool — different config, different DB
One Terraform module, parameterised per tenant
One deploy pipeline that fans out to many silos
Often runs in customer's own AWS account ("BYO cloud") via cross-account IAM

"Single-tenant SaaS"

A whole new product category — companies whose entire offering is silo-only because of regulation: Privatemode AI, healthcare-specific platforms, defence-tech. Charges 3–10× the pooled equivalent.

BYO cloud

Customer brings the AWS / GCP account; you run your software in it via cross-account roles. Hashicorp HCP, Confluent Cloud Networking, Databricks, Snowflake on private deploys all do this.

Silo's hidden tax

Your support team can't reproduce a bug without the customer's logs. Your release cadence slows — every silo upgrade is a maintenance window. Your on-call has N customers' alerts, not one. Charge accordingly.

Tenant Isolation — Defence in Depth

"Tenant isolation" is not a checkbox; it's layered enforcement — at app, DB, network, and operations.

Application layer

Tenant ID extracted from JWT / session, never from URL or query params
Middleware that refuses to query without one
Tests that simulate cross-tenant access and assert 403/404
Lint rule: any query without WHERE tenant_id = ? is a build failure

Database layer

Postgres RLS as the floor
Per-tenant DB users — connection rebinds set local role tenant_acme
Encryption at rest with per-tenant DEKs (Tenant DEK, KMS-wrapped) for compliance tiers
Audit logging on every query

Network layer (silo / hybrid)

Per-tenant VPC or namespace
Per-tenant TLS cert / SNI
PrivateLink endpoints for BYO-cloud tenants
Cell-based architecture — see slide 17

Operational layer

Engineers cannot SSH/exec into anywhere with tenant data without an audited break-glass flow
Logs are tenant-tagged but not tenant-readable to other tenants
Backups are encrypted with per-tenant or per-cell keys
"Customer X support" — copy a sanitised slice into a debug tenant; never query prod

The bug that ends the company

An ORM "convenience" that lets you skip WHERE tenant_id "just for admin queries" and ships to prod via a feature flag. RLS or perish.

B2B Identity — Why It's Different

In B2C, every user has a Google login. In B2B, every customer brings their own identity provider — Okta, Entra, Google Workspace, OneLogin, JumpCloud — and your app must federate to all of them at once.

The B2B identity matrix

Need	Standard	What it does
Login	SAML 2.0 or OIDC	Browser SSO from customer's IdP
User provisioning	SCIM 2.0	Customer's IdP creates / updates / deactivates users in your app
Just-in-time provisioning	via SAML/OIDC claims	Create user on first login from claims
Group sync	SCIM groups	Map IdP groups to your app's roles
Audit / SOC 2	Audit logs	Customer can answer "who did what"

Login flow (SAML SSO)

# User hits app.your-saas.com/login
# We detect their email domain → tenant → IdP

GET /sso?email=alice@acme.com
  → 302 to acme.okta.com/saml/yourapp

# user logs in at Okta
# Okta POSTs SAML assertion back

POST /sso/callback
<SAMLResponse>...</SAMLResponse>
  → we verify signature against Okta cert
  → extract email, name, groups
  → create / update user, set tenant
  → set session cookie

SCIM (provisioning) flow

# Okta admin assigns user to your app
POST /scim/v2/Users HTTP/1.1
Authorization: Bearer <tenant-scoped-token>
{
  "userName":"alice@acme.com",
  "name":{"givenName":"Alice","familyName":"Brown"},
  "emails":[{"value":"alice@acme.com","primary":true}],
  "active": true
}

# Later — Alice leaves Acme; Okta deprovisions
PATCH /scim/v2/Users/<id>
{ "active": false }
# → your app immediately invalidates her sessions

Roles & permissions

Role-based (RBAC) — Owner, Admin, Member, Guest
Attribute-based (ABAC) — finer per-resource (project, region)
ReBAC (relationship) — Zanzibar / OpenFGA / SpiceDB; "Alice can edit this doc because she's a member of the team that owns it"

"Email login" is not enough for enterprise

From the moment you sell to companies of more than ~50 people, you'll hear: "no SSO no sale". Build SAML/OIDC and SCIM before your first $50k contract.

Auth-as-a-Service — Don't Build SSO Yourself

Building SAML, OIDC, SCIM, MFA, magic links, password reset, brute-force protection, audit logs is months of work and a permanent maintenance burden. Use a platform.

Provider	Best fit	SAML/OIDC	SCIM	Pricing notes
Auth0 (Okta)	Mature B2C + B2B	✔	extra fee	per-MAU; cheap small, expensive at scale
WorkOS	B2B SaaS, "SSO as one API"	✔ (every IdP normalised)	✔	flat $125/connection/month, first 1M users free
Stytch	Passwordless-first, dev-native	✔	✔ (B2B SDK)	per-MAU; B2B SDK is the standout
Clerk	Frontend-led, React-shaped	✔ (Pro / Enterprise)	✔ (Enterprise)	per-MAU; great UI components out of the box
FrontEgg	B2B with built-in admin portal	✔	✔	flat by tier
AWS Cognito	If already deeply on AWS	✔ (limits)	✘	cheap; UX is the catch
Microsoft Entra External ID	Customer-facing apps for Azure shops	✔	✔	per-MAU, generous free tier
Keycloak / ZITADEL / Authentik	Self-hosted, EU-resident, OSS	✔	varies	free + ops cost; see OAuth for MCP

Pick by question

"I need SSO & SCIM tomorrow" → WorkOS
"I'm React/Next, want hosted UI" → Clerk
"I want passwordless / OTP / magic link" → Stytch
"I'm already in Auth0" → stay; migrate only if MAUs blow up
"I want EU sovereignty / OSS" → ZITADEL or Keycloak

Companion deck

The full provider tour, with self-hosted options and the OAuth specification trail, is in the Introduction to OAuth and OAuth for MCP decks.

Metering — The Data Model

SaaS billing is a write-mostly time-series workload. Every metered action emits an event; the billing engine aggregates and applies pricing rules.

A minimal usage event

{
  "event_id":     "evt_01HRX2...",
  "tenant_id":    "tnt_acme",
  "user_id":      "usr_alice",
  "metric":       "tokens_used",
  "value":        1284,
  "timestamp":    "2026-05-04T10:33:21Z",
  "idempotency":  "req_01HRX2...",
  "metadata": { "model":"gpt-5", "feature":"chat" }
}

Idempotency key prevents double-billing on retry. Stored append-only — never updated or deleted (audit, replay).

The pricing layer (separate)

plan = "growth"            # tier
rules = [
  { metric:"seats", flat: 12.00 },           # per-seat
  { metric:"tokens_used",
    tiered:[
      { up_to: 1_000_000, unit: 0.0 },       # included
      { up_to: 10_000_000, unit: 0.000004 }, # overage
      { up_to: null, unit: 0.0000035 }
    ] }
]

Build vs buy — the metering platform

Option	What it gives you
Stripe Billing + Meters	Subscriptions, prorations, invoicing; usage meters since 2024
Lago (OSS)	Self-hosted metering + pricing engine; SQL-friendly
Orb	Spec-the-pricing-as-code, reconciliation-first
Metronome	Used by OpenAI, Anthropic, Anysphere — usage at AIaaS scale
m3ter	UK-based, enterprise-billing depth

Three ways to charge

Per-seat — Slack, Notion. Predictable.
Per-usage — Stripe, Twilio, OpenAI. Scales with value.
Hybrid — Vercel ($20/seat + bandwidth), most modern SaaS

Reconciliation is the hard problem

Did our metering stream record everything Stripe billed for? For SOC1 / SOX-aligned customers, you need a daily reconciliation report and a way to credit-note discrepancies. Lago / Orb / Metronome do this; rolling-your-own usually skips it and breaks at audit.

Per-Tenant Observability & SLOs

"Our p99 is 250 ms" is meaningless to a SaaS — what matters is "Customer Acme's p99 is 250 ms". Build observability that sees tenants as first-class.

The four golden signals — per tenant

Latency by tenant — find noisy / starved tenants early
Traffic by tenant — detect ramp / leak / abuse
Errors by tenant — separate "their bug" from "our bug"
Saturation by tenant resource — DB connections, rate-limit tokens, queue depth

Make tenant_id a span attribute

// OpenTelemetry, every span:
span.setAttribute("tenant.id", req.tenant.id);
span.setAttribute("tenant.tier", req.tenant.tier);
span.setAttribute("tenant.region", req.tenant.region);

// Then in Honeycomb / Tempo / Datadog:
GROUP BY tenant.id  → per-tenant dashboards

Tier-aware SLOs

Tier	p99 latency	Availability
Free	1.5 s	99.5%
Growth	500 ms	99.9%
Enterprise	250 ms	99.95%
Enterprise+ (silo)	250 ms	99.99%

Each tier monitored, alerted, and reported to that tier's customers (status page).

Per-tenant rate-limits

Token bucket per (tenant, route) — Redis or in-memory
Defaults by tier; override per tenant for noisy ones or whitelisted partners
Headers — X-RateLimit-Limit, X-RateLimit-Remaining, Retry-After
Soft limits (warn) before hard limits (429)

Don't let one tenant cost you the cluster

Pool architecture's biggest existential risk. Per-tenant rate-limit + per-tenant query budget + circuit breakers on long-running ops.

Data Residency & Multi-Region SaaS

European customers want their data in EU regions, US-Federal customers want US-Gov, Asia customers want APAC. The earlier you think about residency, the cheaper it is to add.

Three architectures

Single-region — start here for B2C and small B2B
Multi-region active-passive — primary region serves; DR site warm. Easy.
Multi-region per-tenant — tenant pinned to a home region; data, compute, even auth stay there. The "real" multi-region SaaS.

Tenant pinning

At signup, ask: where do you want your data? The answer determines:

Which DB cluster the tenant lives in
Which compute region serves their requests
Where their backups are stored
Where their LLM calls go (deck 06)

Practical patterns

Region-routing at the edge — Cloudflare / API Gateway looks up tenant → region → forwards
Per-region cell — each region is an independent SaaS deployment
Globally-replicated control plane — billing, identity, settings; regional data plane — tenant data
Per-tenant encryption keys in regional KMS

Compliance side

GDPR + DPF for EU↔US transfers (post-Schrems II)
UK adequacy (extended Dec 2024)
FedRAMP Moderate / High for US-Federal
Sovereign clouds (deck 01) for the strictest residency

The metadata leak

"Customer data is in EU" but billing IDs, audit logs, support chats route through the US. Auditors will catch this. Either keep all of it regional, or document and disclose what doesn't.

Audit Logs as a Feature

For B2B customers, audit logs are not a bonus — they are part of the product. SOC 2 and ISO 27001 customers will demand them; without them, you fail the security review.

What to log

Who — user_id, IP, user agent, session
What — action verb (create, update, delete, share, login)
On what — resource type + ID + before/after diff
When — server timestamp (UTC, ISO 8601)
Where — region, edge POP
How — auth method (password / SAML / API key)
Why? — request_id, parent operation

A useful audit row

{
  "id":"aud_01HRX...",
  "tenant_id":"tnt_acme",
  "actor":{"type":"user","id":"usr_alice","ip":"203.0.113.5","ua":"..."},
  "action":"document.share",
  "resource":{"type":"document","id":"doc_42","name":"Q1 plan"},
  "context":{"to":"bob@vendor.com","permission":"comment"},
  "session_id":"ses_...",
  "request_id":"req_01HRX...",
  "ts":"2026-05-04T10:33:21Z"
}

How to expose them

UI in the admin area — searchable, filterable, paginated
Export CSV/JSON
Streaming — webhook / Splunk / Sumo / Datadog HTTP source
SIEM integration — Microsoft Sentinel, Splunk Enterprise Security

Build vs buy

Postgres / OpenSearch + a UI you build — fine for ≤ 10M events/month
ClickHouse — column store, scales to billions
Workos Audit Logs — drop-in API, customer-facing UI included
DataDog Audit Trail — if already on DD
Cribl Stream / Vector — for the streaming side

Tamper-evidence

For HIPAA / FedRAMP / SOX, auditors want logs that can't be silently rewritten. Append-only storage + hash-chain (each row hashes the previous) + immutability (S3 Object Lock / Glacier Vault Lock).

Compliance Shapes — SaaS-Specific

Compliance is what distinguishes "SaaS for engineers" from "SaaS for the regulated world". Each framework asks the same things in slightly different language. (Deck 05 covers the cloud-side controls.)

Framework	Trigger	What customers will ask for	Year-1 cost (rough)
SOC 2 Type II	B2B mid-market, $50k+ contracts	Audit report, security questionnaire	$30–80k (auditor) + $30k tooling (Vanta/Drata/Secureframe)
ISO 27001	European customers, government	Certificate, SoA, ISMS	$25–60k
HIPAA (BAA)	Health data — patients, devices, providers	BAA signed, encryption, audit logs, isolation	$10–30k tooling, BAA from cloud provider
GDPR / DPF	Any EU personal data	DPA, sub-processor list, data residency option, deletion / export endpoints	Mostly engineering work, no auditor
PCI DSS	Storing card numbers	Tokenisation (Stripe handles for most)	Avoid — let Stripe / Adyen handle
FedRAMP	US Federal customers	FedRAMP Moderate / High, GovCloud / Azure Gov	$1M+ — only if there's revenue waiting
EU AI Act	Selling AI features in EU	Risk classification, transparency, model cards	Mostly process; ramp through 2025–2027

Compliance automation tools

Vanta · Drata · Secureframe · Tugboat Logic · Anecdotes
Ingest from your cloud / GitHub / IdP; map to controls; generate evidence
SOC 2 Type II in 6–9 months instead of 18+

When to start

Earlier than you think. Once your sales pipeline has a $50k deal, security review will demand it. Vanta + 6 months ≪ blowing the deal.

Tenant Lifecycle

From signup to GDPR-delete, every tenant moves through a state machine. Build it explicitly; don't let it evolve as a tangle of active booleans.

Trial → active

Time-bound, feature-gated. Convert via Stripe trial-end + 24h grace. Don't auto-charge silently — confirmation email.

Past-due / paused

Read-only after 7 days; full pause after 30. Dunning emails (Stripe / Lago / Recurly handle this).

Deleted

GDPR right-to-erasure: 30-day soft-delete (recoverable), then hard delete (overwritten in DB, removed from backups via lifecycle, removed from search indexes). Document the SLA.

Scaling Up — Cell-Based Architecture

Past a certain size, "one big pool" stops scaling. The pattern: divide tenants into cells, each a fully independent slice of the platform. Used by Slack, Stripe, AWS itself.

What a cell is

An independent deployment of the SaaS — its own DB, queue, cache, services
Hosts N tenants, sized to a known capacity (e.g. 1000 tenants)
Failure of one cell ≠ failure of the platform — blast radius is one cell
Routing layer (cell-router) maps tenant → cell

Why it works

Each cell is small enough to be reasoned about, deployed atomically, debugged
You can deploy to one cell at a time, "wave" deploys
You can put the noisy whales in their own cells (silo + cell convergence)
You can run cells in different regions for residency

Real-world examples

Slack — workspace ↔ channel-server, sharded
Stripe — entire stack replicated per cell, US-East has hundreds
Salesforce "PODs" — the original cell pattern (early 2000s)
AWS internal services — cellular by default

When to start

Don't day-one. Build pool, with a tenant_id everywhere and a routing-aware data layer. When the database is bumping limits, slice by cell. The routing abstraction is the first thing to build right.

Cross-cell features hurt

If tenants need to talk to other tenants (Slack Connect, Notion guests), the routing-via-tenant model breaks. Solve at the application layer with explicit cross-cell contracts.

SaaS Anti-Patterns

"tenant_id is optional in some queries"

The first cross-tenant leak is a career event. Lint, RLS, code review every query. No exceptions for "internal-only" admin pages.

"DIY SAML"

SAML is XML, signed XML, with edge cases dating to 2005. Every XML signature library has historically had a signature wrapping CVE. Use WorkOS/Auth0; don't roll it.

"We'll add metering before launch"

You won't. Metering plumbing should be live before billing rules; events flowing six months early is fine. Wiring it after launch into a working app means missing a quarter of usage.

"Audit logs go to the same DB as the app"

Performance footgun (write amp), eventually deletion-resistance footgun. Ship to a separate store from day one.

"One DB shared across customers, no cross-tenant rate limit"

Customer A's CSV export takes down customer B's logins. Per-tenant rate limits on every expensive operation.

"Build SOC 2 ourselves"

You can. You won't ship product. Vanta / Drata / Secureframe — pick one, $30k/yr, 70% time saved.

"Multi-region is just a checkbox"

Multi-region is an architecture, not a feature. Latency, replication lag, partition tolerance, conflict resolution — all surface immediately. Plan for it from the data model upward.

"GDPR delete by setting deleted_at"

The data is still in the DB, the backups, the analytics warehouse, the search index, the audit log, the LLM context cache… Build delete as a fan-out job that touches every stage.

Summary

Three takeaways

SaaS architecture is multi-tenancy + B2B identity + metering + per-tenant ops. Each is its own discipline; treat them as load-bearing from day one.
Build pool, expect to slice into cells, leave a silo escape hatch. The hybrid model is where every successful B2B SaaS ends up.
Buy the auth platform, buy the metering platform, buy compliance automation. The differentiation isn't there; the engineering cost is.

Next in the series

05 Cloud Security — the cross-cutting concerns above
06 LLM-as-a-Service — managed AI for SaaS features

Companion decks

Monetising & Distributing Software — the SaaS business side
Introduction to OAuth — identity foundations
OAuth for MCP — the full provider tour
Web Authentication — sessions, JWTs, MFA
Databases — the storage layer behind every SaaS

One sentence

"SaaS is the discipline of running one product for many strangers — every architectural choice you make ought to make that easier, not harder, six years from now."

SaaSArchitecture

Topics

Multi-tenancy

B2B identity

Metering & billing

Operations

What "SaaS Architecture" Means

The unique problems SaaS poses

What it shares with everything else

B2C vs B2B SaaS

PLG vs sales-led

Multi-Tenancy — The Four Patterns

Pool — One DB, tenant_id Everywhere

The schema pattern

Postgres Row-Level Security

The good

The hard parts

Sharding pool

Bridge — Schema- or DB-Per-Tenant

Schema per tenant

Where bridge wins

Where bridge breaks

Real-world

Bridge is rarely the long-term answer

Silo — A Stack Per Tenant

Why customers ask for silo

How to build it cheaply

"Single-tenant SaaS"

BYO cloud

Silo's hidden tax

Tenant Isolation — Defence in Depth

Application layer

Database layer

Network layer (silo / hybrid)

Operational layer

The bug that ends the company

B2B Identity — Why It's Different

The B2B identity matrix

Login flow (SAML SSO)

SCIM (provisioning) flow

Roles & permissions

"Email login" is not enough for enterprise

Auth-as-a-Service — Don't Build SSO Yourself

Pick by question

Companion deck

Metering — The Data Model

A minimal usage event

The pricing layer (separate)

Build vs buy — the metering platform

Three ways to charge

Reconciliation is the hard problem

Per-Tenant Observability & SLOs

The four golden signals — per tenant

Make tenant_id a span attribute

Tier-aware SLOs

Per-tenant rate-limits

Don't let one tenant cost you the cluster

Data Residency & Multi-Region SaaS

Three architectures

Tenant pinning

Practical patterns

Compliance side

The metadata leak

Audit Logs as a Feature

What to log

A useful audit row

How to expose them

Build vs buy

Tamper-evidence

Compliance Shapes — SaaS-Specific

Compliance automation tools

When to start

Tenant Lifecycle

Trial → active

Past-due / paused

Deleted

Scaling Up — Cell-Based Architecture

What a cell is

Why it works

Real-world examples

SaaS
Architecture