Where authorisation happens before the request reaches your service code — token validation, downscoping, rate-limit-as-policy, identity propagation. The complement to Workload Identity & Service-Mesh AuthZ for north-south traffic.
ext_authz patternAuthorisation can be enforced at three places along a request: the edge (before the request enters your network), the gateway / mesh (between services), and in the application. Most production systems do all three; each layer covers what the others can't.
Coarse-grained checks, rate-limiting, attack mitigation, token-format validation, geofencing — anything that should drop the request before it costs your origin a CPU cycle.
Workload-to-workload identity (mTLS), service-level authz, per-route policy bound to namespaces.
Per-object decisions ("can Alice edit doc 42?"). The edge can't do this — it doesn't know your data model.
exp.exp short (5–15 min); rely on refresh-token rotation at the IdP./introspect endpoint with the token; IdP returns active=true/false plus claims.Best of both: 99 % of requests get the offline-fast path; high-value ones get the online check; revocation propagates within seconds.
Edge logs are usually sampled into central observability. Anyone with read access to those logs becomes the user. Strip Authorization at the proxy; log only sub + a hash if you need correlation.
Cache 5–60 minutes; refresh aggressively on unknown kid; fall back to the previous JWKS for one rotation cycle to survive issuer key roll.
ext_authz PatternEnvoy is the data-plane underneath most modern edges and meshes (Istio, Cilium service-mesh mode, Cloud-native API gateways, AWS App Mesh). Its ext_authz filter is the canonical way to delegate authz to a separate service.
http_filters:
- name: envoy.filters.http.ext_authz
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.http.ext_authz.v3.ExtAuthz
transport_api_version: V3
grpc_service:
envoy_grpc: { cluster_name: ext_authz }
timeout: 0.25s
failure_mode_allow: false # fail-closed
include_peer_certificate: true
with_request_body:
max_request_bytes: 8192
allow_partial_message: true
envoy_ext_authz_grpc server.If the authz service is unreachable, does Envoy let the request through (failure_mode_allow: true) or block it? The "right" answer depends on whether you'd rather have an outage or a security incident.
| Type | Use |
|---|---|
| IAM | Caller signs request with SigV4 (other AWS services / SDKs). |
| JWT (HTTP API) | API Gateway validates JWTs against an OIDC issuer's JWKS. Sub-ms. |
| Lambda authorizer (REQUEST) | Custom Lambda decides; can read headers, body, source IP. |
| Lambda authorizer (TOKEN) | Same as REQUEST but only sees the Authorization header. |
| Cognito user-pool | Built-in for tenants who use Cognito as the IdP. |
def handler(event, context):
token = event['headers']['authorization'].split()[1]
user = verify_jwt(token, jwks_cache)
return {
'principalId': user['sub'],
'policyDocument': {
'Version': '2012-10-17',
'Statement': [{
'Effect': 'Allow',
'Action': 'execute-api:Invoke',
'Resource': event['methodArn']
}]
},
'context': { # passed to backend as headers
'sub': user['sub'],
'tenant_id': user['tenant_id'],
'scope': ' '.join(user['scope'])
}
}
REST API is older, more features (request validation, mapping templates), more expensive. HTTP API is newer, leaner, much cheaper, has the built-in JWT authorizer. Pick HTTP API unless you genuinely need REST features.
email, sub, groups).# the JWT is in CF-Access-Jwt-Assertion header
const token = req.headers['cf-access-jwt-assertion'];
const claims = await verifyJwt(token,
'https://<team>.cloudflareaccess.com/cdn-cgi/access/certs');
require(claims.aud === MY_AUD);
req.user = { sub: claims.sub, email: claims.email };
export default {
async fetch(req, env) {
const t = req.headers.get('authorization')?.split(' ')[1];
const claims = await verifyJwt(t, env.JWKS_URL);
if (!claims || !claims.scope.includes('api:read'))
return new Response('forbidden', { status: 403 });
const headers = new Headers(req.headers);
headers.set('x-user-sub', claims.sub);
headers.set('x-tenant', claims.tenant_id);
headers.delete('authorization'); // strip before origin
return fetch(req.url, { method: req.method, body: req.body, headers });
}
}
| Gateway | Engine | Auth plugins | Sweet spot |
|---|---|---|---|
| Kong | Nginx + Lua / Wasm; OSS & commercial | JWT, OAuth 2.0 introspection, OIDC plugin, mTLS, key-auth, ACL, OPA plugin | K8s-native (Kong Ingress Controller); strong plugin ecosystem; popular self-host |
| Apigee | Google-managed | OAuth, OIDC, SAML, API key, custom JS / Java callouts | Enterprise-grade; deep analytics; expensive |
| Tyk | Go; OSS & cloud | JWT, OAuth, OIDC, key-auth, mTLS, custom middleware | Smaller-team alternative to Kong; good for self-host |
| Krakend | Go; OSS & commercial | JWT, OAuth introspection, mTLS, OPA | API aggregation (one inbound request fans out to many backends) more than authz; pairs well with stateless JWT |
| Envoy + Istio Gateway | Envoy | JWT (RequestAuthentication), ext_authz, mTLS, RBAC | If you're already on Istio mesh — same primitives at the edge |
| AWS / GCP / Azure managed | Cloud-managed | Native IdP integrations | Lowest-ops if you're cloud-native |
routes:
- name: payments
paths: ["/v1/payments"]
methods: ["POST"]
plugins:
- name: jwt
config: { secret_is_base64: false }
- name: acl
config: { allow: ["payments-writer"] }
- name: rate-limiting
config: { minute: 60, policy: redis, redis_host: rl.svc }
Most rate-limit configurations are de-facto authorisation policies — "this caller may do at most this many requests per minute, per hour, per day". Treating them as such (and enforcing at the edge) is one of the highest-leverage pieces of authz you can ship.
sub is signed.# per-tenant ceiling
- key: req.headers["x-tenant-id"]
limit: 10000/min
# per-user inside a tenant
- key: req.jwt.claims.sub
limit: 600/min
# per-route ceiling
- key: req.method + req.path
limit: 5000/min
Run all three; the most-restrictive wins.
X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset.Retry-After header + a JSON body explaining the limit hit (per-tenant? per-user? per-route?).Rate-limit per-IP only when your traffic is mostly authenticated. A small tenant gets crushed by a noisy neighbour on the same NAT.
# nginx
proxy_set_header X-Client-DN $ssl_client_s_dn;
proxy_set_header X-Client-CN $ssl_client_s_dn_cn;
proxy_set_header X-Client-Verify $ssl_client_verify;
proxy_set_header X-Client-Cert $ssl_client_escaped_cert;
# Envoy with forward_client_cert_details: SANITIZE_SET
http_connection_manager:
forward_client_cert_details: SANITIZE_SET
set_current_client_cert_details:
subject: true
uri: true # SPIFFE URI SANs
cert: true
X-Client-DN directly and the edge doesn't strip it, the upstream is happily trusting the attacker.SANITIZE_SET mode does this automatically. Nginx requires explicit proxy_set_header overriding.If you're already issuing SPIFFE SVIDs internally (see Workload Identity deck), an edge that inspects the SAN URI of the client cert can give upstreams a SPIFFE ID directly — same identity model end-to-end.
Same idea, different mechanism. Edge validates the DPoP proof header; replaces the user's bearer with a server-to-server token (token exchange) that proves "request was bound to a key the user holds".
billing:read billing:write payments:refund.billing:read only./token with grant_type=token-exchange, requesting a narrower-scope subject token.POST /token HTTP/1.1
Host: idp.acme.com
Authorization: Basic base64(edge_client:secret)
grant_type=urn:ietf:params:oauth:grant-type:token-exchange&
subject_token=<user's incoming access token>&
subject_token_type=urn:ietf:params:oauth:token-type:access_token&
audience=https://api-internal/billing-svc&
scope=billing:read
Result: a token usable only at billing-svc for billing:read, with the original user's sub preserved and the edge as act.
A token-exchange call per request adds a ~10 ms IdP round-trip. Cache the downscoped token by (user_sub, target_audience, scope) for the original token's TTL minus a safety margin.
If the rule depends on who the request is from (a verified identity) it belongs to authz. If it depends only on what the request looks like, it's WAF. Don't put "is the JWT valid?" in WAF rules; don't put "block this CIDR" in authz code.
GraphQL collapses many REST endpoints into a single POST. Edge authorisation needs different patterns to keep up.
@requiresScope.Apollo Router / Wundergraph Cosmo are GraphQL gateways with built-in JWT validation, per-operation policy and persisted query support. They are an alternative to a generic edge for GraphQL-heavy stacks.
Webhooks are the inverse pattern: an external service POSTs to your endpoint. There is no user-agent and no OAuth flow; you must authenticate the sender from the request itself.
HMAC-SHA256(secret, body) and sends it in a header (Stripe-Signature, X-Hub-Signature-256).const sig = req.headers['stripe-signature'];
const ts = parseTs(sig);
if (Math.abs(now() - ts) > 300) reject(); // replay window
const expected = hmacSha256(SECRET, ts + '.' + req.rawBody);
if (!constantTimeEq(sig.v1, expected)) reject();
if (eventStore.has(req.body.id)) return ok(); // idempotent
eventStore.add(req.body.id);
process(req.body);
Webhooks are at-least-once. Your handler must be idempotent — keyed on the event ID — or you'll double-charge customers when the sender retries.
OAuth_for_MCP for the OAuth profile MCP uses; Workload_Identity_AuthZ for the agent's own identity; this deck for the gateway pattern itself.
{tenant}.acme.com. Easiest. Requires wildcard cert./t/{tenant}/.... Trivial for proxies; ugly for end users.tenant_id claim in the user's access token. The right answer for B2B SaaS where the user belongs to one tenant.# strip everything client-controlled
x-tenant-id: DELETE
x-user-sub: DELETE
x-user-scope: DELETE
x-org-id: DELETE
# inject from verified sources
x-tenant-id: {jwt.tenant_id}
x-user-sub: {jwt.sub}
x-user-scope: {jwt.scope}
x-request-id: {edge-generated UUID}
edge.authz.deny_total{reason} — labelled by invalid_token, insufficient_scope, rate_limited, geo_blocked.edge.token.validate.duration_ms — quantile per IdP / per JWKS endpoint.edge.tenant.req_total{tenant_id, route, status} — fans out into per-customer dashboards.edge.upstream.duration_ms — to surface "slow because of upstream" vs "slow because of authz".edge.authz.cache.hit_rate — < 95 % means your caching strategy needs work.invalid_token from a single IP/ASN — credential stuffing or token enumeration.insufficient_scope on routes the user has never hit — privilege probing.{
"ts": "2026-05-06T09:00:01Z",
"request_id": "abc…",
"edge_pop": "lhr01",
"tenant_id": "t_42",
"user_sub": "u_1234",
"method": "POST",
"path": "/v1/payments",
"decision": "allow",
"matched_rule": "policy:billing-write@v17",
"upstream": "billing-svc.eu-west-1",
"latency_ms": 4.2
}
The edge must strip and re-inject. Anything in X-User-* arriving at the upstream must originate from your edge, not the internet.
JWKS cached for > 24 h means key rotation breaks your edge. Cache 5–60 min; refresh on unknown kid.
failure_mode_allow: true is a footgun. If the IdP is down, do you want every request through, or every request blocked? Decide deliberately.
Tokens end up in CloudWatch / Datadog. Strip at the proxy.
Pattern matching belongs to WAF. Identity-aware decisions belong to authz. Mixing them produces brittle, slow rules.
Botnets and CGNAT defeat per-IP. Use per-user / per-tenant for authenticated traffic.
Your fancy edge authz blocks the browser's preflight. The actual request never goes. Allow OPTIONS through unauthenticated; respond with the right CORS headers; let the real request hit authz.
For partners, fine — but pair with mTLS / IP allow-list / per-key rate limits. A leaked key without those guards is a perpetual breach.
| If you're... | Likely best | Why |
|---|---|---|
| A consumer SaaS with a global audience | Cloudflare Access + Workers | ~300 PoPs; built-in IdP integrations; programmable for custom logic. |
| Heavily AWS-centric, cost-sensitive | AWS HTTP API + Lambda authorizers | Cheap, native IAM glue, JWT authorizer for free. |
| K8s-native, want OSS & plugins | Kong / Tyk | K8s ingress controllers, JWT/OIDC plugins, OPA integration. |
| Already on Istio mesh | Istio Gateway + RequestAuthentication + AuthorizationPolicy | Same primitives as your mesh; one place to maintain policy. |
| Internal tools / "make every app SSO" | Cloudflare Access or Pomerium or oauth2-proxy | Identity-aware proxy in front of unmodified apps; cheap to deploy. |
| Enterprise scale, deep analytics | Apigee or Mulesoft | Established enterprise controls; cost reflects it. |
| API aggregation across many backends | Krakend | One inbound request fans out, recombines responses; JWT/OIDC at the edge. |
| Building / shipping an MCP / agent product | Docker MCP Gateway + your usual edge | Specialised edge for agent ↔ MCP-server flows; coexists with whatever else is at your edge. |
ext_authz — the canonical delegation patternX-User-* must originate from your edge, not the internet.Authorization Models · Workload Identity & Service-Mesh AuthZ · OAuth for MCP Servers · Advanced OpenID Connect · Cloud_aaS_05_Cloud_Security.
Envoy ext_authz docs · Istio RequestAuthentication / AuthorizationPolicy · AWS API Gateway authorizers · Cloudflare Access & Workers · Kong gateway docs · Apigee policy reference · OWASP API Security Top 10 (2023) · OWASP CRS (WAF) · CloudEvents v1.0 + JWS · Stripe / GitHub webhook signing · Apollo Router / Wundergraph · OpenID SSF / CAEP · RFC 6749 / 7519 / 8693 / 9449
The edge is your cheapest authorisation layer — drop bad requests there, propagate verified identity downstream, and let the app worry about per-object rules.