CLOUD SERVICE MODELS · PART 5 OF 6

Cloud Security
in Depth

Identity · secrets · network · encryption · compliance · supply chain
IAM · STS KMS · HSM · Vault PrivateLink · mTLS Zero Trust SBOM · SLSA · Sigstore
🔐 Identity 🗝 Secrets 🌐 Network 🛡 Encrypt 📜 Audit

The cross-cutting security disciplines for *aaS — what the cloud gives you, what it doesn't, and how the past decade's incidents shape the controls you actually need.

Identity  ·  Secrets  ·  Network  ·  Compliance  ·  Detect
01

Topics

Identity & access

  • The shared-responsibility model in detail
  • IAM — users, roles, policies, STS
  • Federation — SSO + workload identity
  • Least-privilege patterns & permission boundaries

Secrets & encryption

  • KMS / HSM / Secrets Manager / Vault
  • Encryption at rest / in transit / in use
  • Confidential compute (Nitro Enclaves, AMD SEV-SNP)
  • BYOK / HYOK — bring or hold your own key

Network & supply chain

  • Zero-trust networking
  • PrivateLink & service mesh
  • Supply chain — SBOM, SLSA, Sigstore
  • Container & image security

Compliance & ops

  • SOC 2 / ISO / HIPAA / GDPR / PCI / FedRAMP
  • CSPM / CNAPP / CWPP
  • SIEM, detection & response
  • Real incidents — what they tell us
02

Shared Responsibility — In Detail

The cloud secures the infrastructure; you secure everything you run on it. Where the line falls depends on the *aaS layer (deck 01) — but in every model, four things are always yours.

Always yours, every layer

  1. Your data — what you upload, encryption keys you choose
  2. Your identities — users, roles, who has what
  3. Your client config — secrets, OAuth apps, API keys
  4. Your access policies — IAM, sharing controls, public flags

Always theirs, every layer

  1. Physical security of data centres
  2. Hardware integrity, hypervisor patching
  3. Substrate network & cross-region transit
  4. Availability of the underlying API
  5. The compliance reports themselves (SOC 2 Type II, ISO 27001 certificates)

A worked example — S3 bucket

  • AWS ensures: hardware encryption, no-cross-tenant access at the storage layer, durability (11×9), TLS termination
  • You must: bucket policy, public-access block, encryption-at-rest, versioning + Object Lock, access logging, principal-based access — and notice when something is misconfigured

Where most breaches happen

Cloud Security Alliance & Verizon DBIR are consistent year after year: ~80% of cloud incidents are customer-side misconfiguration — public storage, lax IAM, leaked keys. Provider-side breaches are real but rare.

Read the contract

For each managed service, find the actual shared-responsibility doc: AWS publishes one per service, GCP has the "shared fate" model, Azure has the SaaS/PaaS/IaaS matrix. They differ.

03

IAM — Identity, Roles, Policies, STS

Principal user · role · workload STS issues short-lived credentials Policy evaluator SCP · resource · identity policies Resource S3 / RDS / Bedrock / … AssumeRole creds Allow / Deny Every API call is an authenticated, authorised, logged transaction. The four IAM controls — SCP, identity, resource, permission boundary — combine in a deny-wins evaluator. explicit deny → deny | explicit allow → check boundaries → allow | otherwise → deny

Identity policies

Attached to a principal — "what can this user/role do?". Most common kind.

Resource policies

Attached to a resource — "who can use this S3 bucket / KMS key / SQS queue?". Govern cross-account access.

SCPs & boundaries

Org-level guardrails — "even an admin in account X can't disable CloudTrail". The compliance backstop.

04

STS & Federation — No Long-Lived Keys

Long-lived AKIA…/JSON-key/service-principal-secret credentials are the thing leaked into GitHub at 03:00 every day. Replace them everywhere with short-lived tokens minted on demand.

For humans

  • SSO via IdP — Okta, Entra, Google Workspace, JumpCloud → cloud federated login
  • AWS IAM Identity Center (was SSO) / GCP Cloud Identity / Azure AD
  • SSO issues a short-lived role assumption (1–12 h)
  • No IAM users — ever

For workloads (in-cloud)

  • EC2 Instance Profile — VM gets a role via metadata service (IMDSv2)
  • EKS Pod Identity (2023) / IRSA — pods get role via service account
  • GKE Workload Identity — service account → IAM
  • Azure Managed Identity — VM/container/app gets identity automatically
  • Cloud Run / App Service service accounts — same idea

For CI/CD & external workloads

  • OIDC federation — GitHub Actions, GitLab CI, Buildkite, CircleCI all sign OIDC tokens; the cloud trusts those tokens via a configured issuer URL
  • No long-lived secret in GitHub Actions — the runner exchanges its OIDC token for STS creds
  • Same pattern for K8s clusters running outside the cloud — federate via OIDC issuer

A GitHub Actions OIDC role-trust

{
  "Effect": "Allow",
  "Principal": { "Federated": "arn:aws:iam::...:oidc-provider/token.actions.githubusercontent.com" },
  "Action": "sts:AssumeRoleWithWebIdentity",
  "Condition": {
    "StringEquals": {
      "token.actions.githubusercontent.com:aud": "sts.amazonaws.com",
      "token.actions.githubusercontent.com:sub": "repo:acme/api:ref:refs/heads/main"
    }
  }
}

The static-key tax

Every long-lived key in your org has a non-trivial probability of being leaked this year. Audit-as-you-go: aws iam list-access-keys, gcloud iam service-accounts keys list, weekly cron, alarm on anything > 30 days old.

05

Least Privilege — In Practice

"Least privilege" is the principle that every identity gets only the permissions it needs. Easy to say; in real cloud accounts most policies have grown to "*:*" by Tuesday.

Iterative tightening

  1. Start coarse (managed policies are fine for week 1)
  2. Use IAM Access Analyser / Cloud Asset Inventory / Microsoft PIM to surface unused permissions
  3. Generate scoped policies from observed CloudTrail / Cloud Audit Logs
  4. Add conditions — region, IP, MFA, time of day, principal tag
  5. Revisit quarterly; permissions decay

Conditions that punch above their weight

"Condition": {
  "Bool":  {"aws:MultiFactorAuthPresent": "true"},
  "StringEquals": {"aws:RequestedRegion": ["eu-west-2","us-east-1"]},
  "IpAddress": {"aws:SourceIp": ["203.0.113.0/24"]},
  "ArnEquals": {"aws:PrincipalTag/team": "data"},
  "DateGreaterThan": {"aws:CurrentTime": "2026-01-01T00:00Z"}
}

Patterns to adopt

  • Permission boundaries — cap any role's permissions; useful for delegating role creation safely
  • SCPs at the org level — "no resource may be created in unapproved regions"
  • Tag-based access — a user can act only on resources tagged with their team
  • Just-in-time access — Teleport / OpenZiti / Tailscale + role-assumption brokers; "request, get for 4 hours"

When auto-tools help

  • AWS Access Analyser policy generation — emits a draft policy from observed activity
  • Dispel / Apono / SailPoint — JIT access platforms
  • Wiz / Orca / Lacework / Prisma — CSPM, surface "this role used 3 of its 240 permissions"

"Admin for now" policies

The most common security debt. Time-box them: every "Action":"*" policy gets a tag with an expiry date; CI fails the deploy after.

06

Secrets — KMS / HSM / Secrets Manager / Vault

ToolClassWhat it storesNotes
AWS KMS / GCP KMS / Azure Key Vault keysKey managementKeys (you encrypt data with the key it returns)FIPS 140-2 L3 with HSM tier; envelope encryption is the pattern
AWS Secrets Manager / Google Secret Manager / Azure Key Vault secretsSecret storeStrings, JSON blobs, rotatedVersioned, IAM-gated, integrates with Lambda for rotation
SSM Parameter StoreConfig + secretsStrings (KMS-encrypted "SecureString")Cheaper than Secrets Manager for low-rotation use
HashiCorp VaultSecret store + dynamic credsAnything; generates short-lived DB / cloud credsThe OSS standard; Boundary for SSH, Vault Enterprise for HSM
Doppler / 1Password / Infisical / AkeylessSaaS secret managerApplication secretsDeveloper-friendly, env-vars at deploy time
CloudHSM / Azure Dedicated HSMHSMKeys for compliance (FIPS 140-2 L3, single-tenant)~$1/hr; only when an auditor demands it

Envelope encryption

# 1. App generates a random data-encryption key (DEK)
# 2. Encrypts data with DEK (fast, AES-256-GCM)
# 3. Asks KMS to encrypt the DEK with a master key (KEK)
# 4. Stores: ciphertext + encrypted-DEK
# 5. To decrypt: ask KMS to decrypt DEK; use DEK to decrypt data

ciphertext  = AES-GCM(plaintext, DEK)
encrypted_DEK = KMS.Encrypt(DEK, KeyId="...master...")
store(ciphertext, encrypted_DEK)

BYOK / HYOK

  • BYOK — Bring Your Own Key. Customer imports key material into the cloud KMS. Common in regulated SaaS.
  • HYOK — Hold Your Own Key. Customer keeps the master key in their HSM; decryption requires their HSM. Stronger; harder to operate.
  • External Key Stores — AWS XKS, GCP EKM. The cloud calls back to your HSM on every cryptographic operation.
07

Encryption — At Rest, In Transit, In Use

At rest — table-stakes

  • AWS: SSE-S3 (default), SSE-KMS (your CMK), SSE-C (you supply key)
  • GCP: Google-managed, customer-managed (CMEK), customer-supplied (CSEK)
  • Azure: Microsoft-managed, customer-managed
  • EBS / Persistent Disks / Managed Disks — encrypted by default in 2024+ accounts
  • RDS / Cloud SQL / Cosmos — encrypted at rest, key choice yours

In transit

  • TLS 1.2+ everywhere; TLS 1.3 default in modern stacks
  • Cloud-native LBs handle ACM / GMC / Azure cert provisioning
  • Internal service-to-service: mTLS via service mesh (Linkerd, Istio, Consul) or sidecar (Envoy)
  • Private connections (Direct Connect, ExpressRoute) are not encrypted by default — use IPsec or MACsec

In use — confidential compute

  • Encrypts memory; the cloud operator can't read RAM contents
  • AWS Nitro Enclaves — isolated VMs, no SSH, no persistent storage; for KMS-key-handling and PII processing
  • AMD SEV-SNP — encrypted VMs (AWS, GCP, Azure)
  • Intel TDX — Trust Domain Extensions, follow-up to SGX
  • Confidential GPUs (Hopper H100, Blackwell B100) — for AIaaS / Bedrock confidential inference

Practical hierarchy

  1. Default-encrypt everything at rest with provider-managed keys (free)
  2. Move sensitive data to customer-managed keys (CMK) — get rotation, audit, revoke
  3. For regulated tenants, offer BYOK with their CMK
  4. For top-tier compliance / IP, evaluate HYOK / XKS / confidential compute — small fraction of workloads

Field-level encryption is its own problem

Database-level encryption protects against backup theft. Field-level (encrypt SSN before insert) protects against full DB compromise — but breaks SQL operators. Tokenisation (Vault Transform, Privacera, Skyflow) is usually the better answer.

08

Network Security — Beyond the VPC

East-west — service-to-service

  • Service mesh — Istio, Linkerd, Consul Connect, Cilium-based mesh; mTLS for free, traffic policy as code
  • Sidecarless mesh — eBPF-based (Cilium, Istio Ambient) — less overhead, no per-pod sidecar
  • SPIFFE / SPIRE — workload identity that predates the cloud's native ones; becomes the lingua franca across providers

North-south — at the edge

  • WAF — AWS WAF, Cloudflare WAF, GCP Cloud Armor, Azure Front Door — bot, OWASP, custom rules
  • DDoS — Cloudflare default, AWS Shield Advanced, Azure DDoS Protection
  • Anti-CSRF / origin-bound headers at the LB
  • Bot management — separate market: PerimeterX/HUMAN, DataDome, hCaptcha Enterprise

PrivateLink & service exposure

  • You expose one service to a customer's VPC, not the whole VPC
  • No NAT, no peering, no overlapping CIDRs
  • Single-tenant SaaS uses this to give enterprise customers a private DNS endpoint into your service
  • Cross-cloud: AWS PrivateLink, GCP Private Service Connect, Azure Private Link — same idea, slightly different APIs

Egress lockdown

  • Restrictive egress rules — workloads that don't need internet shouldn't have it
  • Egress proxies — Squid / NGINX / Cilium Network Policy + DNS filter to allow-list domains
  • VPC endpoints for AWS APIs — calls don't traverse the internet
  • Why: stops data exfiltration via opportunistic compromised library

DNS leaks

You can lock down IPs perfectly, then resolve evil.example through the cloud's DNS resolver. Use Route 53 Resolver query logs / Cloud DNS audit / Azure DNS analytics.

09

Zero Trust — The Replacement for the VPN

"Never trust, always verify." Replaces the corporate VPN as the only thing between trusted users and crown-jewel data.

Five tenets (NIST 800-207)

  1. Every resource is treated as untrusted by default
  2. Authenticate & authorise every request, every time
  3. Use device posture (managed laptop, MDM, EDR running) as a factor
  4. Decisions are dynamic — recalculate as context changes
  5. Log everything; analyse continuously

What replaces the VPN

  • BeyondCorp (Google) — the original; published 2014
  • Cloudflare Zero Trust / Access — broker between user and HTTP/SSH/RDP services
  • Tailscale — WireGuard-based mesh + IdP-driven ACLs
  • Twingate · Pomerium · Teleport — focused on internal app access
  • AWS Verified Access · Azure Entra Private Access

Identity-aware proxy pattern

  • Every internal app sits behind an IAP (Identity-Aware Proxy)
  • The IAP sees: user identity, device posture, location, time, threat-intel signals
  • It allows / denies / steps-up (MFA) per request
  • App receives a signed header with verified identity

Bastion / break-glass access

  • AWS SSM Session Manager — SSH without a public IP, all logged
  • GCP IAP TCP forwarding — same, via IAP
  • Teleport — multi-cloud, with session recording
  • Every break-glass action: ticketed, time-bound, audited, alerted

VPNs are still the #1 ransomware vector

FortiOS, Pulse, Cisco ASA, Ivanti — every year a new pre-auth RCE. Replace with zero-trust + IAP wherever it fits.

10

Supply Chain Security — SBOM, SLSA, Sigstore

Most cloud workloads ship with hundreds of third-party dependencies. The supply chain is now an attack vector — Codecov 2021, SolarWinds 2020, npm colors/faker 2022, xz utils 2024, every typosquatted-pypi incident.

SBOM — Software Bill of Materials

  • Machine-readable list of every dependency & version in your build
  • Formats: SPDX (Linux Foundation) and CycloneDX (OWASP)
  • Generated by Syft, Trivy, Grype, GitHub Dependency Graph
  • US Executive Order 14028 (2021) requires it for federal vendors

SLSA — provenance for builds

  • 4-level framework, levels 1–4
  • L1 — build is scripted & documented
  • L2 — build runs on a hosted service, provenance signed
  • L3 — provenance non-falsifiable; isolated build (GitHub Actions reusable workflows hit L3)
  • L4 — two-person review of builds; rare

Sigstore — sign without keys

  • Cosign + Fulcio (CA) + Rekor (transparency log)
  • OIDC-backed signing — your identity is your key
  • cosign sign $IMAGE on push; cosign verify on deploy
  • Default for Kubernetes, Helm, Python, npm-via-Sigstore

Practical pipeline

# In CI:
syft  packages dir:. -o spdx-json > sbom.json
trivy fs --severity HIGH,CRITICAL --exit-code 1 .
cosign sign $IMAGE
cosign attach sbom --sbom sbom.json $IMAGE

# At deploy admission (Kyverno / OPA):
verify $IMAGE is signed by *.@yourcompany.com
verify $IMAGE has SBOM attached
verify $IMAGE has no CRITICAL CVE in last 24h

The dependency-confusion attacks

Internal package "acme-utils" lives in your private registry. Attacker publishes "acme-utils" 99.9.9 to npmjs.com. npm install picks the higher version. Mitigation: scoped names (@acme/utils), private-only resolution, package-lock pinning, Verdaccio or ProGet pull-through.

11

Container & Image Security

Image-side

  • Distroless / Alpine / scratch as base — fewer CVEs, smaller surface
  • Multi-stage build (see Docker Multi-Stage Builds deck)
  • Scan in CI — Trivy, Grype, Snyk, Wiz, Prisma, Twistlock
  • Scan in registry — ECR / Artifact Registry / ACR built-in
  • Sign with Cosign; admission rejects unsigned
  • SBOM attached; updated nightly

Runtime-side

  • Non-root user (USER 10001)
  • Read-only root filesystem (readOnlyRootFilesystem: true)
  • Drop all capabilities, add back only what's needed
  • seccomp + AppArmor / SELinux profiles
  • Falco / Tetragon (eBPF) — detect anomalous syscall behaviour at runtime

Beyond Docker — micro-VM isolation

  • Firecracker — Lambda, Fargate, Fly Machines; true VM boundary, ~125ms boot
  • gVisor — Google sandbox, intercepts syscalls in user space
  • Kata Containers — VM-per-pod for K8s
  • Used when "container escape" is in the threat model — multi-tenant runtimes (Cloud Run), CI runners, untrusted code execution

Admission policy

# Kyverno - reject privileged pods
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: disallow-privileged
spec:
  validationFailureAction: Enforce
  rules:
    - name: privileged
      match: { any: [ { resources: { kinds: [Pod] } } ] }
      validate:
        message: "privileged pods are not allowed"
        pattern:
          spec:
            =(securityContext):
              =(privileged): "false"
            containers:
              - =(securityContext):
                  =(privileged): "false"

Companion deck

The Docker Security deck covers the container-image side end-to-end. This deck adds the cloud-native pieces (signing, admission, runtime protection).

12

Compliance — The Major Frameworks

FrameworkScopeWhat it certifiesAudit cadence
SOC 2 Type II (AICPA)Service organisationsOperating effectiveness of controls over a period (typ. 6–12 mo)Annual
ISO/IEC 27001Information security mgmtYou have an ISMS; recertification every 3 yrs3 yr cycle, annual surveillance
ISO/IEC 27017 / 27018Cloud-specific add-ons to 27001Cloud security & PII in cloudsame as 27001
HIPAA / HITECH (US health)Protected Health InformationSelf-attestation + BAA chaincontinuous; OCR investigates breaches
GDPR (EU) + UK DPA 2018Personal dataSelf-assessment; supervisory authority enforcescontinuous
PCI DSS v4Card dataCompliance against 12 control areasannual
FedRAMP (US)Federal procurementModerate / High / Tailored authorisationcontinuous monitoring
CCPA / CPRA (California)Consumer dataSelf; AG enforcescontinuous
EU AI ActHigh-risk AIRisk classification, transparency, conformity assessmentphased through 2027
DORA (EU finance)Operational resilienceICT risk, third-party (cloud) risksince Jan 2025

Compliance automation tools

Vanta · Drata · Secureframe · Sprinto · Anecdotes · Tugboat Logic. They ingest from cloud / IdP / GitHub / HRIS, map to controls, generate evidence. SOC 2 in 4–6 months instead of 12+.

CSPM / CNAPP

Wiz · Orca · Lacework (Fortinet) · Prisma Cloud · Tenable Cloud Security · CrowdStrike Falcon Cloud Security. Continuous configuration scanning, attack-path analysis, IAM analysis. Increasingly merged into "Cloud Native Application Protection Platforms".

13

Detection & Response — SIEM, GuardDuty, Security Hub

Cloud-native detection

  • AWS GuardDuty — DNS / VPC flow / CloudTrail anomaly detection; ML-driven
  • GCP Security Command Center — equivalent
  • Microsoft Defender for Cloud — equivalent
  • CloudTrail / Cloud Audit Logs / Activity Log — the source-of-truth event stream
  • Network Firewall / Cloud NGFW / Azure Firewall — IDS/IPS at the VPC edge

SIEM platforms

  • Splunk — the historic standard
  • Microsoft Sentinel — cloud-native, often packaged with E5
  • Datadog Cloud SIEM — if already on Datadog
  • Elastic Security — OSS-rooted
  • Panther · SnapAttack · Anvilogic — modern detection-as-code platforms

Detection-as-code

  • Detection rules in Python / YAML, version-controlled
  • Same lifecycle as application code: PR review, test fixtures, CI
  • Sigma rules — vendor-neutral detection format
  • MITRE ATT&CK mapping per rule

Response automation (SOAR)

  • Tines, Torq, Splunk SOAR, Cortex XSOAR — orchestrate response runbooks
  • Common play: "AccessKey appears on GitHub" → quarantine key → notify owner → rotate
  • "Suspicious AssumeRole from new country" → step-up MFA

Don't ship every log

SIEM is priced per-GB ingested; CloudWatch Logs at $0.50/GB; Datadog logs at $1.27/GB indexed. Tier — sample app logs, full ingest of audit + IAM + DNS. Otherwise the bill is the security incident.

14

Real Incidents — What They Tell Us

Capital One (2019)

SSRF in a misconfigured WAF allowed access to EC2 metadata service → STS credentials → S3 buckets → 100M records. Fix: IMDSv2 (mandatory, hop-limit-bound), tight IAM scopes, VPC endpoint restrictions.

SolarWinds (2020)

Compromised build of Orion shipped to 18,000 customers including US Treasury. Fix: SLSA L3+, signed builds, reproducible builds, two-person build promotion.

Codecov (2021)

Bash uploader script modified in S3; exfiltrated CI secrets for months. Fix: short-lived tokens (OIDC) instead of static; signed scripts.

Okta support breach (2023)

HAR files uploaded to Okta support contained session cookies, used to compromise Okta customers. Fix: sanitised HAR uploads, customer-side session-binding (DPoP / mTLS).

MOVEit (2023)

Cl0p exploited zero-day in Progress MOVEit Transfer; ~2,700 organisations breached; tens of millions of records. Fix: third-party-software risk, network egress restriction, asset inventory of every internet-facing service.

Snowflake credential theft (2024)

Stealer-malware-collected creds reused on ~100 Snowflake customer accounts (no MFA enforced) → Ticketmaster, Santander, AT&T. Fix: MFA mandatory by default; provider-wide auth-policy enforcement.

XZ Utils backdoor (2024)

Multi-year social-engineering of an OSS maintainer slipped a backdoor into liblzma, almost making it into Debian/Fedora SSH. Fix: reproducible builds, multiple maintainers, sandboxed builders.

Microsoft / Storm-0558 (2023)

Stolen MSA signing key was usable for Azure AD tokens because of a validation bug. Fix: defence in depth at token validation; key isolation; audit-log surface for cross-tenant access.

15

Anti-Patterns

"0.0.0.0/0 on port 22 / 3389 / 3306 just for now"

Will be brute-forced inside an hour. SSM / IAP / Tailscale instead. No public DB ports, ever.

"MFA optional"

The Snowflake breach class — millions of credentials harvested by stealer malware, replayed on accounts without MFA. Make it mandatory at the IdP.

"Static keys committed to a private repo, it's fine"

Private becomes public on the first contractor offboard, fork, or SSO misconfiguration. GitGuardian / TruffleHog / Gitleaks scan; OIDC instead.

"Disable encryption for cost / perf"

Encryption-at-rest with provider-managed keys is free and adds < 1% perf overhead. Keep it on.

"CloudTrail off in dev"

Dev breaks first; you need the logs more there than in prod. Org-wide CloudTrail / Cloud Audit / Diagnostic Settings, no opt-out.

"Wildcard IAM permissions"

"Action":"*","Resource":"*" on a workload's role. Use Access Analyser, narrow on the next sprint.

"Compliance is a Vanta dashboard"

Auto-evidence is a productivity gain, not a strategy. The auditor still wants real controls; the customer still wants security, not green checkmarks.

"Security review at end of sprint"

Shift left. Threat model in design; SAST in CI; admission policies; pen test annually; bug bounty as the safety net.

16

Summary

Three takeaways

  1. The cloud secures the substrate; you secure everything you build on it. Most breaches are still customer-side IAM / config.
  2. Identity is the perimeter — short-lived tokens, federation, MFA mandatory, least privilege a habit.
  3. Compliance is process, not magic. Tooling speeds it up; it does not replace the controls.

Next in series

  • 06 LLM-as-a-Service — and the new security shapes that come with it

Companion decks

One sentence

"Cloud security is identity, secrets, network and supply-chain — applied with discipline, audited continuously, and revisited the day after every incident report."