MODERN CRYPTOGRAPHY SERIES · PRESENTATION 07

Post-Quantum Cryptography

Lattices · ML-KEM · ML-DSA · SLH-DSA · NIST PQC Standards · Hardware

Navigate with arrow keys · Press Esc for overview · Press F for fullscreen

Why Post-Quantum?

Every public-key cryptosystem in wide deployment today — RSA, Diffie-Hellman, ECDSA, ECDH, EdDSA — derives security from one of two problems: integer factorisation or the discrete logarithm problem.

Shor's algorithm (1994) solves both in polynomial time on a quantum computer, reducing exponential-class problems to O(n³) in qubit-circuit depth.

⚠ Broken by Shor's Algorithm

  • RSA (integer factorisation)
  • Diffie-Hellman / DSA (DLP over ℤp*)
  • ECDH / ECDSA / EdDSA (ECDLP)
  • ElGamal encryption

✓ Quantum-Resistant

  • AES-256 (Grover halves key strength → 128-bit)
  • SHA-3, SHAKE (output doubling suffices)
  • Lattice-based: ML-KEM, ML-DSA
  • Hash-based: SLH-DSA, LMS/XMSS

Grover's algorithm gives only a quadratic speedup against symmetric ciphers — doubling key length neutralises the threat.

The Quantum Threat Timeline

1994
Shor publishes factoring algorithm
1996
Grover's search algorithm — quadratic speedup
2016
NIST calls for PQC submissions (82 received)
2022
NIST selects Kyber, Dilithium, SPHINCS+ as finalists
Aug 2024
FIPS 203, 204, 205 finalised — ML-KEM, ML-DSA, SLH-DSA
Mar 2025
HQC selected as backup KEM (code-based)
2027
CNSA 2.0 — new NSS acquisitions must be PQC-compliant
2035
NIST deprecates all quantum-vulnerable algorithms

🕐 "Harvest Now, Decrypt Later"

Adversaries record encrypted traffic today, planning to decrypt it once a Cryptographically Relevant Quantum Computer (CRQC) is available. Long-lived secrets are already at risk.

Qubit Estimates to Break RSA-2048

~4,000 error-corrected logical qubits, which translates to millions of physical qubits with current error rates. Estimates for CRQC arrival: 2030s–2040s.

Symmetric Impact

Grover reduces AES-128 → ~64-bit security. Mitigation: use AES-256 (→128-bit post-quantum).

PQC Algorithm Families

Post-quantum schemes derive hardness from problems believed resistant to both classical and quantum algorithms.

🔷 Lattice-Based

Hard problem: Learning With Errors (LWE), Module-LWE, Ring-LWE
Standards: ML-KEM (FIPS 203), ML-DSA (FIPS 204), FN-DSA (draft FIPS 206)
Pros: Small keys, fast operations, versatile
Cons: Relatively newer hardness assumptions

🔶 Hash-Based

Hard problem: Preimage/collision resistance of hash functions
Standards: SLH-DSA (FIPS 205), LMS/XMSS (SP 800-208)
Pros: Very conservative, minimal assumptions, tiny keys
Cons: Large signatures, slower signing

🔵 Code-Based

Hard problem: Decoding random linear codes
Standards: HQC (selected March 2025, draft ~2026)
Pros: Decades of study (McEliece, 1978), diverse math basis
Cons: Very large public keys (Classic McEliece)

🟣 Multivariate / Isogeny

Hard problems: MQ problem, supersingular isogeny
Status: SIKE broken (2022); Rainbow broken (2022); research continues
Notes: Some multivariate signatures still in NIST additional round

Lattice Fundamentals

A lattice ℒ(B) is the set of all integer linear combinations of basis vectors B = {b₁, b₂, …, bₙ} ⊂ ℝⁿ:

ℒ(B) = { Σᵢ zᵢ · bᵢ | zᵢ ∈ ℤ }

Interactive 2D lattice — click to see closest vector problem (CVP)

Hard Lattice Problems

  • SVP — Shortest Vector Problem: find the shortest non-zero vector in ℒ
  • CVP — Closest Vector Problem: given a target point, find the nearest lattice point
  • SIVP — Shortest Independent Vectors: find n shortest linearly independent vectors
  • GapSVP — Decision variant: is λ₁(ℒ) ≤ d or > γ·d?

Best known classical algorithms (BKZ/LLL) run in 2O(n) time. No known quantum algorithm provides more than polynomial improvement over classical lattice algorithms.

Learning With Errors (LWE)

Introduced by Oded Regev (2005, Gödel Prize 2018). LWE is the computational foundation for ML-KEM and ML-DSA.

The Problem

Given a random matrix A ∈ ℤqm×n and a secret vector s ∈ ℤqn, compute:

b = A·s + e (mod q)

where e is a "small" error vector sampled from a discrete Gaussian χ.

The search problem: given (A, b), recover s.
The decision problem: distinguish (A, b) from uniform random.

Without noise: trivial Gaussian elimination — O(n³). With noise: as hard as worst-case lattice problems (GapSVP, SIVP).

Variants

Ring-LWE (RLWE)

Work in Rq = ℤq[x]/(xn+1). Structured — O(n log n) operations via NTT. Basis for Kyber/ML-KEM.

Module-LWE (MLWE)

Matrices of ring elements: A ∈ Rqk×k. Bridges Ring-LWE efficiency with LWE flexibility. Parameter k controls security/performance trade-off.

Worst-Case to Average-Case

Regev's reduction: solving random LWE instances is at least as hard as solving worst-case GapSVP — a unique guarantee among cryptographic assumptions.

Interactive: LWE in Action

This demonstrates why noise makes LWE hard. Without error, the system is trivially solvable.

Parameters: n=4, q=97

No noise (e=0) Small noise (σ=1) Medium noise (σ=3) Large noise (σ=8)

Error distribution: discrete samples from e mod q

NIST PQC Standards (Aug 2024)

FIPSAlgorithmOriginal NameTypeHard ProblemPurpose
203ML-KEMCRYSTALS-KyberLatticeModule-LWEKey Encapsulation
204ML-DSACRYSTALS-DilithiumLatticeModule-LWE + SISDigital Signatures
205SLH-DSASPHINCS+HashHash preimageDigital Signatures

Coming: FIPS 206 — FN-DSA

Based on FALCON. FFT over NTRU lattices. Compact signatures (~690 B at 128-bit security). Draft expected 2025.

Coming: HQC

Code-based KEM selected March 2025 as backup to ML-KEM. Different mathematical basis (Hamming Quasi-Cyclic codes). Draft standard ~2026, final ~2027.

SP 800-208: LMS/XMSS

Stateful hash-based signature schemes. Already approved. Require strict state management — suitable for firmware signing, not TLS.

NIST IR 8547: transition timeline — deprecate quantum-vulnerable algorithms by 2035, with high-risk systems transitioning much earlier.

ML-KEM (FIPS 203) — Deep Dive

Module-Lattice-Based Key-Encapsulation Mechanism, derived from CRYSTALS-Kyber.

Construction Overview

ML-KEM is built in two layers:

1. K-PKE — an IND-CPA public-key encryption scheme based on Module-LWE. Encrypts a fixed 32-byte message.

2. KEM transform — a modified Fujisaki-Okamoto (FO) transform applied to K-PKE, yielding IND-CCA2 security.

KeyGen: (A, t = As + e) where A ∈ Rqk×k, s,e ← CBD(η)
Encaps: c = Compress(Au + e₁, tTu + e₂ + ⌈q/2⌋·m)
Decaps: m' = Decompress(c₂ − sTc₁), re-encrypt, check

The ring Rq = ℤq[x]/(x256+1) with q = 3329. Polynomial multiplication via Number Theoretic Transform (NTT).

Parameter Sets

Setkη₁,η₂PK (B)SK (B)CT (B)NIST Cat
ML-KEM-51223, 280016327681
ML-KEM-76832, 21184240010883
ML-KEM-102442, 21568316815685

Shared secret: always 32 bytes. NIST Category 1 ≈ AES-128 security; Cat 3 ≈ AES-192; Cat 5 ≈ AES-256.

Why Module-LWE?

Module-LWE generalises Ring-LWE (one polynomial → k×k matrix of polynomials). Tuning k adjusts security without changing ring dimension. ML-KEM-768 (k=3) is the recommended general-purpose parameter set.

ML-KEM Protocol Flow

Alice (Initiator)
KeyGen():
1. Sample A ← ℤq[x]/(x²⁵⁶+1) via SHAKE-128
2. Sample s, e ← CBD(η₁)
3. t = NTT(A)·NTT(s) + NTT(e)
4. pk = (t̂, ρ), sk = (s, pk, H(pk), z)

Decaps(sk, ct):
5. Decrypt ct → m'
6. Re-encrypt m' → ct'
7. If ct' = ct: K = KDF(m', H(ct))
   Else: K = KDF(z, H(ct))
pk (800–1568 B)
→→→
←←←
ct (768–1568 B)
Bob (Responder)
Encaps(pk):
1. m ← random 32 bytes
2. (K̂, r) = G(m ‖ H(pk))
3. Sample r₁, r₂, e₁ ← CBD(η) using r
4. u = NTT⁻¹(ÂT·NTT(r₁)) + e₁
5. v = NTT⁻¹(t̂T·NTT(r₁)) + e₂ + Compress(m)
6. ct = (Compress(u), Compress(v))
7. K = KDF(K̂, H(ct))

Both parties derive the same 32-byte shared secret K. The FO transform ensures IND-CCA2 security via implicit rejection.

Number Theoretic Transform (NTT)

The NTT is the workhorse of lattice cryptography — it converts polynomial multiplication from O(n²) to O(n log n).

How It Works

The NTT is the finite-field analogue of the FFT, operating over ℤq instead of ℂ.

Given f(x), g(x) ∈ Rq = ℤq[x]/(x256+1), q = 3329:

1. f̂ = NTT(f)   — 7 layers of butterfly operations
2. ĝ = NTT(g)
3. ĥ = f̂ ⊙ ĝ   — pointwise multiply (128 pairs)
4. h = NTT⁻¹(ĥ) — inverse transform

Root of unity: ζ = 17 (ζ²⁵⁶ ≡ −1 mod 3329)
Why q = 3329? Prime, q ≡ 1 (mod 256), fits in 12 bits.

Butterfly Operation

a[j] a[j+m] ×ζ^k ×ζ^k a[j] + ζᵏ·a[j+m] a[j] − ζᵏ·a[j+m]

ML-KEM uses 7 layers of 128 butterflies. Each butterfly: 1 multiply + 1 add + 1 subtract in ℤ3329. Total: 128 × 7 = 896 butterflies per NTT.

Hardware: can be parallelised into systolic arrays. FPGA implementations achieve NTT in <1 μs at modest clock rates.

ML-DSA (FIPS 204) — Digital Signatures

Module-Lattice-Based Digital Signature Algorithm, derived from CRYSTALS-Dilithium.

Fiat-Shamir with Aborts

ML-DSA uses the "Fiat-Shamir with Aborts" paradigm (Lyubashevsky, 2009):

KeyGen: A ← Rqk×ℓ, (s₁,s₂) ← small, t = As₁ + s₂
  pk = (A, t), sk = (A, t, s₁, s₂)

Sign(sk, M):
  repeat:
    y ← uniform with ‖y‖∞ < γ₁
    w = Ay
    c̃ = H(HighBits(w) ‖ M)
    c = SampleInBall(c̃) — sparse polynomial, τ nonzero coeffs
    z = y + c·s₁
  until ‖z‖∞ < γ₁−β and ‖LowBits(w − c·s₂)‖∞ < γ₂−β
  σ = (c̃, z, hints)

Verify(pk, M, σ):
  w' = Az − c·t, check HighBits match c̃

Parameter Sets

Set(k,ℓ)PK (B)SK (B)Sig (B)Cat
ML-DSA-44(4,4)1,3122,5282,4202
ML-DSA-65(6,5)1,9524,0003,2933
ML-DSA-87(8,7)2,5924,8644,5955

Why "Aborts"?

The masking vector y hides s₁ in z = y + c·s₁. If z leaks information about s₁ (by being too large or having suspicious low bits), the signature is rejected and the signer tries again. Average ~4–7 iterations per signature.

Security Basis

Relies on both Module-LWE (key generation) and Module-SIS (Short Integer Solution — signature forgery). Both reduce to worst-case lattice hardness.

SLH-DSA (FIPS 205) — Hash-Based Signatures

Stateless Hash-Based Digital Signature Algorithm, derived from SPHINCS+. The conservative choice — security rests solely on hash function properties.

Architecture

Hypertree Root (Public Key)

↑ XMSS tree layer d−1

↑ ... (d layers of XMSS Merkle trees) ...

↑ XMSS tree layer 0

FORS Signature (Few-time OTS)

Message Hash → index selection

Each layer: WOTS+ one-time signatures authenticate tree roots. FORS (Forest of Random Subsets) signs the message. The hypertree root is the public key.

Key & Signature Sizes

Parameter SetPK (B)SK (B)Sig (B)Cat
SLH-DSA-*-128s32647,8561
SLH-DSA-*-128f326417,0881
SLH-DSA-*-192s489616,2243
SLH-DSA-*-192f489635,6643
SLH-DSA-*-256s6412829,7925
SLH-DSA-*-256f6412849,8565

Hash choice: SHA-256 or SHAKE-256. Suffix s = smaller sig / slower; f = fast sign / larger sig.

Trade-off: Keys vs Signatures

SLH-DSA has the smallest keys of any PQC signature (32–64 B public key!) but the largest signatures (7–50 KB). Ideal for code signing, firmware verification — not TLS handshakes.

Interactive: Key & Signature Sizes

Compare PQC algorithms against classical schemes. Select a metric to visualise:

Public Key Secret Key Ciphertext / Signature

All sizes in bytes. Classical schemes shown for reference. ML-KEM-768 is the recommended general-purpose KEM.

Performance Benchmarks

PQC algorithms are competitive with — and often faster than — classical schemes in software.

OperationML-KEM-768ECDH (X25519)RSA-2048
KeyGen~30 μs~60 μs~200 ms
Encaps / Encrypt~40 μs~60 μs~10 μs
Decaps / Decrypt~35 μs~60 μs~5 ms
OperationML-DSA-65Ed25519RSA-2048
KeyGen~80 μs~30 μs~200 ms
Sign~250 μs (avg)~50 μs~5 ms
Verify~90 μs~70 μs~10 μs

✓ PQC Advantage

ML-KEM key generation is ~6000× faster than RSA-2048. Total KEM round-trip is comparable to ECDH. NTT-based arithmetic is highly parallelisable.

⚠ PQC Trade-off

Larger key/signature sizes increase bandwidth. ML-DSA signing requires retry loops (~4–7 iterations). SLH-DSA signing is 100–1000× slower than ML-DSA.

Hardware Implementation

PQC accelerators for SoCs, FPGAs, and ASICs.

NTT Accelerator Architecture

Coeff RAM
(256×12-bit)
Butterfly Unit
mul + add/sub mod q
Result RAM
Twiddle ROM
ζk table
Control FSM
7 stages × 128 ops
  • Core operation: modular multiply in ℤ3329 — 12×12-bit multiplier + Barrett reduction
  • Parallelism: 1 butterfly/cycle → 896 cycles per NTT, or 2–4 parallel butterflies for throughput
  • Keccak/SHAKE co-processor for sampling and hashing

Implementation Results

PlatformAlgorithmKeyGenEncapsArea
Cortex-M4ML-KEM-768~500K cyc~600K cyc
Artix-7 FPGAML-KEM-768~3 μs~4 μs~5K LUTs
28nm ASICML-KEM-768<1 μs<1 μs~25K GE
Cortex-M4ML-DSA-65~1.5M cyc~5M cyc

Design Considerations for SoC Integration

  • Shared NTT engine between KEM and signature operations
  • SHAKE-256/Keccak permutation block — also used in SHA-3
  • CBD (Centered Binomial Distribution) sampler — simple XOR + popcount
  • Constant-time implementation essential — avoid timing side-channels in Barrett reduction and NTT butterfly

Python: Simplified ML-KEM Core

A pedagogical implementation of the core NTT and polynomial arithmetic in ℤ3329.


Q = 3329          # ML-KEM modulus
N = 256           # polynomial degree
ZETA = 17         # primitive 256th root of unity mod Q

# Precompute twiddle factors: ζ^(BitRev₇(i)) for NTT layers
def precompute_zetas():
    zetas = [0] * 128
    for i in range(128):
        br = int(f'{i:07b}'[::-1], 2)        # 7-bit bit-reversal
        zetas[i] = pow(ZETA, br, Q)
    return zetas

ZETAS = precompute_zetas()

def ntt(f):
    """Number Theoretic Transform: O(n log n) polynomial → NTT domain."""
    a = list(f)
    k, length = 0, 128
    while length >= 1:
        for start in range(0, N, 2 * length):
            zeta = ZETAS[k]; k += 1
            for j in range(start, start + length):
                t = (zeta * a[j + length]) % Q
                a[j + length] = (a[j] - t) % Q
                a[j]          = (a[j] + t) % Q
        length //= 2
    return a

def pointwise_mul(f_hat, g_hat):
    """Pointwise multiplication in NTT domain (128 degree-1 polynomials)."""
    h = [0] * N
    for i in range(0, N, 2):
        z = ZETAS[64 + i // 2]
        h[i]   = (f_hat[i]*g_hat[i] + f_hat[i+1]*g_hat[i+1]*z) % Q
        h[i+1] = (f_hat[i]*g_hat[i+1] + f_hat[i+1]*g_hat[i]) % Q
    return h
  

Python: Key Generation Sketch


import os, hashlib

def cbd(eta, seed, nonce):
    """Centered Binomial Distribution — sample small polynomial."""
    # In production: expand seed with SHAKE-256, extract 64·eta bytes
    # Here simplified for illustration
    coeffs = [0] * N
    data = hashlib.shake_256(seed + bytes([nonce])).digest(64 * eta)
    for i in range(N):
        a = sum((data[2*eta*i + j] >> b) & 1
                for j in range(eta) for b in range(8)) 
        # ... (simplified — real CBD sums bit-pairs from byte stream)
        coeffs[i] = (a % (2*eta + 1)) - eta
    return [c % Q for c in coeffs]

def ml_kem_keygen(k=3, eta1=2):
    """ML-KEM-768 key generation (simplified sketch)."""
    rho = os.urandom(32)            # public seed for matrix A
    sigma = os.urandom(32)          # private seed for s, e
    
    # Generate A ∈ R_q^{k×k} from rho (deterministic via SHAKE-128)
    A_hat = [[sample_ntt(rho, i, j) for j in range(k)] for i in range(k)]
    
    # Sample secret s and error e from CBD(eta1)
    s = [ntt(cbd(eta1, sigma, n)) for n in range(k)]
    e = [ntt(cbd(eta1, sigma, n + k)) for n in range(k)]
    
    # t = A·s + e  (all in NTT domain)
    t_hat = [
        [(A_hat[i][j] @ s[j]) for j in range(k)]  # pseudocode
        for i in range(k)
    ]
    # ... sum columns, add e ...
    
    pk = encode(t_hat, rho)         # 1184 bytes for k=3
    sk = encode(s, pk, H(pk), z)    # 2400 bytes for k=3
    return pk, sk
  

Production implementations: use liboqs, pqcrypto (Rust), or OpenSSL 3.5+ with ML-KEM support.

Hybrid Schemes & Migration

The transition strategy: combine classical + PQC for defence-in-depth.

Hybrid Key Exchange

TLS 1.3 Hybrid Handshake

X25519 Key Share
+
ML-KEM-768 Encaps Key

↓ Combined Key Share in ClientHello

SS₁ = X25519(a, B)
SS₂ = ML-KEM.Decaps(ct)

↓ KDF(SS₁ ‖ SS₂) → session keys

If either algorithm is broken, the other still protects the session. Already deployed by Cloudflare, Google Chrome, AWS.

Migration Roadmap

Phase 1: Inventory (Now)

Identify all uses of RSA, ECDH, ECDSA in protocols, certificates, and stored data. Prioritise long-lived secrets.

Phase 2: Hybrid Deploy (2025–2027)

Deploy hybrid TLS with ML-KEM + X25519. Update certificate chains. Test interoperability.

Phase 3: Pure PQC (2028–2035)

Transition to PQC-only where hybrid overhead is unacceptable. Deprecate classical public-key algorithms per NIST IR 8547.

Crypto-Agility

Design systems to swap algorithms without rewriting protocols. Use negotiation (like TLS cipher suites) and abstraction layers.

Algorithm Decision Matrix

AlgorithmTypePKSKCT/SigSpeedMaturityBest For
ML-KEM-768KEM1,184 B2,400 B1,088 BVery fastFIPS 203TLS, VPN, general KE
ML-DSA-65Sig1,952 B4,000 B3,293 BFastFIPS 204Code signing, auth, TLS
SLH-DSA-128fSig32 B64 B17,088 BSlow signFIPS 205Firmware, long-term archive
FN-DSA-512Sig897 B1,281 B690 BFastDraft 206Bandwidth-constrained
HQC-128KEM~2,249 B~2,289 B~4,497 BModerateSelected 2025ML-KEM backup (diversity)
RSA-2048KE+Sig256 B256 B256 BSlow KGBroken by QCLegacy (deprecate)
ECDSA P-256Sig64 B32 B64 BFastBroken by QCLegacy (deprecate)

Recommendation: Use ML-KEM-768 for key exchange and ML-DSA-65 for general signing. Add SLH-DSA where conservative hash-only assumptions are required. Deploy hybrid initially.

References & Standards

NIST Standards
FIPS 203 — ML-KEM (Module-Lattice-Based Key-Encapsulation Mechanism)
FIPS 204 — ML-DSA (Module-Lattice-Based Digital Signature Algorithm)
FIPS 205 — SLH-DSA (Stateless Hash-Based Digital Signature Algorithm)
NIST IR 8547 — Transition to Post-Quantum Cryptography Standards
SP 800-208 — Stateful Hash-Based Signature Schemes (LMS, XMSS)

Foundational Papers
Shor, "Polynomial-Time Algorithms for Prime Factorization and Discrete Logarithms on a Quantum Computer" (1994)
Regev, "On Lattices, Learning with Errors, Random Linear Codes, and Cryptography" (2005, Gödel Prize 2018)
Lyubashevsky, Peikert, Regev, "On Ideal Lattices and Learning with Errors Over Rings" (2010)
Bos et al., "CRYSTALS-Kyber: A CCA-Secure Module-Lattice-Based KEM" (2018)
Ducas et al., "CRYSTALS-Dilithium: A Lattice-Based Digital Signature Scheme" (2018)
Bernstein et al., "The SPHINCS+ Signature Framework" (2019)

Implementation Resources
Open Quantum Safe (liboqs) · PQ-Crystals reference code · OpenSSL 3.5+ PQC support · NIST ACVP test vectors