MODERN CRYPTOGRAPHY SERIES · PRESENTATION 07

Post-Quantum Cryptography

Lattices · ML-KEM · ML-DSA · SLH-DSA · NIST PQC Standards · Hardware

Navigate with arrow keys · Press Esc for overview · Press F for fullscreen

Why Post-Quantum?

Every public-key cryptosystem in wide deployment today — RSA, Diffie-Hellman, ECDSA, ECDH, EdDSA — derives security from one of two problems: integer factorisation or the discrete logarithm problem.

Shor's algorithm (1994) solves both in polynomial time on a quantum computer, reducing exponential-class problems to O(n³) in qubit-circuit depth.

⚠ Broken by Shor's Algorithm

RSA (integer factorisation)
Diffie-Hellman / DSA (DLP over ℤ_p*)
ECDH / ECDSA / EdDSA (ECDLP)
ElGamal encryption

✓ Quantum-Resistant

AES-256 (Grover halves key strength → 128-bit)
SHA-3, SHAKE (output doubling suffices)
Lattice-based: ML-KEM, ML-DSA
Hash-based: SLH-DSA, LMS/XMSS

Grover's algorithm gives only a quadratic speedup against symmetric ciphers — doubling key length neutralises the threat.

The Quantum Threat Timeline

1994
Shor publishes factoring algorithm

1996
Grover's search algorithm — quadratic speedup

2016
NIST calls for PQC submissions (82 received)

2022
NIST selects Kyber, Dilithium, SPHINCS+ as finalists

Aug 2024
FIPS 203, 204, 205 finalised — ML-KEM, ML-DSA, SLH-DSA

Mar 2025
HQC selected as backup KEM (code-based)

2027
CNSA 2.0 — new NSS acquisitions must be PQC-compliant

2035
NIST deprecates all quantum-vulnerable algorithms

🕐 "Harvest Now, Decrypt Later"

Adversaries record encrypted traffic today, planning to decrypt it once a Cryptographically Relevant Quantum Computer (CRQC) is available. Long-lived secrets are already at risk.

Qubit Estimates to Break RSA-2048

~4,000 error-corrected logical qubits, which translates to millions of physical qubits with current error rates. Estimates for CRQC arrival: 2030s–2040s.

Symmetric Impact

Grover reduces AES-128 → ~64-bit security. Mitigation: use AES-256 (→128-bit post-quantum).

PQC Algorithm Families

Post-quantum schemes derive hardness from problems believed resistant to both classical and quantum algorithms.

🔷 Lattice-Based

Hard problem: Learning With Errors (LWE), Module-LWE, Ring-LWE
Standards: ML-KEM (FIPS 203), ML-DSA (FIPS 204), FN-DSA (draft FIPS 206)
Pros: Small keys, fast operations, versatile
Cons: Relatively newer hardness assumptions

🔶 Hash-Based

Hard problem: Preimage/collision resistance of hash functions
Standards: SLH-DSA (FIPS 205), LMS/XMSS (SP 800-208)
Pros: Very conservative, minimal assumptions, tiny keys
Cons: Large signatures, slower signing

🔵 Code-Based

Hard problem: Decoding random linear codes
Standards: HQC (selected March 2025, draft ~2026)
Pros: Decades of study (McEliece, 1978), diverse math basis
Cons: Very large public keys (Classic McEliece)

🟣 Multivariate / Isogeny

Hard problems: MQ problem, supersingular isogeny
Status: SIKE broken (2022); Rainbow broken (2022); research continues
Notes: Some multivariate signatures still in NIST additional round

Lattice Fundamentals

A lattice ℒ(B) is the set of all integer linear combinations of basis vectors B = {b₁, b₂, …, bₙ} ⊂ ℝⁿ:

ℒ(B) = { Σᵢ zᵢ \cdot bᵢ | zᵢ \in ℤ }

Interactive 2D lattice — click to see closest vector problem (CVP)

Hard Lattice Problems

SVP — Shortest Vector Problem: find the shortest non-zero vector in ℒ
CVP — Closest Vector Problem: given a target point, find the nearest lattice point
SIVP — Shortest Independent Vectors: find n shortest linearly independent vectors
GapSVP — Decision variant: is λ₁(ℒ) ≤ d or > γ·d?

Best known classical algorithms (BKZ/LLL) run in 2^O(n) time. No known quantum algorithm provides more than polynomial improvement over classical lattice algorithms.

Learning With Errors (LWE)

Introduced by Oded Regev (2005, Gödel Prize 2018). LWE is the computational foundation for ML-KEM and ML-DSA.

The Problem

Given a random matrix A ∈ ℤ_q^m×n and a secret vector s ∈ ℤ_qⁿ, compute:

b = A\cdots + e (mod q)

where e is a "small" error vector sampled from a discrete Gaussian χ.

The search problem: given (A, b), recover s.
The decision problem: distinguish (A, b) from uniform random.

Without noise: trivial Gaussian elimination — O(n³). With noise: as hard as worst-case lattice problems (GapSVP, SIVP).

Variants

Ring-LWE (RLWE)

Work in R_q = ℤ_q[x]/(xⁿ+1). Structured — O(n log n) operations via NTT. Basis for Kyber/ML-KEM.

Module-LWE (MLWE)

Matrices of ring elements: A ∈ R_q^k×k. Bridges Ring-LWE efficiency with LWE flexibility. Parameter k controls security/performance trade-off.

Worst-Case to Average-Case

Regev's reduction: solving random LWE instances is at least as hard as solving worst-case GapSVP — a unique guarantee among cryptographic assumptions.

Interactive: LWE in Action

This demonstrates why noise makes LWE hard. Without error, the system is trivially solvable.

Parameters: n=4, q=97

No noise (e=0) Small noise (σ=1) Medium noise (σ=3) Large noise (σ=8)

Error distribution: discrete samples from e mod q

NIST PQC Standards (Aug 2024)

FIPS	Algorithm	Original Name	Type	Hard Problem	Purpose
203	ML-KEM	CRYSTALS-Kyber	Lattice	Module-LWE	Key Encapsulation
204	ML-DSA	CRYSTALS-Dilithium	Lattice	Module-LWE + SIS	Digital Signatures
205	SLH-DSA	SPHINCS+	Hash	Hash preimage	Digital Signatures

Coming: FIPS 206 — FN-DSA

Based on FALCON. FFT over NTRU lattices. Compact signatures (~690 B at 128-bit security). Draft expected 2025.

Coming: HQC

Code-based KEM selected March 2025 as backup to ML-KEM. Different mathematical basis (Hamming Quasi-Cyclic codes). Draft standard ~2026, final ~2027.

SP 800-208: LMS/XMSS

Stateful hash-based signature schemes. Already approved. Require strict state management — suitable for firmware signing, not TLS.

NIST IR 8547: transition timeline — deprecate quantum-vulnerable algorithms by 2035, with high-risk systems transitioning much earlier.

ML-KEM (FIPS 203) — Deep Dive

Module-Lattice-Based Key-Encapsulation Mechanism, derived from CRYSTALS-Kyber.

Construction Overview

ML-KEM is built in two layers:

1. K-PKE — an IND-CPA public-key encryption scheme based on Module-LWE. Encrypts a fixed 32-byte message.

2. KEM transform — a modified Fujisaki-Okamoto (FO) transform applied to K-PKE, yielding IND-CCA2 security.

KeyGen: (A, t = As + e) where A \in R q k\timesk, s,e \leftarrow CBD(η) Encaps: c = Compress(Au + e₁, t T u + e₂ + ⌈q/2⌋\cdotm) Decaps: m' = Decompress(c₂ - s T c₁), re-encrypt, check

The ring R_q = ℤ_q[x]/(x²⁵⁶+1) with q = 3329. Polynomial multiplication via Number Theoretic Transform (NTT).

Parameter Sets

Set	k	η₁,η₂	PK (B)	SK (B)	CT (B)	NIST Cat
ML-KEM-512	2	3, 2	800	1632	768	1
ML-KEM-768	3	2, 2	1184	2400	1088	3
ML-KEM-1024	4	2, 2	1568	3168	1568	5

Shared secret: always 32 bytes. NIST Category 1 ≈ AES-128 security; Cat 3 ≈ AES-192; Cat 5 ≈ AES-256.

Why Module-LWE?

Module-LWE generalises Ring-LWE (one polynomial → k×k matrix of polynomials). Tuning k adjusts security without changing ring dimension. ML-KEM-768 (k=3) is the recommended general-purpose parameter set.

ML-KEM Protocol Flow

Alice (Initiator)

KeyGen():
1. Sample A ← ℤ_q[x]/(x²⁵⁶+1) via SHAKE-128
2. Sample s, e ← CBD(η₁)
3. t = NTT(A)·NTT(s) + NTT(e)
4. pk = (t̂, ρ), sk = (s, pk, H(pk), z)

Decaps(sk, ct):
5. Decrypt ct → m'
6. Re-encrypt m' → ct'
7. If ct' = ct: K = KDF(m', H(ct))
Else: K = KDF(z, H(ct))

pk (800–1568 B)
→→→

←←←
ct (768–1568 B)

Bob (Responder)

Encaps(pk):
1. m ← random 32 bytes
2. (K̂, r) = G(m ‖ H(pk))
3. Sample r₁, r₂, e₁ ← CBD(η) using r
4. u = NTT⁻¹(Â^T·NTT(r₁)) + e₁
5. v = NTT⁻¹(t̂^T·NTT(r₁)) + e₂ + Compress(m)
6. ct = (Compress(u), Compress(v))
7. K = KDF(K̂, H(ct))

Both parties derive the same 32-byte shared secret K. The FO transform ensures IND-CCA2 security via implicit rejection.

Number Theoretic Transform (NTT)

The NTT is the workhorse of lattice cryptography — it converts polynomial multiplication from O(n²) to O(n log n).

How It Works

The NTT is the finite-field analogue of the FFT, operating over ℤ_q instead of ℂ.

Given f(x), g(x) \in R q = ℤ q [x]/(x 256 +1), q = 3329: 1. f̂ = NTT(f) — 7 layers of butterfly operations 2. ĝ = NTT(g) 3. ĥ = f̂ ⊙ ĝ — pointwise multiply (128 pairs) 4. h = NTT⁻¹(ĥ) — inverse transform Root of unity: ζ = 17 (ζ²⁵⁶ \equiv -1 mod 3329) Why q = 3329? Prime, q \equiv 1 (mod 256), fits in 12 bits.

Butterfly Operation

ML-KEM uses 7 layers of 128 butterflies. Each butterfly: 1 multiply + 1 add + 1 subtract in ℤ₃₃₂₉. Total: 128 × 7 = 896 butterflies per NTT.

Hardware: can be parallelised into systolic arrays. FPGA implementations achieve NTT in <1 μs at modest clock rates.

ML-DSA (FIPS 204) — Digital Signatures

Module-Lattice-Based Digital Signature Algorithm, derived from CRYSTALS-Dilithium.

Fiat-Shamir with Aborts

ML-DSA uses the "Fiat-Shamir with Aborts" paradigm (Lyubashevsky, 2009):

KeyGen: A \leftarrow R q k\timesℓ, (s₁,s₂) \leftarrow small, t = As₁ + s₂ pk = (A, t), sk = (A, t, s₁, s₂) Sign(sk, M): repeat: y \leftarrow uniform with ‖y‖\infty < γ₁ w = Ay c̃ = H(HighBits(w) ‖ M) c = SampleInBall(c̃) — sparse polynomial, τ nonzero coeffs z = y + c\cdots₁ until ‖z‖\infty < γ₁-β and ‖LowBits(w - c\cdots₂)‖\infty < γ₂-β σ = (c̃, z, hints) Verify(pk, M, σ): w' = Az - c\cdott, check HighBits match c̃

Parameter Sets

Set	(k,ℓ)	PK (B)	SK (B)	Sig (B)	Cat
ML-DSA-44	(4,4)	1,312	2,528	2,420	2
ML-DSA-65	(6,5)	1,952	4,000	3,293	3
ML-DSA-87	(8,7)	2,592	4,864	4,595	5

Why "Aborts"?

The masking vector y hides s₁ in z = y + c·s₁. If z leaks information about s₁ (by being too large or having suspicious low bits), the signature is rejected and the signer tries again. Average ~4–7 iterations per signature.

Security Basis

Relies on both Module-LWE (key generation) and Module-SIS (Short Integer Solution — signature forgery). Both reduce to worst-case lattice hardness.

SLH-DSA (FIPS 205) — Hash-Based Signatures

Stateless Hash-Based Digital Signature Algorithm, derived from SPHINCS+. The conservative choice — security rests solely on hash function properties.

Architecture

Hypertree Root (Public Key)

↑ XMSS tree layer d−1

↑ ... (d layers of XMSS Merkle trees) ...

↑ XMSS tree layer 0

FORS Signature (Few-time OTS)

↑

Message Hash → index selection

Each layer: WOTS+ one-time signatures authenticate tree roots. FORS (Forest of Random Subsets) signs the message. The hypertree root is the public key.

Key & Signature Sizes

Parameter Set	PK (B)	SK (B)	Sig (B)	Cat
SLH-DSA-*-128s	32	64	7,856	1
SLH-DSA-*-128f	32	64	17,088	1
SLH-DSA-*-192s	48	96	16,224	3
SLH-DSA-*-192f	48	96	35,664	3
SLH-DSA-*-256s	64	128	29,792	5
SLH-DSA-*-256f	64	128	49,856	5

Hash choice: SHA-256 or SHAKE-256. Suffix s = smaller sig / slower; f = fast sign / larger sig.

Trade-off: Keys vs Signatures

SLH-DSA has the smallest keys of any PQC signature (32–64 B public key!) but the largest signatures (7–50 KB). Ideal for code signing, firmware verification — not TLS handshakes.

Interactive: Key & Signature Sizes

Compare PQC algorithms against classical schemes. Select a metric to visualise:

Public Key Secret Key Ciphertext / Signature

All sizes in bytes. Classical schemes shown for reference. ML-KEM-768 is the recommended general-purpose KEM.

Performance Benchmarks

PQC algorithms are competitive with — and often faster than — classical schemes in software.

Operation	ML-KEM-768	ECDH (X25519)	RSA-2048
KeyGen	~30 μs	~60 μs	~200 ms
Encaps / Encrypt	~40 μs	~60 μs	~10 μs
Decaps / Decrypt	~35 μs	~60 μs	~5 ms

Operation	ML-DSA-65	Ed25519	RSA-2048
KeyGen	~80 μs	~30 μs	~200 ms
Sign	~250 μs (avg)	~50 μs	~5 ms
Verify	~90 μs	~70 μs	~10 μs

✓ PQC Advantage

ML-KEM key generation is ~6000× faster than RSA-2048. Total KEM round-trip is comparable to ECDH. NTT-based arithmetic is highly parallelisable.

⚠ PQC Trade-off

Larger key/signature sizes increase bandwidth. ML-DSA signing requires retry loops (~4–7 iterations). SLH-DSA signing is 100–1000× slower than ML-DSA.

Hardware Implementation

PQC accelerators for SoCs, FPGAs, and ASICs.

NTT Accelerator Architecture

Coeff RAM
(256×12-bit)

→

Butterfly Unit
mul + add/sub mod q

→

Result RAM

Twiddle ROM
ζ^k table

↗

Control FSM
7 stages × 128 ops

Core operation: modular multiply in ℤ₃₃₂₉ — 12×12-bit multiplier + Barrett reduction
Parallelism: 1 butterfly/cycle → 896 cycles per NTT, or 2–4 parallel butterflies for throughput
Keccak/SHAKE co-processor for sampling and hashing

Implementation Results

Platform	Algorithm	KeyGen	Encaps	Area
Cortex-M4	ML-KEM-768	~500K cyc	~600K cyc	—
Artix-7 FPGA	ML-KEM-768	~3 μs	~4 μs	~5K LUTs
28nm ASIC	ML-KEM-768	<1 μs	<1 μs	~25K GE
Cortex-M4	ML-DSA-65	~1.5M cyc	~5M cyc	—

Design Considerations for SoC Integration

Shared NTT engine between KEM and signature operations
SHAKE-256/Keccak permutation block — also used in SHA-3
CBD (Centered Binomial Distribution) sampler — simple XOR + popcount
Constant-time implementation essential — avoid timing side-channels in Barrett reduction and NTT butterfly

Python: Simplified ML-KEM Core

A pedagogical implementation of the core NTT and polynomial arithmetic in ℤ₃₃₂₉.


Q = 3329          # ML-KEM modulus
N = 256           # polynomial degree
ZETA = 17         # primitive 256th root of unity mod Q

# Precompute twiddle factors: ζ^(BitRev₇(i)) for NTT layers
def precompute_zetas():
    zetas = [0] * 128
    for i in range(128):
        br = int(f'{i:07b}'[::-1], 2)        # 7-bit bit-reversal
        zetas[i] = pow(ZETA, br, Q)
    return zetas

ZETAS = precompute_zetas()

def ntt(f):
    """Number Theoretic Transform: O(n log n) polynomial → NTT domain."""
    a = list(f)
    k, length = 0, 128
    while length >= 1:
        for start in range(0, N, 2 * length):
            zeta = ZETAS[k]; k += 1
            for j in range(start, start + length):
                t = (zeta * a[j + length]) % Q
                a[j + length] = (a[j] - t) % Q
                a[j]          = (a[j] + t) % Q
        length //= 2
    return a

def pointwise_mul(f_hat, g_hat):
    """Pointwise multiplication in NTT domain (128 degree-1 polynomials)."""
    h = [0] * N
    for i in range(0, N, 2):
        z = ZETAS[64 + i // 2]
        h[i]   = (f_hat[i]*g_hat[i] + f_hat[i+1]*g_hat[i+1]*z) % Q
        h[i+1] = (f_hat[i]*g_hat[i+1] + f_hat[i+1]*g_hat[i]) % Q
    return h

Python: Key Generation Sketch


import os, hashlib

def cbd(eta, seed, nonce):
    """Centered Binomial Distribution — sample small polynomial."""
    # In production: expand seed with SHAKE-256, extract 64·eta bytes
    # Here simplified for illustration
    coeffs = [0] * N
    data = hashlib.shake_256(seed + bytes([nonce])).digest(64 * eta)
    for i in range(N):
        a = sum((data[2*eta*i + j] >> b) & 1
                for j in range(eta) for b in range(8)) 
        # ... (simplified — real CBD sums bit-pairs from byte stream)
        coeffs[i] = (a % (2*eta + 1)) - eta
    return [c % Q for c in coeffs]

def ml_kem_keygen(k=3, eta1=2):
    """ML-KEM-768 key generation (simplified sketch)."""
    rho = os.urandom(32)            # public seed for matrix A
    sigma = os.urandom(32)          # private seed for s, e
    
    # Generate A ∈ R_q^{k×k} from rho (deterministic via SHAKE-128)
    A_hat = [[sample_ntt(rho, i, j) for j in range(k)] for i in range(k)]
    
    # Sample secret s and error e from CBD(eta1)
    s = [ntt(cbd(eta1, sigma, n)) for n in range(k)]
    e = [ntt(cbd(eta1, sigma, n + k)) for n in range(k)]
    
    # t = A·s + e  (all in NTT domain)
    t_hat = [
        [(A_hat[i][j] @ s[j]) for j in range(k)]  # pseudocode
        for i in range(k)
    ]
    # ... sum columns, add e ...
    
    pk = encode(t_hat, rho)         # 1184 bytes for k=3
    sk = encode(s, pk, H(pk), z)    # 2400 bytes for k=3
    return pk, sk

Production implementations: use liboqs, pqcrypto (Rust), or OpenSSL 3.5+ with ML-KEM support.

Hybrid Schemes & Migration

The transition strategy: combine classical + PQC for defence-in-depth.

Hybrid Key Exchange

TLS 1.3 Hybrid Handshake

X25519 Key Share

ML-KEM-768 Encaps Key

↓ Combined Key Share in ClientHello

SS₁ = X25519(a, B)

‖

SS₂ = ML-KEM.Decaps(ct)

↓ KDF(SS₁ ‖ SS₂) → session keys

If either algorithm is broken, the other still protects the session. Already deployed by Cloudflare, Google Chrome, AWS.

Migration Roadmap

Phase 1: Inventory (Now)

Identify all uses of RSA, ECDH, ECDSA in protocols, certificates, and stored data. Prioritise long-lived secrets.

Phase 2: Hybrid Deploy (2025–2027)

Deploy hybrid TLS with ML-KEM + X25519. Update certificate chains. Test interoperability.

Phase 3: Pure PQC (2028–2035)

Transition to PQC-only where hybrid overhead is unacceptable. Deprecate classical public-key algorithms per NIST IR 8547.

Crypto-Agility

Design systems to swap algorithms without rewriting protocols. Use negotiation (like TLS cipher suites) and abstraction layers.

Algorithm Decision Matrix

Algorithm	Type	PK	SK	CT/Sig	Speed	Maturity	Best For
ML-KEM-768	KEM	1,184 B	2,400 B	1,088 B	Very fast	FIPS 203	TLS, VPN, general KE
ML-DSA-65	Sig	1,952 B	4,000 B	3,293 B	Fast	FIPS 204	Code signing, auth, TLS
SLH-DSA-128f	Sig	32 B	64 B	17,088 B	Slow sign	FIPS 205	Firmware, long-term archive
FN-DSA-512	Sig	897 B	1,281 B	690 B	Fast	Draft 206	Bandwidth-constrained
HQC-128	KEM	~2,249 B	~2,289 B	~4,497 B	Moderate	Selected 2025	ML-KEM backup (diversity)
RSA-2048	KE+Sig	256 B	256 B	256 B	Slow KG	Broken by QC	Legacy (deprecate)
ECDSA P-256	Sig	64 B	32 B	64 B	Fast	Broken by QC	Legacy (deprecate)

Recommendation: Use ML-KEM-768 for key exchange and ML-DSA-65 for general signing. Add SLH-DSA where conservative hash-only assumptions are required. Deploy hybrid initially.

References & Standards

NIST Standards
FIPS 203 — ML-KEM (Module-Lattice-Based Key-Encapsulation Mechanism)
FIPS 204 — ML-DSA (Module-Lattice-Based Digital Signature Algorithm)
FIPS 205 — SLH-DSA (Stateless Hash-Based Digital Signature Algorithm)
NIST IR 8547 — Transition to Post-Quantum Cryptography Standards
SP 800-208 — Stateful Hash-Based Signature Schemes (LMS, XMSS)

Foundational Papers
Shor, "Polynomial-Time Algorithms for Prime Factorization and Discrete Logarithms on a Quantum Computer" (1994)
Regev, "On Lattices, Learning with Errors, Random Linear Codes, and Cryptography" (2005, Gödel Prize 2018)
Lyubashevsky, Peikert, Regev, "On Ideal Lattices and Learning with Errors Over Rings" (2010)
Bos et al., "CRYSTALS-Kyber: A CCA-Secure Module-Lattice-Based KEM" (2018)
Ducas et al., "CRYSTALS-Dilithium: A Lattice-Based Digital Signature Scheme" (2018)
Bernstein et al., "The SPHINCS+ Signature Framework" (2019)

Implementation Resources
Open Quantum Safe (liboqs) · PQ-Crystals reference code · OpenSSL 3.5+ PQC support · NIST ACVP test vectors