Presentation 04

Hash Functions & MACs

Integrity, Authentication & Commitment

SHA-2 · SHA-3 / Keccak · Merkle-Damgård · Sponge Construction
HMAC · KMAC · Poly1305 · Merkle Trees · Password Hashing

→ / ← navigate Esc overview F fullscreen ?print-pdf export

What Is a Cryptographic Hash Function?

A function H : {0,1}* → {0,1}ⁿ mapping arbitrary-length input to a fixed-length digest.

Core Properties

Deterministic — same input always produces the same output

Fixed output — regardless of input length, output is always n bits

Efficient — computing H(m) is fast (linear in |m|)

Avalanche effect — flipping one input bit changes ~50% of output bits

One-Way Function

Given a digest d = H(m), it must be computationally infeasible to find any message m' such that H(m') = d.

This is not about secrecy of the algorithm — the function is public. It is about the computational difficulty of inversion.

Compression

Since the domain is infinite and the range is finite (2ⁿ), collisions must exist by the pigeonhole principle. The security requirement is that they are hard to find.

Security Properties

Preimage Resistance

Given d, find any m such that H(m) = d.

Cost: O(2ⁿ)

Also called "one-wayness." Generic attack: brute-force search over messages.

Second-Preimage Resistance

Given m₁, find m₂ ≠ m₁ such that H(m₁) = H(m₂).

Cost: O(2ⁿ)

Harder than collision resistance — one input is fixed by the attacker's target.

Collision Resistance

Find any m₁ ≠ m₂ such that H(m₁) = H(m₂).

Cost: O(2^n/2)

The birthday bound — in a set of 2^n/2 random digests, a collision is expected with probability > 50%.

    Birthday Paradox: For an n-bit hash, collision search costs ~O(2n/2) evaluations. SHA-256 thus provides ~2128 collision resistance.
  

Implications for Design

Collision resistance ⇒ second-preimage resistance (for most practical constructions), but not vice versa. A hash function with an n-bit digest should target n/2 bits of collision security and n bits of (second-)preimage security.

Rogaway & Shrimpton, "Cryptographic Hash-Function Basics" (2004)

Merkle-Damgård Construction

The dominant paradigm for hash function design from 1979 through SHA-2. Reduces collision resistance of the full hash to collision resistance of a fixed-size compression function.

→

f
↑ m₁

→

f
↑ m₂

→

f
↑ m₃

→ ··· →

f
↑ m_k||pad||len

→

H(m)

How It Works

1. Pad message to multiple of block size (512 bits for SHA-256)

2. Append length field (Merkle-Damgård strengthening)

3. Split padded message into blocks m₁, m₂, ..., m_k

4. Chain: h₀ = IV, h_i = f(h_i-1, m_i)

5. Output: H(m) = h_k

Length Extension Attack

Given H(m) and |m|, an attacker can compute H(m || pad || m') without knowing m.

The attacker uses H(m) as the chaining value and continues the Merkle-Damgård iteration with new blocks m'.

Affects: MD5, SHA-1, SHA-256, SHA-512

Not affected: SHA-3, HMAC, SHA-512/256, truncated variants

Merkle, "One Way Hash Functions and DES" (CRYPTO 1989) · Damgård, "A Design Principle for Hash Functions" (CRYPTO 1989)

MD5 & SHA-1 DEPRECATED

MD5 (1992)

128-bit digest, 512-bit blocks, 64 rounds

2004 — Wang et al. find collisions in 2³⁹ operations

2006 — Practical collision in seconds on a laptop

2008 — Rogue CA certificate attack (Sotirov et al.)

2012 — Flame malware used MD5 collision for code signing

Only 128-bit ⇒ birthday bound is just 2⁶⁴, and actual attacks far below that.

SHA-1 (1995)

160-bit digest, 512-bit blocks, 80 rounds

NSA design, published as FIPS 180-1

2005 — Wang et al. theoretical collision in 2⁶³

2017 — SHAttered: first practical collision (Google/CWI), 2^63.1 SHA-1 computations, ~110 GPU-years

2020 — SHA-mbles: chosen-prefix collision for $45k in cloud compute

CA/Browser Forum banned SHA-1 certs in 2017. Git transitioning to SHA-256.

    Both MD5 and SHA-1 are cryptographically broken.
     Do not use for digital signatures, certificate validation, or any collision-resistant application. Legacy use for checksums (non-adversarial) persists but is discouraged.
  

Wang et al., "Collisions for Hash Functions MD4, MD5, HAVAL-128" (CRYPTO 2004) · Stevens et al., "The First Collision for Full SHA-1" (CRYPTO 2017)

SHA-2 Family FIPS 180-4

Published by NIST in 2001 (SHA-256/384/512), with SHA-224 and SHA-512/256 added later. Still the most widely deployed secure hash family.

Variant	Digest	Block Size	Word Size	Rounds	Operations
SHA-224	224 bits	512 bits	32 bits	64	+, AND, OR, XOR, SHR, ROTR
SHA-256	256 bits	512 bits	32 bits	64	+, AND, OR, XOR, SHR, ROTR
SHA-384	384 bits	1024 bits	64 bits	80	+, AND, OR, XOR, SHR, ROTR
SHA-512	512 bits	1024 bits	64 bits	80	+, AND, OR, XOR, SHR, ROTR
SHA-512/256	256 bits	1024 bits	64 bits	80	+, AND, OR, XOR, SHR, ROTR

Design Lineage

Merkle-Damgård structure with Davies-Meyer compression function: h_i = E_{m_i}(h_i-1) + h_i-1

Constants K_t derived from the fractional parts of cube roots of the first 64 (or 80) primes — nothing-up-my-sleeve numbers.

Why SHA-512/256?

Uses the 64-bit SHA-512 core (faster on 64-bit CPUs) but truncates to 256 bits.

Different IV than SHA-512 ⇒ no length extension attack between variants.

On x86-64: SHA-512/256 is often ~50% faster than SHA-256.

FIPS 180-4 — Secure Hash Standard (NIST, 2015)

SHA-256 Compression Function

Each round updates eight 32-bit working variables a, b, c, d, e, f, g, h using two auxiliary functions and modular addition.

↓

T₁ = h + Σ₁(e) + Ch(e,f,g) + K_t + W_t
T₂ = Σ₀(a) + Maj(a,b,c)

↓

T₁+T₂

d+T₁

Auxiliary Functions

Ch(e,f,g) = (e AND f) XOR (NOT e AND g)

"Choose" — e selects between f and g bitwise

Maj(a,b,c) = (a AND b) XOR (a AND c) XOR (b AND c)

"Majority" — output bit is the majority of the three inputs

Sigma Functions

Σ₀(a) = ROTR²(a) ⊕ ROTR¹³(a) ⊕ ROTR²²(a)

Σ₁(e) = ROTR⁶(e) ⊕ ROTR¹¹(e) ⊕ ROTR²⁵(e)

All additions are modulo 2³². Each round mixes non-linear selection (Ch), majority voting (Maj), and bit rotation (Σ) to achieve diffusion.

FIPS 180-4, Section 6.2

SHA-256 Message Schedule

The 512-bit message block is expanded into 64 thirty-two-bit words W₀ ... W₆₃.

W_t = σ₁(W_t-2) + W_t-7 + σ₀(W_t-15) + W_t-16 for t = 16..63

First 16 Words

W₀ through W₁₅ are taken directly from the 512-bit message block (big-endian 32-bit words).

For a message block m = m₀m₁...m₁₅, each m_i is a 32-bit word.

Expansion Functions

σ₀(x) = ROTR⁷(x) ⊕ ROTR¹⁸(x) ⊕ SHR³(x)

σ₁(x) = ROTR¹⁷(x) ⊕ ROTR¹⁹(x) ⊕ SHR¹⁰(x)

Note: lowercase σ (message schedule) vs uppercase Σ (compression function). SHR is logical right shift; ROTR is circular right rotation.

Design Rationale

The schedule ensures every input bit influences multiple rounds. Without expansion, only 16 rounds would see fresh message material.

The choice of offsets (2, 7, 15, 16) and rotation constants is optimized for diffusion speed — after ~20 rounds, every bit of the message block has influenced every bit of the state.

Constants K_t

64 constants from fractional parts of cube roots of the first 64 primes:

K₀ = 0x428a2f98 (∛2)
K₁ = 0x71374491 (∛3)
K₂ = 0xb5c0fbcf (∛5)
K₃ = 0xe9b5dba5 (∛7) ...

"Nothing-up-my-sleeve" numbers — verifiably not chosen to create backdoors.

SHA-2 Performance & Hardware

Platform	Implementation	Throughput	Notes
x86-64 (no extensions)	Optimized C/ASM	~500 MB/s	Single-core, ~4 GHz
x86-64 (SHA-NI)	Intel SHA Extensions	~2.5 GB/s	Ice Lake+, single core
ARM (Crypto Ext.)	ARMv8-A SHA2 insns	~1.8 GB/s	Cortex-A76 class
Xilinx Virtex-7	Iterative FPGA	~5 Gbps	~800 slices
14nm ASIC	Full unrolled pipeline	~40 Gbps	Custom design
Bitcoin ASIC (S21)	Application-specific	~200 TH/s	double-SHA-256 only, ~3000W

Intel SHA-NI

Instructions: SHA256RNDS2, SHA256MSG1, SHA256MSG2

Two rounds per SHA256RNDS2 instruction ⇒ 32 instructions for all 64 rounds

Available since Goldmont (Atom), Ice Lake (mainstream)

Bitcoin Mining as Extreme Case

SHA-256d = SHA-256(SHA-256(header)). Global hashrate (2025): ~600 EH/s = 6×10²⁰ double-SHA-256/s.

The entire world's SHA-256 mining hardware demonstrates that SHA-256's structure is extremely hardware-friendly but also that brute-force search scales massively.

Intel Architecture Instruction Set Reference · Bitmain Antminer S21 specifications

SHA-3 / Keccak FIPS 202

A fundamentally different hash construction selected via a public NIST competition (2007–2012).

Why a New Standard?

Diversity — SHA-2 is Merkle-Damgård. If a structural attack against M-D is found, we need a backup with a completely different design.

Length extension — M-D hashes inherently vulnerable; SHA-3's sponge construction is immune.

Flexibility — SHA-3 natively supports XOFs (extendable output functions), domain separation, and arbitrary output lengths.

The Competition

2007 — NIST calls for SHA-3 candidates

2008 — 64 submissions received

2010 — 5 finalists: BLAKE, Grøstl, JH, Keccak, Skein

2012 — Keccak selected as SHA-3

The Keccak Team

Guido Bertoni (STMicroelectronics)

Joan Daemen (STMicroelectronics) — co-designer of AES/Rijndael

Michaël Peeters (NXP Semiconductors)

Gilles Van Assche (STMicroelectronics)

The team also developed Keyak (AEAD), Ketje (lightweight AEAD), and KangarooTwelve.

Key Innovation

Sponge construction — a single permutation-based framework that yields hash functions, MACs, PRNGs, stream ciphers, and authenticated encryption from one primitive.

FIPS 202 — SHA-3 Standard (NIST, 2015) · Bertoni et al., "Keccak Reference" (2011)

Sponge Construction

A mode of operation for a fixed-width permutation f : {0,1}^b → {0,1}^b that can process arbitrary input and produce arbitrary output.

ABSORB

0^b

→

⊕m₁
f

→

⊕m₂
f

→

⊕m_k
f

→

SQUEEZE

f
z₁↓

→

f
z₂↓

→ ···

Parameters

State width b = r + c = 1600 bits (for SHA-3)

Rate (r) — bits absorbed/squeezed per call to f. Higher r = faster.

Capacity (c) — bits never directly exposed. Security level = c/2.

Example: SHA3-256 uses r=1088, c=512 ⇒ 256-bit security.

How It Works

Absorb: XOR message blocks (r bits each) into the rate portion of the state, apply f after each block.

Squeeze: Extract r bits from the rate portion. Apply f to get more output if needed.

The capacity portion acts as a "secret" internal state that the attacker cannot directly observe or control.

Bertoni et al., "Cryptographic Sponge Functions" (2011)

Keccak-f Permutation

The state is a 3D array of 5 × 5 × 64 = 1600 bits. Each of 24 rounds applies five step mappings.

θ (Theta)

XOR each bit with the parity of its column and the adjacent column (shifted by one position).

Linear diffusion — ensures every output bit depends on ≥11 input bits.

ρ (Rho)

Rotate each of the 25 lanes by a different offset (0 to 63 positions).

Inter-slice diffusion: bits spread across the z-axis of the state.

π (Pi)

Rearrange the positions of the 25 lanes: (x,y) → (y, 2x+3y mod 5).

Long-range transposition — disrupts any column/row structure.

χ (Chi)

a[x] = a[x] XOR ((NOT a[x+1]) AND a[x+2])

The only non-linear step. Provides resistance to linear and differential cryptanalysis.

Algebraic degree 2. After multiple rounds, the algebraic degree grows exponentially.

ι (Iota)

XOR a round constant into lane (0,0). Breaks symmetry between rounds.

Without ι, all rounds would be identical and the permutation would have exploitable self-similarity. Constants are derived from an LFSR.

    Round pipeline: θ → ρ → π → χ → ι — repeated 24 times. Total: only ~6 basic operations per bit per round.
  

Bertoni et al., "The Keccak Reference" (Jan 2011), Section 1.2

SHA-3 Variants & Extensions

Function	Output	Rate (r)	Capacity (c)	Security	Type
SHA3-224	224 bits	1152	448	112-bit collision	Hash
SHA3-256	256 bits	1088	512	128-bit collision	Hash
SHA3-384	384 bits	832	768	192-bit collision	Hash
SHA3-512	512 bits	576	1024	256-bit collision	Hash
SHAKE128	variable	1344	256	128-bit all	XOF
SHAKE256	variable	1088	512	256-bit all	XOF

XOFs (Extendable Output Functions)

SHAKE128/256 can produce any number of output bytes by continuing the squeeze phase. Useful for:

• Key derivation (KDF)

• Pseudorandom number generation

• Domain separation via customization strings

NIST SP 800-185 Extensions

cSHAKE — customizable SHAKE with function name + customization string

TupleHash — hash of tuples with unambiguous encoding

ParallelHash — parallelizable hashing for large messages

KMAC — Keccak-based MAC (see slide 16)

KangarooTwelve (K12)

Reduced-round (12 vs 24) Keccak-p with tree hashing. ~6× faster than SHA3-256. Proposed by the Keccak team as a fast, safe alternative.

NIST SP 800-185 — SHA-3 Derived Functions (2016) · Guido Bertoni et al., "KangarooTwelve" (2016)

SHA-2 vs SHA-3 Comparison

Property	SHA-2 (SHA-256)	SHA-3 (SHA3-256)
Construction	Merkle-Damgård	Sponge (Keccak)
Primitive	Compression function (Davies-Meyer)	Permutation (Keccak-f[1600])
Block size	512 bits	1088 bits (rate)
State size	256 bits	1600 bits
Rounds	64	24
Length extension	Vulnerable	Immune
XOF support	No (needs external construction)	Native (SHAKE)
Software speed (x86)	~15 cycles/byte (SHA-NI: ~3.5)	~12 cycles/byte (no HW accel)
HW area (ASIC)	~10 kGE (iterative)	~13 kGE (iterative)
Grover (quantum)	128-bit → 2¹²⁸	128-bit → 2¹²⁸

    Both SHA-2 and SHA-3 are considered secure as of 2025. SHA-2 has broader hardware acceleration support; SHA-3 offers structural advantages and versatility. Modern systems should support both.
  

Bernstein & Lange, "eBACS: ECRYPT Benchmarking" · Keccak team, "Software Performance" (keccak.team)

HMAC RFC 2104 / FIPS 198-1

The standard construction for building a Message Authentication Code from a hash function.

HMAC-H(K, m) = H( (K ⊕ opad) ‖ H( (K ⊕ ipad) ‖ m ) )

Construction Details

K = key (if |K| > block size, use H(K); if shorter, pad with zeros)

ipad = 0x36 repeated to block size (inner padding)

opad = 0x5C repeated to block size (outer padding)

Step by Step

1. Compute inner hash: H((K ⊕ ipad) || m)

2. Compute outer hash: H((K ⊕ opad) || inner_hash)

The double hashing prevents length extension — even with a Merkle-Damgård hash, the outer hash seals the construction.

Security

Proven secure by Bellare, Canetti & Krawczyk (1996)

HMAC is a PRF (pseudorandom function) if the compression function of H is a PRF.

Security does not require collision resistance of H — HMAC-MD5 remains a secure MAC even though MD5 collisions are trivial.

Why Not Just H(K || m)?

Vulnerable to length extension: attacker can compute H(K || m || pad || m') from H(K || m).

H(m || K) is also problematic — offline collision search on H(m || K) for different m values.

HMAC's nested structure avoids both attacks.

Bellare, Canetti & Krawczyk, "Keying Hash Functions for Message Authentication" (CRYPTO 1996) · RFC 2104

KMAC & Other MACs

KMAC (SP 800-185)

Keccak-based MAC — no nested construction needed.

KMAC256(K, X, L, S) = cSHAKE256(bytepad(encode(K),168) || X || right_encode(L), L, "KMAC", S)

Because the sponge construction has capacity bits that absorb the key, KMAC is simpler and avoids HMAC's double-hash overhead.

Also supports XOF mode (KMACXOF128/256) for variable-length output.

Poly1305 (Bernstein)

One-time MAC based on polynomial evaluation over GF(2¹³⁰-5).

Paired with ChaCha20 as ChaCha20-Poly1305 (RFC 8439) — the primary AEAD in TLS 1.3 alongside AES-GCM.

Extremely fast in software: ~1 cycle/byte on modern x86.

CMAC (SP 800-38B)

Cipher-based MAC built on block ciphers (typically AES).

Fixes CBC-MAC vulnerability (arbitrary-length messages) using a subkey derivation step and final XOR.

CMAC = AES-CBC with K1/K2 subkeys derived from AES(K, 0)

GMAC

The authentication component of AES-GCM used standalone (empty plaintext).

Based on GHASH — universal hashing in GF(2¹²⁸).

Requires unique nonces; nonce reuse completely breaks authenticity.

Comparison

HMAC — hash-based, widely deployed, works with any hash

KMAC — sponge-based, cleaner, SHA-3 ecosystem

CMAC — cipher-based, good when you already have AES

Poly1305 — fastest in software, one-time (needs cipher)

Hash-Based Signatures

Digital signatures built solely on the security of hash functions — the most conservative post-quantum signature approach.

Lamport One-Time Signatures

For each bit of the message hash:

• Generate two random values (sk₀, sk₁)

• Publish H(sk₀), H(sk₁) as public key

• To sign bit b, reveal sk_b

One-time only — revealing sk values for different messages leaks key material.

Key size for 256-bit security: ~16 KB private, ~16 KB public, ~8 KB signature.

Winternitz OTS (WOTS+)

Chains of hash evaluations trade signature size for computation. Parameter w controls the tradeoff.

WOTS+ with w=16: ~2.5 KB signatures (vs 8 KB for Lamport).

Stateful: XMSS & LMS

XMSS (RFC 8391) and LMS (RFC 8554, SP 800-208) use Merkle trees to authenticate many one-time keys.

Stateful: signer must never reuse a leaf index. State management is critical.

Approved by NIST for firmware/software updates where state tracking is feasible.

Stateless: SLH-DSA (SPHINCS+)

FIPS 205 — standardized 2024. Based on SPHINCS+.

Stateless — no state management required. Uses a hyper-tree of WOTS+ / FORS trees.

Tradeoff: larger signatures (~7–49 KB) and slower signing than lattice-based schemes, but security relies only on hash function properties.

Parameter sets: SLH-DSA-128s (small), SLH-DSA-128f (fast), up to 256-bit security.

FIPS 205 (SLH-DSA, 2024) · SP 800-208 (LMS/XMSS, 2020) · Bernstein et al., "SPHINCS+" (2019)

Merkle Trees

A tree of hashes enabling efficient proof of inclusion for any leaf in O(log n) hashes.

Root = H(H₀₁ || H₂₃)

H₀₁ = H(H₀||H₁)

H₂₃ = H(H₂||H₃)

H₀=H(D₀)

H₁=H(D₁)

H₂=H(D₂)

H₃=H(D₃)

D₀

D₁

D₂

D₃

To prove D₂ is in the tree, provide: H₃ and H₀₁ (2 hashes for 4 leaves — log₂(4) = 2).

Git

Every commit is a Merkle DAG. Tree objects hash file blobs; commit objects hash trees. Content-addressable storage.

Git uses SHA-1 (transitioning to SHA-256 via object-format).

Blockchain

Transactions in each block form a Merkle tree. SPV (lightweight) clients verify transaction inclusion without downloading full blocks.

Bitcoin: double-SHA-256 Merkle tree per block.

Certificate Transparency

RFC 6962: append-only Merkle tree log of all issued TLS certificates. Enables auditing without trust in any single CA.

Consistency proofs + inclusion proofs in O(log n).

Merkle, "A Digital Signature Based on a Conventional Encryption Function" (CRYPTO 1987) · RFC 6962

Password Hashing

Cryptographic hash functions like SHA-256 are designed to be fast. For password hashing, we need functions that are intentionally slow.

Why SHA-256 Is Wrong

A GPU can compute ~10 billion SHA-256 hashes/second.

An 8-character alphanumeric password has ~2×10¹⁴ possibilities → exhausted in ~6 hours on one GPU.

Fast hashes enable brute-force and dictionary attacks on leaked password databases.

Requirements

CPU-hard — configurable iteration count

Memory-hard — requires large memory allocation, defeating GPU/ASIC parallelism

Salt — unique random value per password prevents rainbow tables

Recommended Functions

bcrypt

Blowfish-based, adjustable cost factor (2^cost iterations). Mature, widely deployed. 72-byte password limit.

scrypt

Memory-hard: requires O(N) memory and O(N) time. Parameters: N (CPU/memory cost), r (block size), p (parallelism). Used in Litecoin, Tarsnap.

Argon2 PHC Winner 2015

Current best practice. Three variants:

Argon2d — data-dependent access (GPU-resistant)

Argon2i — data-independent access (side-channel resistant)

Argon2id — hybrid (recommended default)

OWASP recommendation: Argon2id, m=19456 (19 MB), t=2 iterations, p=1.

Biryukov et al., "Argon2" (PHC, 2015) · OWASP Password Storage Cheat Sheet

Hash Functions in TLS 1.3

TLS 1.3 (RFC 8446) relies heavily on hash functions for key derivation, transcript binding, and message authentication.

HKDF (RFC 5869)

HMAC-based Key Derivation Function — two stages:

Extract: PRK = HMAC-Hash(salt, IKM)

Concentrates entropy from input keying material.

Expand: OKM = HMAC-Hash(PRK, info || counter)

Derives arbitrary-length output keying material.

TLS 1.3 uses HKDF-SHA256 or HKDF-SHA384 depending on cipher suite.

Transcript Hash

TLS 1.3 maintains a running hash of all handshake messages:

Transcript-Hash(M1, M2, ...) = Hash(M1 || M2 || ...)

This hash binds the key schedule to the specific handshake, preventing transcript manipulation.

Finished Messages

The Finished message is an HMAC over the transcript hash using a key derived from the handshake secret:

verify_data = HMAC(finished_key, Transcript-Hash)

Provides key confirmation and handshake integrity.

    TLS 1.3 key schedule: PSK → HKDF-Extract → Early Secret → Derive-Secret → Handshake Secret → ... → Application Keys
  

RFC 8446 — TLS 1.3, Section 7.1 (Key Schedule) · RFC 5869 — HKDF

Hardware Implementation

SHA-256 Architecture

Iterative: One compression round per clock cycle. 64 cycles per block. Compact: ~10 kGE.

Unrolled (2x): Two rounds per cycle. 32 cycles/block. Critical path doubles but throughput 2x.

Fully pipelined: 64-stage pipeline. One block result per cycle after initial latency. Highest throughput; area ~64x iterative.

Critical Path

T₁ computation requires 5 additions in series: h + Σ₁(e) + Ch(e,f,g) + K_t + W_t

Carry-save adders (CSA) reduce the chain: 5-to-2 CSA tree followed by one carry-propagate add.

CSA optimization can improve Fmax by ~40%.

SHA-3 / Keccak Hardware

Lane-complementing: Store some lanes in complemented form to eliminate NOT operations in χ, replacing AND/NOT/XOR with AND/XOR only.

Bit-interleaving: Split each 64-bit lane into even/odd bit planes. Rotations by even amounts become free wire permutations on 32-bit datapaths.

Useful on 32-bit embedded processors or narrow FPGA datapaths.

FPGA Results

Design	Slices	Throughput
SHA-256 iterative	~800	~3 Gbps
SHA-3 iterative	~1,600	~5 Gbps
SHA-3 high-speed	~3,200	~24 Gbps

Kerckhof et al., "Compact FPGA Implementations of SHA-3 Candidates" (2011) · Homsirikamol et al., "SHA-3 Hardware Implementations" (2012)

SystemVerilog — SHA-256 Round

module sha256_round (
    input  logic [31:0] a_in, b_in, c_in, d_in,
    input  logic [31:0] e_in, f_in, g_in, h_in,
    input  logic [31:0] K_t,   // round constant
    input  logic [31:0] W_t,   // message schedule word
    output logic [31:0] a_out, b_out, c_out, d_out,
    output logic [31:0] e_out, f_out, g_out, h_out
);

    // -- Auxiliary functions --
    function automatic logic [31:0] Ch(
        input logic [31:0] e, f, g);
        return (e & f) ^ (~e & g);
    endfunction

    function automatic logic [31:0] Maj(
        input logic [31:0] a, b, c);
        return (a & b) ^ (a & c) ^ (b & c);
    endfunction

    function automatic logic [31:0] Sigma0(input logic [31:0] x);
        return {x[1:0],  x[31:2]}  ^   // ROTR 2
               {x[12:0], x[31:13]} ^   // ROTR 13
               {x[21:0], x[31:22]};    // ROTR 22
    endfunction

    function automatic logic [31:0] Sigma1(input logic [31:0] x);
        return {x[5:0],  x[31:6]}  ^   // ROTR 6
               {x[10:0], x[31:11]} ^   // ROTR 11
               {x[24:0], x[31:25]};    // ROTR 25
    endfunction

    // -- Round computation --
    logic [31:0] T1, T2;

    assign T1 = h_in + Sigma1(e_in) + Ch(e_in, f_in, g_in) + K_t + W_t;
    assign T2 = Sigma0(a_in) + Maj(a_in, b_in, c_in);

    assign a_out = T1 + T2;
    assign b_out = a_in;
    assign c_out = b_in;
    assign d_out = c_in;
    assign e_out = d_in + T1;
    assign f_out = e_in;
    assign g_out = f_in;
    assign h_out = g_in;

endmodule

Python — SHA-256 from Scratch (one block)

import struct

# Initial hash values: fractional parts of square roots of first 8 primes
H = [0x6a09e667, 0xbb67ae85, 0x3c6ef372, 0xa54ff53a,
     0x510e527f, 0x9b05688c, 0x1f83d9ab, 0x5be0cd19]

# Round constants: fractional parts of cube roots of first 64 primes
K = [0x428a2f98, 0x71374491, 0xb5c0fbcf, 0xe9b5dba5,
     0x3956c25b, 0x59f111f1, 0x923f82a4, 0xab1c5ed5,
     0xd807aa98, 0x12835b01, 0x243185be, 0x550c7dc3,
     0x72be5d74, 0x80deb1fe, 0x9bdc06a7, 0xc19bf174,
     # ... 48 more constants (64 total) ...
     ]

def rotr(x, n):  return ((x >> n) | (x << (32 - n))) & 0xFFFFFFFF
def shr(x, n):   return x >> n

def sigma0(x):   return rotr(x,7)  ^ rotr(x,18) ^ shr(x,3)
def sigma1(x):   return rotr(x,17) ^ rotr(x,19) ^ shr(x,10)
def Sigma0(x):   return rotr(x,2)  ^ rotr(x,13) ^ rotr(x,22)
def Sigma1(x):   return rotr(x,6)  ^ rotr(x,11) ^ rotr(x,25)
def Ch(e,f,g):   return (e & f) ^ (~e & g)
def Maj(a,b,c):  return (a & b) ^ (a & c) ^ (b & c)
def add(*args):  return sum(args) & 0xFFFFFFFF

def sha256_block(block, h):
    """Process one 512-bit (64-byte) block."""
    W = list(struct.unpack('>16L', block))           # 16 words from block
    for t in range(16, 64):                          # expand to 64 words
        W.append(add(sigma1(W[t-2]), W[t-7], sigma0(W[t-15]), W[t-16]))

    a, b, c, d, e, f, g, hh = h                     # working variables
    for t in range(64):                              # 64 rounds
        T1 = add(hh, Sigma1(e), Ch(e,f,g), K[t], W[t])
        T2 = add(Sigma0(a), Maj(a,b,c))
        a, b, c, d, e, f, g, hh = add(T1,T2), a, b, c, add(d,T1), e, f, g

    return [add(x, y) for x, y in zip(h, [a,b,c,d,e,f,g,hh])]

Interactive: Hash Visualizer

Type text below to see its SHA-256 hash update in real time. Change a single character to observe the avalanche effect.

Avalanche Effect

A well-designed hash function changes ~50% of output bits when a single input bit flips. SHA-256 achieves this within 4–5 rounds of compression.

The strict avalanche criterion (SAC): flipping input bit i should flip each output bit with probability 1/2.

Try These

"abc" → ba7816bf...

"abd" → a52d1598...

A single letter change (c→d) flips ~130 of 256 output bits. There is no way to predict which bits flip without computing the hash.

Real-World Hash Vulnerabilities

Notable Incidents

2008 Rogue CA Certificate

MD5 collision used to forge a CA certificate that could sign arbitrary TLS certificates. Sotirov et al. at 25C3.

2012 Flame Malware

Nation-state malware used an MD5 chosen-prefix collision to forge a Windows Update code-signing certificate.

2017 SHAttered

Google/CWI produced two different PDFs with identical SHA-1 hashes. Cost: ~$110,000 in cloud compute.

2020 SHA-mbles

Chosen-prefix SHA-1 collision for ~$45,000 — enables PGP key impersonation.

Length Extension Exploits

Many web applications used H(secret || message) as a MAC.

Attackers exploit length extension to forge H(secret || message || padding || attacker_data) without knowing the secret.

Affected: Flickr API (2009), numerous custom authentication schemes.

Fix: use HMAC, never raw H(key || msg).

Hash DoS (2011)

Hash table collision attacks: crafted inputs that all hash to the same bucket, degrading O(1) lookup to O(n²) total insertion time.

Affected: PHP, Java, Python, Ruby, ASP.NET hash tables using non-cryptographic hashes (MurmurHash, DJBX33A).

Fix: randomized hash seeds (SipHash), per-process secret keys.

Selection Guide

Choosing the right hash function for your application.

Use Case	Recommended	Avoid	Notes
General-purpose hashing	SHA-256, SHA3-256	MD5, SHA-1	SHA-256 if hardware accel available
MAC / authentication	HMAC-SHA-256, KMAC	H(K\|\|m), raw hash	KMAC for SHA-3 ecosystem
Password hashing	Argon2id	SHA-256, MD5, bcrypt (if possible)	Memory-hard is essential
Key derivation	HKDF-SHA-256	Raw SHA-256	Extract-then-Expand paradigm
Digital signatures	SHA-256, SHA-384, SHA3-256	SHA-1	Match security level to key size
XOF / variable output	SHAKE256, SHAKE128	Truncated SHA-2	Native XOF is cleaner
Post-quantum signatures	SLH-DSA (SPHINCS+)	RSA, ECDSA alone	Hash-only security assumption
Hash table / non-crypto	SipHash, xxHash	CRC32, DJBX33A	Keyed for DoS resistance

    When in doubt: SHA-256 for hashing, HMAC-SHA-256 for authentication, Argon2id for passwords, HKDF for key derivation.
  

Quantum Impact on Hash Functions

Grover's Algorithm

Provides a quadratic speedup for unstructured search:

Preimage: 2ⁿ → 2^n/2 quantum operations

Second preimage: 2ⁿ → 2^n/2

For SHA-256: preimage reduced from 2²⁵⁶ to 2¹²⁸ — still computationally infeasible.

BHT Algorithm (Collisions)

Brassard-Høyer-Tapp (1997): quantum collision search in O(2^n/3).

SHA-256 collision: 2¹²⁸ → 2⁸⁵ quantum ops.

But requires O(2^n/3) quantum-accessible memory (qRAM), which is a very strong assumption.

Assessment

Hash functions are relatively quantum-resistant.

Doubling the digest size compensates for Grover's attack. 256-bit hashes provide 128-bit post-quantum security for preimage resistance.

NIST Guidance

SP 800-227 (draft): For 128-bit post-quantum security:

• Hash: SHA-256 or SHA3-256 (preimage-safe)

• Collision: SHA-384+ (for 128-bit collision resistance against BHT)

• HMAC: HMAC-SHA-256 remains secure

CNSA 2.0 mandates SHA-384 for signatures by 2025.

Grover, "A Fast Quantum Mechanical Algorithm for Database Search" (STOC 1996) · NIST SP 800-227 (draft)

Summary

Constructions

Merkle-Damgård — iterative compression, vulnerable to length extension

Sponge — absorb/squeeze, immune to length extension, supports XOFs

Two fundamentally different paradigms providing cryptographic diversity.

Standards

SHA-2 (FIPS 180-4) — workhorse, hardware-accelerated, unbroken

SHA-3 (FIPS 202) — backup, flexible, sponge-based

HMAC (RFC 2104) — standard MAC, proven secure

Both SHA-2 and SHA-3 families remain secure as of 2025.

Applications

Digital signatures, TLS key derivation, Merkle trees, password hashing, commitment schemes, hash-based signatures (SLH-DSA)

Hash functions are the most ubiquitous primitive in modern cryptography.

    
      Broken: MD5, SHA-1  
      Current standard: SHA-256, SHA-384, SHA-512, SHA3-256  
      Future: SLH-DSA, KMAC, KangarooTwelve

References

Standards

FIPS 180-4 — Secure Hash Standard (SHA-2), NIST, 2015

FIPS 202 — SHA-3 Standard, NIST, 2015

FIPS 198-1 — HMAC, NIST, 2008

NIST SP 800-185 — SHA-3 Derived Functions, 2016

FIPS 205 — SLH-DSA (SPHINCS+), NIST, 2024

NIST SP 800-208 — LMS and XMSS, 2020

RFC 2104 — HMAC (Krawczyk et al., 1997)

RFC 5869 — HKDF (Krawczyk & Eronen, 2010)

RFC 8446 — TLS 1.3 (Rescorla, 2018)

Design & Theory

Merkle, "A Digital Signature Based on a Conventional Encryption Function" (CRYPTO 1987)

Damgård, "A Design Principle for Hash Functions" (CRYPTO 1989)

Bellare, Canetti & Krawczyk, "Keying Hash Functions for Message Authentication" (CRYPTO 1996)

SHA-3 / Keccak

Bertoni et al., "Keccak Reference" (2011)

Bertoni et al., "Cryptographic Sponge Functions" (2011)

Bertoni et al., "KangarooTwelve" (2016)

Cryptanalysis

Wang et al., "Collisions for Hash Functions MD4, MD5, HAVAL-128" (CRYPTO 2004)

Wang et al., "Finding Collisions in the Full SHA-1" (CRYPTO 2005)

Stevens et al., "The First Collision for Full SHA-1" (CRYPTO 2017)

Leurent & Peyrin, "SHA-1 is a Shambles" (2020)

Password Hashing

Biryukov, Dinu & Khovratovich, "Argon2" (PHC, 2015)

Percival, "Stronger Key Derivation via Sequential Memory-Hard Functions" (2009)

Part of the Modern Cryptography Presentation Series

Hash Functions & MACs

What Is a Cryptographic Hash Function?

Core Properties

One-Way Function

Compression

Security Properties

Preimage Resistance

Second-Preimage Resistance

Collision Resistance

Implications for Design

Merkle-Damgård Construction

How It Works

Length Extension Attack

MD5 & SHA-1 DEPRECATED

MD5 (1992)

SHA-1 (1995)

SHA-2 Family FIPS 180-4

Design Lineage

Why SHA-512/256?

SHA-256 Compression Function

Auxiliary Functions

Sigma Functions

SHA-256 Message Schedule

First 16 Words

Expansion Functions

Design Rationale

Constants Kt

SHA-2 Performance & Hardware

Intel SHA-NI

Bitcoin Mining as Extreme Case

SHA-3 / Keccak FIPS 202

Why a New Standard?

The Competition

The Keccak Team

Key Innovation

Sponge Construction

Parameters

How It Works

Keccak-f Permutation

θ (Theta)

ρ (Rho)

π (Pi)

χ (Chi)

ι (Iota)

SHA-3 Variants & Extensions

XOFs (Extendable Output Functions)

NIST SP 800-185 Extensions

KangarooTwelve (K12)

SHA-2 vs SHA-3 Comparison

HMAC RFC 2104 / FIPS 198-1

Construction Details

Step by Step

Security

Why Not Just H(K || m)?

KMAC & Other MACs

KMAC (SP 800-185)

Poly1305 (Bernstein)

CMAC (SP 800-38B)

GMAC

Comparison

Hash-Based Signatures

Lamport One-Time Signatures

Winternitz OTS (WOTS+)

Stateful: XMSS & LMS

Stateless: SLH-DSA (SPHINCS+)

Merkle Trees

Git

Blockchain

Certificate Transparency

Password Hashing

Why SHA-256 Is Wrong

Requirements

Recommended Functions

Hash Functions in TLS 1.3

HKDF (RFC 5869)

Transcript Hash

Finished Messages

Hardware Implementation

SHA-256 Architecture

Critical Path

Constants K_t