Coding Theory Series — 03

Hamming Codes

The first practical error-correcting codes — elegant, efficient, and perfect

Mathematics Code History

Use arrow keys or swipe to navigate • Press ESC for overview

Roadmap

Hamming's Story

The Bell Labs frustration that launched error-correcting codes

Hamming(7,4)

Step-by-step construction and encoding

Syndrome Decoding

Detecting and locating single-bit errors

Matrix View

Generator and parity-check matrices

SECDED

Extending Hamming codes for double-error detection

Perfect Codes

Why Hamming codes are mathematically special

Hamming's Story at Bell Labs

History

In the late 1940s, Richard Hamming worked at Bell Labs alongside Claude Shannon. He used relay-based computers that ran programs over weekends.

The Problem

The machines used parity checking. When an error was detected, the machine would stop and flash lights indicating the error. On weekdays, operators would fix the error and restart. On weekends — no operators. Hamming would arrive Monday to find his jobs had failed early Friday evening.

The Insight

Hamming realized: "If the machine can detect an error, why can't it locate and correct the error?" He needed multiple parity checks, each covering a different subset of bits, so that the pattern of parity failures would point to the exact error location.

Hamming published his results in 1950: "Error Detecting and Error Correcting Codes" in the Bell System Technical Journal. Marcel Golay independently discovered similar codes the same year.

The Key Idea

Mathematics

Instead of one parity bit checking all data bits, use multiple parity bits, each covering a specific overlapping subset of data bits.

Single Parity (review)

1 parity bit → 1 check → 2 outcomes:

"No error detected" (might be wrong)
"Error detected" (but WHERE?)

Hamming's Approach

r parity bits → r checks → 2^r outcomes:

000...0 = "No error"
Any other pattern = position of the error!

With r parity bits, we can identify 2^r − 1 possible error positions
Need: 2^r − 1 ≥ n = r + k → 2^r ≥ k + r + 1

The syndrome (the pattern of parity check results) directly encodes the binary position number of the erroneous bit. This is the genius of Hamming codes.

Hamming(7,4): Bit Positions

Mathematics

In Hamming(7,4), we have 7 bit positions. Positions that are powers of 2 (1, 2, 4) are parity bits. The remaining positions (3, 5, 6, 7) hold data.

Position	1	2	3	4	5	6	7
Binary	001	010	011	100	101	110	111
Role	p₁	p₂	d₁	p₃	d₂	d₃	d₄

p₁ (pos 1)

Checks positions with bit 0 set in binary:

1, 3, 5, 7

p₂ (pos 2)

Checks positions with bit 1 set in binary:

2, 3, 6, 7

p₃ (pos 4)

Checks positions with bit 2 set in binary:

4, 5, 6, 7

Parity Check Coverage

Mathematics

Each parity bit covers all positions whose binary representation has a 1 in the corresponding bit position:

Position	Binary	p₁ checks?	p₂ checks?	p₃ checks?
1 (p₁)	001	Yes
2 (p₂)	010		Yes
3 (d₁)	011	Yes	Yes
4 (p₃)	100			Yes
5 (d₂)	101	Yes		Yes
6 (d₃)	110		Yes	Yes
7 (d₄)	111	Yes	Yes	Yes

Critical insight: Each position has a unique pattern of parity checks. If a single bit flips, the pattern of failing parity checks (the syndrome) gives the binary representation of the error position!

Encoding Example: Data = 1011

Mathematics Interactive

Encode data bits d₁d₂d₃d₄ = 1011

Step 1: Place data bits in positions 3, 5, 6, 7:

Position: 1 2 3 4 5 6 7 Bits: ? ? 1 ? 011

Step 2: Compute parity bits:

p₁: XOR positions 1,3,5,7

p₁ ⊕ 1 ⊕ 0 ⊕ 1 = 0

p₁ = 0

p₂: XOR positions 2,3,6,7

p₂ ⊕ 1 ⊕ 1 ⊕ 1 = 0

p₂ = 1

p₃: XOR positions 4,5,6,7

p₃ ⊕ 0 ⊕ 1 ⊕ 1 = 0

p₃ = 0

Result:

Codeword: 0 1 1 0 0 1 1 (positions 1-7)

Syndrome Decoding

Mathematics

When we receive a (possibly corrupted) codeword, we recompute each parity check. The results form the syndrome.

Example: Sent 0110011, received 0111011 (error at position 4)

s₁: XOR pos 1,3,5,7

0 ⊕ 1 ⊕ 0 ⊕ 1 = 0

s₂: XOR pos 2,3,6,7

1 ⊕ 1 ⊕ 1 ⊕ 1 = 0

s₃: XOR pos 4,5,6,7

1 ⊕ 0 ⊕ 1 ⊕ 1 = 1

Syndrome = s₃s₂s₁ = 100 = 4 in binary → Error at position 4!

Correction: Flip bit at position 4: 0111011 → 0110011. If syndrome = 000, no error detected.

Complete Syndrome Table

Mathematics

Syndrome (s₃s₂s₁)	Decimal	Meaning	Action
000	0	No error	Accept as-is
001	1	Error at position 1	Flip bit 1 (p₁)
010	2	Error at position 2	Flip bit 2 (p₂)
011	3	Error at position 3	Flip bit 3 (d₁)
100	4	Error at position 4	Flip bit 4 (p₃)
101	5	Error at position 5	Flip bit 5 (d₂)
110	6	Error at position 6	Flip bit 6 (d₃)
111	7	Error at position 7	Flip bit 7 (d₄)

Limitation: If 2 or more bits are flipped, the syndrome will point to the wrong position, causing miscorrection. Hamming(7,4) can only reliably correct single errors.

Generator Matrix G

Mathematics

The encoding process can be expressed as matrix multiplication over GF(2):
c = m · G, where m is the 1×4 message vector and G is the 4×7 generator matrix.

G =
p₁ p₂ d₁ p₃ d₂ d₃ d₄
[ 1 1 1 0 0 0 0 ] ← d₁ at pos 3
[ 1 0 0 1 1 0 0 ] ← d₂ at pos 5
[ 0 1 0 1 0 1 0 ] ← d₃ at pos 6
[ 1 1 0 1 0 0 1 ] ← d₄ at pos 7

Each row of G is a codeword. The generator matrix encodes by selecting which rows to XOR together (the rows corresponding to 1-bits in the message).

Systematic form: G = [P | I_k] where P contains the parity-check coefficients and I_k is the 4×4 identity matrix.

Parity-Check Matrix H

Mathematics

The parity-check matrix H is used for decoding. The syndrome is computed as s = H · r^T, where r is the received vector.

H = [ 1 0 1 0 1 0 1 ]
[ 0 1 1 0 0 1 1 ]
[ 0 0 0 1 1 1 1 ]

Key Property

The columns of H are the binary representations of the numbers 1 through 7:

Col 1: 001, Col 2: 010, Col 3: 011
Col 4: 100, Col 5: 101, Col 6: 110, Col 7: 111

Syndrome = Error Position

If there's a single error at position j, the syndrome s = H · e_j^T equals column j of H, which is the binary representation of j.

G · H^T = 0 (fundamental relationship)

Matrix Encoding Example

Mathematics Code

Encode message m = [1, 0, 1, 1] using c = m · G:

c = [1,0,1,1] \cdot G = 1\cdot[1,1,1,0,0,0,0] \oplus 0\cdot[1,0,0,1,1,0,0] \oplus 1\cdot[0,1,0,1,0,1,0] \oplus 1\cdot[1,1,0,1,0,0,1] = [1,1,1,0,0,0,0] \oplus [0,1,0,1,0,1,0] \oplus [1,1,0,1,0,0,1] = [0,1,1,0,0,1,1]

import numpy as np

G = np.array([
    [1, 1, 1, 0, 0, 0, 0],
    [1, 0, 0, 1, 1, 0, 0],
    [0, 1, 0, 1, 0, 1, 0],
    [1, 1, 0, 1, 0, 0, 1]
])

message = np.array([1, 0, 1, 1])
codeword = message @ G % 2
print(f"Codeword: {codeword}")  # [0 1 1 0 0 1 1]

Matrix Syndrome Decoding

Mathematics Code

H = np.array([
    [1, 0, 1, 0, 1, 0, 1],
    [0, 1, 1, 0, 0, 1, 1],
    [0, 0, 0, 1, 1, 1, 1]
])

def decode_hamming74(received):
    """Decode a Hamming(7,4) codeword using syndrome decoding."""
    r = np.array(received)
    syndrome = H @ r % 2
    error_pos = syndrome[0] * 1 + syndrome[1] * 2 + syndrome[2] * 4

    if error_pos == 0:
        print("No error detected")
    else:
        print(f"Error at position {error_pos}, flipping bit")
        r[error_pos - 1] ^= 1  # Fix the error (0-indexed)

    # Extract data bits from positions 3, 5, 6, 7 (indices 2, 4, 5, 6)
    data = [r[2], r[4], r[5], r[6]]
    return data

# Test: introduce error at position 5
received = [0, 1, 1, 0, 1, 1, 1]  # bit 5 flipped (was 0)
decoded = decode_hamming74(received)
print(f"Decoded message: {decoded}")  # [1, 0, 1, 1]

The syndrome directly gives the error position in binary. No lookup table or search is needed — just a single matrix multiplication and a bit flip.

Complete Hamming(7,4) Codebook

Mathematics

Message	Codeword	Weight	Message	Codeword	Weight
0000	0000000	0	1000	1110000	3
0001	1101001	4	1001	0011001	3
0010	0110010	3	1010	1000010	4
0011	1011011	4	1011	0101011	3
0100	1010100	3	1100	0100100	4
0101	0111101	4	1101	1001101	3
0110	1100110	4	1110	0010110	3
0111	0001111	3	1111	1111111	7

Weight distribution: 1 word of weight 0, 7 of weight 3, 7 of weight 4, 1 of weight 7

Minimum weight: 3 (confirming d_min = 3 for this linear code)

SECDED: Extended Hamming Code

Mathematics

SECDED = Single Error Correction, Double Error Detection. Add one more parity bit (overall parity) to create an [8, 4, 4] code.

Hamming(7,4)
[7, 4, 3]

+ overall parity →

SECDED(8,4)
[8, 4, 4]

How It Works

Encode with Hamming(7,4) as before
Append bit p₀ = XOR of all 7 bits
On decoding, compute syndrome AND overall parity

Decision Table

Syndrome	Overall P	Meaning
= 0	OK	No error
≠ 0	Fail	1 error → correct it
≠ 0	OK	2 errors → detected!
= 0	Fail	Error in p₀ only

SECDED is used in virtually all modern ECC RAM (Error-Correcting Code memory). It corrects single-bit errors transparently and detects double-bit errors.

General Hamming Codes

Mathematics

Hamming(7,4) is just one member of a family. For any integer r ≥ 2, there is a Hamming(2^r−1, 2^r−1−r) code.

n = 2^r − 1, k = 2^r − 1 − r, d_min = 3

r	Code	[n, k, d]	Rate	Redundancy
2	Hamming(3,1)	[3, 1, 3]	0.333	2 bits
3	Hamming(7,4)	[7, 4, 3]	0.571	3 bits
4	Hamming(15,11)	[15, 11, 3]	0.733	4 bits
5	Hamming(31,26)	[31, 26, 3]	0.839	5 bits
6	Hamming(63,57)	[63, 57, 3]	0.905	6 bits
8	Hamming(255,247)	[255, 247, 3]	0.969	8 bits

As r grows, the rate R = (2^r−1−r)/(2^r−1) → 1. Longer Hamming codes are more efficient, but still only correct 1 error per block.

Perfect Codes

Mathematics

A code is perfect if it achieves the Hamming bound with equality — the Hamming spheres of radius t around each codeword tile the entire space with no overlaps or gaps.

2^k · ∑_i=0^t C(n, i) = 2ⁿ (exact equality)

Hamming Codes Are Perfect!

For Hamming(2^r−1, 2^r−1−r) with t = 1:

2^k · (1 + n) = 2^k · 2^r = 2^k+r = 2ⁿ ✓

Every vector in {0,1}ⁿ is within distance 1 of exactly one codeword.

All Binary Perfect Codes

Hamming codes [2^r−1, 2^r−1−r, 3]
Binary Golay code [23, 12, 7]
Trivial: repetition codes [n, 1, n] for odd n
Trivial: whole space {0,1}ⁿ
That's it! No other binary perfect codes exist.

Why Perfection Matters

Mathematics

For a perfect code, the nearest-codeword decoder is unambiguous for all error patterns of weight ≤ t:

Perfect Code (Hamming)

Every received vector r is within distance 1 of exactly one codeword. The decoder always makes a definite decision.

0 errors → correct decoding
1 error → correct decoding
2 errors → decodes to wrong codeword

Non-Perfect Code

Some received vectors are NOT within distance t of any codeword. These are "uncorrectable" patterns that the decoder can detect but not fix.

Can detect more errors than it corrects
More flexible error-handling strategies
Gap between spheres = detection region

Surprising consequence: Hamming codes CANNOT detect 2 errors without miscorrecting. Because they're perfect, every received vector is decoded — a 2-error pattern is "corrected" to the wrong codeword. This is why SECDED adds the extra parity bit.

Distance & Correction Capability

Mathematics

The relationship between minimum distance and error correction/detection is fundamental:

t-error correcting: d_min ≥ 2t + 1
s-error detecting: d_min ≥ s + 1
t-correcting AND s-detecting (s > t): d_min ≥ t + s + 1

Hamming(7,4): d_min = 3

Corrects t = 1 error (2·1+1 = 3 ≤ 3)
Detects s = 2 errors (2+1 = 3 ≤ 3)
But NOT both simultaneously!

SECDED(8,4): d_min = 4

Corrects t = 1 AND detects s = 2
Satisfies t + s + 1 = 1 + 2 + 1 = 4 ≤ 4
The extra bit enables simultaneous correction + detection

Python: Complete Hamming(7,4)

Code

class Hamming74:
    """Complete Hamming(7,4) encoder and decoder."""

    G = [[1,1,1,0,0,0,0], [1,0,0,1,1,0,0],
         [0,1,0,1,0,1,0], [1,1,0,1,0,0,1]]

    H = [[1,0,1,0,1,0,1], [0,1,1,0,0,1,1], [0,0,0,1,1,1,1]]

    @staticmethod
    def encode(message):
        """Encode 4-bit message to 7-bit codeword."""
        assert len(message) == 4
        codeword = [0] * 7
        for i in range(4):
            if message[i]:
                for j in range(7):
                    codeword[j] ^= Hamming74.G[i][j]
        return codeword

    @staticmethod
    def decode(received):
        """Decode 7-bit received word, correcting single errors."""
        r = list(received)
        # Compute syndrome
        syndrome = 0
        for i in range(3):
            s = 0
            for j in range(7):
                s ^= Hamming74.H[i][j] * r[j]
            syndrome += s * (2 ** i)

        if syndrome != 0:
            r[syndrome - 1] ^= 1  # Correct the error

        return [r[2], r[4], r[5], r[6]]  # Extract data bits

Testing Hamming(7,4)

Code

import random

def test_hamming74():
    """Test all possible messages and single-error patterns."""
    errors_corrected = 0
    total_tests = 0

    # Test all 16 possible 4-bit messages
    for msg_int in range(16):
        message = [(msg_int >> i) & 1 for i in range(3, -1, -1)]
        codeword = Hamming74.encode(message)

        # Test no-error case
        decoded = Hamming74.decode(codeword)
        assert decoded == message, f"Failed for {message}"

        # Test all 7 single-error patterns
        for err_pos in range(7):
            corrupted = list(codeword)
            corrupted[err_pos] ^= 1  # Introduce single error
            decoded = Hamming74.decode(corrupted)
            assert decoded == message, \
                f"Failed: msg={message}, err_pos={err_pos}"
            errors_corrected += 1
            total_tests += 1

    print(f"All tests passed! {errors_corrected} errors corrected "
          f"across {total_tests} test cases.")

test_hamming74()  # All tests passed! 112 errors corrected

16 messages × 7 error positions = 112 single-error patterns, all correctly decoded. Plus 16 no-error cases = 128 total vectors = 2⁷ = the entire space. This confirms perfection.

Application: ECC Memory

Applications

Modern server RAM uses SECDED Hamming codes to protect against bit flips caused by cosmic rays, alpha particles, and electrical noise.

How ECC RAM Works

Data is written in 64-bit blocks
8 parity bits are computed (SECDED)
72 bits total are stored (64 data + 8 check)
On read, syndrome is computed
Single-bit errors are silently corrected
Double-bit errors trigger a machine check exception

Why It Matters

~1/month

Single-bit errors per GB of RAM

10⁹+

Bytes in a server — errors are inevitable

Without ECC, a server with 128 GB of RAM could experience hundreds of silent data corruptions per year. ECC RAM is mandatory for servers, scientific computing, and financial systems.

Hamming vs. Simple Codes

Mathematics

Property	Single Parity	Repetition-3	Hamming(7,4)	SECDED(8,4)
Parameters	[n+1, n, 2]	[3, 1, 3]	[7, 4, 3]	[8, 4, 4]
Rate	~1.0	0.333	0.571	0.500
Errors detected	1	2	2	3
Errors corrected	0	1	1	1
Perfect?	No	Yes (trivially)	Yes	No
Decode complexity	O(n)	O(n)	O(n)	O(n)

Hamming(7,4) achieves the same error correction as repetition-3 but transmits 4 data bits instead of 1 — a 4× improvement in throughput.

Limitations of Hamming Codes

Mathematics

Strengths

Elegant mathematical structure
Simple encoding and decoding (XOR only)
Perfect — optimal for single-error correction
Efficient hardware implementation
Well-suited for memory protection

Limitations

Only corrects 1 error per block
Not suitable for channels with burst errors
d_min = 3 is fixed (cannot be increased within the family)
For stronger protection, need BCH, Reed-Solomon, or LDPC
Perfect property means NO detection beyond correction capability

When Hamming codes fail: On channels where errors come in bursts (e.g., wireless, disk storage), multiple bits in a block are corrupted simultaneously. Hamming codes miscorrect these patterns. For burst errors, interleaving or Reed-Solomon codes are needed.

Summary & What's Next

Key Takeaways

Hamming codes use overlapping parity checks to locate single-bit errors
Parity bits at positions 2⁰, 2¹, ..., 2^r-1
The syndrome directly gives the error position in binary
Generator matrix G encodes; parity-check matrix H decodes
SECDED extends Hamming for simultaneous correction + detection
Hamming codes are perfect — they optimally tile Hamming space

Coming Up: CRC & Error Detection

Hamming codes are great for correction, but many applications only need fast, reliable detection of errors.

CRC (Cyclic Redundancy Check) uses polynomial arithmetic to detect burst errors with extremely high probability.

Used in Ethernet, USB, ZIP files, and virtually every network protocol.

Hamming Codes

Roadmap

Hamming's Story

Hamming(7,4)

Syndrome Decoding

Matrix View

SECDED

Perfect Codes

Hamming's Story at Bell Labs

The Problem

The Insight

The Key Idea

Single Parity (review)

Hamming's Approach

Hamming(7,4): Bit Positions

p1 (pos 1)

p2 (pos 2)

p3 (pos 4)

Parity Check Coverage

Encoding Example: Data = 1011

Syndrome Decoding

Complete Syndrome Table

Generator Matrix G

Parity-Check Matrix H

Key Property

Syndrome = Error Position

Matrix Encoding Example

Matrix Syndrome Decoding

Complete Hamming(7,4) Codebook

SECDED: Extended Hamming Code

How It Works

Decision Table

General Hamming Codes

Perfect Codes

Hamming Codes Are Perfect!

All Binary Perfect Codes

Why Perfection Matters

Perfect Code (Hamming)

Non-Perfect Code

Distance & Correction Capability

Hamming(7,4): dmin = 3

SECDED(8,4): dmin = 4

Python: Complete Hamming(7,4)

Testing Hamming(7,4)

Application: ECC Memory

How ECC RAM Works

Why It Matters

Hamming vs. Simple Codes

Limitations of Hamming Codes

Strengths

Limitations

Summary & What's Next

Key Takeaways

Coming Up: CRC & Error Detection

p₁ (pos 1)

p₂ (pos 2)

p₃ (pos 4)

Hamming(7,4): d_min = 3

SECDED(8,4): d_min = 4