Coding Theory Series — 02

Parity Checks & Simple Codes

The simplest error-detecting and error-correcting schemes

Mathematics Code History Applications

Use arrow keys or swipe to navigate • Press ESC for overview

Roadmap

History

Early telegraph and telephone error checking methods

Parity Bits

Even parity, odd parity, and single parity check codes

Repetition

Triple repetition codes and majority voting

2D Parity

Rectangular codes for stronger error detection

Real-World

ISBN check digits and the Luhn algorithm

Tradeoffs

Rate vs. error correction capability

Historical Context

History

Error checking is as old as communication itself. Long before Shannon's formal theory, engineers devised ad-hoc methods to detect transmission errors.

Telegraph (1840s–1900s)

Operators used "check sums" — adding up character values and transmitting the total
Baudot code (1870) included error-checking conventions
Vernam cipher (1917) used XOR — same operation underlying parity

Early Computing (1940s)

Punched card machines used parity rows for verification
Bell Labs relay computers checked parity after each operation
Hamming's frustration: machines could detect errors but not fix them

The parity bit is arguably the oldest and most ubiquitous error-detection mechanism in digital systems.

The Parity Bit

Mathematics

A parity bit is a single bit appended to a data word to make the total number of 1-bits either even or odd.

Even Parity

The parity bit is set so the total number of 1s (data + parity) is even.

1011001 → 10110010 (four 1s)
1010001 → 10100011 (four 1s)

Odd Parity

The parity bit is set so the total number of 1s is odd.

1011001 → 10110011 (five 1s)
1010001 → 10100010 (three 1s)

Even parity: p = d₁ ⊕ d₂ ⊕ d₃ ⊕ ... ⊕ d_k

Parity is simply the XOR of all data bits. This operation is both computationally trivial and implementable with a single XOR gate chain.

Single Parity Check Code

Mathematics

The single parity check (SPC) code is a [n, n−1, 2] code. It appends one parity bit to n−1 data bits.

Properties

Rate: R = (n−1)/n (very high)
d_min = 2: detects 1 error
Cannot correct any errors
All codewords have even weight

Example: [8, 7, 2]

Data (7 bits)	Parity	Codeword
0000000	0	00000000
1010101	0	10101010
1100110	0	11001100
1111111	1	11111111

Critical limitation: SPC cannot detect an even number of errors. If 2 bits flip, parity still checks out and the error goes undetected!

Python: Parity Encoding

Code

def encode_even_parity(data: str) -> str:
    """Append an even parity bit to a binary string."""
    parity = sum(int(b) for b in data) % 2
    return data + str(parity)

def check_even_parity(codeword: str) -> bool:
    """Check if a codeword has valid even parity."""
    return sum(int(b) for b in codeword) % 2 == 0

# Encoding examples
print(encode_even_parity("1011001"))  # "10110010" (4 ones, parity=0)
print(encode_even_parity("1010001"))  # "10100011" (4 ones, parity=1)

# Detection examples
print(check_even_parity("10110010"))  # True  - no error
print(check_even_parity("10110011"))  # False - 1 bit flipped!
print(check_even_parity("10110001"))  # True  - 2 bits flipped, UNDETECTED!

Line 11: When 2 bits flip simultaneously, parity cannot detect the error. This is the fundamental weakness of single parity check codes.

When Parity Fails

Mathematics Interactive

Single Error — Detected

Sent: 1 0 1 1 0 0 1 0 (parity = 0, sum = 4)
Received: 1 0 0 1 0 0 1 0 (parity check: sum = 3 ≠ even) ERROR!

Double Error — Undetected!

Sent: 1 0 1 1 0 0 1 0 (parity = 0, sum = 4)
Received: 1 0 0 1 0 1 1 0 (parity check: sum = 4 = even) Looks OK!

Triple Error — Detected

Sent: 1 0 1 1 0 0 1 0 (parity = 0, sum = 4)
Received: 0 0 0 1 0 1 1 0 (parity check: sum = 3 ≠ even) ERROR!

Pattern: SPC detects odd numbers of errors but misses even numbers. In general, it detects all error patterns of weight 1, 3, 5, ... but misses all patterns of weight 2, 4, 6, ...

Repetition Codes

Mathematics

The simplest error-correcting code: repeat each bit n times. The resulting [n, 1, n] code can correct up to ⌊(n−1)/2⌋ errors.

[3, 1, 3]

Triple repetition

0 → 000
1 → 111

Corrects 1 error

[5, 1, 5]

Quintuple repetition

0 → 00000
1 → 11111

Corrects 2 errors

[7, 1, 7]

Septuple repetition

0 → 0000000
1 → 1111111

Corrects 3 errors

Terrible rate: R = 1/n. The triple repetition code wastes 2/3 of bandwidth! This is the brute-force approach — effective but extremely inefficient.

Majority Voting Decoder

Mathematics Code

For repetition codes, decoding is simple: take the majority vote of the received bits.

Triple Repetition Examples

Received 000 → decode as 0 (unanimous)
Received 001 → decode as 0 (2 vs 1)
Received 010 → decode as 0 (2 vs 1)
Received 011 → decode as 1 (1 vs 2)
Received 111 → decode as 1 (unanimous)

Error Analysis

For BSC with crossover probability p:

P_error = 3p²(1−p) + p³ = 3p² − 2p³

For p = 0.01: P_error ≈ 0.000298 (much better than raw p!)

def repetition_encode(bit: str, n: int = 3) -> str:
    """Encode a single bit using n-fold repetition."""
    return bit * n

def majority_vote_decode(received: str) -> str:
    """Decode using majority voting."""
    ones = received.count('1')
    return '1' if ones > len(received) // 2 else '0'

# Encode and simulate error
encoded = repetition_encode('1', 3)  # "111"
received = "110"  # one bit flipped
decoded = majority_vote_decode(received)  # "1" - correct!

Repetition Code Performance

Mathematics Code

How does error probability decrease as we increase repetition factor n?

from math import comb

def repetition_error_prob(p, n):
    """Probability of decoding error for [n,1,n] repetition code.
    Error occurs when more than n//2 bits are flipped."""
    t = n // 2  # correction capability
    p_error = sum(comb(n, i) * p**i * (1-p)**(n-i)
                  for i in range(t + 1, n + 1))
    return p_error

# Compare for p = 0.01
for n in [1, 3, 5, 7, 9]:
    pe = repetition_error_prob(0.01, n)
    rate = 1/n
    print(f"[{n},1,{n}]: P_error = {pe:.2e}, Rate = {rate:.3f}")

Code	Rate	P_error (p=0.01)	Improvement
Uncoded	1.000	1.00 × 10^-2	—
[3,1,3]	0.333	2.98 × 10^-4	33×
[5,1,5]	0.200	9.80 × 10^-7	10,200×
[7,1,7]	0.143	3.40 × 10^-10	29M×

Error probability drops exponentially with n, but so does the code rate. This is the fundamental tradeoff.

Two-Dimensional Parity

Mathematics

Arrange data in a rectangular grid and compute parity for each row AND each column. This creates a 2D parity check code.

Layout

  Data bits   | Row
  -----------+-----
  1 0 1 1    |  1
  0 1 1 0    |  0
  1 1 0 1    |  1
  -----------+-----
  0 0 0 0    | Col parity

Properties

Detects all 1, 2, and 3-bit errors
Corrects all single-bit errors
Can locate the error at the intersection of the failed row and column checks
Some 4-bit error patterns go undetected

Single error correction: If row i and column j both fail parity, the error is at position (i, j). Flip that bit to correct it!

2D Parity: Worked Example

Mathematics Interactive

Original (with parity)

  1 0 1 1 | 1
  0 1 1 0 | 0
  1 1 0 1 | 1
  ---------+--
  0 0 0 0 | 0

All row/column parities correct

Error at position (1,2)

  1 0 1 1 | 1 ✓
  0 0 1 0 | 0 ✗
  1 1 0 1 | 1 ✓
  ---------+--
  0 0 0 0 | 0
     ✗

Row 1 fails, Col 1 fails → error at (1,1)

Rate = (r · c) / ((r+1) · (c+1)) For 4×3: R = 12/20 = 0.60

Limitation: A rectangular pattern of 4 errors (forming a rectangle in the grid) is undetectable — each row and column parity still checks out.

Python: 2D Parity

Code

def encode_2d_parity(data, rows, cols):
    """Encode data with 2D parity (row + column + corner)."""
    grid = [data[i*cols:(i+1)*cols] for i in range(rows)]
    # Add row parity
    for row in grid:
        row.append(sum(row) % 2)
    # Add column parity row
    col_parity = [sum(grid[r][c] for r in range(rows)) % 2
                  for c in range(cols + 1)]
    grid.append(col_parity)
    return grid

def check_2d_parity(grid):
    """Check 2D parity; return error position or None."""
    rows, cols = len(grid), len(grid[0])
    bad_rows = [r for r in range(rows) if sum(grid[r]) % 2 != 0]
    bad_cols = [c for c in range(cols)
                if sum(grid[r][c] for r in range(rows)) % 2 != 0]
    if not bad_rows and not bad_cols:
        return None  # No error detected
    if len(bad_rows) == 1 and len(bad_cols) == 1:
        return (bad_rows[0], bad_cols[0])  # Single error located!
    return "Multiple errors detected"

# Example
data = [1,0,1,1, 0,1,1,0, 1,1,0,1]
grid = encode_2d_parity(data, 3, 4)
grid[1][1] ^= 1  # Flip bit at (1,1)
print(check_2d_parity(grid))  # (1, 1)

ISBN Check Digits

Applications History

The International Standard Book Number (ISBN) uses a check digit system to catch transcription errors when book numbers are typed or spoken.

ISBN-10 (pre-2007)

10 digits: d₁d₂...d₁₀

∑_i=1¹⁰ i · d_i ≡ 0 (mod 11)

Last digit can be 'X' (representing 10)

Detects: all single-digit errors AND all transpositions of adjacent digits!

ISBN-13 (current)

13 digits: d₁d₂...d₁₃

∑_i=1¹³ w_i · d_i ≡ 0 (mod 10)

Weights alternate: 1, 3, 1, 3, ...

Detects: all single-digit errors but NOT all transpositions.

Why modular arithmetic? Using mod 11 (a prime) for ISBN-10 ensures that all single-digit substitutions and adjacent transpositions are caught — a property mod 10 cannot guarantee.

Python: ISBN Validation

Code

def validate_isbn10(isbn: str) -> bool:
    """Validate an ISBN-10 check digit."""
    isbn = isbn.replace('-', '')
    if len(isbn) != 10:
        return False
    total = 0
    for i, ch in enumerate(isbn):
        if ch == 'X' and i == 9:
            val = 10
        elif ch.isdigit():
            val = int(ch)
        else:
            return False
        total += (i + 1) * val
    return total % 11 == 0

def validate_isbn13(isbn: str) -> bool:
    """Validate an ISBN-13 check digit."""
    isbn = isbn.replace('-', '')
    if len(isbn) != 13 or not isbn.isdigit():
        return False
    weights = [1, 3] * 6 + [1]
    total = sum(int(d) * w for d, w in zip(isbn, weights))
    return total % 10 == 0

# Examples
print(validate_isbn10("0-306-40615-2"))  # True
print(validate_isbn13("978-0-306-40615-7"))  # True
print(validate_isbn10("0-306-40615-3"))  # False (wrong check digit)

ISBN-10: Transposition Detection

Mathematics

Why does ISBN-10 detect all adjacent transpositions? Consider swapping digits at positions j and j+1:

Original: ∑ = ... + j·a + (j+1)·b + ...
Swapped: ∑' = ... + j·b + (j+1)·a + ...

∑' − ∑ = j(b−a) + (j+1)(a−b) = (a−b)(j+1−j) = (a−b)

For this to go undetected: (a − b) ≡ 0 (mod 11)

Since 0 ≤ a, b ≤ 10 and a ≠ b: |a − b| ranges from 1 to 10, none of which are divisible by 11.

The magic of mod 11: Because 11 is prime and larger than all digit values, no non-trivial digit difference is zero mod 11. This algebraic property is why ISBN-10 is mathematically superior to ISBN-13 for transposition detection.

The Luhn Algorithm

Applications History

Invented by Hans Peter Luhn at IBM in 1954, the Luhn algorithm is used to validate credit card numbers, IMEI numbers, and various ID systems.

Algorithm Steps

Starting from the rightmost digit, double every second digit
If doubling produces a number > 9, subtract 9
Sum all digits
If total mod 10 = 0, the number is valid

Example: 4539 1488 0343 6467

Original: 4 5 3 9 1 4 8 8 0 3 4 3 6 4 6 7
Doubled: 8 5 6 9 2 4 7 8 0 3 8 3 3 4 3 7
Sum: 8+5+6+9+2+4+7+8+0+3+8+3+3+4+3+7 = 80
80 mod 10 = 0 ✓ Valid!

Python: Luhn Algorithm

Code

def luhn_validate(number: str) -> bool:
    """Validate a number using the Luhn algorithm."""
    digits = [int(d) for d in number if d.isdigit()]
    # Process from right to left, doubling every second digit
    checksum = 0
    for i, d in enumerate(reversed(digits)):
        if i % 2 == 1:  # Every second digit from right
            d *= 2
            if d > 9:
                d -= 9
        checksum += d
    return checksum % 10 == 0

def luhn_generate_check_digit(partial: str) -> int:
    """Generate the Luhn check digit for a partial number."""
    for check in range(10):
        if luhn_validate(partial + str(check)):
            return check
    return -1  # Should never reach here

# Examples
print(luhn_validate("4539148803436467"))  # True
print(luhn_validate("4539148803436468"))  # False
print(luhn_generate_check_digit("453914880343646"))  # 7

Luhn detects: all single-digit errors and most (but not all) transpositions of adjacent digits. It fails for transpositions of 0↔9.

Luhn: Strengths & Limitations

Mathematics

Detects

All single-digit substitution errors
Most adjacent transpositions
Simple to implement (no division)
Works with standard decimal digits

Misses

Transposition of 0 ↔ 9 (09 ↔ 90)
Some twin errors (e.g., 22 ↔ 55)
Any error pattern that preserves the checksum
NOT a security measure — easily forged

Important: The Luhn algorithm is a sanity check, not a security feature. It catches accidental typos but provides zero protection against intentional fraud. Never confuse error detection with authentication!

Better alternative: The Verhoeff algorithm (1969) uses the dihedral group D₅ to catch ALL single errors and ALL adjacent transpositions, but is more complex to implement.

Rate vs. Correction Tradeoff

Mathematics

Every error-control code faces a fundamental tradeoff: more redundancy means better error protection but lower throughput.

Code	[n, k, d]	Rate	Detects	Corrects
Uncoded	[n, n, 1]	1.000	0	0
Single parity	[8, 7, 2]	0.875	1	0
2D parity (4×3)	[20, 12, 4]	0.600	3	1
Repetition-3	[3, 1, 3]	0.333	2	1
Repetition-5	[5, 1, 5]	0.200	4	2
Hamming(7,4)	[7, 4, 3]	0.571	2	1

Notice: Hamming(7,4) corrects 1 error just like repetition-3, but at rate 0.571 vs 0.333. Better codes achieve more correction with less redundancy. This is what coding theory optimization is all about.

Limitations of Simple Codes

Mathematics

Single Parity Check

Cannot correct errors, only detect
Misses all even-weight error patterns
No information about error location
Requires ARQ (retransmission) protocol

Repetition Codes

Catastrophically low code rate
Only encodes 1 bit at a time
Bandwidth cost grows linearly with protection
Far from any theoretical bound

2D Parity

Misses rectangular error patterns
Limited to single-error correction
Overhead grows with grid size
Not algebraically elegant

Check Digit Schemes

Application-specific, not general-purpose
Detect only, cannot correct
Limited error pattern coverage
Not suitable for binary channels

Summary & What's Next

Key Takeaways

Parity bits are the simplest error detection — XOR of all data bits
Repetition codes can correct errors but at terrible rates
2D parity improves on both by combining row and column checks
Real-world check digits (ISBN, Luhn) use modular arithmetic for human-facing validation
All simple codes fall far short of Shannon's channel capacity

Coming Up: Hamming Codes

Richard Hamming asked: can we do much better? His answer was a resounding yes.

Hamming(7,4): Corrects 1 error with rate 4/7 ≈ 0.57, compared to repetition-3's rate of 0.33 for the same correction capability.

We'll see how clever placement of parity bits enables not just detection but location of errors.

Simple Codes

→

Hamming Codes

→

CRC

→

Advanced Codes