The simplest error-detecting and error-correcting schemes
Early telegraph and telephone error checking methods
Even parity, odd parity, and single parity check codes
Triple repetition codes and majority voting
Rectangular codes for stronger error detection
ISBN check digits and the Luhn algorithm
Rate vs. error correction capability
Error checking is as old as communication itself. Long before Shannon's formal theory, engineers devised ad-hoc methods to detect transmission errors.
A parity bit is a single bit appended to a data word to make the total number of 1-bits either even or odd.
The parity bit is set so the total number of 1s (data + parity) is even.
1011001 → 10110010 (four 1s)
1010001 → 10100011 (four 1s)
The parity bit is set so the total number of 1s is odd.
1011001 → 10110011 (five 1s)
1010001 → 10100010 (three 1s)
The single parity check (SPC) code is a [n, n−1, 2] code. It appends one parity bit to n−1 data bits.
| Data (7 bits) | Parity | Codeword |
|---|---|---|
| 0000000 | 0 | 00000000 |
| 1010101 | 0 | 10101010 |
| 1100110 | 0 | 11001100 |
| 1111111 | 1 | 11111111 |
def encode_even_parity(data: str) -> str:
"""Append an even parity bit to a binary string."""
parity = sum(int(b) for b in data) % 2
return data + str(parity)
def check_even_parity(codeword: str) -> bool:
"""Check if a codeword has valid even parity."""
return sum(int(b) for b in codeword) % 2 == 0
# Encoding examples
print(encode_even_parity("1011001")) # "10110010" (4 ones, parity=0)
print(encode_even_parity("1010001")) # "10100011" (4 ones, parity=1)
# Detection examples
print(check_even_parity("10110010")) # True - no error
print(check_even_parity("10110011")) # False - 1 bit flipped!
print(check_even_parity("10110001")) # True - 2 bits flipped, UNDETECTED!
Sent: 1 0 1 1 0 0 1 0 (parity = 0, sum = 4)
Received: 1 0 0 1 0 0 1 0 (parity check: sum = 3 ≠ even) ERROR!
Sent: 1 0 1 1 0 0 1 0 (parity = 0, sum = 4)
Received: 1 0 0 1 0 1 1 0 (parity check: sum = 4 = even) Looks OK!
Sent: 1 0 1 1 0 0 1 0 (parity = 0, sum = 4)
Received: 0 0 0 1 0 1 1 0 (parity check: sum = 3 ≠ even) ERROR!
The simplest error-correcting code: repeat each bit n times. The resulting [n, 1, n] code can correct up to ⌊(n−1)/2⌋ errors.
Triple repetition
0 → 000
1 → 111
Corrects 1 error
Quintuple repetition
0 → 00000
1 → 11111
Corrects 2 errors
Septuple repetition
0 → 0000000
1 → 1111111
Corrects 3 errors
For repetition codes, decoding is simple: take the majority vote of the received bits.
Received 000 → decode as 0 (unanimous)
Received 001 → decode as 0 (2 vs 1)
Received 010 → decode as 0 (2 vs 1)
Received 011 → decode as 1 (1 vs 2)
Received 111 → decode as 1 (unanimous)
For BSC with crossover probability p:
For p = 0.01: Perror ≈ 0.000298 (much better than raw p!)
def repetition_encode(bit: str, n: int = 3) -> str:
"""Encode a single bit using n-fold repetition."""
return bit * n
def majority_vote_decode(received: str) -> str:
"""Decode using majority voting."""
ones = received.count('1')
return '1' if ones > len(received) // 2 else '0'
# Encode and simulate error
encoded = repetition_encode('1', 3) # "111"
received = "110" # one bit flipped
decoded = majority_vote_decode(received) # "1" - correct!
How does error probability decrease as we increase repetition factor n?
from math import comb
def repetition_error_prob(p, n):
"""Probability of decoding error for [n,1,n] repetition code.
Error occurs when more than n//2 bits are flipped."""
t = n // 2 # correction capability
p_error = sum(comb(n, i) * p**i * (1-p)**(n-i)
for i in range(t + 1, n + 1))
return p_error
# Compare for p = 0.01
for n in [1, 3, 5, 7, 9]:
pe = repetition_error_prob(0.01, n)
rate = 1/n
print(f"[{n},1,{n}]: P_error = {pe:.2e}, Rate = {rate:.3f}")
| Code | Rate | Perror (p=0.01) | Improvement |
|---|---|---|---|
| Uncoded | 1.000 | 1.00 × 10-2 | — |
| [3,1,3] | 0.333 | 2.98 × 10-4 | 33× |
| [5,1,5] | 0.200 | 9.80 × 10-7 | 10,200× |
| [7,1,7] | 0.143 | 3.40 × 10-10 | 29M× |
Arrange data in a rectangular grid and compute parity for each row AND each column. This creates a 2D parity check code.
Data bits | Row
-----------+-----
1 0 1 1 | 1
0 1 1 0 | 0
1 1 0 1 | 1
-----------+-----
0 0 0 0 | Col parity
1 0 1 1 | 1 0 1 1 0 | 0 1 1 0 1 | 1 ---------+-- 0 0 0 0 | 0
All row/column parities correct
1 0 1 1 | 1 ✓ 0 0 1 0 | 0 ✗ 1 1 0 1 | 1 ✓ ---------+-- 0 0 0 0 | 0 ✗
Row 1 fails, Col 1 fails → error at (1,1)
def encode_2d_parity(data, rows, cols):
"""Encode data with 2D parity (row + column + corner)."""
grid = [data[i*cols:(i+1)*cols] for i in range(rows)]
# Add row parity
for row in grid:
row.append(sum(row) % 2)
# Add column parity row
col_parity = [sum(grid[r][c] for r in range(rows)) % 2
for c in range(cols + 1)]
grid.append(col_parity)
return grid
def check_2d_parity(grid):
"""Check 2D parity; return error position or None."""
rows, cols = len(grid), len(grid[0])
bad_rows = [r for r in range(rows) if sum(grid[r]) % 2 != 0]
bad_cols = [c for c in range(cols)
if sum(grid[r][c] for r in range(rows)) % 2 != 0]
if not bad_rows and not bad_cols:
return None # No error detected
if len(bad_rows) == 1 and len(bad_cols) == 1:
return (bad_rows[0], bad_cols[0]) # Single error located!
return "Multiple errors detected"
# Example
data = [1,0,1,1, 0,1,1,0, 1,1,0,1]
grid = encode_2d_parity(data, 3, 4)
grid[1][1] ^= 1 # Flip bit at (1,1)
print(check_2d_parity(grid)) # (1, 1)
The International Standard Book Number (ISBN) uses a check digit system to catch transcription errors when book numbers are typed or spoken.
10 digits: d1d2...d10
Last digit can be 'X' (representing 10)
Detects: all single-digit errors AND all transpositions of adjacent digits!
13 digits: d1d2...d13
Weights alternate: 1, 3, 1, 3, ...
Detects: all single-digit errors but NOT all transpositions.
def validate_isbn10(isbn: str) -> bool:
"""Validate an ISBN-10 check digit."""
isbn = isbn.replace('-', '')
if len(isbn) != 10:
return False
total = 0
for i, ch in enumerate(isbn):
if ch == 'X' and i == 9:
val = 10
elif ch.isdigit():
val = int(ch)
else:
return False
total += (i + 1) * val
return total % 11 == 0
def validate_isbn13(isbn: str) -> bool:
"""Validate an ISBN-13 check digit."""
isbn = isbn.replace('-', '')
if len(isbn) != 13 or not isbn.isdigit():
return False
weights = [1, 3] * 6 + [1]
total = sum(int(d) * w for d, w in zip(isbn, weights))
return total % 10 == 0
# Examples
print(validate_isbn10("0-306-40615-2")) # True
print(validate_isbn13("978-0-306-40615-7")) # True
print(validate_isbn10("0-306-40615-3")) # False (wrong check digit)
Why does ISBN-10 detect all adjacent transpositions? Consider swapping digits at positions j and j+1:
For this to go undetected: (a − b) ≡ 0 (mod 11)
Since 0 ≤ a, b ≤ 10 and a ≠ b: |a − b| ranges from 1 to 10, none of which are divisible by 11.
Invented by Hans Peter Luhn at IBM in 1954, the Luhn algorithm is used to validate credit card numbers, IMEI numbers, and various ID systems.
Original: 4 5 3 9 1 4 8 8 0 3 4 3 6 4 6 7
Doubled: 8 5 6 9 2 4 7 8 0 3 8 3 3 4 3 7
Sum: 8+5+6+9+2+4+7+8+0+3+8+3+3+4+3+7 = 80
80 mod 10 = 0 ✓ Valid!
def luhn_validate(number: str) -> bool:
"""Validate a number using the Luhn algorithm."""
digits = [int(d) for d in number if d.isdigit()]
# Process from right to left, doubling every second digit
checksum = 0
for i, d in enumerate(reversed(digits)):
if i % 2 == 1: # Every second digit from right
d *= 2
if d > 9:
d -= 9
checksum += d
return checksum % 10 == 0
def luhn_generate_check_digit(partial: str) -> int:
"""Generate the Luhn check digit for a partial number."""
for check in range(10):
if luhn_validate(partial + str(check)):
return check
return -1 # Should never reach here
# Examples
print(luhn_validate("4539148803436467")) # True
print(luhn_validate("4539148803436468")) # False
print(luhn_generate_check_digit("453914880343646")) # 7
Better alternative: The Verhoeff algorithm (1969) uses the dihedral group D5 to catch ALL single errors and ALL adjacent transpositions, but is more complex to implement.
Every error-control code faces a fundamental tradeoff: more redundancy means better error protection but lower throughput.
| Code | [n, k, d] | Rate | Detects | Corrects |
|---|---|---|---|---|
| Uncoded | [n, n, 1] | 1.000 | 0 | 0 |
| Single parity | [8, 7, 2] | 0.875 | 1 | 0 |
| 2D parity (4×3) | [20, 12, 4] | 0.600 | 3 | 1 |
| Repetition-3 | [3, 1, 3] | 0.333 | 2 | 1 |
| Repetition-5 | [5, 1, 5] | 0.200 | 4 | 2 |
| Hamming(7,4) | [7, 4, 3] | 0.571 | 2 | 1 |
Richard Hamming asked: can we do much better? His answer was a resounding yes.
Hamming(7,4): Corrects 1 error with rate 4/7 ≈ 0.57, compared to repetition-3's rate of 0.33 for the same correction capability.
We'll see how clever placement of parity bits enables not just detection but location of errors.