The most widely deployed error correction code in history — from deep space to your pocket
Reed & Solomon at MIT Lincoln Lab, 1960
GF(2m) arithmetic, MDS property, code parameters
Generator polynomial, systematic encoding, worked examples
Full pipeline: syndromes, BM, Forney, Chien
CDs, QR codes, deep space, RAID 6
Python RS encoder/decoder over GF(28)
Born in Seattle. Caltech PhD, worked at MIT Lincoln Lab. Also co-invented the Reed-Muller codes (1954).
Later, Reed contributed to JPEG image compression and automatic target recognition.
At MIT Lincoln Lab and JPL. The 1960 paper "Polynomial Codes over Certain Finite Fields" was just 3 pages — one of the most impactful short papers in engineering history.
The original encoding was based on polynomial evaluation, not the generator polynomial approach used today.
Reed-Solomon codes are BCH codes where the alphabet and the root field are the same.
Codeword symbols: GF(2)
Roots of g(x): GF(2m)
Different fields!
Minimal polynomial degree can be up to m, so g(x) degree grows fast.
Codeword symbols: GF(2m)
Roots of g(x): GF(2m)
Same field!
Every root has minimal polynomial of degree 1: (x - αi). So g(x) = product of linear factors.
GF(28) = GF(256): each element is a byte (8 bits).
This is the polynomial used in AES, QR codes, and many RS implementations.
XOR of bytes — trivial and fast:
No carries, no overflow. a + a = 0 for all a.
Polynomial multiplication mod p(x):
Using log/antilog tables makes multiplication a table lookup + addition mod 255.
Start with α = 0x02 (the primitive element x), repeatedly multiply:
| Power | Value (hex) | Value (binary) |
|---|---|---|
| α0 = 1 | 0x01 | 00000001 |
| α1 | 0x02 | 00000010 |
| α2 | 0x04 | 00000100 |
| α7 | 0x80 | 10000000 |
| α8 | 0x1D | 00011101 |
α8 = α4 + α3 + α2 + 1 = 0x1D (from the reduction: x8 mod p(x)).
| Code | Field | Parity | Correction | Use |
|---|---|---|---|---|
| RS(255, 223) | GF(28) | 32 | 16 symbols | Deep space |
| RS(255, 239) | GF(28) | 16 | 8 symbols | DVB |
| RS(255, 249) | GF(28) | 6 | 3 symbols | DiskStorage |
| RS(204, 188) | GF(28) | 16 | 8 symbols | DVB-T |
| RS(28, 24) | GF(28) | 4 | 2 symbols | CD (C1) |
No code with the same n and k can have a larger minimum distance. RS codes are optimal in this sense.
With s erasures and e errors:
Each error "costs" 2 (unknown position + unknown value). Each erasure "costs" 1 (known position, unknown value).
The generator polynomial for RS(n, k) has 2t = n-k consecutive roots:
Each factor is linear (degree 1) since roots are in the same field as coefficients!
A degree-32 polynomial over GF(256). Coefficients are bytes.
First few coefficients: g(x) = x32 + 0x10·x31 + 0x77·x30 + ...
Step 1: Represent message as polynomial m(x) of degree k-1
Step 2: Shift message: x2t · m(x)
Step 3: Compute remainder: r(x) = [x2t · m(x)] mod g(x)
Step 4: Codeword: c(x) = x2t · m(x) - r(x)
GF(8): p(x) = x3 + x + 1, α3 = α + 1
RS(7, 3): t = 2, g(x) = (x-α)(x-α2)(x-α3)(x-α4)
Message: m = [α2, α0, α4] → m(x) = α2x2 + x + α4
Step 1: x4·m(x) = α2x6 + x5 + α4x4
Step 2: Divide by g(x) using GF(8) polynomial long division
Step 3: Remainder r(x) = r3x3 + r2x2 + r1x + r0
Step 4: Codeword c(x) = x4·m(x) + r(x)
Binary BCH only needs error positions (flip the bit). RS needs both positions and values — what is the correct symbol value?
This is why RS decoding adds Forney's algorithm.
| Step | Complexity |
|---|---|
| Syndromes | O(n · 2t) |
| Berlekamp-Massey | O(t2) |
| Chien search | O(n · t) |
| Forney | O(t2) |
Since c(αj) = 0 for valid codewords, the syndrome depends only on errors:
If all 2t syndromes are zero, the received word is a valid codeword — no errors detected (or an undetectable error pattern, which requires ≥ d errors).
Iteratively builds σ(x) = 1 + σ1x + σ2x2 + ... + σvxv whose roots are Xl-1.
def berlekamp_massey(syndromes, t, gf):
"""Berlekamp-Massey algorithm over GF(2^m)."""
n = 2 * t
sigma = [1] # error locator polynomial
old_sigma = [1] # previous sigma
L = 0 # current LFSR length
for r in range(n):
# Compute discrepancy
delta = syndromes[r]
for j in range(1, len(sigma)):
delta = gf.add(delta, gf.mul(sigma[j], syndromes[r - j]))
# Shift old_sigma: multiply by x
old_sigma = [0] + old_sigma
if delta == 0:
continue
# Update sigma
new_sigma = list(sigma)
for j in range(len(old_sigma)):
if j < len(new_sigma):
new_sigma[j] = gf.add(new_sigma[j],
gf.mul(delta, old_sigma[j]))
else:
new_sigma.append(gf.mul(delta, old_sigma[j]))
if 2 * L <= r:
# Length change
old_sigma = [gf.mul(gf.inv(delta), s) for s in sigma]
# Pad to account for the shift we already did
old_sigma = [0] + old_sigma
L = r + 1 - L
sigma = new_sigma
return sigma
Suppose syndromes: S₁ = α3, S₂ = α0, S₃ = α5, S₄ = α2
| Iteration r | Δ | σ(x) | L | Action |
|---|---|---|---|---|
| 0 | α3 | 1 + α3x | 1 | Length update |
| 1 | α6 | 1 + α5x + α6x2 | 2 | Length update |
| 2 | 0 | unchanged | 2 | No change |
| 3 | 0 | unchanged | 2 | No change |
Once we know the error positions Xl, we need the error magnitudes Yl.
where S(x) = S₁ + S₂x + ... + S₂ₜx2t-1 is the syndrome polynomial.
σ'(x) is the formal derivative of σ(x).
In characteristic 2: σ'(x) = σ₁ + σ₃x2 + σ₅x4 + ... (only odd-index coefficients survive).
Evaluate σ(x) at x = α0, α-1, α-2, ..., α-(n-1):
def chien_search(sigma, n, gf):
"""Find roots of sigma(x) by exhaustive evaluation."""
error_positions = []
for i in range(n):
# Evaluate sigma at alpha^(-i)
x_inv = gf.exp_table[(255 - i) % 255] # alpha^(-i)
val = 0
x_power = 1
for coeff in sigma:
val = gf.add(val, gf.mul(coeff, x_power))
x_power = gf.mul(x_power, x_inv)
if val == 0:
error_positions.append(i)
if len(error_positions) != len(sigma) - 1:
return None # Decoding failure!
return error_positions
Unknown position, unknown value.
Costs 2 in the budget: 2e ≤ d-1
Decoder must find both WHERE and WHAT.
Known position, unknown value.
Costs 1 in the budget: s ≤ d-1
Position is flagged by demodulator. Only need to find WHAT.
CCSDS standard adopted by NASA, ESA, JAXA. Used from Voyager to Mars missions.
The Compact Disc (1982) uses a brilliant two-layer RS scheme designed by Philips and Sony:
RS(32, 28) over GF(256)
Corrects 2 byte errors per frame
Handles random errors from disc read
RS(28, 24) over GF(256)
Corrects 2 byte errors or 4 erasures
Handles burst errors (scratches up to ~2.4mm)
| Level | Recovery | Use Case |
|---|---|---|
| L (Low) | ~7% | Clean environments |
| M (Medium) | ~15% | General purpose |
| Q (Quartile) | ~25% | Industrial |
| H (High) | ~30% | Harsh environments |
QR data is encoded as bytes (GF(256)). RS code adds parity bytes. Higher EC level = more parity = less data capacity but more resilience.
QR codes use RS over GF(256) with p(x) = x8 + x4 + x3 + x2 + 1 — the same field as our examples.
RAID 6 uses an RS-like code across disk drives:
P = D₁ ⊕ D₂ ⊕ ... ⊕ Dn
Survives 1 disk failure. Simple XOR.
P = D₁ ⊕ D₂ ⊕ ... ⊕ Dn
Q = g₁·D₁ ⊕ g₂·D₂ ⊕ ... ⊕ gn·Dn
Where gi = αi in GF(256)!
Scale: Modern storage arrays with 20+ disks rely on RAID 6 (or triple-parity RAID) to avoid data loss. The probability of simultaneous failures during rebuild is non-negligible with large drives.
class GF256:
"""Galois Field GF(2^8) with primitive polynomial x^8+x^4+x^3+x^2+1."""
def __init__(self):
self.exp_table = [0] * 512 # double-sized for convenience
self.log_table = [0] * 256
# Generate tables using primitive element alpha = 2
x = 1
for i in range(255):
self.exp_table[i] = x
self.log_table[x] = i
x <<= 1 # multiply by alpha
if x & 0x100: # if x >= 256, reduce mod p(x)
x ^= 0x11D # p(x) = x^8 + x^4 + x^3 + x^2 + 1
# Extend exp table for easy mod-free multiplication
for i in range(255, 512):
self.exp_table[i] = self.exp_table[i - 255]
def add(self, a, b):
return a ^ b # addition = XOR
def sub(self, a, b):
return a ^ b # same as addition in char 2
def mul(self, a, b):
if a == 0 or b == 0: return 0
return self.exp_table[self.log_table[a] + self.log_table[b]]
def div(self, a, b):
if b == 0: raise ZeroDivisionError
if a == 0: return 0
return self.exp_table[(self.log_table[a] - self.log_table[b]) % 255]
def inv(self, a):
if a == 0: raise ZeroDivisionError
return self.exp_table[255 - self.log_table[a]]
def pow(self, a, n):
if a == 0: return 0
return self.exp_table[(self.log_table[a] * n) % 255]
class ReedSolomon:
"""Reed-Solomon encoder/decoder over GF(256)."""
def __init__(self, n=255, k=223):
self.n = n
self.k = k
self.t = (n - k) // 2 # error correction capability
self.gf = GF256()
self.generator = self._build_generator()
def _build_generator(self):
"""Build generator polynomial g(x) = prod(x - alpha^i)."""
g = [1]
for i in range(1, 2 * self.t + 1):
# Multiply g by (x - alpha^i)
alpha_i = self.gf.exp_table[i]
new_g = [0] * (len(g) + 1)
for j in range(len(g)):
new_g[j] = self.gf.add(new_g[j], g[j])
new_g[j + 1] = self.gf.add(new_g[j + 1],
self.gf.mul(g[j], alpha_i))
g = new_g
return g
def encode(self, message):
"""Systematic RS encoding."""
assert len(message) == self.k
# Multiply message by x^(n-k)
msg_padded = message + [0] * (self.n - self.k)
# Compute remainder of msg_padded / generator
remainder = list(msg_padded)
for i in range(self.k):
if remainder[i] != 0:
coef = remainder[i]
for j in range(len(self.generator)):
remainder[i + j] = self.gf.add(
remainder[i + j],
self.gf.mul(self.generator[j], coef))
# Codeword = message + parity
parity = remainder[self.k:]
return message + parity
def decode(self, received):
"""Decode received RS codeword, correct up to t errors."""
# Step 1: Compute syndromes
syndromes = []
for i in range(1, 2 * self.t + 1):
s = 0
for j in range(self.n):
s = self.gf.add(s, self.gf.mul(received[j],
self.gf.pow(self.gf.exp_table[i], j)))
syndromes.append(s)
if all(s == 0 for s in syndromes):
return received[:self.k] # No errors
# Step 2: Berlekamp-Massey -> error locator sigma(x)
sigma = self._berlekamp_massey(syndromes)
# Step 3: Chien search -> error positions
positions = self._chien_search(sigma)
if positions is None:
raise ValueError("Decoding failure: too many errors")
# Step 4: Forney's algorithm -> error values
omega = self._error_evaluator(syndromes, sigma)
sigma_deriv = self._formal_derivative(sigma)
# Step 5: Compute error values and correct
corrected = list(received)
for pos in positions:
x_inv = self.gf.inv(self.gf.exp_table[pos])
# Evaluate omega and sigma' at x_inv
omega_val = self._poly_eval(omega, x_inv)
sigma_d_val = self._poly_eval(sigma_deriv, x_inv)
# Error value
error_val = self.gf.mul(self.gf.exp_table[pos],
self.gf.div(omega_val, sigma_d_val))
corrected[pos] = self.gf.add(corrected[pos], error_val)
return corrected[:self.k]
# Create RS(255, 223) codec
rs = ReedSolomon(n=255, k=223)
# Create a test message (223 bytes)
message = list(range(223)) # [0, 1, 2, ..., 222]
# Encode
codeword = rs.encode(message)
print(f"Message length: {len(message)}")
print(f"Codeword length: {len(codeword)}")
print(f"Parity bytes: {len(codeword) - len(message)}")
# Introduce 16 random symbol errors (maximum correctable)
import random
received = list(codeword)
error_positions = random.sample(range(255), 16)
for pos in error_positions:
received[pos] ^= random.randint(1, 255) # corrupt byte
errors = sum(a != b for a, b in zip(codeword, received))
print(f"\nErrors introduced: {errors}")
# Decode
try:
decoded = rs.decode(received)
if decoded == message:
print("Decoding SUCCESS - all 16 errors corrected!")
else:
print("Decoding produced wrong result")
except ValueError as e:
print(f"Decoding failed: {e}")
# Test with 17 errors (beyond correction capability)
received2 = list(codeword)
for pos in random.sample(range(255), 17):
received2[pos] ^= random.randint(1, 255)
try:
rs.decode(received2)
print("WARNING: 17 errors decoded (miscorrection!)")
except ValueError:
print("17 errors correctly detected as uncorrectable")
| Property | RS Codes | Binary BCH | Convolutional |
|---|---|---|---|
| Symbol size | m-bit symbols | Individual bits | Individual bits |
| Burst error handling | Excellent | Poor | Moderate |
| MDS? | Yes! | No | N/A |
| Random errors | Good | Better per bit | Best (soft) |
| Decoding complexity | O(n·t + t2) | O(n·t + t2) | O(2K·n) |
Compact Disc
First mass RS deployment
DVD
Enhanced CIRC
QR Codes
Mobile revolution