From deep space to your pocket — how error-correcting codes silently protect every bit of digital information in the modern world.
Voyager, Mars rovers, JWST — coding at the edge of the solar system.
CDs, DVDs, Blu-ray, HDDs, SSDs, RAID — protecting your data.
QR codes, NFC, Bluetooth — codes in your pocket.
Ethernet, Wi-Fi, 5G, submarine cables — the digital backbone.
Digital TV, satellite — reaching millions simultaneously.
Quantum error correction, DNA storage — what's next.
Every digital device you use relies on error-correcting codes. They're invisible when they work — and catastrophic when absent.
Right now, as you view this presentation, error-correcting codes are operating in your display controller, your Wi-Fi chip, your storage drive, your RAM (ECC), and the network infrastructure delivering this content.
The most distant human-made objects. Voyager 1 is over 24 billion km from Earth.
Signal power at Earth: ~10-16 watts. The coding must be exceptional.
| Year | Code | Gain |
|---|---|---|
| 1977 | Golay (24,12) | Baseline |
| 1979 | Convolutional (K=7, R=1/2) | +2 dB |
| 1981 | RS(255,223) + Conv | +2 dB more |
| 1989 | Software update: improved decoding | +1 dB |
Voyager 2's coding was upgraded in flight via software upload. The concatenated RS+convolutional code enabled high-resolution images from Neptune — 4.5 billion km away.
Voyager demonstrated a principle: as spacecraft travel farther, you can compensate by upgrading the coding — software can replace hardware.
Voyager 1 continues transmitting today (2026) using the same RS+convolutional code. Still works perfectly after nearly 50 years.
Spirit/Opportunity: Turbo codes (CCSDS standard). Enabled high-res panoramic images from Mars surface.
Curiosity/Perseverance: Enhanced turbo codes. Data rates up to 2 Mbps from Mars orbit relay.
At L2, 1.5 million km from Earth. Transmitting infrared images of the early universe.
| Coding | RS(255,223) + LDPC |
| Data rate | 28 Mbps (Ka-band) |
| Distance | 1.5 million km |
| Daily data | ~57 GB downlink |
JWST's coding enables it to transmit as much data daily as Voyager transmitted in its entire mission.
Cross-Interleaved Reed-Solomon Code:
C2: RS(28,24) — first level correction
Interleave: Spread symbols across ~3 mm of track
C1: RS(32,28) — second level correction
A CD can correct a burst error up to ~4,000 consecutive symbols (~2.5 mm scratch). This is why CDs play through minor scratches.
| Format | Inner code | Outer code |
|---|---|---|
| CD | RS(32,28) | RS(28,24) |
| DVD | RS(182,172) | RS(208,192) |
| Blu-ray | LDPC + BIS | RS-like (Picket) |
A scratch produces a burst error. Without interleaving, one codeword gets destroyed. With interleaving, the burst is spread across many codewords — each sees only 1-2 errors.
Every QR code uses RS error correction over GF(28). The error correction level determines how much damage the code can survive:
| Level | Recovery | Overhead |
|---|---|---|
| L (Low) | ~7% | Minimal |
| M (Medium) | ~15% | Moderate |
| Q (Quartile) | ~25% | Significant |
| H (High) | ~30% | Maximum |
Level H: up to 30% of the QR code can be destroyed or obscured and it still scans correctly.
This is why you can put logos in the center of QR codes — the RS code recovers the hidden data.
Invented by Denso Wave (1994). ISO/IEC 18004 standard. ~10 billion scans per day worldwide.
Modern HDDs use LDPC codes (since ~2010, replacing RS/Reed-Muller). The channel: magnetic medium read at extreme density.
| Raw BER | ~10-2 |
| After LDPC | ~10-15 |
| Coding gain | ~13 orders of magnitude! |
| Iterations | 5-50 (adaptive) |
NAND flash cells degrade with writes. Coding is essential:
| Flash type | Raw BER | Code |
|---|---|---|
| SLC (1 bit/cell) | 10-6 | BCH |
| MLC (2 bits/cell) | 10-4 | BCH/LDPC |
| TLC (3 bits/cell) | 10-3 | LDPC |
| QLC (4 bits/cell) | 10-2 | LDPC (strong) |
QLC SSDs literally cannot work without LDPC codes. 1 in 100 bits is wrong coming off the flash — the LDPC decoder corrects this to fewer than 1 in 1015.
| RAID | Method | Code | Tolerance |
|---|---|---|---|
| RAID 0 | Striping | None | 0 drives |
| RAID 1 | Mirroring | Repetition | 1 drive |
| RAID 5 | Parity | XOR parity | 1 drive |
| RAID 6 | Double parity | RS / P+Q | 2 drives |
RAID 5 uses a simple XOR parity — the most basic error-correcting code. RAID 6 adds a second parity using a Reed-Solomon-like code over GF(28).
If any one drive fails, XOR the others to recover it. This is a [5,4] single-parity-check code!
Two independent syndromes enable recovery from any 2 drive failures. Exactly the RS code principle.
Every Ethernet frame ends with a 4-byte Frame Check Sequence — a CRC-32 computed over the entire frame.
This polynomial detects:
Modern Ethernet uses FEC for the physical layer:
| Standard | FEC |
|---|---|
| 10GBase-T | LDPC (2048,1723) |
| 25/50/100GBase | RS(544,514) + CRC |
| 400GBase | RS(544,514) concatenated |
| 800GBase | RS + LDPC (proposed) |
Billions of CRC-32 computations happen every second across the world's Ethernet networks. Dedicated hardware computes CRC at line rate.
| Standard | Year | FEC | Max Rate |
|---|---|---|---|
| 802.11a/g | 1999/2003 | Convolutional (BCC) | 54 Mbps |
| 802.11n (Wi-Fi 4) | 2009 | BCC + LDPC (optional) | 600 Mbps |
| 802.11ac (Wi-Fi 5) | 2013 | BCC + LDPC | 6.9 Gbps |
| 802.11ax (Wi-Fi 6) | 2020 | LDPC mandatory for high MCS | 9.6 Gbps |
| 802.11be (Wi-Fi 7) | 2024 | LDPC for all advanced modes | 46 Gbps |
~1.5 dB gain over convolutional coding at high data rates (256-QAM, 1024-QAM). This translates to 30-50% range improvement or higher throughput at the same range.
Three codeword lengths: 648, 1296, 1944 bits. Four rates: 1/2, 2/3, 3/4, 5/6. Parity check matrix specified as circulant shift values.
5G NR is the first standard to use three different modern code families simultaneously: LDPC, polar, and CRC.
| Channel | Code | Block Size | Why |
|---|---|---|---|
| Data (PDSCH/PUSCH) | LDPC (QC) | 100-8000+ bits | High throughput, parallel decoding |
| Control (PDCCH/PUCCH) | Polar (CA-SCL) | 12-140 bits | Short block excellence |
| Broadcast (PBCH) | Polar | 32 bits + CRC | Reliability for system info |
| Small control | Repetition/simplex | 1-11 bits | Very short messages |
20 Gbps peak downlink
10 Gbps peak uplink
1 ms latency target
99.999% reliability (URLLC)
Without LDPC/polar, 5G would need ~3 dB more transmit power or 50% more spectrum to achieve the same performance.
Over 500 submarine cables carry >95% of intercontinental data. Lengths: up to 20,000 km. Fiber signals degrade over thousands of km.
Net coding gain: 11-12 dB. This is the difference between a working trans-Pacific link and a useless one.
| Gen | FEC | NCG |
|---|---|---|
| 1G | RS + hard decision | 6 dB |
| 2G | Concatenated RS + BCH | 8 dB |
| 3G | Soft-decision LDPC | 11+ dB |
| 4G | Probabilistic shaping + LDPC | 12+ dB |
Each 1 dB of coding gain allows ~25% more cable length or ~25% more capacity. Modern FEC enables 400 Gbps per wavelength across the Atlantic.
| Standard | Medium | Inner FEC | Outer FEC |
|---|---|---|---|
| DVB-S (1994) | Satellite | Convolutional | RS(204,188) |
| DVB-S2 (2004) | Satellite | LDPC (64800) | BCH |
| DVB-T (1997) | Terrestrial | Convolutional | RS(204,188) |
| DVB-T2 (2009) | Terrestrial | LDPC (64800) | BCH |
| DVB-C (1994) | Cable | — | RS(204,188) |
| DVB-C2 (2010) | Cable | LDPC | BCH |
DVB-S2's LDPC code provides ~2.5 dB improvement over DVB-S's convolutional code. This translates to ~30% more TV channels on the same satellite transponder.
Extended with new LDPC rates (2/9, 13/45, ...) and modulations up to 256-APSK. Within 0.5 dB of Shannon limit across all rates.
USB 3.x/4: 128b/132b encoding with CRC-32 for link-layer error detection.
USB4 (2019): Adds link-layer retry and FEC for tunneled protocols.
CRC-5 protects the token packet; CRC-16 protects data packets.
Classic BT: 1/3 and 2/3 rate convolutional codes + CRC-16.
BLE (Low Energy): CRC-24 for data integrity. Bluetooth 5.x adds coded PHY with convolutional codes for extended range.
BLE coded PHY: S=2 or S=8 spreading with convolutional code — 4x range vs standard BLE.
NFC-A: Modified Miller encoding + CRC.
NFC-V: CRC-16 for data integrity.
Short range (~10 cm) means noise is low, so simple CRC suffices. The bigger concern is collision resolution with multiple tags.
Choosing the right code for a system depends on many factors. Here's a guide to matching codes to requirements.
| Requirement | Best Code Family | Why |
|---|---|---|
| Maximum throughput | LDPC (QC) | Parallel decoding, hardware friendly |
| Short messages (<256 bits) | Polar (CA-SCL) | Best known short-block performance |
| Algebraic guarantees | RS / BCH | Guaranteed correction capability |
| Error detection only | CRC | Simple, fast, well-understood |
| Broadcast/multicast | Raptor / RaptorQ | Rateless, no feedback needed |
| Very long blocks, near capacity | LDPC (irregular) | Within 0.01 dB of Shannon |
| Burst errors | RS + interleaving | RS naturally handles bursts |
| Ultra-low latency | Convolutional (Viterbi) | Streaming decode, no block delay |
| Code | Year | Gap to Shannon | Visual |
|---|---|---|---|
| Uncoded BPSK | — | ~9.6 dB | |
| Hamming (7,4) | 1950 | ~6.0 dB | |
| Convolutional (K=7) | 1955 | ~3.0 dB | |
| RS(255,128) + Conv | 1977 | ~2.0 dB | |
| Turbo (N=65536) | 1993 | ~0.7 dB | |
| LDPC (irregular, N=106) | 1996 | ~0.04 dB | |
| Polar (CA-SCL, N=220) | 2009 | ~0.1 dB |
| Year | Milestone | Code |
|---|---|---|
| 1948 | Shannon's channel coding theorem | Theoretical foundation |
| 1950 | Hamming code in IBM computers | Hamming (7,4) |
| 1960 | Reed-Solomon codes invented | RS |
| 1967 | Viterbi algorithm | Convolutional |
| 1977 | Voyager 1 & 2 launch | Golay + Conv |
| 1982 | Compact Disc (CD) | RS (CIRC) |
| 1993 | Turbo codes announced | PCCC |
| 1994 | QR codes invented | RS |
| 1996 | LDPC codes rediscovered | LDPC |
| 1999 | 3G UMTS standardized | Turbo |
| 2004 | DVB-S2 standard | LDPC + BCH |
| 2009 | Polar codes invented | Polar |
| 2009 | Wi-Fi 4 (802.11n) | LDPC (optional) |
| 2018 | 5G NR Release 15 | LDPC + Polar |
| 2024 | Wi-Fi 7 (802.11be) | LDPC (mandatory) |
A single smartphone uses dozens of different error-correcting codes simultaneously — perhaps more coding theory per gram than any other device in history.
| Component | Code(s) | Purpose |
|---|---|---|
| 5G modem (data) | LDPC | Wireless data protection |
| 5G modem (control) | Polar | Control signaling |
| Wi-Fi 6/7 | LDPC | Local wireless |
| Bluetooth | Convolutional + CRC | Short-range wireless |
| NFC | CRC | Contactless payments |
| Flash storage (NAND) | LDPC | Data integrity in storage |
| RAM (LPDDR5) | ECC (Hamming-like) | Memory error correction |
| GPS | Convolutional + CRC | Navigation signals |
| Camera (QR scan) | RS | QR code decoding |
Major cloud providers replace 3x replication with erasure codes to reduce storage overhead:
| Provider | Scheme | Overhead |
|---|---|---|
| Google (GFS) | RS(6,3) | 1.5x |
| Facebook (f4) | RS(10,4) | 1.4x |
| Azure (LRC) | LRC(12,2,2) | 1.33x |
| HDFS 3.x | RS(6,3) or RS(10,4) | 1.5x/1.4x |
Compare: 3x replication = 3.0x overhead!
Azure's innovation: groups of data blocks have local parity (repairs from 2-3 blocks) plus global parity (handles worst-case).
Repair cost: read 2-3 blocks instead of 6-10. Crucial when drives fail daily in million-disk data centers.
At exabyte scale, switching from 3x replication to erasure coding saves millions of dollars in storage hardware annually. Coding theory has enormous economic impact.
Quantum computers need error correction even more desperately than classical systems. Qubit error rates: ~10-3 (1000x worse than transistors).
But quantum information cannot be copied (no-cloning theorem) — classical codes don't directly apply.
Surface codes: 2D lattice of qubits with nearest-neighbor checks. Leading candidate for near-term quantum computers.
CSS codes: Constructed from pairs of classical codes (Calderbank-Shor-Steane).
Quantum LDPC: Sparse parity checks on qubits. Active research frontier.
Current estimates: 1,000-10,000 physical qubits per logical qubit (with surface codes).
A useful quantum computer (1000 logical qubits) would need millions of physical qubits — most performing error correction!
Quantum error correction is the single biggest bottleneck in quantum computing. Better quantum codes = practical quantum computers sooner. The field is desperate for coding theory breakthroughs.
DNA stores information at ~1 exabyte per gram. Stable for thousands of years. Replicated naturally.
Alphabet: {A, C, G, T} — quaternary, not binary.
Synthesis errors: ~1% per base
Sequencing errors: ~1-5%
Insertions and deletions (indels)
Missing strands (erasures)
Reed-Solomon over GF(4) or GF(256) for substitution errors.
Fountain codes (Erlich & Zielinski, 2017): Each DNA strand is a fountain-encoded symbol. Perfect match — read order is random!
Synchronization codes for insertion/deletion correction — a new frontier.
DNA storage combines multiple coding challenges: quaternary alphabet, insertion/deletion errors, and random-access reads. A rich frontier for coding theory research.
AWGN/Fading: LDPC, Turbo, Polar
Erasure: RS, Fountain, LDPC
Burst errors: RS + interleaving
Mixed: Concatenated codes
Very short (<128): Polar, BCH, short RS
Medium (128-10K): Polar, Turbo, LDPC
Long (>10K): LDPC, Turbo
Very long (>64K): LDPC
Max throughput: QC-LDPC (parallel decoder)
Min latency: Convolutional (streaming)
Min power: Min-sum LDPC
Min complexity: CRC, Hamming, RS
No feedback: Fountain codes
Guaranteed correction: Algebraic (RS, BCH)
No error floor: Polar, RS
Rate adaptation: Polar, Raptor
Mobile industry: LDPC/Polar in 5G enables ~$800B/year mobile ecosystem
Storage: LDPC enables QLC flash — 4x density, saving billions in SSD costs
Cloud: Erasure coding saves ~50% storage costs vs replication
Satellites: Better codes = same performance from cheaper/smaller hardware
Coding theory enables the digital economy itself. Without error-correcting codes:
Coding theory is among the most commercially impactful branches of mathematics in human history — rivaling calculus and statistics in economic contribution.
1. Every digital system uses error-correcting codes — they're the invisible foundation.
2. The right code depends on channel, block length, latency, and throughput requirements.
3. We've nearly closed the gap to Shannon's limit — but new challenges (quantum, DNA) await.
4. Coding theory continuously evolves — 5G uses codes invented in 2009 (polar) and rediscovered in 1996 (LDPC).
Textbooks:
Richardson & Urbanke, "Modern Coding Theory" (LDPC, turbo)
Lin & Costello, "Error Control Coding" (comprehensive)
Arikan, "Channel Polarization" (original polar paper)
Standards:
3GPP TS 38.212 (5G NR coding)
IEEE 802.11ax (Wi-Fi 6 LDPC)
ETSI EN 302 307 (DVB-S2)
Online:
errorcorrectionzoo.org — catalog of known codes
End of Coding Theory Series. Thank you for following along.