Codes that generate an endless stream of encoded symbols — collect enough drops from the fountain, and you can reconstruct the original message.
The digital fountain metaphor and why fixed-rate codes fall short.
Luby Transform codes — the first practical fountain code.
Linear-time encoding and decoding with pre-coding.
Degree distributions, soliton distributions, and analysis.
Streaming, broadcast, multicast, and distributed storage.
RaptorQ (RFC 6330) and real-world deployment.
Imagine a fountain that continuously sprays water drops. Each drop carries a tiny bit of information. You hold a cup under the fountain and collect drops.
It doesn't matter which drops you catch — once you've collected enough, you can reconstruct the message.
A fountain code generates a potentially unlimited number of encoded symbols from k source symbols. Any (1+ε)k received symbols suffice for decoding.
No feedback needed: Sender doesn't need to know what was received.
No rate decision: No need to predict the channel quality in advance.
Universal: Works for any erasure rate without code redesign.
Scalable: One sender, millions of receivers with different loss rates.
First articulated by John Byers, Michael Luby, and Michael Mitzenmacher (1998).
A satellite broadcasts data to millions of receivers. Each has a different packet loss rate (1% to 30%).
With a fixed-rate code (e.g., rate 1/2 RS code):
10 receivers, each missing different 10% of packets. With fixed codes, you retransmit specific missing packets — but what's missing varies per receiver.
This is the "coupon collector" problem — terribly inefficient!
Fountain codes solve both problems: every encoded symbol is useful to every receiver, regardless of which symbols they've already received.
The BEC is the natural channel for fountain codes:
Each transmitted symbol is either received perfectly or erased (lost entirely).
Erasure probability = ε
Beautifully simple! To transmit k bits, you need at least k/(1-ε) channel uses.
Internet packets (TCP/UDP): lost or received
Broadcast/multicast: missed or received
Wireless: fading below threshold = erasure
Storage: sector failures = erasures
Each encoded symbol is a random linear combination of source symbols over GF(2):
where Si is a random subset of {1, 2, ..., k}.
1. Choose a random binary vector gi of length k
2. Compute yi = gi · x (inner product mod 2)
3. Transmit (yi, gi) or seed for gi
Collect n ≥ k symbols. Form matrix G (n × k). Solve Gx = y by Gaussian elimination.
Succeeds w.h.p. when n ≈ k + O(1).
Problem: Encoding is O(k) per symbol, decoding is O(k3) — impractical for large k! This motivated the search for structured fountain codes.
Michael Luby introduced LT (Luby Transform) codes in 2002 — the first rateless erasure code with efficient encoding and decoding. The key insight: use a sparse random generator instead of a dense one.
For each encoded symbol:
1. Sample degree d from distribution Ω(d)
2. Choose d source symbols uniformly at random
3. XOR them together: y = xi1 ⊕ xi2 ⊕ ... ⊕ xid
Not all degrees are equally useful. The degree distribution Ω must be carefully designed for efficient decoding.
1. Find an encoded symbol with degree 1 (covers one source symbol)
2. That source symbol is now known — "peel" it off
3. XOR it from all other encoded symbols that include it
4. This may create new degree-1 symbols — repeat!
Like a chain reaction: one decoded symbol triggers others.
In expectation, exactly one encoded symbol has degree 1, and peeling releases exactly one new degree-1 symbol at each step.
The ideal soliton fails in practice: variance is too high. The peeling process often stalls because no degree-1 symbols are available.
Luby's fix: add extra degree-1 symbols and boost certain degrees:
where τ(d) adds "spikes" and Z normalizes. Parameters δ (failure probability) and c (tuning constant) control the trade-off.
| Parameter | Effect |
|---|---|
| More degree-1 | Reliable start, higher overhead |
| Higher avg degree | Better coverage, slower encoding |
| Optimal | O(k ln(k/δ)) symbols needed |
| Symbol | Degree | Composition |
|---|---|---|
| y1 | 1 | B |
| y2 | 2 | A ⊕ C |
| y3 | 3 | A ⊕ B ⊕ D |
| y4 | 2 | C ⊕ D |
| y5 | 1 | A |
Step 1: y1 has degree 1 → B = y1
Step 2: y5 has degree 1 → A = y5
Step 3: Remove A from y2 → y2' = C (degree 1!) → C = y2'
Step 4: Remove C from y4 → y4' = D (degree 1!) → D = y4'
All 4 source symbols recovered from 5 encoded symbols!
import numpy as np
from math import log, sqrt, ceil
def robust_soliton(k, c=0.1, delta=0.5):
"""Generate robust soliton distribution for k source symbols."""
# Ideal soliton
rho = [0] * (k + 1)
rho[1] = 1.0 / k
for d in range(2, k + 1):
rho[d] = 1.0 / (d * (d - 1))
# Tau component
S = c * sqrt(k) * log(k / delta)
tau = [0] * (k + 1)
pivot = int(k / S)
for d in range(1, min(pivot, k) + 1):
tau[d] = S / (k * d)
if pivot <= k:
tau[pivot] = S * log(S / delta) / k
# Normalize
total = sum(rho[d] + tau[d] for d in range(1, k + 1))
mu = [(rho[d] + tau[d]) / total for d in range(k + 1)]
return mu[1:] # degrees 1 to k
def lt_encode(source_symbols, n_encoded, degree_dist):
"""
LT encode: generate n_encoded symbols from source.
Returns list of (encoded_value, neighbors).
"""
k = len(source_symbols)
degrees = list(range(1, k + 1))
encoded = []
for _ in range(n_encoded):
# Sample degree from distribution
d = np.random.choice(degrees, p=degree_dist)
# Choose d random source symbols
neighbors = sorted(np.random.choice(k, size=d, replace=False))
# XOR them together
value = 0
for idx in neighbors:
value ^= source_symbols[idx]
encoded.append((value, neighbors))
return encoded
def lt_decode(encoded_symbols, k):
"""Peeling decoder for LT codes."""
decoded = [None] * k
remaining = [(val, list(nbrs)) for val, nbrs in encoded_symbols]
progress = True
while progress:
progress = False
for i, (val, nbrs) in enumerate(remaining):
if len(nbrs) == 1:
idx = nbrs[0]
if decoded[idx] is None:
decoded[idx] = val
progress = True
# Remove this source from all other symbols
for j, (v2, n2) in enumerate(remaining):
if idx in n2:
remaining[j] = (v2 ^ val, [x for x in n2 if x != idx])
remaining[i] = (0, []) # consumed
return decoded
# Demo
k = 100
source = np.random.randint(0, 256, k).tolist()
dist = robust_soliton(k)
encoded = lt_encode(source, int(k * 1.2), dist) # 20% overhead
result = lt_decode(encoded, k)
recovered = sum(1 for x in result if x is not None)
print(f"Recovered {recovered}/{k} source symbols")
With robust soliton distribution, LT codes need:
symbols to decode k source symbols with probability 1-δ.
Overhead ε = n/k - 1 ≈ 5-10% for practical k.
Average degree under robust soliton: O(ln(k/δ))
Each encoded symbol: O(ln k) XOR operations
Total encoding: O(k ln k) — near-linear!
Peeling decoder: each source symbol "ripples" through O(ln k) encoded symbols.
Total decoding: O(k ln k)
LT code limitation: The O(√k) overhead term means LT codes have non-vanishing overhead. For k = 10,000, you need ~5-10% extra symbols. Raptor codes fix this.
| Code | Overhead | Decode |
|---|---|---|
| Random linear | O(1) | O(k3) |
| LT | O(√k ln2k) | O(k ln k) |
| Raptor | O(1) | O(k) |
Amin Shokrollahi (2006) invented Raptor codes — achieving O(k) encoding/decoding with constant overhead. The name: Rapid Tornado + LT.
First apply a high-rate pre-code (inner code) to the k source symbols, producing n' = k(1+ε) intermediate symbols.
Then apply an LT code with weak degree distribution (low average degree) to the intermediate symbols.
The pre-code handles the symbols that LT misses!
The weak LT code recovers ~95-99% of intermediate symbols. The pre-code (e.g., LDPC) fills in the remaining 1-5%.
Because the LT code is weak, its average degree is O(1) — constant! This gives linear-time encoding and decoding.
| Pre-code | Rate | Decoding |
|---|---|---|
| LDPC | 0.95-0.99 | Belief propagation |
| Hamming-like | ~0.95 | Bounded distance |
| HDPC | ~0.97 | Gaussian elimination |
| Combination | ~0.96 | RaptorQ approach |
Much simpler than standalone LT:
Average degree: O(1) (e.g., ~4-5)
Distribution: optimized for pre-code residual
No need for heavy tail of soliton
Total overhead: k(1+ε) symbols suffice with ε < 0.05 for practical implementations. Encoding and decoding are both O(k).
RaptorQ is the IETF standard implementation of Raptor codes. It's the most powerful erasure code deployed at scale.
| Source symbols | 1 to 56,403 |
| Symbol size | Arbitrary (bytes) |
| Overhead | <2 extra symbols |
| Failure prob | <10-6 at k+2 |
| Encoding speed | ~1 Gbps |
Uses a sophisticated combination:
1. LDPC: Sparse parity checks for bulk error correction
2. HDPC: Half-density parity checks (dense GF(256) rows) for guaranteed recovery
3. LT: Degree distribution optimized for the specific pre-code
RaptorQ achieves information-theoretically optimal overhead: it can decode from k+0 to k+2 symbols in most cases. Essentially zero waste.
| Property | ARQ (TCP-style) | Fountain Code |
|---|---|---|
| Feedback required? | Yes (ACK/NACK) | No |
| Multicast efficiency | Poor (N separate streams) | Excellent (one stream) |
| Latency (1-way) | 2+ RTT per loss event | 1 one-way delay |
| Bandwidth overhead | Feedback + retransmissions | ~2-5% coding overhead |
| Adapts to loss rate? | Reactively (slow) | Automatically |
| Deep space viable? | No (30 min RTT!) | Yes |
Fountain codes are ideal when feedback is expensive (satellite), impossible (broadcast), or when serving heterogeneous receivers with different loss rates.
3GPP Multimedia Broadcast/Multicast Service uses Raptor codes (RFC 5053) for file delivery and streaming to mobile devices.
One base station broadcasts; thousands of phones decode. Each phone misses different packets but all recover.
Mobile TV standards use Raptor codes as the application-layer FEC, complementing physical-layer coding (LDPC/turbo).
ATSC 3.0 (next-gen US digital TV) supports Raptor for robust mobile reception.
IETF file delivery over unidirectional transport. RaptorQ provides reliable file delivery without feedback.
Used in satellite content distribution and emergency broadcasting.
Low-latency video: Raptor codes protect against packet loss without retransmission delay.
Particularly valuable for live streaming where retransmission would violate latency budgets.
Distributing a 1 GB update to millions of devices:
Without fountain: Each device needs the exact missing pieces. Server tracks state per device.
With fountain: Server sprays encoded symbols. Every symbol helps every device. No per-device state.
In P2P, each peer has a random subset of encoded symbols. Any peer's symbols are useful to any other — no "rare piece" problem (unlike BitTorrent).
↓ different encoded symbols ↓
Each server generates independent encoded symbols. Client collects from all servers — no coordination needed!
Store encoded symbols across n nodes. Any k+ε nodes suffice to recover the data. Nodes can fail and be replaced without coordination.
| Method | Storage | Tolerance |
|---|---|---|
| 3x Replication | 3x | 2 failures |
| RS (10,4) | 1.4x | 4 of 14 failures |
| Fountain (k=10) | ~1.4x | Any 30% failures |
Fountain codes achieve the same reliability as RS with the added benefit of no fixed code rate.
When a node fails, generating a new encoded symbol requires reading k source symbols — expensive bandwidth!
This spawned research into regenerating codes and locally repairable codes — important extensions for storage systems.
Cloud storage systems (Azure, Google) use specialized erasure codes optimized for repair bandwidth.
Include the original k source symbols as the first k encoded symbols. Repair symbols are generated on demand.
Advantage: No decoding needed if no erasures!
Standard fountain codes assume erasure channels. Extensions handle noisy channels by combining with soft-decision inner codes.
Weight the degree distribution to protect important data (e.g., video base layer) more heavily than less critical data.
Random linear network coding is essentially a fountain code applied at every relay node in a network. Deep connection between the two fields.
Fountain codes represent a fundamental shift in coding philosophy: from "design a fixed code for a known channel" to "generate symbols on demand and let the receiver decide when it has enough."
| 1998 | Digital fountain concept (Byers, Luby, Mitzenmacher) |
| 2002 | LT codes (Luby) |
| 2005 | RFC 5053: Raptor codes |
| 2006 | Raptor codes theory (Shokrollahi) |
| 2012 | RFC 6330: RaptorQ |
| 2017 | ATSC 3.0 adopts Raptor |
Rateless: no rate decision needed
Universal: works for any erasure rate
Efficient: O(k) encoding and decoding
Near-optimal: overhead < 2 symbols
Fountain codes for IoT/sensor networks, coded computation, blockchain scalability, and DNA data storage (where read order is inherently random — a perfect fit for fountain codes).