Coding Theory Series — 11

Fountain &
Rateless Codes

Codes that generate an endless stream of encoded symbols — collect enough drops from the fountain, and you can reconstruct the original message.

Mathematics Code History Applications

Roadmap

Motivation

The digital fountain metaphor and why fixed-rate codes fall short.

LT Codes

Luby Transform codes — the first practical fountain code.

Raptor Codes

Linear-time encoding and decoding with pre-coding.

Theory

Degree distributions, soliton distributions, and analysis.

Applications

Streaming, broadcast, multicast, and distributed storage.

Standards

RaptorQ (RFC 6330) and real-world deployment.

The Digital Fountain Metaphor

History

The Idea

Imagine a fountain that continuously sprays water drops. Each drop carries a tiny bit of information. You hold a cup under the fountain and collect drops.

It doesn't matter which drops you catch — once you've collected enough, you can reconstruct the message.

A fountain code generates a potentially unlimited number of encoded symbols from k source symbols. Any (1+ε)k received symbols suffice for decoding.

Why This Is Revolutionary

No feedback needed: Sender doesn't need to know what was received.

No rate decision: No need to predict the channel quality in advance.

Universal: Works for any erasure rate without code redesign.

Scalable: One sender, millions of receivers with different loss rates.

First articulated by John Byers, Michael Luby, and Michael Mitzenmacher (1998).

Why Fixed-Rate Codes Struggle

Mathematics

The Broadcast Problem

A satellite broadcasts data to millions of receivers. Each has a different packet loss rate (1% to 30%).

With a fixed-rate code (e.g., rate 1/2 RS code):

  • Wasteful for good channels (1% loss)
  • Insufficient for bad channels (30% loss)
  • Can't adapt without feedback

The Multicast Problem

10 receivers, each missing different 10% of packets. With fixed codes, you retransmit specific missing packets — but what's missing varies per receiver.

N retransmissions needed ≈ N · log(N)

This is the "coupon collector" problem — terribly inefficient!

Fountain codes solve both problems: every encoded symbol is useful to every receiver, regardless of which symbols they've already received.

The Binary Erasure Channel (BEC)

Mathematics

Channel Model

The BEC is the natural channel for fountain codes:

Each transmitted symbol is either received perfectly or erased (lost entirely).

Erasure probability = ε

0
1-ε
0
0/1
ε
?
1
1-ε
1

Capacity of BEC

C = 1 - ε

Beautifully simple! To transmit k bits, you need at least k/(1-ε) channel uses.

Real-World BECs

Internet packets (TCP/UDP): lost or received

Broadcast/multicast: missed or received

Wireless: fading below threshold = erasure

Storage: sector failures = erasures

Random Linear Fountain Codes

Mathematics

The Simplest Fountain Code

Each encoded symbol is a random linear combination of source symbols over GF(2):

yi = ⊕j ∈ Si xj

where Si is a random subset of {1, 2, ..., k}.

Encoding

1. Choose a random binary vector gi of length k

2. Compute yi = gi · x (inner product mod 2)

3. Transmit (yi, gi) or seed for gi

Decoding

Collect n ≥ k symbols. Form matrix G (n × k). Solve Gx = y by Gaussian elimination.

Succeeds w.h.p. when n ≈ k + O(1).

Problem: Encoding is O(k) per symbol, decoding is O(k3) — impractical for large k! This motivated the search for structured fountain codes.

LT Codes: The First Practical Fountain Code

History Mathematics

Michael Luby introduced LT (Luby Transform) codes in 2002 — the first rateless erasure code with efficient encoding and decoding. The key insight: use a sparse random generator instead of a dense one.

LT Encoding

For each encoded symbol:

1. Sample degree d from distribution Ω(d)

2. Choose d source symbols uniformly at random

3. XOR them together: y = xi1 ⊕ xi2 ⊕ ... ⊕ xid

The Magic: Degree Distribution

Not all degrees are equally useful. The degree distribution Ω must be carefully designed for efficient decoding.

LT Decoding: Peeling

1. Find an encoded symbol with degree 1 (covers one source symbol)

2. That source symbol is now known — "peel" it off

3. XOR it from all other encoded symbols that include it

4. This may create new degree-1 symbols — repeat!

Like a chain reaction: one decoded symbol triggers others.

The Soliton Distribution

Mathematics

Ideal Soliton Distribution

ρ(1) = 1/k
ρ(d) = 1/(d(d-1))   for d = 2, ..., k

In expectation, exactly one encoded symbol has degree 1, and peeling releases exactly one new degree-1 symbol at each step.

The ideal soliton fails in practice: variance is too high. The peeling process often stalls because no degree-1 symbols are available.

Robust Soliton Distribution

Luby's fix: add extra degree-1 symbols and boost certain degrees:

μ(d) = (ρ(d) + τ(d)) / Z

where τ(d) adds "spikes" and Z normalizes. Parameters δ (failure probability) and c (tuning constant) control the trade-off.

Trade-offs

ParameterEffect
More degree-1Reliable start, higher overhead
Higher avg degreeBetter coverage, slower encoding
OptimalO(k ln(k/δ)) symbols needed

LT Coding: Visual Example

Mathematics

Source: 4 symbols [A, B, C, D]

A
B
C
D

Encoded Symbols

SymbolDegreeComposition
y11B
y22A ⊕ C
y33A ⊕ B ⊕ D
y42C ⊕ D
y51A

Peeling Decode

Step 1: y1 has degree 1 → B = y1

Step 2: y5 has degree 1 → A = y5

Step 3: Remove A from y2 → y2' = C (degree 1!) → C = y2'

Step 4: Remove C from y4 → y4' = D (degree 1!) → D = y4'

All 4 source symbols recovered from 5 encoded symbols!

LT Encoding in Python

Code
import numpy as np
from math import log, sqrt, ceil

def robust_soliton(k, c=0.1, delta=0.5):
    """Generate robust soliton distribution for k source symbols."""
    # Ideal soliton
    rho = [0] * (k + 1)
    rho[1] = 1.0 / k
    for d in range(2, k + 1):
        rho[d] = 1.0 / (d * (d - 1))

    # Tau component
    S = c * sqrt(k) * log(k / delta)
    tau = [0] * (k + 1)
    pivot = int(k / S)
    for d in range(1, min(pivot, k) + 1):
        tau[d] = S / (k * d)
    if pivot <= k:
        tau[pivot] = S * log(S / delta) / k

    # Normalize
    total = sum(rho[d] + tau[d] for d in range(1, k + 1))
    mu = [(rho[d] + tau[d]) / total for d in range(k + 1)]
    return mu[1:]  # degrees 1 to k

def lt_encode(source_symbols, n_encoded, degree_dist):
    """
    LT encode: generate n_encoded symbols from source.
    Returns list of (encoded_value, neighbors).
    """
    k = len(source_symbols)
    degrees = list(range(1, k + 1))
    encoded = []

    for _ in range(n_encoded):
        # Sample degree from distribution
        d = np.random.choice(degrees, p=degree_dist)
        # Choose d random source symbols
        neighbors = sorted(np.random.choice(k, size=d, replace=False))
        # XOR them together
        value = 0
        for idx in neighbors:
            value ^= source_symbols[idx]
        encoded.append((value, neighbors))

    return encoded

def lt_decode(encoded_symbols, k):
    """Peeling decoder for LT codes."""
    decoded = [None] * k
    remaining = [(val, list(nbrs)) for val, nbrs in encoded_symbols]

    progress = True
    while progress:
        progress = False
        for i, (val, nbrs) in enumerate(remaining):
            if len(nbrs) == 1:
                idx = nbrs[0]
                if decoded[idx] is None:
                    decoded[idx] = val
                    progress = True
                    # Remove this source from all other symbols
                    for j, (v2, n2) in enumerate(remaining):
                        if idx in n2:
                            remaining[j] = (v2 ^ val, [x for x in n2 if x != idx])
                remaining[i] = (0, [])  # consumed

    return decoded

# Demo
k = 100
source = np.random.randint(0, 256, k).tolist()
dist = robust_soliton(k)
encoded = lt_encode(source, int(k * 1.2), dist)  # 20% overhead
result = lt_decode(encoded, k)
recovered = sum(1 for x in result if x is not None)
print(f"Recovered {recovered}/{k} source symbols")

LT Code Performance Analysis

Mathematics

Overhead

With robust soliton distribution, LT codes need:

n = k + O(√k · ln2(k/δ))

symbols to decode k source symbols with probability 1-δ.

Overhead ε = n/k - 1 ≈ 5-10% for practical k.

Encoding Cost

Average degree under robust soliton: O(ln(k/δ))

Each encoded symbol: O(ln k) XOR operations

Total encoding: O(k ln k) — near-linear!

Decoding Cost

Peeling decoder: each source symbol "ripples" through O(ln k) encoded symbols.

Total decoding: O(k ln k)

LT code limitation: The O(√k) overhead term means LT codes have non-vanishing overhead. For k = 10,000, you need ~5-10% extra symbols. Raptor codes fix this.

Comparison

CodeOverheadDecode
Random linearO(1)O(k3)
LTO(√k ln2k)O(k ln k)
RaptorO(1)O(k)

Raptor Codes: Linear-Time Fountain Codes

History Mathematics

Amin Shokrollahi (2006) invented Raptor codes — achieving O(k) encoding/decoding with constant overhead. The name: Rapid Tornado + LT.

The Key Idea: Pre-Coding

First apply a high-rate pre-code (inner code) to the k source symbols, producing n' = k(1+ε) intermediate symbols.

Then apply an LT code with weak degree distribution (low average degree) to the intermediate symbols.

The pre-code handles the symbols that LT misses!

k source
symbols
Pre-code
(LDPC/Hamming)
n' intermediate
symbols
LT code
(weak)
Encoded
stream

Why This Works

The weak LT code recovers ~95-99% of intermediate symbols. The pre-code (e.g., LDPC) fills in the remaining 1-5%.

Because the LT code is weak, its average degree is O(1) — constant! This gives linear-time encoding and decoding.

Raptor Code Architecture

Mathematics

Pre-Code Options

Pre-codeRateDecoding
LDPC0.95-0.99Belief propagation
Hamming-like~0.95Bounded distance
HDPC~0.97Gaussian elimination
Combination~0.96RaptorQ approach

LT Component

Much simpler than standalone LT:

Average degree: O(1) (e.g., ~4-5)

Distribution: optimized for pre-code residual

No need for heavy tail of soliton

Raptor Decoding

Receive n
symbols
LT peel
recover ~95%
Pre-code
decode
Fill gaps
recover 100%
k source
symbols

Total overhead: k(1+ε) symbols suffice with ε < 0.05 for practical implementations. Encoding and decoding are both O(k).

RaptorQ: The State of the Art

Standards Applications

RFC 6330 (2012)

RaptorQ is the IETF standard implementation of Raptor codes. It's the most powerful erasure code deployed at scale.

Source symbols1 to 56,403
Symbol sizeArbitrary (bytes)
Overhead<2 extra symbols
Failure prob<10-6 at k+2
Encoding speed~1 Gbps

RaptorQ Pre-Code

Uses a sophisticated combination:

1. LDPC: Sparse parity checks for bulk error correction

2. HDPC: Half-density parity checks (dense GF(256) rows) for guaranteed recovery

3. LT: Degree distribution optimized for the specific pre-code

RaptorQ achieves information-theoretically optimal overhead: it can decode from k+0 to k+2 symbols in most cases. Essentially zero waste.

Fountain Codes vs ARQ

Applications
Property ARQ (TCP-style) Fountain Code
Feedback required? Yes (ACK/NACK) No
Multicast efficiency Poor (N separate streams) Excellent (one stream)
Latency (1-way) 2+ RTT per loss event 1 one-way delay
Bandwidth overhead Feedback + retransmissions ~2-5% coding overhead
Adapts to loss rate? Reactively (slow) Automatically
Deep space viable? No (30 min RTT!) Yes

Fountain codes are ideal when feedback is expensive (satellite), impossible (broadcast), or when serving heterogeneous receivers with different loss rates.

Streaming & Broadcast Applications

Applications

MBMS / eMBMS

3GPP Multimedia Broadcast/Multicast Service uses Raptor codes (RFC 5053) for file delivery and streaming to mobile devices.

One base station broadcasts; thousands of phones decode. Each phone misses different packets but all recover.

DVB-H / ATSC 3.0

Mobile TV standards use Raptor codes as the application-layer FEC, complementing physical-layer coding (LDPC/turbo).

ATSC 3.0 (next-gen US digital TV) supports Raptor for robust mobile reception.

FLUTE / ROUTE

IETF file delivery over unidirectional transport. RaptorQ provides reliable file delivery without feedback.

Used in satellite content distribution and emergency broadcasting.

Video Streaming

Low-latency video: Raptor codes protect against packet loss without retransmission delay.

Particularly valuable for live streaming where retransmission would violate latency budgets.

Content Distribution & CDNs

Applications

Software Updates

Distributing a 1 GB update to millions of devices:

Without fountain: Each device needs the exact missing pieces. Server tracks state per device.

With fountain: Server sprays encoded symbols. Every symbol helps every device. No per-device state.

Peer-to-Peer

In P2P, each peer has a random subset of encoded symbols. Any peer's symbols are useful to any other — no "rare piece" problem (unlike BitTorrent).

Multi-Source Download

Server A
Server B
Server C

↓ different encoded symbols ↓

Client

Each server generates independent encoded symbols. Client collects from all servers — no coordination needed!

Fountain Codes for Distributed Storage

Applications Mathematics

Store encoded symbols across n nodes. Any k+ε nodes suffice to recover the data. Nodes can fail and be replaced without coordination.

Replication vs Erasure Coding

MethodStorageTolerance
3x Replication3x2 failures
RS (10,4)1.4x4 of 14 failures
Fountain (k=10)~1.4xAny 30% failures

Fountain codes achieve the same reliability as RS with the added benefit of no fixed code rate.

Repair Problem

When a node fails, generating a new encoded symbol requires reading k source symbols — expensive bandwidth!

This spawned research into regenerating codes and locally repairable codes — important extensions for storage systems.

Cloud storage systems (Azure, Google) use specialized erasure codes optimized for repair bandwidth.

Beyond Basic Fountain Codes

Mathematics

Systematic Raptor Codes

Include the original k source symbols as the first k encoded symbols. Repair symbols are generated on demand.

Advantage: No decoding needed if no erasures!

Fountain over Noisy Channels

Standard fountain codes assume erasure channels. Extensions handle noisy channels by combining with soft-decision inner codes.

Unequal Error Protection

Weight the degree distribution to protect important data (e.g., video base layer) more heavily than less critical data.

Network Coding Connection

Random linear network coding is essentially a fountain code applied at every relay node in a network. Deep connection between the two fields.

Fountain codes represent a fundamental shift in coding philosophy: from "design a fixed code for a known channel" to "generate symbols on demand and let the receiver decide when it has enough."

Fountain Codes: Summary & Future

History Applications

Timeline

1998Digital fountain concept (Byers, Luby, Mitzenmacher)
2002LT codes (Luby)
2005RFC 5053: Raptor codes
2006Raptor codes theory (Shokrollahi)
2012RFC 6330: RaptorQ
2017ATSC 3.0 adopts Raptor

Key Takeaways

Rateless: no rate decision needed

Universal: works for any erasure rate

Efficient: O(k) encoding and decoding

Near-optimal: overhead < 2 symbols

Future Directions

Fountain codes for IoT/sensor networks, coded computation, blockchain scalability, and DNA data storage (where read order is inherently random — a perfect fit for fountain codes).