The first codes provably achieving channel capacity with efficient encoding and decoding — from Arikan's elegant theory to the 5G control channel.
Arikan's 2009 breakthrough and the concept of channel polarization.
How channels split into perfect and useless — the core phenomenon.
Choosing information bits and the Kronecker product encoding.
Successive cancellation, list decoding, and CRC-aided variants.
How polar codes won the battle for 5G control channels.
Polar vs LDPC vs turbo — strengths and trade-offs.
Since Shannon (1948), the field searched for codes that:
For 60 years, no code satisfied all four conditions simultaneously.
Professor at Bilkent University, Turkey. His 2009 paper "Channel Polarization: A Method for Constructing Capacity-Achieving Codes for Symmetric Binary-Input Discrete Memoryless Channels" solved the problem.
Polar codes are the first and only family of codes with a rigorous proof of achieving Shannon capacity with polynomial complexity.
Arikan received the IEEE Richard W. Hamming Medal (2018) and the Shannon Award (2023, anticipated) for this contribution.
Start with N copies of a channel W. Apply a specific linear transform. The resulting N "virtual channels" polarize: they become either nearly perfect (capacity → 1) or nearly useless (capacity → 0).
Two uses of channel W with capacity I(W):
W-: Channel for u1 (seeing only y1, y2) — worse than W
W+: Channel for u2 (seeing y1, y2, u1) — better than W
Total capacity is preserved! But:
One channel gets better, the other gets worse. Capacity is redistributed, not created or destroyed.
Apply the basic transform recursively n = log2(N) times. After n levels, N = 2n virtual channels emerge.
As N → ∞, the fraction of channels with capacity near 1 approaches I(W), and the fraction with capacity near 0 approaches 1 - I(W).
| N | Good (>0.99) | Bad (<0.01) | Middle |
|---|---|---|---|
| 4 | 1 (25%) | 1 (25%) | 2 (50%) |
| 16 | 5 (31%) | 5 (31%) | 6 (38%) |
| 256 | 107 (42%) | 107 (42%) | 42 (16%) |
| 4096 | 1946 (48%) | 1946 (48%) | 204 (5%) |
| ∞ | 50% | 50% | 0% |
Polarization converges slowly: the "middle" channels vanish as O(2-Nβ) where β < 0.5. This means polar codes need long block lengths for best performance.
Z(W) ∈ [0, 1] quantifies channel reliability:
Z(W) = 0: Perfect channel (noiseless)
Z(W) = 1: Useless channel (pure noise)
Upper bounds error probability: Pe ≤ Z(W)
The "minus" channel degrades, the "plus" channel improves — quadratically!
For each of the N virtual channels, compute Z(WN(i)). Sort by reliability.
Best channels (lowest Z) carry information bits.
Worst channels (highest Z) carry frozen bits (set to 0).
The Bhattacharyya parameter gives us a simple, recursive way to identify which channels are reliable — this is how we construct polar codes.
For a polar code (N, K):
1. Compute Z(WN(i)) for all i = 0, ..., N-1
2. Choose K channels with smallest Bhattacharyya parameter
3. These carry information bits; the rest are "frozen" (set to known values, usually 0)
| Method | Complexity | Optimality |
|---|---|---|
| Bhattacharyya | O(N) | Exact for BEC |
| Density evolution | O(N · M) | Near-optimal AWGN |
| Gaussian approx. | O(N) | Good approx. |
| Monte Carlo | Slow | Channel-specific |
| 5G NR tables | O(1) lookup | Pre-computed |
Frozen bits are known to both encoder and decoder. They carry no information but are crucial for the decoder to work.
Think of them as "scaffolding" — they support the decoding process but add no data.
Rate = K/N = fraction of good channels used
The 2×2 matrix F is the fundamental building block. The Kronecker power F⊗n creates the N×N generator matrix.
BN is a bit-reversal permutation matrix.
Encoding has a recursive butterfly structure (like FFT):
Each stage: XOR pairs of values. Total: n stages of N/2 XORs each.
The encoding is simply: set frozen positions to 0, place information bits in good positions, then multiply by GN (using the butterfly).
import numpy as np
def polar_encode(info_bits, frozen_positions, N):
"""
Polar encoder using butterfly (Kronecker) structure.
info_bits: K information bits
frozen_positions: set of frozen bit indices
N: code length (must be power of 2)
"""
n = int(np.log2(N))
assert N == 2**n, "N must be a power of 2"
# Place bits: frozen = 0, info = data
u = np.zeros(N, dtype=int)
info_idx = 0
for i in range(N):
if i not in frozen_positions:
u[i] = info_bits[info_idx]
info_idx += 1
# Butterfly encoding (in-place Kronecker product)
x = u.copy()
for stage in range(n):
step = 2 ** (stage + 1)
half = step // 2
for i in range(0, N, step):
for j in range(half):
x[i + j] ^= x[i + j + half]
# x[i + j + half] stays the same
return x
def construct_polar_code_bec(N, K, epsilon):
"""
Construct polar code for BEC(epsilon).
Returns set of frozen positions.
"""
n = int(np.log2(N))
# Compute Bhattacharyya parameters recursively
z = [epsilon] # Start with single channel
for stage in range(n):
new_z = []
for zi in z:
z_minus = 2 * zi - zi**2 # Worse channel
z_plus = zi**2 # Better channel
new_z.append(z_minus)
new_z.append(z_plus)
z = new_z
# Sort channels by reliability (lower Z = more reliable)
indexed = [(z[i], i) for i in range(N)]
indexed.sort()
# Best K channels carry information; rest are frozen
frozen = set(idx for _, idx in indexed[K:])
return frozen, z
# Example: (8, 4) polar code for BEC(0.5)
N, K = 8, 4
frozen, bhatt = construct_polar_code_bec(N, K, epsilon=0.5)
print(f"Frozen positions: {sorted(frozen)}")
print(f"Info positions: {sorted(set(range(N)) - frozen)}")
info = np.array([1, 0, 1, 1])
codeword = polar_encode(info, frozen, N)
print(f"Info bits: {info}")
print(f"Codeword: {codeword}")
SC decoding estimates bits one at a time, from u0 to uN-1, using previously decoded bits as side information — mimicking the genie that knows u1, ..., ui-1 in the polarization analysis.
For each bit position i = 0, 1, ..., N-1:
If i is frozen: Set ûi = 0 (known)
If i is information: Compute LLR from received signal y and previously decoded bits û0...ûi-1. Decide: ûi = 0 if LLR ≥ 0, else 1.
Same butterfly structure as encoding, but processing LLRs instead of bits. Like a "reverse FFT" on log-likelihoods.
f-function: combines two LLRs (check-node-like)
g-function: uses a decoded bit to refine (variable-node-like)
SC decoding is sequential — each bit depends on all previous bits. This makes it inherently serial and limits throughput. Also, a single wrong decision propagates errors.
Instead of making a single hard decision at each step, SCL maintains a list of L candidate paths. At each information bit, the list doubles (try both 0 and 1), then prune back to L best paths.
1. Start with one path (all frozen bits known)
2. At frozen bit: extend all paths with known value
3. At info bit: split each path into two (0 and 1)
4. Keep only the L most likely paths (by path metric)
5. After all N bits, select the best surviving path
| L | Performance | Complexity |
|---|---|---|
| 1 | SC (baseline) | O(N log N) |
| 4 | ~0.5 dB gain | 4x SC |
| 8 | ~0.7 dB gain | 8x SC |
| 32 | ~1.0 dB gain | 32x SC |
| ML bound | Optimal | Exponential |
SCL with L=32 approaches maximum-likelihood decoding performance for practical block lengths. The gain over basic SC is substantial — this made polar codes competitive.
The breakthrough that made polar codes practical: append a CRC to the information bits before encoding. The CRC helps the list decoder select the correct path.
1. Before encoding: compute CRC over K' info bits, append r CRC bits. Total K = K' + r info bits encoded.
2. Decode with SCL (list size L)
3. Among L surviving paths, pick the one that passes the CRC check
4. If no path passes CRC, report decoding failure
CA-SCL with L=32 and CRC-11:
~1.5 dB better than SC alone
Matches or exceeds turbo/LDPC at short blocks
Reliable error detection via CRC
The CRC bits reduce the effective code rate slightly (e.g., 11 CRC bits out of 128 info bits = 8.6% overhead). But the performance gain far outweighs this cost.
import numpy as np
def sc_decode(llr, frozen_positions, N):
"""
Successive Cancellation decoder for polar codes.
llr: channel LLRs (length N)
frozen_positions: set of frozen bit indices
Returns: decoded bits (length N)
"""
n = int(np.log2(N))
decoded = np.zeros(N, dtype=int)
# Recursive SC decode
def recurse(llrs, node_bits, depth, node_idx):
if depth == n:
# Leaf node
if node_idx in frozen_positions:
return np.array([0])
else:
return np.array([0 if llrs[0] >= 0 else 1])
half = len(llrs) // 2
# f-function: combine upper and lower LLRs
f_llr = np.sign(llrs[:half]) * np.sign(llrs[half:]) * \
np.minimum(np.abs(llrs[:half]), np.abs(llrs[half:]))
# Decode left (upper) half
left = recurse(f_llr, None, depth + 1, 2 * node_idx)
# g-function: use left decisions to refine
g_llr = llrs[half:] + (1 - 2 * left) * llrs[:half]
# Decode right (lower) half
right = recurse(g_llr, None, depth + 1, 2 * node_idx + 1)
# Combine: upper = left XOR right, lower = right
combined = np.concatenate([left ^ right, right])
return combined
result = recurse(llr, None, 0, 0)
return result
# Example: decode (8,4) polar code
N = 8
frozen = {0, 1, 2, 4} # Example frozen set
# Simulated received LLRs (positive = likely 0)
channel_llr = np.array([2.1, -0.5, 1.3, -1.8, 0.9, -2.3, 1.1, -0.7])
decoded = sc_decode(channel_llr, frozen, N)
print(f"Decoded bits: {decoded}")
| Property | Polar (CA-SCL) | LDPC | Turbo |
|---|---|---|---|
| Capacity achieving? | Proven | Empirical (near) | Empirical (near) |
| Short blocks (N<256) | Excellent | Poor | Good |
| Long blocks (N>10K) | Good | Excellent | Good |
| Decoding latency | High (serial SC) | Low (parallel BP) | Medium |
| Error floor | None (SC) | Low (trapping sets) | Moderate |
| Rate flexibility | Any K/N (change frozen set) | Needs new H matrix | Puncturing |
| Encoding | O(N log N), simple | O(N) QC, complex | O(N), simple |
3GPP selected polar codes for control channels (PDCCH, PUCCH) in 5G NR Release 15 (2018). This was a historic first deployment of polar codes in a major standard.
| Channel | PDCCH (DL control) |
| Also | PUCCH, PBCH |
| Info bits K | 12 to 140 |
| Code length N | 32 to 512 (max 1024) |
| CRC | CRC-6, CRC-11, CRC-24 |
| Decoder | CA-SCL (L=8 typical) |
Short blocks: Control info is 20-140 bits. Polar codes excel here.
Flexible rate: Just change the frozen set — no new code design needed.
Built-in CRC: Error detection comes free with CA-SCL.
No error floor: Critical for control reliability (BLER < 10-5).
| Channel | Candidates | Winner |
|---|---|---|
| Data (long) | LDPC, Polar, Turbo | LDPC |
| Control (short) | Polar, LDPC, TBCC | Polar |
Turbo codes — the incumbent from 4G — were eliminated early due to throughput limitations at 5G data rates.
The decision was intensely political. Huawei championed polar codes; Qualcomm pushed LDPC. The final split — LDPC for data, polar for control — was a compromise.
LDPC for data: superior throughput at long block lengths (1000+ bits), parallel decoding, mature hardware.
Polar for control: superior performance at short block lengths (20-200 bits), flexible rate, no error floor.
The 5G adoption validated 10 years of polar code research and established them as a mainstream coding family alongside LDPC.
Polar codes can achieve the source coding bound (lossless compression at entropy rate).
Dual of channel coding: freeze the "good" channels (carry randomness) and compress via the "bad" channels.
Polar codes can simultaneously compress and protect — achieving the optimal rate-distortion-capacity trade-off.
This unification was one of Arikan's key insights: the same polarization phenomenon governs both compression and error correction.
Polar codes achieve the secrecy capacity of wiretap channels — providing information-theoretic security.
Polarization extends to multi-user channels, achieving the capacity region of the multiple access channel.
Pro: Simple, low area
Con: Sequential — latency O(N log N). For N=1024, that's ~10,000 clock cycles.
Throughput: limited to ~100 Mbps.
Challenge: L parallel SC decoders + sorting/pruning. Area grows linearly with L.
For L=8 in 5G: manageable in modern ASIC processes.
| Technique | Speedup |
|---|---|
| Rate-0 node skip | 2-3x |
| Rate-1 node skip | 2-3x |
| REP/SPC merge | 1.5x |
| Unrolled decoder | 5-10x |
| Combined | 10-30x |
Fast-SSC (Simplified SC) identifies subtree patterns (all-frozen, all-info) and decodes them in one step.
Modern fast-SCL decoders achieve 1+ Gbps throughput — sufficient for 5G control channel requirements. Research into BP decoding of polar codes continues for higher throughput.
Polar codes require N = 2n. But 5G needs arbitrary output lengths M. Solution: encode at N ≥ M, then rate-match to M bits.
M < N: Delete N-M coded bits. Receiver treats punctured positions as erasures (LLR = 0).
Used when M is slightly less than N.
M < N: Set N-M input bits to known values. Receiver knows these bits perfectly (LLR = ∞).
Better than puncturing when M << N.
M > N: Repeat some coded bits. Receiver combines repeated copies for improved LLRs.
Used for very low rates.
5G NR uses a combination of these techniques with interleaving to support any desired output length and code rate. The frozen set must be adapted for the rate-matching scheme.
Polarization works over any finite field GF(q). Non-binary kernels can achieve faster polarization — fewer "middle" channels at finite N.
Trade-off: better performance vs higher decoding complexity.
Replace the 2×2 kernel with larger matrices (3×3, 4×4, ...). Larger kernels polarize faster — better finite-length performance.
Polarization-Adjusted Convolutional codes (Arikan, 2019): pre-transform the data with a convolutional code before polar encoding. Remarkable short-block performance.
Polar codes have been extended to quantum error correction, achieving the quantum channel capacity. An active research area.
This is exactly the flow used in 5G NR for control channel encoding/decoding. The entire process is defined in 3GPP TS 38.212.
| 2009 | Arikan's foundational paper |
| 2011 | SC List decoding (Tal & Vardy) |
| 2012 | CRC-aided SCL |
| 2015 | Fast-SSC decoding |
| 2018 | 5G NR standardization |
| 2019 | PAC codes (Arikan) |
First provably capacity-achieving codes with efficient encoding/decoding
Elegant theory: one simple transform, recursive structure
Excellent short-block performance with CA-SCL
Deployed in the world's most advanced wireless standard
Polar codes closed a 60-year gap between Shannon's promise and practical reality. They represent the most significant theoretical advance in coding since turbo codes, and arguably since Shannon himself.