Memory-based encoding, trellis representations, and the algorithm that reached the edge of the solar system
Elias (1955), Viterbi (1967), history and motivation
Shift registers, generator polynomials, state machines
State diagrams, trellis diagrams, code trees
Path metrics, branch metrics, survivor paths, trace-back
Soft decisions, puncturing, sequential decoding
Deep space, GSM, WiFi, modems — everywhere!
At MIT, Elias introduced convolutional codes as an alternative to block codes. Unlike block codes that operate on fixed-size message blocks, convolutional codes process data as a continuous stream.
Elias was motivated by the impracticality of achieving Shannon's capacity with block codes of the lengths available at the time.
At UCLA/Linkabit, Viterbi invented the decoding algorithm that bears his name. Originally described for decoding convolutional codes, the Viterbi algorithm is now recognized as a general solution for finding the most likely sequence of hidden states.
Viterbi later co-founded Qualcomm. The algorithm is used in speech recognition, DNA sequence analysis, and more.
| Property | Block Codes | Convolutional Codes |
|---|---|---|
| Input | Fixed k-bit blocks | Continuous bit stream |
| Memory | None (memoryless) | Depends on previous m inputs |
| Encoding | Matrix multiplication | Shift register + XOR |
| Output | n-bit codewords | n bits per input bit (continuous) |
| Decoding | Syndrome, algebraic | Viterbi (trellis search) |
| Latency | One block delay | 5× constraint length typical |
Each output is a mod-2 sum of selected register taps:
Octal notation: g₁ = 7, g₂ = 5 (common rate-1/2 code)
The encoder output can be described as convolution (hence the name!):
where D is the delay operator (D-transform).
g₁(D) = 1 + D + D²
Binary: 111, Octal: 7
Taps: input, s₁, s₂
g₂(D) = 1 + D²
Binary: 101, Octal: 5
Taps: input, s₂ only
State Transitions
| Current | Input 0 | Output | Input 1 | Output |
|---|---|---|---|---|
| 00 | 00 | 00 | 10 | 11 |
| 01 | 00 | 11 | 10 | 00 |
| 10 | 01 | 10 | 11 | 01 |
| 11 | 01 | 01 | 11 | 10 |
Reading the Table
The state diagram is a finite state machine with 2m = 4 states.
The trellis unfolds the state diagram in time. Each column represents one time step; each node represents a state.
Time: t=0 t=1 t=2 t=3
State 00: 00 ——00—→ 00 ——00—→ 00 ——00—→ 00
State 01: 01 01 01
State 10: 10 10 10
State 11: 11 11 11
Solid lines = input 0, dashed = input 1. Labels = output bits.
Common rates:
| Rate | Meaning | Use Case |
|---|---|---|
| 1/2 | 1 in → 2 out | Deep space, WiFi |
| 1/3 | 1 in → 3 out | Very noisy channels |
| 2/3 | 2 in → 3 out | Bandwidth-limited |
| 3/4 | 3 in → 4 out | Punctured codes |
K determines the encoder's memory and code strength:
| K | States | Viterbi Complexity |
|---|---|---|
| 3 | 4 | Trivial |
| 5 | 16 | Easy |
| 7 | 64 | Standard (Voyager) |
| 9 | 256 | Moderate |
| 14 | 8192 | Near practical limit |
The free distance dfree is the minimum Hamming distance between any two codeword sequences that start and end in the same state:
Equivalently: the minimum weight path that diverges from and re-merges with the all-zeros path.
For the (7,5) code: dfree = 5, so t = 2.
| Code | dfree |
|---|---|
| R=1/2, K=3 (7,5) | 5 |
| R=1/2, K=5 (23,35) | 7 |
| R=1/2, K=7 (171,133) | 10 |
| R=1/3, K=3 (7,7,5) | 8 |
Given received sequence r, find the most likely transmitted sequence v:
Distance between received bits and expected output on each trellis branch:
Hard decision: Hamming distance (count differing bits)
Soft decision: Euclidean distance or correlation metric
Cumulative metric along a path through the trellis:
The path with the smallest cumulative metric is the most likely (for hard decisions).
For each state at time t, exactly 2 paths arrive. Add branch metric to each predecessor's path metric, compare the two candidates, select the survivor (lower metric wins).
Transmitted message: u = [1, 1, 0, 1, 0]
Encoded output: v = [11, 10, 01, 00, 01, 11, 00, 00]
Received with errors: r = [11, 00, 01, 00, 01, 11, 00, 00]
(Error in second pair: received 00 instead of 10)
| State | Metric at t=0 |
|---|---|
| 00 (start) | 0 |
| 01, 10, 11 | ∞ (impossible) |
From state 00: input 0 → output 00, BM = d(11,00) = 2 → state 00, PM = 2
From state 00: input 1 → output 11, BM = d(11,11) = 0 → state 10, PM = 0
| To State | From | Path Metric | Survivor? |
|---|---|---|---|
| 00 | 00 (input 0, out 00) | 2 + d(00,00) = 2 | Yes |
| 00 | 01 (input 0, out 11) | ∞ | No |
| 01 | 10 (input 0, out 10) | 0 + d(00,10) = 1 | Yes |
| 10 | 00 (input 1, out 11) | 2 + d(00,11) = 4 | No |
| 10 | 01 (input 1, out 00) | ∞ | — |
| 10 | — | 0 + d(00,11) = 2 | Yes (tie) |
| 11 | 10 (input 1, out 01) | 0 + d(00,01) = 1 | Yes |
After processing all received pairs, each state has a single surviving path and a cumulative path metric.
Starting from state 00 at the final time step (after tail bits flush the encoder), trace the survivor path backward:
Demodulator outputs 0 or 1 (binary).
Branch metric: Hamming distance
Simpler hardware, but throws away reliability information.
Demodulator outputs analog values (e.g., 3-bit quantization: 0-7).
Branch metric: Euclidean or correlation
More complex but preserves channel reliability.
Most modern Viterbi decoders use 3-bit (8-level) soft quantization. Diminishing returns beyond 4 bits.
Start with a low-rate code (e.g., R=1/2) and periodically delete (puncture) some output bits to increase the effective rate.
Base: R=1/2, K=7
Every 2 input bits: transmit 3 of 4 output bits.
Decoder inserts "erasures" at punctured positions and runs standard Viterbi.
Voyager 1 & 2 (1977)
R=1/2, K=7 convolutional code (171, 133)
Later concatenated with RS outer code
Still transmitting from interstellar space at 160 bps!
Mars Rovers (2004+)
Concatenated turbo code + convolutional inner code
Eb/N0 as low as 0.8 dB from Shannon limit
New Horizons (2015)
K=15 convolutional code for Pluto flyby
Signal power: 10-16 watts at Earth!
James Webb (2022)
Modern LDPC codes, but heritage from convolutional era
R=1/2, K=5 convolutional code
Different rates for voice (R=1/2) and data (R=1/3)
Deployed 1991, billions of phones
R=1/2, K=7 base code (171, 133)
Punctured to R=2/3, 3/4, 5/6
Viterbi decoder in every WiFi chip
Trellis-coded modulation (TCM)
Ungerboeck's coding + modulation
Enabled 28.8/33.6 kbps over phone lines
Convolutional inner code + RS outer code
Multiple rate options for different conditions
Replaced by LDPC in DVB-T2
For large constraint lengths (K > 15), Viterbi is impractical. Sequential decoding explores the trellis selectively rather than exhaustively.
Variable computation time — fast when channel is good.
Guaranteed to find ML path eventually.
class ConvolutionalEncoder:
"""Rate 1/n convolutional encoder with K constraint length."""
def __init__(self, generators, K):
"""
generators: list of generator polynomials (octal)
K: constraint length
"""
self.generators = [int(oct_str) for oct_str in generators]
self.K = K
self.n = len(generators)
self.state = 0 # shift register state
def _octal_to_taps(self, octal_val):
"""Convert octal generator to binary tap positions."""
binary = bin(int(str(octal_val), 8))[2:]
return [int(b) for b in binary.zfill(self.K)]
def encode_bit(self, input_bit):
"""Encode single input bit, return n output bits."""
# Shift input into register
combined = (input_bit << (self.K - 1)) | self.state
outputs = []
for gen in self.generators:
taps = self._octal_to_taps(gen)
out = 0
for i, tap in enumerate(taps):
if tap:
out ^= (combined >> (self.K - 1 - i)) & 1
outputs.append(out)
# Update state
self.state = (combined >> 1) & ((1 << (self.K-1)) - 1)
return outputs
def encode(self, message):
"""Encode full message with tail bits."""
self.state = 0
output = []
# Encode message bits
for bit in message:
output.extend(self.encode_bit(bit))
# Add tail bits to flush encoder to state 0
for _ in range(self.K - 1):
output.extend(self.encode_bit(0))
return output
# Create rate 1/2, K=3 encoder with generators (7, 5)
encoder = ConvolutionalEncoder(['7', '5'], K=3)
# Encode a message
message = [1, 1, 0, 1, 0]
encoded = encoder.encode(message)
print(f"Message: {message}")
print(f"Encoded: {encoded}")
# Output: [1,1, 1,0, 0,1, 0,0, 0,1, 1,1, 0,0]
# Show output as pairs
pairs = [(encoded[i], encoded[i+1]) for i in range(0, len(encoded), 2)]
print(f"As pairs: {pairs}")
# [(1,1), (1,0), (0,1), (0,0), (0,1), (1,1), (0,0)]
# Verify by tracing state machine manually
encoder2 = ConvolutionalEncoder(['7', '5'], K=3)
print("\nStep-by-step trace:")
print(f"{'Time':>4} {'Input':>5} {'State':>7} {'Output':>7}")
for i, bit in enumerate(message + [0, 0]): # +tail
state_before = format(encoder2.state, '02b')
out = encoder2.encode_bit(bit)
print(f"{i:>4} {bit:>5} {state_before:>7} {''.join(map(str,out)):>7}")
# Verify: rate = len(message) / len(encoded)
rate = len(message) / (len(message) + encoder.K - 1) / encoder.n
print(f"\nEffective rate: {len(message)}/{len(encoded)} "
f"= {len(message)/len(encoded):.3f}")
def viterbi_decode(received, generators, K):
"""Hard-decision Viterbi decoder for rate 1/n convolutional code."""
n_outputs = len(generators)
n_states = 1 << (K - 1)
n_steps = len(received) // n_outputs
# Precompute branch outputs for each state and input
enc = ConvolutionalEncoder([str(g) for g in generators], K)
branch_output = {}
for state in range(n_states):
for inp in [0, 1]:
enc.state = state
out = enc.encode_bit(inp)
next_state = enc.state
branch_output[(state, inp)] = (next_state, out)
# Initialize path metrics
INF = float('inf')
path_metric = [INF] * n_states
path_metric[0] = 0
survivor = [[[] for _ in range(n_states)] for _ in range(n_steps)]
# Forward pass: ACS
for t in range(n_steps):
rx = received[t * n_outputs : (t + 1) * n_outputs]
new_metric = [INF] * n_states
for state in range(n_states):
if path_metric[state] == INF:
continue
for inp in [0, 1]:
ns, out = branch_output[(state, inp)]
bm = sum(r != o for r, o in zip(rx, out))
candidate = path_metric[state] + bm
if candidate < new_metric[ns]:
new_metric[ns] = candidate
survivor[t][ns] = survivor[max(0,t-1)][state] + [inp] \
if t > 0 else [inp]
path_metric = new_metric
# Trace back from state 0
return survivor[n_steps - 1][0][: -(K-1)] # remove tail bits
| Scheme | BER = 10-5 | Coding Gain |
|---|---|---|
| Uncoded BPSK | 9.6 dB | — |
| R=1/2 K=3 Hard Viterbi | 6.5 dB | 3.1 dB |
| R=1/2 K=7 Hard Viterbi | 5.2 dB | 4.4 dB |
| R=1/2 K=7 Soft Viterbi | 3.3 dB | 6.3 dB |
| R=1/2 K=7 Soft + RS outer | 2.5 dB | 7.1 dB |
| Shannon limit R=1/2 | 0.2 dB | 9.4 dB |