Polynomial arithmetic for fast, reliable error detection in every network packet
Peterson's 1961 paper and the origins of CRC
The mathematical foundation: arithmetic mod 2
Polynomial long division step by step
CRC-8, CRC-16, CRC-32, CRC-64 and their generators
Shift register implementation for blazing-fast CRC
Ethernet, USB, ZIP, and beyond
The Cyclic Redundancy Check was formalized by W. Wesley Peterson in his 1961 paper "Cyclic Codes for Error Detection", published in the Proceedings of the IRE.
Peterson showed that cyclic codes — codes where any cyclic shift of a codeword is also a codeword — could be efficiently computed using polynomial division and implemented with simple shift register circuits.
GF(2) is the finite field with two elements: {0, 1}. Arithmetic is modular: addition and subtraction are both XOR. Multiplication is AND.
| + | 0 | 1 | × | 0 | 1 | |
|---|---|---|---|---|---|---|
| 0 | 0 | 1 | 0 | 0 | 0 | |
| 1 | 1 | 0 | 1 | 0 | 1 |
Key: 1 + 1 = 0 (no carry!), and subtraction = addition
Binary strings map to polynomials:
1011 → x3 + x + 1
11001 → x4 + x3 + 1
10111 → x4 + x2 + x + 1
(x3 + x + 1) + (x3 + x2 + 1)
= x2 + x
Bitwise: 1011 ⊕ 1101 = 0110
Note: x3 + x3 = 0 and 1 + 1 = 0 (they cancel!)
(x2 + 1) × (x + 1)
= x3 + x2 + x + 1
(101) × (11) = 1111
Multiply like normal polynomials, then reduce coefficients mod 2
CRC treats the message as a polynomial M(x) and divides it by a fixed generator polynomial G(x). The remainder is the CRC checksum.
Message: 1101011 Generator: G(x) = x3 + x + 1 = 1011
Step 1: Append r = 3 zeros to message: 1101011000
Step 2: Perform polynomial long division (XOR division):
1101011000 ÷ 1011
1011
----
0110011000
1011
----
011111000
1011
----
01001000
1011
----
0010000
1011
----
00110
The receiver divides the received message (with CRC appended) by the same generator polynomial G(x).
Received: 1101011110
Divide by 1011:
Remainder = 000
Result: PASS
Received: 1100011110 (bit 4 flipped)
Divide by 1011:
Remainder = non-zero
Result: FAIL — error detected!
The choice of generator polynomial G(x) determines which errors CRC can and cannot detect. Good generators have specific algebraic properties.
A burst error of length b is an error pattern where the first and last erroneous bits are b positions apart: E(x) = xi(xb-1 + ... + 1).
Burst errors are the most common type in practice (noise affects consecutive bits). CRC is specifically designed to catch them because:
A burst of length ≤ r corresponds to a polynomial of degree < r. Since G(x) has degree r, it cannot divide any polynomial of degree < r.
With r = 32:
| Name | Polynomial | Hex | Used In |
|---|---|---|---|
| CRC-8 | x8 + x2 + x + 1 | 0x07 | ATM headers, I2C |
| CRC-16-CCITT | x16 + x12 + x5 + 1 | 0x1021 | Bluetooth, HDLC, X.25 |
| CRC-16-IBM | x16 + x15 + x2 + 1 | 0x8005 | USB, Modbus |
| CRC-32 | x32 + x26 + x23 + x22 + x16 + x12 + x11 + x10 + x8 + x7 + x5 + x4 + x2 + x + 1 | 0x04C11DB7 | Ethernet, ZIP, PNG, MPEG-2 |
| CRC-64-ECMA | x64 + x62 + x57 + x55 + ... | 0x42F0E1EBA9EA3693 | XZ, HDFS |
CRC-32 (IEEE 802.3 / ITU-T V.42) is defined by the generator polynomial:
In Ethernet, CRC-32 is computed with all bits complemented initially, and the final CRC is also complemented. This catches the edge case of inserted/deleted zeros.
def crc_remainder(data: str, generator: str) -> str:
"""Compute CRC remainder using binary polynomial division.
Args:
data: Binary string of the message
generator: Binary string of the generator polynomial
Returns:
Binary string of the CRC remainder
"""
gen_len = len(generator)
# Append zeros (degree of generator - 1)
padded = list(data + '0' * (gen_len - 1))
for i in range(len(data)):
if padded[i] == '1': # Only XOR when leading bit is 1
for j in range(gen_len):
padded[i + j] = str(int(padded[i + j]) ^ int(generator[j]))
# Remainder is the last (gen_len - 1) bits
return ''.join(padded[-(gen_len - 1):])
def crc_check(data_with_crc: str, generator: str) -> bool:
"""Verify CRC by checking if remainder is zero."""
return all(b == '0' for b in crc_remainder(data_with_crc, generator))
# Example
message = "1101011"
generator = "1011" # x^3 + x + 1
crc = crc_remainder(message, generator)
print(f"CRC: {crc}") # "110"
transmitted = message + crc
print(f"Check: {crc_check(transmitted, generator)}") # True
def crc32(data: bytes) -> int:
"""Compute CRC-32 (IEEE 802.3) from scratch."""
# CRC-32 polynomial (reversed/reflected representation)
POLY = 0xEDB88320
crc = 0xFFFFFFFF # Initial value (all 1s)
for byte in data:
crc ^= byte
for _ in range(8): # Process each bit
if crc & 1: # LSB is 1
crc = (crc >> 1) ^ POLY
else:
crc >>= 1
return crc ^ 0xFFFFFFFF # Final XOR
# Verify against Python's built-in
import binascii
message = b"Hello, World!"
our_crc = crc32(message)
lib_crc = binascii.crc32(message) & 0xFFFFFFFF
print(f"Our CRC-32: 0x{our_crc:08X}")
print(f"Lib CRC-32: 0x{lib_crc:08X}")
print(f"Match: {our_crc == lib_crc}") # True
Processing bit-by-bit is slow. A lookup table of 256 pre-computed CRC values lets us process a full byte at a time — an 8× speedup.
def make_crc32_table():
"""Pre-compute CRC-32 lookup table (256 entries)."""
POLY = 0xEDB88320
table = []
for i in range(256):
crc = i
for _ in range(8):
if crc & 1:
crc = (crc >> 1) ^ POLY
else:
crc >>= 1
table.append(crc)
return table
CRC32_TABLE = make_crc32_table()
def crc32_fast(data: bytes) -> int:
"""Table-driven CRC-32 (byte-at-a-time)."""
crc = 0xFFFFFFFF
for byte in data:
index = (crc ^ byte) & 0xFF
crc = (crc >> 8) ^ CRC32_TABLE[index]
return crc ^ 0xFFFFFFFF
Further optimization: Modern implementations use 4KB or 16KB "slicing-by-4" / "slicing-by-8" tables to process 4 or 8 bytes at once. Intel/AMD CPUs have dedicated CRC32 instructions (SSE 4.2).
CRC can be computed in hardware using a Linear Feedback Shift Register (LFSR). For a generator of degree r, you need r flip-flops and a few XOR gates.
XOR gates are placed where the generator polynomial has coefficient 1 (at x1 and x0). Feedback from R0 loops back to the XOR gates.
| Protocol / Format | CRC Type | Protected Data | Notes |
|---|---|---|---|
| Ethernet (802.3) | CRC-32 | Frame data | 4-byte FCS field |
| USB | CRC-5, CRC-16 | Token/data packets | CRC-5 for tokens, CRC-16 for data |
| Bluetooth | CRC-24 | PDU data | Adaptive frequency hopping |
| ZIP / gzip | CRC-32 | Compressed data | Detects corruption during storage |
| PNG images | CRC-32 | Each chunk | Per-chunk integrity |
| HDLC / PPP | CRC-16-CCITT | Frame contents | Used in serial links |
| SATA / PCIe | CRC-32 | Data layer | High-speed bus integrity |
| iSCSI | CRC-32C | PDU data | Uses Castagnoli polynomial |
CRC32 in 2008, computing CRC-32C at hardware speed.
Both CRC and checksums detect errors, but they have very different mathematical foundations and error detection capabilities.
| Property | Internet Checksum | CRC-32 |
|---|---|---|
| Algorithm | One's complement sum | Polynomial division (GF(2)) |
| Check bits | 16 | 32 |
| Single-bit errors | All detected | All detected |
| Burst errors ≤ 16 | Some missed | All detected |
| Byte swaps | Undetected! | Detected |
| Speed | Very fast | Fast (with tables/HW) |
| Used in | IP, TCP, UDP headers | Ethernet, storage, protocols |
CRC could theoretically correct errors (the syndrome contains information about the error), but there are fundamental and practical reasons it's used only for detection.
01: Foundations — Shannon, capacity, distance, bounds
02: Parity & Simple Codes — detection basics
03: Hamming Codes — first practical FEC
04: CRC — polynomial error detection