Coding Theory Series — 04

CRC & Error Detection

Polynomial arithmetic for fast, reliable error detection in every network packet

Mathematics Code Standards Applications

Use arrow keys or swipe to navigate • Press ESC for overview

Roadmap

History

Peterson's 1961 paper and the origins of CRC

GF(2) Polynomials

The mathematical foundation: arithmetic mod 2

CRC Computation

Polynomial long division step by step

Standard CRCs

CRC-8, CRC-16, CRC-32, CRC-64 and their generators

Hardware

Shift register implementation for blazing-fast CRC

Applications

Ethernet, USB, ZIP, and beyond

History of CRC

History

The Cyclic Redundancy Check was formalized by W. Wesley Peterson in his 1961 paper "Cyclic Codes for Error Detection", published in the Proceedings of the IRE.

The Innovation

Peterson showed that cyclic codes — codes where any cyclic shift of a codeword is also a codeword — could be efficiently computed using polynomial division and implemented with simple shift register circuits.

Why CRC Won

Speed: computable with XOR and shift operations
Hardware: trivial to implement in circuits
Burst detection: excellent at catching burst errors
Standardized: widely adopted across all protocols

CRC has been the dominant error detection mechanism in digital communications for over 60 years. Every Ethernet frame, USB packet, and ZIP file uses CRC.

Polynomial Arithmetic over GF(2)

Mathematics

GF(2) is the finite field with two elements: {0, 1}. Arithmetic is modular: addition and subtraction are both XOR. Multiplication is AND.

GF(2) Operations

+	0	1		×	0	1
0	0	1		0	0	0
1	1	0		1	0	1

Key: 1 + 1 = 0 (no carry!), and subtraction = addition

Polynomials over GF(2)

Binary strings map to polynomials:

1011 → x³ + x + 1
11001 → x⁴ + x³ + 1
10111 → x⁴ + x² + x + 1

Why GF(2)? Over GF(2), polynomial addition is XOR (bitwise), and polynomial division reduces to repeated XOR with shifts. No multiplication or division of integers — just bit operations.

GF(2) Polynomial Operations

Mathematics

Addition (= XOR)

(x³ + x + 1) + (x³ + x² + 1)
= x² + x

Bitwise: 1011 ⊕ 1101 = 0110

Note: x³ + x³ = 0 and 1 + 1 = 0 (they cancel!)

Multiplication

(x² + 1) × (x + 1)
= x³ + x² + x + 1

(101) × (11) = 1111

Multiply like normal polynomials, then reduce coefficients mod 2

Key identity: a + a = 0 for all a ∈ GF(2)
Therefore subtraction = addition = XOR

No carries, no borrows. This is what makes GF(2) arithmetic so hardware-friendly. Every operation reduces to shifts and XORs.

The CRC Concept

Mathematics

CRC treats the message as a polynomial M(x) and divides it by a fixed generator polynomial G(x). The remainder is the CRC checksum.

Message
M(x)

→

Shift left
M(x) · x^r

→

Divide by G(x)
mod G(x)

→

Remainder
R(x) = CRC

Transmitted: T(x) = M(x) · x^r + R(x)
where R(x) = M(x) · x^r mod G(x) and r = deg(G)

Verification: The receiver divides T(x) by G(x). If the remainder is 0, no error detected. If non-zero, an error occurred. This works because T(x) is constructed to be exactly divisible by G(x).

CRC Computation: Step by Step

Mathematics Interactive

Message: 1101011 Generator: G(x) = x³ + x + 1 = 1011

Step 1: Append r = 3 zeros to message: 1101011000

Step 2: Perform polynomial long division (XOR division):

  1101011000  ÷  1011
  1011
  ----
  0110011000
   1011
   ----
   011111000
    1011
    ----
    01001000
     1011
     ----
     0010000
      1011
      ----
      00110

Remainder R(x) = 110 → Transmit: 1101011110

CRC Verification

Mathematics

The receiver divides the received message (with CRC appended) by the same generator polynomial G(x).

No Error

Received: 1101011110

Divide by 1011:

Remainder = 000

Result: PASS

With Error

Received: 1100011110 (bit 4 flipped)

Divide by 1011:

Remainder = non-zero

Result: FAIL — error detected!

If received R(x) = T(x) + E(x), then:
R(x) mod G(x) = E(x) mod G(x)
Error undetected only if G(x) divides E(x)

CRC fails to detect an error pattern E(x) only when E(x) is a multiple of G(x). By choosing G(x) carefully, we can make this extremely unlikely for common error patterns.

Generator Polynomial Properties

Mathematics

The choice of generator polynomial G(x) determines which errors CRC can and cannot detect. Good generators have specific algebraic properties.

Must-Have Properties

G(x) has (x+1) as factor:
Detects all odd-weight errors
G(x) is irreducible over GF(2):
Maximum-length sequence detection
Degree r: produces r-bit CRC value

Detection Guarantees

All single-bit errors (if x^r + ... + 1)
All double-bit errors (if G has appropriate period)
All odd numbers of bit errors (if (x+1) | G(x))
All burst errors of length ≤ r
Fraction 1 − 2^−r of longer bursts

Choosing poorly: A generator like G(x) = x^r (just a shift) would be useless — it only checks if the last r bits are zero. Real generators must be carefully selected for maximum error coverage.

Burst Error Detection

Mathematics

A burst error of length b is an error pattern where the first and last erroneous bits are b positions apart: E(x) = xⁱ(x^b-1 + ... + 1).

A CRC with generator G(x) of degree r detects:
• ALL burst errors of length ≤ r
• Fraction (1 − 2^−(r−1)) of bursts of length r + 1
• Fraction (1 − 2^−r) of bursts of length > r + 1

Why CRC Excels at Bursts

Burst errors are the most common type in practice (noise affects consecutive bits). CRC is specifically designed to catch them because:

A burst of length ≤ r corresponds to a polynomial of degree < r. Since G(x) has degree r, it cannot divide any polynomial of degree < r.

CRC-32 Example

With r = 32:

Detects ALL bursts ≤ 32 bits
Misses only 1 in 2³¹ bursts of length 33
Misses only 1 in 2³² bursts of length > 33
That's a miss rate of ~0.00000005%

Standard CRC Polynomials

Standards

Name	Polynomial	Hex	Used In
CRC-8	x⁸ + x² + x + 1	0x07	ATM headers, I²C
CRC-16-CCITT	x¹⁶ + x¹² + x⁵ + 1	0x1021	Bluetooth, HDLC, X.25
CRC-16-IBM	x¹⁶ + x¹⁵ + x² + 1	0x8005	USB, Modbus
CRC-32	x³² + x²⁶ + x²³ + x²² + x¹⁶ + x¹² + x¹¹ + x¹⁰ + x⁸ + x⁷ + x⁵ + x⁴ + x² + x + 1	0x04C11DB7	Ethernet, ZIP, PNG, MPEG-2
CRC-64-ECMA	x⁶⁴ + x⁶² + x⁵⁷ + x⁵⁵ + ...	0x42F0E1EBA9EA3693	XZ, HDFS

CRC-32 is the most widely used. Its polynomial was carefully chosen by committee to maximize error detection for common frame sizes in networking (up to ~12,000 bits).

CRC-32 in Detail

Standards Mathematics

CRC-32 (IEEE 802.3 / ITU-T V.42) is defined by the generator polynomial:

G(x) = x³² + x²⁶ + x²³ + x²² + x¹⁶ + x¹² + x¹¹ + x¹⁰ + x⁸ + x⁷ + x⁵ + x⁴ + x² + x + 1

Check bits
appended to data

Non-zero terms
in generator

2³²

Possible CRC values
(~4.3 billion)

Detection Capabilities

All single-bit errors
All double-bit errors (for messages up to 2³¹−1 bits)
All odd numbers of bit errors
All burst errors up to 32 bits long

Practical Note

In Ethernet, CRC-32 is computed with all bits complemented initially, and the final CRC is also complemented. This catches the edge case of inserted/deleted zeros.

Python: CRC from Scratch

Code

def crc_remainder(data: str, generator: str) -> str:
    """Compute CRC remainder using binary polynomial division.

    Args:
        data: Binary string of the message
        generator: Binary string of the generator polynomial
    Returns:
        Binary string of the CRC remainder
    """
    gen_len = len(generator)
    # Append zeros (degree of generator - 1)
    padded = list(data + '0' * (gen_len - 1))

    for i in range(len(data)):
        if padded[i] == '1':  # Only XOR when leading bit is 1
            for j in range(gen_len):
                padded[i + j] = str(int(padded[i + j]) ^ int(generator[j]))

    # Remainder is the last (gen_len - 1) bits
    return ''.join(padded[-(gen_len - 1):])

def crc_check(data_with_crc: str, generator: str) -> bool:
    """Verify CRC by checking if remainder is zero."""
    return all(b == '0' for b in crc_remainder(data_with_crc, generator))

# Example
message = "1101011"
generator = "1011"  # x^3 + x + 1
crc = crc_remainder(message, generator)
print(f"CRC: {crc}")  # "110"
transmitted = message + crc
print(f"Check: {crc_check(transmitted, generator)}")  # True

Python: CRC-32

Code

def crc32(data: bytes) -> int:
    """Compute CRC-32 (IEEE 802.3) from scratch."""
    # CRC-32 polynomial (reversed/reflected representation)
    POLY = 0xEDB88320
    crc = 0xFFFFFFFF  # Initial value (all 1s)

    for byte in data:
        crc ^= byte
        for _ in range(8):  # Process each bit
            if crc & 1:  # LSB is 1
                crc = (crc >> 1) ^ POLY
            else:
                crc >>= 1

    return crc ^ 0xFFFFFFFF  # Final XOR

# Verify against Python's built-in
import binascii

message = b"Hello, World!"
our_crc = crc32(message)
lib_crc = binascii.crc32(message) & 0xFFFFFFFF

print(f"Our CRC-32:  0x{our_crc:08X}")
print(f"Lib CRC-32:  0x{lib_crc:08X}")
print(f"Match: {our_crc == lib_crc}")  # True

The "reflected" form processes bits LSB-first, which matches hardware shift register behavior. The initial value 0xFFFFFFFF and final XOR catch edge cases like leading zeros.

Optimized: Table-Driven CRC

Code

Processing bit-by-bit is slow. A lookup table of 256 pre-computed CRC values lets us process a full byte at a time — an 8× speedup.

def make_crc32_table():
    """Pre-compute CRC-32 lookup table (256 entries)."""
    POLY = 0xEDB88320
    table = []
    for i in range(256):
        crc = i
        for _ in range(8):
            if crc & 1:
                crc = (crc >> 1) ^ POLY
            else:
                crc >>= 1
        table.append(crc)
    return table

CRC32_TABLE = make_crc32_table()

def crc32_fast(data: bytes) -> int:
    """Table-driven CRC-32 (byte-at-a-time)."""
    crc = 0xFFFFFFFF
    for byte in data:
        index = (crc ^ byte) & 0xFF
        crc = (crc >> 8) ^ CRC32_TABLE[index]
    return crc ^ 0xFFFFFFFF

Further optimization: Modern implementations use 4KB or 16KB "slicing-by-4" / "slicing-by-8" tables to process 4 or 8 bytes at once. Intel/AMD CPUs have dedicated CRC32 instructions (SSE 4.2).

Hardware: Shift Register

Applications Mathematics

CRC can be computed in hardware using a Linear Feedback Shift Register (LFSR). For a generator of degree r, you need r flip-flops and a few XOR gates.

LFSR for G(x) = x³ + x + 1

Input
bit

→

⊕

→

R₂

→

⊕

→

R₁

→

R₀

→ feedback

XOR gates are placed where the generator polynomial has coefficient 1 (at x¹ and x⁰). Feedback from R₀ loops back to the XOR gates.

Advantages

Processes 1 bit per clock cycle
Constant-time, no memory needed
Trivial gate count: r flip-flops + wt(G)−1 XOR gates

CRC-32 in Hardware

32 flip-flops
14 XOR gates
At 1 GHz: processes 1 Gbit/sec
Parallel versions process multiple bits/clock

CRC in Ethernet (IEEE 802.3)

Standards Applications

Preamble
7 bytes

SFD
1 byte

Dest MAC
6 bytes

Src MAC
6 bytes

Type/Len
2 bytes

Payload
46-1500 B

FCS (CRC-32)
4 bytes

Frame Check Sequence (FCS)

CRC-32 is computed over the entire frame (destination MAC through payload)
Initial CRC register set to 0xFFFFFFFF (not zero!)
Final CRC is complemented before appending
Receiver computes CRC over entire frame including FCS
If result is the "magic" constant 0xC704DD7B, frame is valid

Every Ethernet frame on the internet is protected by CRC-32. At 10 Gbps, a NIC computes CRC-32 on approximately ~800,000 frames per second.

CRC Across Protocols

Standards Applications

Protocol / Format	CRC Type	Protected Data	Notes
Ethernet (802.3)	CRC-32	Frame data	4-byte FCS field
USB	CRC-5, CRC-16	Token/data packets	CRC-5 for tokens, CRC-16 for data
Bluetooth	CRC-24	PDU data	Adaptive frequency hopping
ZIP / gzip	CRC-32	Compressed data	Detects corruption during storage
PNG images	CRC-32	Each chunk	Per-chunk integrity
HDLC / PPP	CRC-16-CCITT	Frame contents	Used in serial links
SATA / PCIe	CRC-32	Data layer	High-speed bus integrity
iSCSI	CRC-32C	PDU data	Uses Castagnoli polynomial

CRC is so universal that modern CPUs include dedicated CRC instructions. Intel's SSE 4.2 added CRC32 in 2008, computing CRC-32C at hardware speed.

CRC vs. Checksums

Mathematics

Both CRC and checksums detect errors, but they have very different mathematical foundations and error detection capabilities.

Property	Internet Checksum	CRC-32
Algorithm	One's complement sum	Polynomial division (GF(2))
Check bits	16	32
Single-bit errors	All detected	All detected
Burst errors ≤ 16	Some missed	All detected
Byte swaps	Undetected!	Detected
Speed	Very fast	Fast (with tables/HW)
Used in	IP, TCP, UDP headers	Ethernet, storage, protocols

The Internet checksum is weak! It uses 1's complement addition, which means swapping two 16-bit words is undetectable. CRC provides far stronger guarantees but is computed at a lower layer (Ethernet) to compensate.

CRC vs. Cryptographic Hashes

Mathematics

CRC-32

Purpose: Detect accidental errors
Speed: Extremely fast (hardware support)
Output: 32 bits
Security: None — trivially forgeable
Math: Linear (GF(2) polynomial)
CRC(A ⊕ B) = CRC(A) ⊕ CRC(B)

SHA-256

Purpose: Detect tampering (integrity)
Speed: Slower (~500 MB/s in SW)
Output: 256 bits
Security: Collision-resistant
Math: Non-linear (confusion/diffusion)
No algebraic relationship between inputs/outputs

CRC is NOT a security mechanism! Because CRC is linear, an attacker can compute exactly which bits to flip in the message to produce any desired CRC value. Always use cryptographic hashes (SHA-256, BLAKE3) or MACs (HMAC) for integrity against adversaries.

Why CRC Detects But Doesn't Correct

Mathematics

CRC could theoretically correct errors (the syndrome contains information about the error), but there are fundamental and practical reasons it's used only for detection.

Why Not Correct?

Ambiguity: Many different error patterns produce the same syndrome (remainder). CRC-32 has 2³² syndromes but errors can occur in any of millions of bit positions.
Design goal: CRC is optimized for detection, not correction. The generator polynomial is chosen for maximum detection coverage.
Practical: In networking, retransmission (ARQ) is cheap. Correction adds complexity for little benefit.

When Correction Is Needed

Deep space: Reed-Solomon + convolutional (round-trip time too long for ARQ)
Storage: Reed-Solomon in CDs, DVDs, QR codes
Memory: Hamming SECDED in ECC RAM
Wireless: LDPC / turbo codes in 4G/5G

The layered approach: CRC detects at the link layer, and if an error is found, the protocol retransmits (TCP, reliable transport). This separation of concerns is both simple and effective.

Summary & Series Recap

CRC Key Takeaways

CRC uses polynomial division over GF(2) for error detection
The generator polynomial determines detection capabilities
Detects ALL burst errors up to r bits and nearly all longer ones
Efficiently implemented with shift registers or lookup tables
Used in virtually every network protocol and storage format
CRC is for detection only — not correction, not security

Series Summary

01: Foundations — Shannon, capacity, distance, bounds

02: Parity & Simple Codes — detection basics

03: Hamming Codes — first practical FEC

04: CRC — polynomial error detection

Foundations

→

Simple Codes

→

Hamming FEC

→

CRC Detection

→

BCH, RS, LDPC...

CRC & Error Detection

Roadmap

History

GF(2) Polynomials

CRC Computation

Standard CRCs

Hardware

Applications

History of CRC

The Innovation

Why CRC Won

Polynomial Arithmetic over GF(2)

GF(2) Operations

Polynomials over GF(2)

GF(2) Polynomial Operations

Addition (= XOR)

Multiplication

The CRC Concept

CRC Computation: Step by Step

CRC Verification

No Error

With Error

Generator Polynomial Properties

Must-Have Properties

Detection Guarantees

Burst Error Detection

Why CRC Excels at Bursts

CRC-32 Example

Standard CRC Polynomials

CRC-32 in Detail

Detection Capabilities

Practical Note

Python: CRC from Scratch

Python: CRC-32

Optimized: Table-Driven CRC

Hardware: Shift Register

LFSR for G(x) = x3 + x + 1

Advantages

CRC-32 in Hardware

CRC in Ethernet (IEEE 802.3)

Frame Check Sequence (FCS)

CRC Across Protocols

CRC vs. Checksums

CRC vs. Cryptographic Hashes

CRC-32

SHA-256

Why CRC Detects But Doesn't Correct

Why Not Correct?

When Correction Is Needed

Summary & Series Recap

CRC Key Takeaways

Series Summary

LFSR for G(x) = x³ + x + 1