Coding Theory Series — 04

CRC & Error Detection

Polynomial arithmetic for fast, reliable error detection in every network packet

Mathematics Code Standards Applications

Roadmap

History

Peterson's 1961 paper and the origins of CRC

GF(2) Polynomials

The mathematical foundation: arithmetic mod 2

CRC Computation

Polynomial long division step by step

Standard CRCs

CRC-8, CRC-16, CRC-32, CRC-64 and their generators

Hardware

Shift register implementation for blazing-fast CRC

Applications

Ethernet, USB, ZIP, and beyond

History of CRC

History

The Cyclic Redundancy Check was formalized by W. Wesley Peterson in his 1961 paper "Cyclic Codes for Error Detection", published in the Proceedings of the IRE.

The Innovation

Peterson showed that cyclic codes — codes where any cyclic shift of a codeword is also a codeword — could be efficiently computed using polynomial division and implemented with simple shift register circuits.

Why CRC Won

  • Speed: computable with XOR and shift operations
  • Hardware: trivial to implement in circuits
  • Burst detection: excellent at catching burst errors
  • Standardized: widely adopted across all protocols
CRC has been the dominant error detection mechanism in digital communications for over 60 years. Every Ethernet frame, USB packet, and ZIP file uses CRC.

Polynomial Arithmetic over GF(2)

Mathematics

GF(2) is the finite field with two elements: {0, 1}. Arithmetic is modular: addition and subtraction are both XOR. Multiplication is AND.

GF(2) Operations

+01×01
001000
110101

Key: 1 + 1 = 0 (no carry!), and subtraction = addition

Polynomials over GF(2)

Binary strings map to polynomials:

1011 → x3 + x + 1
11001 → x4 + x3 + 1
10111 → x4 + x2 + x + 1

Why GF(2)? Over GF(2), polynomial addition is XOR (bitwise), and polynomial division reduces to repeated XOR with shifts. No multiplication or division of integers — just bit operations.

GF(2) Polynomial Operations

Mathematics

Addition (= XOR)

  (x3 + x + 1) + (x3 + x2 + 1)
= x2 + x

Bitwise: 1011 ⊕ 1101 = 0110

Note: x3 + x3 = 0 and 1 + 1 = 0 (they cancel!)

Multiplication

  (x2 + 1) × (x + 1)
= x3 + x2 + x + 1

(101) × (11) = 1111

Multiply like normal polynomials, then reduce coefficients mod 2

Key identity:   a + a = 0   for all a ∈ GF(2)
Therefore subtraction = addition = XOR
No carries, no borrows. This is what makes GF(2) arithmetic so hardware-friendly. Every operation reduces to shifts and XORs.

The CRC Concept

Mathematics

CRC treats the message as a polynomial M(x) and divides it by a fixed generator polynomial G(x). The remainder is the CRC checksum.

Message
M(x)
Shift left
M(x) · xr
Divide by G(x)
mod G(x)
Remainder
R(x) = CRC
Transmitted: T(x) = M(x) · xr + R(x)
where R(x) = M(x) · xr mod G(x)   and   r = deg(G)
Verification: The receiver divides T(x) by G(x). If the remainder is 0, no error detected. If non-zero, an error occurred. This works because T(x) is constructed to be exactly divisible by G(x).

CRC Computation: Step by Step

Mathematics Interactive

Message: 1101011   Generator: G(x) = x3 + x + 1 = 1011

Step 1: Append r = 3 zeros to message: 1101011000

Step 2: Perform polynomial long division (XOR division):

  1101011000  ÷  1011
  1011
  ----
  0110011000
   1011
   ----
   011111000
    1011
    ----
    01001000
     1011
     ----
     0010000
      1011
      ----
      00110
            
Remainder R(x) = 110  →  Transmit: 1101011110

CRC Verification

Mathematics

The receiver divides the received message (with CRC appended) by the same generator polynomial G(x).

No Error

Received: 1101011110

Divide by 1011:

Remainder = 000

Result: PASS

With Error

Received: 1100011110 (bit 4 flipped)

Divide by 1011:

Remainder = non-zero

Result: FAIL — error detected!

If received R(x) = T(x) + E(x), then:
R(x) mod G(x) = E(x) mod G(x)
Error undetected only if G(x) divides E(x)
CRC fails to detect an error pattern E(x) only when E(x) is a multiple of G(x). By choosing G(x) carefully, we can make this extremely unlikely for common error patterns.

Generator Polynomial Properties

Mathematics

The choice of generator polynomial G(x) determines which errors CRC can and cannot detect. Good generators have specific algebraic properties.

Must-Have Properties

  • G(x) has (x+1) as factor:
    Detects all odd-weight errors
  • G(x) is irreducible over GF(2):
    Maximum-length sequence detection
  • Degree r: produces r-bit CRC value

Detection Guarantees

  • All single-bit errors (if xr + ... + 1)
  • All double-bit errors (if G has appropriate period)
  • All odd numbers of bit errors (if (x+1) | G(x))
  • All burst errors of length ≤ r
  • Fraction 1 − 2−r of longer bursts
Choosing poorly: A generator like G(x) = xr (just a shift) would be useless — it only checks if the last r bits are zero. Real generators must be carefully selected for maximum error coverage.

Burst Error Detection

Mathematics

A burst error of length b is an error pattern where the first and last erroneous bits are b positions apart: E(x) = xi(xb-1 + ... + 1).

A CRC with generator G(x) of degree r detects:
ALL burst errors of length ≤ r
• Fraction (1 − 2−(r−1)) of bursts of length r + 1
• Fraction (1 − 2−r) of bursts of length > r + 1

Why CRC Excels at Bursts

Burst errors are the most common type in practice (noise affects consecutive bits). CRC is specifically designed to catch them because:

A burst of length ≤ r corresponds to a polynomial of degree < r. Since G(x) has degree r, it cannot divide any polynomial of degree < r.

CRC-32 Example

With r = 32:

  • Detects ALL bursts ≤ 32 bits
  • Misses only 1 in 231 bursts of length 33
  • Misses only 1 in 232 bursts of length > 33
  • That's a miss rate of ~0.00000005%

Standard CRC Polynomials

Standards
NamePolynomialHexUsed In
CRC-8 x8 + x2 + x + 1 0x07 ATM headers, I2C
CRC-16-CCITT x16 + x12 + x5 + 1 0x1021 Bluetooth, HDLC, X.25
CRC-16-IBM x16 + x15 + x2 + 1 0x8005 USB, Modbus
CRC-32 x32 + x26 + x23 + x22 + x16 + x12 + x11 + x10 + x8 + x7 + x5 + x4 + x2 + x + 1 0x04C11DB7 Ethernet, ZIP, PNG, MPEG-2
CRC-64-ECMA x64 + x62 + x57 + x55 + ... 0x42F0E1EBA9EA3693 XZ, HDFS
CRC-32 is the most widely used. Its polynomial was carefully chosen by committee to maximize error detection for common frame sizes in networking (up to ~12,000 bits).

CRC-32 in Detail

Standards Mathematics

CRC-32 (IEEE 802.3 / ITU-T V.42) is defined by the generator polynomial:

G(x) = x32 + x26 + x23 + x22 + x16 + x12 + x11 + x10 + x8 + x7 + x5 + x4 + x2 + x + 1
32
Check bits
appended to data
15
Non-zero terms
in generator
232
Possible CRC values
(~4.3 billion)

Detection Capabilities

  • All single-bit errors
  • All double-bit errors (for messages up to 231−1 bits)
  • All odd numbers of bit errors
  • All burst errors up to 32 bits long

Practical Note

In Ethernet, CRC-32 is computed with all bits complemented initially, and the final CRC is also complemented. This catches the edge case of inserted/deleted zeros.

Python: CRC from Scratch

Code
def crc_remainder(data: str, generator: str) -> str:
    """Compute CRC remainder using binary polynomial division.

    Args:
        data: Binary string of the message
        generator: Binary string of the generator polynomial
    Returns:
        Binary string of the CRC remainder
    """
    gen_len = len(generator)
    # Append zeros (degree of generator - 1)
    padded = list(data + '0' * (gen_len - 1))

    for i in range(len(data)):
        if padded[i] == '1':  # Only XOR when leading bit is 1
            for j in range(gen_len):
                padded[i + j] = str(int(padded[i + j]) ^ int(generator[j]))

    # Remainder is the last (gen_len - 1) bits
    return ''.join(padded[-(gen_len - 1):])

def crc_check(data_with_crc: str, generator: str) -> bool:
    """Verify CRC by checking if remainder is zero."""
    return all(b == '0' for b in crc_remainder(data_with_crc, generator))

# Example
message = "1101011"
generator = "1011"  # x^3 + x + 1
crc = crc_remainder(message, generator)
print(f"CRC: {crc}")  # "110"
transmitted = message + crc
print(f"Check: {crc_check(transmitted, generator)}")  # True

Python: CRC-32

Code
def crc32(data: bytes) -> int:
    """Compute CRC-32 (IEEE 802.3) from scratch."""
    # CRC-32 polynomial (reversed/reflected representation)
    POLY = 0xEDB88320
    crc = 0xFFFFFFFF  # Initial value (all 1s)

    for byte in data:
        crc ^= byte
        for _ in range(8):  # Process each bit
            if crc & 1:  # LSB is 1
                crc = (crc >> 1) ^ POLY
            else:
                crc >>= 1

    return crc ^ 0xFFFFFFFF  # Final XOR

# Verify against Python's built-in
import binascii

message = b"Hello, World!"
our_crc = crc32(message)
lib_crc = binascii.crc32(message) & 0xFFFFFFFF

print(f"Our CRC-32:  0x{our_crc:08X}")
print(f"Lib CRC-32:  0x{lib_crc:08X}")
print(f"Match: {our_crc == lib_crc}")  # True
The "reflected" form processes bits LSB-first, which matches hardware shift register behavior. The initial value 0xFFFFFFFF and final XOR catch edge cases like leading zeros.

Optimized: Table-Driven CRC

Code

Processing bit-by-bit is slow. A lookup table of 256 pre-computed CRC values lets us process a full byte at a time — an 8× speedup.

def make_crc32_table():
    """Pre-compute CRC-32 lookup table (256 entries)."""
    POLY = 0xEDB88320
    table = []
    for i in range(256):
        crc = i
        for _ in range(8):
            if crc & 1:
                crc = (crc >> 1) ^ POLY
            else:
                crc >>= 1
        table.append(crc)
    return table

CRC32_TABLE = make_crc32_table()

def crc32_fast(data: bytes) -> int:
    """Table-driven CRC-32 (byte-at-a-time)."""
    crc = 0xFFFFFFFF
    for byte in data:
        index = (crc ^ byte) & 0xFF
        crc = (crc >> 8) ^ CRC32_TABLE[index]
    return crc ^ 0xFFFFFFFF

Further optimization: Modern implementations use 4KB or 16KB "slicing-by-4" / "slicing-by-8" tables to process 4 or 8 bytes at once. Intel/AMD CPUs have dedicated CRC32 instructions (SSE 4.2).

Hardware: Shift Register

Applications Mathematics

CRC can be computed in hardware using a Linear Feedback Shift Register (LFSR). For a generator of degree r, you need r flip-flops and a few XOR gates.

LFSR for G(x) = x3 + x + 1

Input
bit
R2
R1
R0
→ feedback

XOR gates are placed where the generator polynomial has coefficient 1 (at x1 and x0). Feedback from R0 loops back to the XOR gates.

Advantages

  • Processes 1 bit per clock cycle
  • Constant-time, no memory needed
  • Trivial gate count: r flip-flops + wt(G)−1 XOR gates

CRC-32 in Hardware

  • 32 flip-flops
  • 14 XOR gates
  • At 1 GHz: processes 1 Gbit/sec
  • Parallel versions process multiple bits/clock

CRC in Ethernet (IEEE 802.3)

Standards Applications
Preamble
7 bytes
SFD
1 byte
Dest MAC
6 bytes
Src MAC
6 bytes
Type/Len
2 bytes
Payload
46-1500 B
FCS (CRC-32)
4 bytes

Frame Check Sequence (FCS)

  • CRC-32 is computed over the entire frame (destination MAC through payload)
  • Initial CRC register set to 0xFFFFFFFF (not zero!)
  • Final CRC is complemented before appending
  • Receiver computes CRC over entire frame including FCS
  • If result is the "magic" constant 0xC704DD7B, frame is valid
Every Ethernet frame on the internet is protected by CRC-32. At 10 Gbps, a NIC computes CRC-32 on approximately ~800,000 frames per second.

CRC Across Protocols

Standards Applications
Protocol / FormatCRC TypeProtected DataNotes
Ethernet (802.3)CRC-32Frame data4-byte FCS field
USBCRC-5, CRC-16Token/data packetsCRC-5 for tokens, CRC-16 for data
BluetoothCRC-24PDU dataAdaptive frequency hopping
ZIP / gzipCRC-32Compressed dataDetects corruption during storage
PNG imagesCRC-32Each chunkPer-chunk integrity
HDLC / PPPCRC-16-CCITTFrame contentsUsed in serial links
SATA / PCIeCRC-32Data layerHigh-speed bus integrity
iSCSICRC-32CPDU dataUses Castagnoli polynomial
CRC is so universal that modern CPUs include dedicated CRC instructions. Intel's SSE 4.2 added CRC32 in 2008, computing CRC-32C at hardware speed.

CRC vs. Checksums

Mathematics

Both CRC and checksums detect errors, but they have very different mathematical foundations and error detection capabilities.

PropertyInternet ChecksumCRC-32
AlgorithmOne's complement sumPolynomial division (GF(2))
Check bits1632
Single-bit errorsAll detectedAll detected
Burst errors ≤ 16Some missedAll detected
Byte swapsUndetected!Detected
SpeedVery fastFast (with tables/HW)
Used inIP, TCP, UDP headersEthernet, storage, protocols
The Internet checksum is weak! It uses 1's complement addition, which means swapping two 16-bit words is undetectable. CRC provides far stronger guarantees but is computed at a lower layer (Ethernet) to compensate.

CRC vs. Cryptographic Hashes

Mathematics

CRC-32

  • Purpose: Detect accidental errors
  • Speed: Extremely fast (hardware support)
  • Output: 32 bits
  • Security: None — trivially forgeable
  • Math: Linear (GF(2) polynomial)
  • CRC(A ⊕ B) = CRC(A) ⊕ CRC(B)

SHA-256

  • Purpose: Detect tampering (integrity)
  • Speed: Slower (~500 MB/s in SW)
  • Output: 256 bits
  • Security: Collision-resistant
  • Math: Non-linear (confusion/diffusion)
  • No algebraic relationship between inputs/outputs
CRC is NOT a security mechanism! Because CRC is linear, an attacker can compute exactly which bits to flip in the message to produce any desired CRC value. Always use cryptographic hashes (SHA-256, BLAKE3) or MACs (HMAC) for integrity against adversaries.

Why CRC Detects But Doesn't Correct

Mathematics

CRC could theoretically correct errors (the syndrome contains information about the error), but there are fundamental and practical reasons it's used only for detection.

Why Not Correct?

  • Ambiguity: Many different error patterns produce the same syndrome (remainder). CRC-32 has 232 syndromes but errors can occur in any of millions of bit positions.
  • Design goal: CRC is optimized for detection, not correction. The generator polynomial is chosen for maximum detection coverage.
  • Practical: In networking, retransmission (ARQ) is cheap. Correction adds complexity for little benefit.

When Correction Is Needed

  • Deep space: Reed-Solomon + convolutional (round-trip time too long for ARQ)
  • Storage: Reed-Solomon in CDs, DVDs, QR codes
  • Memory: Hamming SECDED in ECC RAM
  • Wireless: LDPC / turbo codes in 4G/5G
The layered approach: CRC detects at the link layer, and if an error is found, the protocol retransmits (TCP, reliable transport). This separation of concerns is both simple and effective.

Summary & Series Recap

CRC Key Takeaways

  • CRC uses polynomial division over GF(2) for error detection
  • The generator polynomial determines detection capabilities
  • Detects ALL burst errors up to r bits and nearly all longer ones
  • Efficiently implemented with shift registers or lookup tables
  • Used in virtually every network protocol and storage format
  • CRC is for detection only — not correction, not security

Series Summary

01: Foundations — Shannon, capacity, distance, bounds

02: Parity & Simple Codes — detection basics

03: Hamming Codes — first practical FEC

04: CRC — polynomial error detection

Foundations
Simple Codes
Hamming FEC
CRC Detection
BCH, RS, LDPC...