🔐

Finite Field Arithmetic
for Cryptography

Theory, Standards, and Hardware/Software Implementation

Modern Cryptography Series · Presentation 1 of 10

Navigate: arrow keys or swipe · Press ESC for overview

Agenda

01 What is a Finite Field?

Algebraic foundations — groups, rings, fields

02 GF(p) — Prime Fields

Modular arithmetic, properties, and crypto primes

03 GF(2ⁿ) — Binary Extension Fields

Polynomial basis, irreducible polynomials, XOR arithmetic

04 Fields in Cryptographic Standards

AES, ECC curves, and why each field was chosen

05 Hardware Implementation

SoC, FPGA, software — Montgomery multiplication

06 SystemVerilog Examples

GF(2⁸) multiplier, modular arithmetic RTL

07 Python Examples

Galois field classes, ECC field arithmetic

08 Performance & Trade-offs

LUT vs. combinational vs. pipelined comparison

What is a Finite Field?

Group

(G, +)

Closure, associativity, identity, inverses

Ring

(R, +, ×)

Group under +; closure & associativity under ×; distributive law

Field

(F, +, ×)

Ring where every non-zero element has a multiplicative inverse

Finite Field

GF(pⁿ)

A field with a finite number of elements. Exists iff |F| = pⁿ (p prime)

Key Properties

Exists uniquely for every prime power pⁿ

Characteristic p: adding 1 to itself p times gives 0

GF(p): prime field — integers mod p

GF(pⁿ): extension field — polynomials mod irreducible poly

All arithmetic stays within the field (closure)

Every non-zero element has a multiplicative inverse

Named after Évariste Galois (1811–1832)

Foundation of AES, ECC, error-correcting codes, and more

GF(p) — Prime Fields

GF(p) = { 0, 1, 2, …, p−1 } where p is prime

All operations performed modulo p. Addition, subtraction, multiplication, and division (via modular inverse) produce results within the field.

Example: GF(7) Addition Table

+	0	1	2	3	4	5	6
0	0	1	2	3	4	5	6
1	1	2	3	4	5	6	0
2	2	3	4	5	6	0	1
3	3	4	5	6	0	1	2
4	4	5	6	0	1	2	3
5	5	6	0	1	2	3	4
6	6	0	1	2	3	4	5

Notice the wrap-around — results never exceed 6

Prime Fields in Cryptography

P-256 128-bit security
p = 2²⁵⁶ − 2²²⁴ + 2¹⁹² + 2⁹⁶ − 1
TLS, ECDSA, ECDH, FIDO2

Curve25519 ~128-bit security
p = 2²⁵⁵ − 19
Signal, WireGuard, SSH, TLS 1.3

P-384 192-bit security
p = 2³⁸⁴ − 2¹²⁸ − 2⁹⁶ + 2³² − 1
Suite B, high-security TLS

P-521 ~260-bit security
p = 2⁵²¹ − 1 (Mersenne prime)

Curve448 224-bit security
p = 2⁴⁴⁸ − 2²²⁴ − 1 (Goldilocks prime)

GF(2ⁿ) — Binary Extension Fields

Elements are polynomials over GF(2) with binary coefficients {0, 1}
Arithmetic mod an irreducible polynomial of degree n. Addition = XOR, Multiplication = polynomial multiply + reduce.

🔒 GF(2⁸) — The AES Field

Irreducible polynomial (AES standard):

m(x) = x⁸ + x⁴ + x³ + x + 1 (0x11B)

256 elements — each is a byte (8 bits)

Add: 0x57 ⊕ 0x83 = 0xD4

01010111 XOR 10000011 = 11010100

Mul: 0x57 × 0x83 = 0xC1 (mod m(x))

Polynomial multiply, then reduce mod x⁸+x⁴+x³+x+1

Why Binary Extension Fields?

Addition = XOR → no carry propagation, single gate per bit

Maps naturally to digital hardware (bits = coefficients)

Constant-time operations → side-channel resistant

Efficient in HW (LUT, shift-XOR) and SW (bitwise ops)

Used in AES, GCM, CRC, Reed-Solomon, McEliece

The xtime operation (multiply by x) is a left-shift with conditional XOR — the fundamental building block of GF(2ⁿ) hardware.

Finite Fields in Cryptographic Standards

Standard	Field	Why This Field?	Used For
AES (Rijndael)	GF(2⁸) m(x)=x⁸+x⁴+x³+x+1	Byte-oriented; XOR = no carry chain; efficient S-box via multiplicative inverse	Symmetric encryption (TLS, disk, VPN)
AES-GCM	GF(2¹²⁸) m(x)=x¹²⁸+x⁷+x²+x+1	GHASH authentication tag; 128-bit block matches AES; carry-less multiply (PCLMULQDQ)	Authenticated encryption (HTTPS, IPsec)
NIST P-256	GF(p), p=2²⁵⁶−2²²⁴+2¹⁹²+2⁹⁶−1	Pseudo-Mersenne prime → fast reduction via shift/add; 128-bit security	TLS certificates, code signing, FIDO2
Curve25519	GF(p), p=2²⁵⁵−19	Near-Mersenne prime → single-word reduction; constant-time ladder; cofactor 8	Signal, WireGuard, SSH, TLS 1.3
Curve448	GF(p), p=2⁴⁴⁸−2²²⁴−1	Goldilocks prime: efficient Karatsuba; 224-bit security	High-security TLS, NIST SP 800-186
SM2	GF(p), 256-bit	National standard prime; similar structure to P-256	Chinese gov encryption/signatures

Why Were These Fields Chosen?

🔒 AES: Why GF(2⁸)?

Byte-oriented design
Each byte = one element of GF(2⁸). Maps perfectly to 8-bit processors and memory widths.

S-Box = Multiplicative Inverse
Computes multiplicative inverse in GF(2⁸) followed by affine transform. Provides optimal non-linearity against differential/linear cryptanalysis.

MixColumns = Matrix over GF(2⁸)
Column polynomial multiplied by fixed polynomial c(x) mod x⁴+1. Provides diffusion.

No carry propagation
Addition is XOR — constant-time, no carries, no branching.

🔑 ECC: Why Pseudo-Mersenne Primes?

Fast modular reduction
For p = 2²⁵⁵−19, reducing mod p needs only multiply by 19 and add — no division. For P-256, reduction uses shifts and additions.

Constant-time implementation
Curve25519 uses a Montgomery ladder: every scalar bit executes identical operations. The prime allows this with efficient limb arithmetic.

Security margin
P-256 → 128-bit; P-384 → 192-bit; P-521 → ~260-bit. Curve25519 achieves ~128-bit with superior performance.

Rigidity / Verifiability
Curve25519: smallest p ≡ 5 (mod 8) near 2²⁵⁶. Transparent, auditable parameter choices.

Hardware Implementation of FF Arithmetic

⚙ SoC / ASIC

Dedicated crypto accelerator blocks (AES-NI in x86)
Composite field: GF(2⁸) → GF(((2²)²)²) reduces gates ~20%
Montgomery multipliers with systolic arrays for ECC
Side-channel: masking, dual-rail logic
Intel 45nm: 53 Gbps native GF((2⁴)²) AES

🔧 FPGA

LUT-based S-box: ~32 LUTs per byte on Xilinx 7-series
Composite field S-box: Canright's method for compact design
Pipelined: 7+ Gbps on Virtex-E
Block RAM for precomputed tables (256×8 AES S-box)
DSP slices for large-field Montgomery multiply

💻 Software

Table-driven: 256-entry lookup for GF(2⁸) (cache-timing risk)
Bitsliced: 64 blocks in parallel via 64-bit registers
SIMD: AVX2/NEON for parallel field ops
PCLMULQDQ for GF(2¹²⁸) in AES-GCM
Limb arithmetic: 5×51-bit for Curve25519

Montgomery Multiplication for Hardware

Core Idea

Replace division by N with division by R (a power of 2)

MonPro(ã, b̃, N):
  R  = 2ⁿ    (n = bit-width of N)
  N' = -N⁻¹ mod R

  t = ã × b̃
  m = t × N' mod R      ← just mask lower n bits
  u = (t + m × N) / R   ← just right-shift n bits
  if u ≥ N: u = u − N
  return u               (= a·b·R mod N)

mod R = mask lower n bits

÷ R = right-shift by n bits

No expensive modular division!

Hardware Architectures

Bit-Serial

1 bit/cycle. Minimal area, n+1 cycles. IoT SoCs.

Word-Serial (CIOS)

w-bit words, w×w multiplier + CSA. Best FPGA ECC trade-off.

Systolic Array

Pipelined PEs, regular data flow. High throughput. Ideal for ASIC.

Full-Parallel

1-cycle multiply. Enormous area, maximum throughput. AES-NI.

SystemVerilog: GF(2⁸) Multiplier

module gf2_8_mult (
  input  logic [7:0] a, b,
  output logic [7:0] p
);
  // AES irreducible polynomial: x^8 + x^4 + x^3 + x + 1
  localparam [8:0] POLY = 9'h11B;
  logic [7:0] partial [8];

  // Bit-serial "Russian peasant" multiply
  always_comb begin
    partial[0] = a;
    for (int i = 0; i < 8; i++) begin
      if (i == 0)
        p = b[0] ? a : 8'h00;
      else
        p = p ^ (b[i] ? partial[i] : 8'h00);

      // xtime: shift left, reduce if MSB was set
      if (i < 7)
        partial[i+1] = {partial[i][6:0], 1'b0}
                      ^ (partial[i][7] ? POLY[7:0] : 8'h00);
    end
  end
endmodule

How It Works

Russian Peasant Algorithm
Shift-and-add adapted for GF(2ⁿ). Process one bit per iteration.

xtime Operation
Left-shift by 1. If MSB was 1, XOR with reduction polynomial.

Irreducible Poly
0x11B ensures 256 unique elements in the field.

Synthesis
Fully combinational. ~100–150 gates. Can be pipelined.

Usage
AES MixColumns (×02, ×03) and S-box (inverse via Itoh-Tsujii).

SystemVerilog: Modular Arithmetic for GF(p)

module mod_add #(parameter WIDTH = 256)(
  input  logic [WIDTH-1:0] a, b, p,
  output logic [WIDTH-1:0] sum
);
  logic [WIDTH:0] temp;  // 1 extra bit for overflow
  always_comb begin
    temp = a + b;
    sum  = (temp >= {1'b0, p}) ? temp - p
                               : temp[WIDTH-1:0];
  end
endmodule

module mod_sub #(parameter WIDTH = 256)(
  input  logic [WIDTH-1:0] a, b, p,
  output logic [WIDTH-1:0] diff
);
  always_comb begin
    if (a >= b)
      diff = a - b;
    else
      diff = p - (b - a);  // wrap around via p
  end
endmodule

Design Notes

Parameterized
Same module for 255-bit (Curve25519), 256-bit (P-256), 384-bit (P-384).

Conditional Reduction
a+b < 2p, so one subtraction suffices.

Carry Bit
Extra bit catches overflow when a+b > 2^WIDTH.

⚠ Side-Channel Warning
Conditional subtract creates data-dependent timing. Production crypto must use constant-time MUX selection.

Python: GF(2⁸) for AES

class GF2_8:
    """Galois Field GF(2^8) with AES polynomial."""
    POLY = 0x11B  # x^8 + x^4 + x^3 + x + 1

    def __init__(self, val):
        self.val = val & 0xFF

    def __add__(self, other):        # Addition = XOR
        return GF2_8(self.val ^ other.val)

    def __mul__(self, other):        # Russian peasant multiply
        a, b, p = self.val, other.val, 0
        for _ in range(8):
            if b & 1:  p ^= a
            hi = a & 0x80
            a = (a << 1) & 0xFF
            if hi:     a ^= self.POLY & 0xFF
            b >>= 1
        return GF2_8(p)

    def inverse(self):               # Fermat: a^(2^8 - 2) = a^254
        r = GF2_8(1)
        base = GF2_8(self.val)
        for bit in bin(254)[2:]:
            r = r * r
            if bit == '1':  r = r * base
        return r

# ── Usage ──
a, b = GF2_8(0x57), GF2_8(0x83)
print(f"0x57 * 0x83 = 0x{(a * b).val:02X}")       # 0xC1
print(f"inv(0x57)   = 0x{a.inverse().val:02X}")     # 0xFE
print(f"0x57 * inv  = 0x{(a * a.inverse()).val:02X}") # 0x01 ✓

Key Points

XOR Addition
No carry chain — just bitwise XOR. Subtraction is identical.

Shift-and-XOR Multiply
Mirrors the hardware: shift left, conditionally XOR with polynomial.

Fermat Inverse
a⁻¹ = a^pⁿ−2 by Fermat's little theorem. For GF(2⁸): a²⁵⁴.

No Dependencies
Pure Python — no external libraries needed.

Python: GF(p) for Elliptic Curves

class GFp:
    """Prime field GF(p) arithmetic."""
    def __init__(self, val, p):
        self.val = val % p
        self.p = p

    def __add__(self, o):  return GFp(self.val + o.val, self.p)
    def __sub__(self, o):  return GFp(self.val - o.val, self.p)
    def __mul__(self, o):  return GFp(self.val * o.val, self.p)
    def __truediv__(self, o): return self * o.inverse()
    def __eq__(self, o):   return self.val == o.val
    def __repr__(self):    return f"GFp({self.val})"

    def inverse(self):     # Fermat's little theorem
        return GFp(pow(self.val, self.p - 2, self.p), self.p)

# ── Curve25519 prime field ──
p25519 = 2**255 - 19
a = GFp(42, p25519)
b = GFp(17, p25519)
assert a * b / a == b          # field division works ✓
assert a * a.inverse() == GFp(1, p25519)  # inverse ✓

# ── P-256 prime field ──
p256 = 2**256 - 2**224 + 2**192 + 2**96 - 1
x = GFp(123456789, p256)
assert x * x.inverse() == GFp(1, p256)    # ✓

Key Points

Modular Arithmetic
Python's % operator handles the reduction. pow(a, p-2, p) uses fast modular exponentiation.

Division = Multiply by Inverse
a / b = a × b⁻¹, where b⁻¹ = b^p−2 mod p.

Big Integer Support
Python natively handles 256-bit and 521-bit integers with arbitrary precision.

⚠ Not constant-time
Python's bigint operations are variable-time. Use libraries like pynacl or cryptography for production.

Performance & Trade-offs

AES-128 Throughput by Implementation

SW (OpenSSL)

3.2

FPGA LUT

5.5

FPGA Composite

7.0

FPGA Pipelined

21.5

ASIC (45nm)

53.0

Gbps · Sources: Canright (2005), McCanny (2003), Mathew/Intel (2011)

Implementation Trade-offs

Approach	Area	Speed	Side-Ch.
LUT / Table	High (RAM)	Moderate	Risk
Composite Field	Low (~80 gates)	Moderate	Good
Pipelined	Medium	Very High	Good
Bit-sliced (SW)	N/A	High	Excellent
Full-parallel	Very High	Maximum	Excellent

Key Takeaways

1 Finite fields are the algebraic foundation of nearly all modern cryptography — from AES to ECC to post-quantum schemes.

2 Two families dominate: GF(2ⁿ) for symmetric ciphers and GF(p) for public-key / elliptic curve cryptography.

3 Field parameters are chosen for both security and efficiency — Mersenne primes, carry-free XOR, structured reduction.

4 Hardware ranges from minimal bit-serial designs (IoT) to 53+ Gbps ASIC accelerators (data centers).

5 Side-channel resistance is as important as raw performance — constant-time code is essential.

References: FIPS 197 (AES) · NIST SP 800-186 (ECC) · RFC 7748 (Curve25519/448) · Canright, "A Very Compact Rijndael S-box"

Modern Cryptography Series

Finite Field Arithmetic

This presentation

Elliptic Curve Cryptography

Coming next

AES — Design & Implementation

Planned

Hash Functions & MACs

Planned

Public Key Cryptography

Planned

Digital Signatures (ECDSA, EdDSA)

Planned

Post-Quantum Cryptography

Planned

Key Exchange Protocols

Planned

Side-Channel Attacks

Planned

Crypto HW Accelerator Design

Planned

github.com/modern-cryptography | Contributions welcome!

Finite Field Arithmeticfor Cryptography

Agenda

01 What is a Finite Field?

02 GF(p) — Prime Fields

03 GF(2n) — Binary Extension Fields

04 Fields in Cryptographic Standards

05 Hardware Implementation

06 SystemVerilog Examples

07 Python Examples

08 Performance & Trade-offs

What is a Finite Field?

Key Properties

GF(p) — Prime Fields

Example: GF(7) Addition Table

Prime Fields in Cryptography

GF(2n) — Binary Extension Fields

🔒 GF(28) — The AES Field

Why Binary Extension Fields?

Finite Fields in Cryptographic Standards

Why Were These Fields Chosen?

🔒 AES: Why GF(28)?

🔑 ECC: Why Pseudo-Mersenne Primes?

Hardware Implementation of FF Arithmetic

⚙ SoC / ASIC

🔧 FPGA

💻 Software

Montgomery Multiplication for Hardware

Core Idea

Hardware Architectures

Bit-Serial

Word-Serial (CIOS)

Systolic Array

Full-Parallel

SystemVerilog: GF(28) Multiplier

How It Works

SystemVerilog: Modular Arithmetic for GF(p)

Design Notes

Python: GF(28) for AES

Key Points

Python: GF(p) for Elliptic Curves

Key Points

Performance & Trade-offs

AES-128 Throughput by Implementation

Implementation Trade-offs

Key Takeaways

Modern Cryptography Series

Finite Field Arithmetic
for Cryptography

03 GF(2ⁿ) — Binary Extension Fields

GF(2ⁿ) — Binary Extension Fields

🔒 GF(2⁸) — The AES Field

🔒 AES: Why GF(2⁸)?

SystemVerilog: GF(2⁸) Multiplier

Python: GF(2⁸) for AES