Navigate: arrow keys or swipe · Press ESC for overview
Algebraic foundations — groups, rings, fields
Modular arithmetic, properties, and crypto primes
Polynomial basis, irreducible polynomials, XOR arithmetic
AES, ECC curves, and why each field was chosen
SoC, FPGA, software — Montgomery multiplication
GF(28) multiplier, modular arithmetic RTL
Galois field classes, ECC field arithmetic
LUT vs. combinational vs. pipelined comparison
| + | 0 | 1 | 2 | 3 | 4 | 5 | 6 |
|---|---|---|---|---|---|---|---|
| 0 | 0 | 1 | 2 | 3 | 4 | 5 | 6 |
| 1 | 1 | 2 | 3 | 4 | 5 | 6 | 0 |
| 2 | 2 | 3 | 4 | 5 | 6 | 0 | 1 |
| 3 | 3 | 4 | 5 | 6 | 0 | 1 | 2 |
| 4 | 4 | 5 | 6 | 0 | 1 | 2 | 3 |
| 5 | 5 | 6 | 0 | 1 | 2 | 3 | 4 |
| 6 | 6 | 0 | 1 | 2 | 3 | 4 | 5 |
Notice the wrap-around — results never exceed 6
Irreducible polynomial (AES standard):
m(x) = x8 + x4 + x3 + x + 1 (0x11B)
256 elements — each is a byte (8 bits)
The xtime operation (multiply by x) is a left-shift with conditional XOR — the fundamental building block of GF(2n) hardware.
| Standard | Field | Why This Field? | Used For |
|---|---|---|---|
| AES (Rijndael) | GF(28) m(x)=x8+x4+x3+x+1 | Byte-oriented; XOR = no carry chain; efficient S-box via multiplicative inverse | Symmetric encryption (TLS, disk, VPN) |
| AES-GCM | GF(2128) m(x)=x128+x7+x2+x+1 | GHASH authentication tag; 128-bit block matches AES; carry-less multiply (PCLMULQDQ) | Authenticated encryption (HTTPS, IPsec) |
| NIST P-256 | GF(p), p=2256−2224+2192+296−1 | Pseudo-Mersenne prime → fast reduction via shift/add; 128-bit security | TLS certificates, code signing, FIDO2 |
| Curve25519 | GF(p), p=2255−19 | Near-Mersenne prime → single-word reduction; constant-time ladder; cofactor 8 | Signal, WireGuard, SSH, TLS 1.3 |
| Curve448 | GF(p), p=2448−2224−1 | Goldilocks prime: efficient Karatsuba; 224-bit security | High-security TLS, NIST SP 800-186 |
| SM2 | GF(p), 256-bit | National standard prime; similar structure to P-256 | Chinese gov encryption/signatures |
Byte-oriented design
Each byte = one element of GF(28). Maps perfectly to 8-bit processors and memory widths.
S-Box = Multiplicative Inverse
Computes multiplicative inverse in GF(28) followed by affine transform. Provides optimal non-linearity against differential/linear cryptanalysis.
MixColumns = Matrix over GF(28)
Column polynomial multiplied by fixed polynomial c(x) mod x4+1. Provides diffusion.
No carry propagation
Addition is XOR — constant-time, no carries, no branching.
Fast modular reduction
For p = 2255−19, reducing mod p needs only multiply by 19 and add — no division. For P-256, reduction uses shifts and additions.
Constant-time implementation
Curve25519 uses a Montgomery ladder: every scalar bit executes identical operations. The prime allows this with efficient limb arithmetic.
Security margin
P-256 → 128-bit; P-384 → 192-bit; P-521 → ~260-bit. Curve25519 achieves ~128-bit with superior performance.
Rigidity / Verifiability
Curve25519: smallest p ≡ 5 (mod 8) near 2256. Transparent, auditable parameter choices.
Replace division by N with division by R (a power of 2)
MonPro(ã, b̃, N):
R = 2ⁿ (n = bit-width of N)
N' = -N⁻¹ mod R
t = ã × b̃
m = t × N' mod R ← just mask lower n bits
u = (t + m × N) / R ← just right-shift n bits
if u ≥ N: u = u − N
return u (= a·b·R mod N)
mod R = mask lower n bits÷ R = right-shift by n bits1 bit/cycle. Minimal area, n+1 cycles. IoT SoCs.
w-bit words, w×w multiplier + CSA. Best FPGA ECC trade-off.
Pipelined PEs, regular data flow. High throughput. Ideal for ASIC.
1-cycle multiply. Enormous area, maximum throughput. AES-NI.
module gf2_8_mult (
input logic [7:0] a, b,
output logic [7:0] p
);
// AES irreducible polynomial: x^8 + x^4 + x^3 + x + 1
localparam [8:0] POLY = 9'h11B;
logic [7:0] partial [8];
// Bit-serial "Russian peasant" multiply
always_comb begin
partial[0] = a;
for (int i = 0; i < 8; i++) begin
if (i == 0)
p = b[0] ? a : 8'h00;
else
p = p ^ (b[i] ? partial[i] : 8'h00);
// xtime: shift left, reduce if MSB was set
if (i < 7)
partial[i+1] = {partial[i][6:0], 1'b0}
^ (partial[i][7] ? POLY[7:0] : 8'h00);
end
end
endmodule
Russian Peasant Algorithm
Shift-and-add adapted for GF(2n). Process one bit per iteration.
xtime Operation
Left-shift by 1. If MSB was 1, XOR with reduction polynomial.
Irreducible Poly
0x11B ensures 256 unique elements in the field.
Synthesis
Fully combinational. ~100–150 gates. Can be pipelined.
Usage
AES MixColumns (×02, ×03) and S-box (inverse via Itoh-Tsujii).
module mod_add #(parameter WIDTH = 256)(
input logic [WIDTH-1:0] a, b, p,
output logic [WIDTH-1:0] sum
);
logic [WIDTH:0] temp; // 1 extra bit for overflow
always_comb begin
temp = a + b;
sum = (temp >= {1'b0, p}) ? temp - p
: temp[WIDTH-1:0];
end
endmodule
module mod_sub #(parameter WIDTH = 256)(
input logic [WIDTH-1:0] a, b, p,
output logic [WIDTH-1:0] diff
);
always_comb begin
if (a >= b)
diff = a - b;
else
diff = p - (b - a); // wrap around via p
end
endmodule
Parameterized
Same module for 255-bit (Curve25519), 256-bit (P-256), 384-bit (P-384).
Conditional Reduction
a+b < 2p, so one subtraction suffices.
Carry Bit
Extra bit catches overflow when a+b > 2WIDTH.
⚠ Side-Channel Warning
Conditional subtract creates data-dependent timing. Production crypto must use constant-time MUX selection.
class GF2_8:
"""Galois Field GF(2^8) with AES polynomial."""
POLY = 0x11B # x^8 + x^4 + x^3 + x + 1
def __init__(self, val):
self.val = val & 0xFF
def __add__(self, other): # Addition = XOR
return GF2_8(self.val ^ other.val)
def __mul__(self, other): # Russian peasant multiply
a, b, p = self.val, other.val, 0
for _ in range(8):
if b & 1: p ^= a
hi = a & 0x80
a = (a << 1) & 0xFF
if hi: a ^= self.POLY & 0xFF
b >>= 1
return GF2_8(p)
def inverse(self): # Fermat: a^(2^8 - 2) = a^254
r = GF2_8(1)
base = GF2_8(self.val)
for bit in bin(254)[2:]:
r = r * r
if bit == '1': r = r * base
return r
# ── Usage ──
a, b = GF2_8(0x57), GF2_8(0x83)
print(f"0x57 * 0x83 = 0x{(a * b).val:02X}") # 0xC1
print(f"inv(0x57) = 0x{a.inverse().val:02X}") # 0xFE
print(f"0x57 * inv = 0x{(a * a.inverse()).val:02X}") # 0x01 ✓
XOR Addition
No carry chain — just bitwise XOR. Subtraction is identical.
Shift-and-XOR Multiply
Mirrors the hardware: shift left, conditionally XOR with polynomial.
Fermat Inverse
a−1 = apn−2 by Fermat's little theorem. For GF(28): a254.
No Dependencies
Pure Python — no external libraries needed.
class GFp:
"""Prime field GF(p) arithmetic."""
def __init__(self, val, p):
self.val = val % p
self.p = p
def __add__(self, o): return GFp(self.val + o.val, self.p)
def __sub__(self, o): return GFp(self.val - o.val, self.p)
def __mul__(self, o): return GFp(self.val * o.val, self.p)
def __truediv__(self, o): return self * o.inverse()
def __eq__(self, o): return self.val == o.val
def __repr__(self): return f"GFp({self.val})"
def inverse(self): # Fermat's little theorem
return GFp(pow(self.val, self.p - 2, self.p), self.p)
# ── Curve25519 prime field ──
p25519 = 2**255 - 19
a = GFp(42, p25519)
b = GFp(17, p25519)
assert a * b / a == b # field division works ✓
assert a * a.inverse() == GFp(1, p25519) # inverse ✓
# ── P-256 prime field ──
p256 = 2**256 - 2**224 + 2**192 + 2**96 - 1
x = GFp(123456789, p256)
assert x * x.inverse() == GFp(1, p256) # ✓
Modular Arithmetic
Python's % operator handles the reduction. pow(a, p-2, p) uses fast modular exponentiation.
Division = Multiply by Inverse
a / b = a × b−1, where b−1 = bp−2 mod p.
Big Integer Support
Python natively handles 256-bit and 521-bit integers with arbitrary precision.
⚠ Not constant-time
Python's bigint operations are variable-time. Use libraries like pynacl or cryptography for production.
Gbps · Sources: Canright (2005), McCanny (2003), Mathew/Intel (2011)
| Approach | Area | Speed | Side-Ch. |
|---|---|---|---|
| LUT / Table | High (RAM) | Moderate | Risk |
| Composite Field | Low (~80 gates) | Moderate | Good |
| Pipelined | Medium | Very High | Good |
| Bit-sliced (SW) | N/A | High | Excellent |
| Full-parallel | Very High | Maximum | Excellent |
1 Finite fields are the algebraic foundation of nearly all modern cryptography — from AES to ECC to post-quantum schemes.
2 Two families dominate: GF(2n) for symmetric ciphers and GF(p) for public-key / elliptic curve cryptography.
3 Field parameters are chosen for both security and efficiency — Mersenne primes, carry-free XOR, structured reduction.
4 Hardware ranges from minimal bit-serial designs (IoT) to 53+ Gbps ASIC accelerators (data centers).
5 Side-channel resistance is as important as raw performance — constant-time code is essential.
References: FIPS 197 (AES) · NIST SP 800-186 (ECC) · RFC 7748 (Curve25519/448) · Canright, "A Very Compact Rijndael S-box"
github.com/modern-cryptography | Contributions welcome!