Modern Cryptography Series — 02

Elliptic Curve
Cryptography

From the mathematics of elliptic curves over finite fields through NIST/IETF standards to hardware accelerator design and side-channel resistant implementations.

Mathematics Standards Hardware Code Attacks Interactive

Arrow keys or swipe to navigate · Esc for overview · F fullscreen

Roadmap

I. Foundations

Elliptic curves over ℝ, the group law, geometric intuition, algebraic addition formulas

II. Curves over Finite Fields

𝔽p arithmetic, ECDLP, Hasse's theorem, scalar multiplication algorithms

III. Curve Forms

Short Weierstrass, Montgomery, twisted Edwards — coordinate systems & birational maps

IV. Standards & Protocols

NIST P-256/P-384/P-521, Curve25519, Curve448, ECDH, ECDSA, EdDSA

V. Attacks & Security

BSGS, Pollard's ρ, Pohlig-Hellman, MOV, Smart's attack, invalid curve attacks

VI. Hardware & Implementation

FPGA/ASIC accelerators, Montgomery ladder, projective coordinates, side-channel countermeasures

Why Elliptic Curve Cryptography?

The core advantage: equivalent security with dramatically smaller keys.

Security LevelRSA KeyECC KeyRatio
80 bits1024 bits160 bits6:1
112 bits2048 bits224 bits9:1
128 bits3072 bits256 bits12:1
192 bits7680 bits384 bits20:1
256 bits15360 bits521 bits30:1

No subexponential algorithm is known for the ECDLP on properly chosen curves — unlike integer factorization (GNFS) or the DLP in 𝔽p* (index calculus).

ECC is deployed in TLS 1.3, SSH, Bitcoin/Ethereum, Signal Protocol, Apple iMessage, WhatsApp, code signing, passkeys/FIDO2, and virtually all modern PKI.

Elliptic Curves over ℝ

Mathematics

An elliptic curve in short Weierstrass form:

y² = x³ + ax + b

where 4a³ + 27b² ≠ 0 (non-singular condition — no cusps or self-intersections).

The set of points (x, y) satisfying this equation, together with a special point at infinity 𝒪, forms an abelian group under a geometric addition law.

Discriminant

Δ = −16(4a³ + 27b²)

Δ > 0 → two connected components
Δ < 0 → one connected component

y² = x³ − x + 1 over ℝ

The Group Law — Geometric

Mathematics

Point Addition (P ≠ Q)

  1. Draw a line through P and Q
  2. The line intersects the curve at a third point R′
  3. Reflect R′ over the x-axis to get R = P + Q

Point Doubling (P = Q)

  1. Draw the tangent to the curve at P
  2. The tangent intersects the curve at R′
  3. Reflect R′ to get R = 2P
Identity element: The point at infinity 𝒪.
Inverse: −(x, y) = (x, −y) (reflection across x-axis).
Closure, associativity, commutativity all hold → abelian group.

Click canvas to cycle: add → double → inverse

The Group Law — Algebraic Formulas

Mathematics

For the curve y² = x³ + ax + b with P = (x₁, y₁) and Q = (x₂, y₂):

Addition (P ≠ Q)

λ = (y₂ − y₁) / (x₂ − x₁)
x₃ = λ² − x₁ − x₂
y₃ = λ(x₁ − x₃) − y₁

Slope of secant line through P and Q

Doubling (P = Q)

λ = (3x₁² + a) / (2y₁)
x₃ = λ² − 2x₁
y₃ = λ(x₁ − x₃) − y₁

Slope of tangent at P (implicit differentiation)

Special Cases

  • P + 𝒪 = P (identity)
  • P + (−P) = 𝒪 where −P = (x₁, −y₁) (inverse)
  • If y₁ = 0 when doubling → 2P = 𝒪 (point of order 2)

Cost: addition ≈ 1I + 2M + 1S in affine coordinates (I = field inversion, M = multiplication, S = squaring)

Elliptic Curves over Finite Fields

Mathematics

Replace ℝ with the prime field 𝔽p = {0, 1, …, p−1}. All arithmetic is mod p.

y² ≡ x³ + ax + b  (mod p)

The set E(𝔽p) consists of all pairs (x, y) ∈ 𝔽p × 𝔽p satisfying the curve equation, plus 𝒪.

Hasse's Theorem: The number of points on E(𝔽p) satisfies
|#E(𝔽p) − (p + 1)| ≤ 2√p

The same algebraic formulas apply — just replace division with modular inverse (via extended Euclidean algorithm or Fermat's little theorem: a−1 ≡ ap−2 mod p).

Key insight: Over 𝔽p, the points form a finite abelian group. The geometric "line-intersect-reflect" still applies algebraically even though the visual geometry is lost.

y² ≡ x³ + 2x + 3 (mod 97) — 100 points shown

The Elliptic Curve Discrete Logarithm Problem

Mathematics

Definition (ECDLP): Given an elliptic curve E over 𝔽p, a generator point G of order n, and a point Q = kG, find the scalar k.

The Trapdoor

  • Easy (forward): Given k and G, compute Q = kG in O(log k) group operations using double-and-add
  • Hard (reverse): Given Q and G, find k — best known generic algorithm is O(√n) with Pollard's ρ

Security Levels

Curve SizeECDLP Work≈ Security
160 bits28080-bit
256 bits2128128-bit
384 bits2192192-bit
521 bits2260256-bit

No subexponential classical algorithm exists for properly chosen curves. Compare: RSA-3072 has security ~2128 but uses the GNFS which is subexponential — this is why ECC keys are so much shorter.

Scalar Multiplication Algorithms

Mathematics Hardware

Computing Q = kG efficiently is the critical operation in every ECC protocol.

Double-and-Add (Binary Method)

def scalar_mult(k, P):
    """Left-to-right binary method"""
    R = INFINITY
    for bit in bin(k)[2:]:  # MSB to LSB
        R = point_double(R)
        if bit == '1':
            R = point_add(R, P)
    return R

Cost: ≈ log₂(k) doublings + (HW(k)−1) additions, where HW = Hamming weight

NAF (Non-Adjacent Form)

Recode k using digits {−1, 0, 1} with no two consecutive non-zero digits. Average density: 1/3 vs 1/2 for binary.

Example: k = 7
Binary: 111 → 3 adds
NAF: 100̄1 (= 8 − 1) → 1 add + 1 sub

Windowed Methods

w-NAF: precompute {G, 3G, 5G, …, (2w−1)G}, then scan k in windows of w bits. Reduces additions to ≈ n/(w+1).

The Montgomery Ladder

Hardware Constant-Time

Peter Montgomery (1987): a constant-time scalar multiplication using only x-coordinates.

def montgomery_ladder(k, P):
    """Constant-time scalar mult"""
    R0 = INFINITY
    R1 = P
    for bit in bin(k)[2:]:  # MSB first
        if bit == '0':
            R1 = point_add(R0, R1)
            R0 = point_double(R0)
        else:
            R0 = point_add(R0, R1)
            R1 = point_double(R1)
    return R0

Key Properties

  • Constant-time: always performs 1 add + 1 double per bit — no branching on secret data
  • x-only: works with projective X:Z coordinates on Montgomery curves — no y needed until recovery
  • Differential: maintains R1 − R0 = P at every step
  • Side-channel resistant: uniform execution pattern defeats SPA

Cost on Montgomery curve: ≈ 5M + 4S + 1×(a24) per bit, where a24 = (A−2)/4. For Curve25519, a24 = 121665.

Curve Forms: Short Weierstrass

Mathematics
y² = x³ + ax + b   over 𝔽p,   char ≠ 2, 3

Properties

  • Most general standard form
  • Used by NIST P-curves and secp256k1
  • Can represent any elliptic curve (up to isomorphism) when char(K) ≠ 2, 3
  • Prime-order groups possible (cofactor h = 1)

Coordinate Systems

CoordinatesAdditionDoubling
Affine1I+2M+1S1I+2M+2S
Projective12M+2S5M+6S
Jacobian11M+5S1M+8S
Mixed Jacobian7M+4S

NIST P-256 Optimization: a = −3

All NIST prime curves use a = −3, which saves one multiplication in the doubling formula:

λ = 3(x₁ − Z₁²)(x₁ + Z₁²)
instead of 3x₁² + a·Z₁⁴

Limitation: Addition formulas have exceptional cases — P = Q, P = −Q, P = 𝒪 — requiring conditional branches, which leak via side-channels.

Complete addition formulas for prime-order Weierstrass curves exist (Renes, Costello, Batina 2016) but cost more: 12M + 2S + 6a + 1b.

Curve Forms: Montgomery

Mathematics
Bv² = u³ + Au² + u   A ≠ ±2,   B ≠ 0

Properties

  • Introduced by Peter Montgomery (1987) for ECM factorization
  • Group order always divisible by 4 (cofactor h ≥ 4)
  • x-only arithmetic: differential addition uses only X:Z projective coords
  • Naturally constant-time via the Montgomery ladder

Differential Addition (x-only)

Given x(P), x(Q), x(P−Q):
X₊ = Z₋ · (X_P·X_Q − Z_P·Z_Q)²
Z₊ = X₋ · (X_P·Z_Q − Z_P·X_Q)²

Cost: 4M + 2S per step. Doubling: 2M + 2S + 1×a24

Key Curves

CurveApSecurity
Curve255194866622²⁵⁵ − 19~128 bit
Curve448390812⁴⁴⁸ − 2²²⁴ − 1~224 bit

Why these primes? Pseudo-Mersenne form enables fast modular reduction with a few additions/shifts instead of expensive division.

Curve25519: p = 2²⁵⁵ − 19 → reduction costs ~1 multiplication by 19.
Curve448: Goldilocks prime → Karatsuba-friendly 2×224-bit limb decomposition.

Curve Forms: (Twisted) Edwards

Mathematics
ax² + y² = 1 + dx²y²   (twisted Edwards, a ≠ d)

Addition Law

(x₁, y₁) + (x₂, y₂) =
 ( (x₁y₂ + y₁x₂) / (1 + dx₁x₂y₁y₂),
   (y₁y₂ − ax₁x₂) / (1 − dx₁x₂y₁y₂) )
Complete addition law: When d is not a square in 𝔽p, the denominators are never zero for any inputs on the curve. No exceptional cases — the same formula handles addition, doubling, identity, and inverses.

Neutral element: (0, 1)
Inverse: −(x, y) = (−x, y)

Why Edwards Curves Matter

  • Complete: No branches on secret data → inherently side-channel resistant
  • Unified: Same formula for add and double → simplifies hardware
  • Fast: Extended coordinates: 8M for add, 4M+4S for double

edwards25519

−x² + y² = 1 + d·x²y²
d = −121665/121666 mod p
Birationally equivalent to Curve25519
Used by Ed25519 (EdDSA signatures)

Birational Equivalences

Mathematics

The three curve forms are related by rational maps that preserve the group structure.

Montgomery
Bv² = u³ + Au² + u
Twisted Edwards
ax² + y² = 1 + dx²y²
Weierstrass
Y² = X³ + aX + b

Montgomery ↔ Twisted Edwards

(u, v) → (x, y) = (u/v, (u−1)/(u+1))
a = (A+2)/B,   d = (A−2)/B

Exceptions: points where v = 0 or u = −1

Montgomery → Weierstrass

(u, v) → (X, Y) = (u/B + A/(3B), v/B)
aW = (3 − A²)/(3B²)
bW = (2A³ − 9A)/(27B³)

Practical Implications

X25519 (ECDH): Uses Montgomery x-only ladder on Curve25519
Ed25519 (EdDSA): Uses twisted Edwards addition on edwards25519
Same underlying curve — keys can be converted between the two forms

Coordinate Systems in Practice

Hardware

Avoiding field inversions is crucial — 1 inversion ≈ 80–100 multiplications.

SystemRepresentationAdd CostDouble CostNotes
Affine(x, y)1I+2M+1S1I+2M+2SInversion per operation
Projective(X:Y:Z), x=X/Z, y=Y/Z12M+2S5M+6SWeierstrass
Jacobian(X:Y:Z), x=X/Z², y=Y/Z³11M+5S1M+8SFast doubling; NIST P-curves
Extended(X:Y:Z:T), xy=T/Z8M4M+4STwisted Edwards; Ed25519
Montgomery x-only(X:Z), x=X/Z4M+2S*2M+2S+1DDifferential; X25519
Co-ZShared Z coordinate5M+2SNIST P-256 ladder

*Montgomery addition requires the difference point. D = multiplication by curve constant a24.

Strategy: Perform all intermediate operations in projective/Jacobian/extended coordinates, paying zero inversions, then convert back to affine at the very end with a single inversion.

Standard Elliptic Curves

Standards
CurveFormFieldOrder BitsCofactorSecurityStandards
P-256Weierstrass𝔽p, p = 2²⁵⁶ − 2²²⁴ + 2¹⁹² + 2⁹⁶ − 12561128FIPS 186-5, SP 800-186, Suite B
P-384Weierstrass𝔽p, 384-bit NIST prime3841192FIPS 186-5, CNSA 1.0
P-521Weierstrass𝔽p, p = 2⁵²¹ − 15211256FIPS 186-5
Curve25519Montgomery𝔽p, p = 2²⁵⁵ − 192538~128RFC 7748, SP 800-186
Curve448Montgomery𝔽p, p = 2⁴⁴⁸ − 2²²⁴ − 14464~224RFC 7748, SP 800-186
edwards25519Twisted Edwards(same as Curve25519)2538~128RFC 8032 (Ed25519)
edwards448Edwards(same as Curve448)4464~224RFC 8032 (Ed448)
secp256k1Weierstrass𝔽p, p = 2²⁵⁶ − 2³² − 9772561128SEC 2 (Bitcoin)
BrainpoolP256r1Weierstrass256-bit random prime2561128RFC 5639
NIST SP 800-186 (2023): Adds Curve25519/Curve448 and their Edwards forms. Deprecates all binary field curves. Recommends prime-field curves only for new implementations.

Deep Dive: Curve25519 / X25519 / Ed25519

Standards

Design Principles (Bernstein, 2006)

  • Prime shape: p = 2²⁵⁵ − 19 enables fast reduction (multiply high limbs by 19 and add to low limbs)
  • Laddering: Montgomery form → constant-time x-only ladder
  • Cofactor 8: Clamping clears low 3 bits of scalar (multiple of 8) and sets bit 254 — prevents small-subgroup attacks and ensures constant-time
  • Twist security: Both curve and quadratic twist have large prime-order subgroups — protects against invalid-curve faults
  • A = 486662: Smallest A satisfying all security criteria

Performance

OperationTime (Haswell)
X25519 ECDH~145,000 cycles
Ed25519 sign~87,000 cycles
Ed25519 verify~207,000 cycles
P-256 ECDH~260,000 cycles

Deployment

TLS 1.3 (default key exchange), Signal Protocol, WhatsApp, SSH (Ed25519 keys), WireGuard, age encryption, FIDO2/Passkeys, Tor

ECDH: Elliptic Curve Diffie-Hellman

Protocol
AliceChannelBob
Choose private key a ∈ [1, n−1]Choose private key b ∈ [1, n−1]
Compute A = aGA →
← BCompute B = bG
Compute S = aB = abGCompute S = bA = baG
Shared secret: S = abG

X25519 (RFC 7748)

  1. Clamp 32-byte private key (clear bits 0–2, set bit 254, clear bit 255)
  2. Montgomery ladder on u-coordinate only
  3. Output: 32-byte shared secret
  4. Derive symmetric key via HKDF

ECDH on NIST P-256

  1. Validate peer's public key is on curve
  2. Scalar multiply with private key
  3. Output: x-coordinate of shared point
  4. Feed into KDF (SP 800-56C)

Must validate public key — otherwise invalid curve attacks!

ECDSA: Elliptic Curve Digital Signature Algorithm

Protocol

Key Generation

Private key: d ∈ [1, n−1] random
Public key: Q = dG

Signing (message m)

  1. e = HASH(m), truncate to bitlength of n
  2. Choose random k ∈ [1, n−1]
  3. Compute (x₁, y₁) = kG
  4. r = x₁ mod n  (if r = 0, restart)
  5. s = k⁻¹(e + rd) mod n  (if s = 0, restart)
  6. Signature = (r, s)

Verification

  1. Check r, s ∈ [1, n−1]
  2. e = HASH(m)
  3. w = s⁻¹ mod n
  4. u₁ = ew mod n,   u₂ = rw mod n
  5. Compute (x₁, y₁) = u₁G + u₂Q
  6. Accept if x₁ mod n = r
Critical: The nonce k must be truly random and unique per signature. Reusing k leaks the private key directly:
d = (s·k − e) · r⁻¹ mod n

RFC 6979: Deterministic ECDSA — derives k = HMAC-DRBG(d, HASH(m)), eliminating random nonce failures.

EdDSA: Edwards-Curve Digital Signature Algorithm

Protocol

RFC 8032 — deterministic Schnorr-type signatures on twisted Edwards curves.

Ed25519 Signing

  1. Private seed: 32 random bytes → SHA-512 → (a, prefix)
    a = scalar (clamped), prefix = 32 bytes
  2. r = SHA-512(prefix ‖ m) mod ℓ
    Deterministic nonce — no RNG needed!
  3. R = rB (B = base point on edwards25519)
  4. S = (r + SHA-512(R ‖ A ‖ m) · a) mod ℓ
  5. Signature: (R, S) = 64 bytes

Verification

Check: 8·S·B = 8·R + 8·SHA-512(R‖A‖m)·A
(cofactor 8 multiplication clears small-order components)

EdDSA vs ECDSA

PropertyECDSAEdDSA
NonceRandom (or RFC 6979)Deterministic
Curve formWeierstrassTwisted Edwards
Addition lawIncompleteComplete
Signature size~72 bytes (DER)64 bytes (fixed)
Batch verifyLimitedNative support
Malleabilitys ↔ n−sCanonical S
StandardsFIPS 186-5RFC 8032

Attacks on the ECDLP

Attacks
AttackComplexityRequirementsCountermeasure
Baby-Step Giant-StepO(√n) time, O(√n) spaceGeneric groupn ≥ 2²⁵⁶
Pollard's ρO(√(πn/2)) time, O(1) spaceGeneric groupn ≥ 2²⁵⁶
Pohlig-HellmanO(√pmax)n has small prime factorsUse (near-)prime order n
MOV / Frey-RückSubexponential (via pairings)Low embedding degreeEmbedding degree k ≥ 20
Smart's AttackO(n) — linear!Anomalous curve: #E = pEnsure #E ≠ p
Invalid CurveVariesNo point validationValidate public keys
Small SubgroupO(√h)Large cofactor + no cofactor multCofactor validation / clamping
Shor's (Quantum)PolynomialCRQC (~2330 qubits for 256-bit)Post-quantum migration

SafeCurves Criteria

Daniel Bernstein & Tanja Lange's SafeCurves project evaluates curves against: ECDLP security, twist security, completeness, indistinguishability, and implementation safety. Curve25519 and Curve448 pass all criteria. Several NIST curves do not pass all (e.g., twist security, rigidity).

Real-World ECC Failures

Case Studies

Sony PS3 ECDSA Nonce Reuse (2010)

Sony used a static value of k for all ECDSA signatures on PS3 firmware. fail0verflow recovered the private key with basic algebra: d = (s·k − e)·r⁻¹ mod n. Complete compromise of PS3 code signing.

Android Bitcoin Wallet RNG Failure (2013)

Android's SecureRandom produced duplicate ECDSA nonces across different transactions. Attackers extracted private keys from the blockchain, stealing ~55 BTC.

Minerva Timing Attack (2019)

Timing leakage in scalar multiplication on smart cards allowed recovery of ECDSA private keys via lattice-based analysis of biased nonces. Affected Athena IDProtect and other JavaCard implementations.

Dual_EC_DRBG NSA Backdoor (2013)

The NIST-standardized PRNG used two EC points P and Q. If Q = eP for a secret e known to NSA, the output could predict future random numbers, compromising any ECC keys generated with it.

ECC Hardware Accelerator Architecture

Hardware

Three-level hierarchy for scalar multiplication Q = kP:

Level 3: Scalar Multiplication
Montgomery Ladder / Double-and-Add / w-NAF
Level 2: Point Operations
Point Addition · Point Doubling · Mixed Addition
Level 1: Modular Arithmetic
Mod Multiply · Mod Square · Mod Add/Sub · Mod Inverse
Level 0: Multi-Precision Arithmetic
Big-integer multiply (Karatsuba / Schoolbook) · Reduction (Barrett / Montgomery)

Performance Bottleneck

Modular multiplication dominates: ~90% of total computation. One 256-bit scalar multiplication requires ~3000-4000 modular multiplications.

Parallelism Opportunity

Point add needs 11-16 sequential field multiplications. Pipelining & parallel multiplier banks (2× or 4×) can cut latency significantly.

Montgomery Modular Multiplication

Hardware

The workhorse of ECC hardware — avoids expensive trial division.

Algorithm

Work in "Montgomery domain": ā = a·R mod p, where R = 2n.

MonPro(ā, b̄) = ā·b̄·R⁻¹ mod p

1. t = ā · b̄
2. u = (t · (−p⁻¹ mod R)) mod R
3. result = (t + u·p) / R
4. if result ≥ p: result −= p

The division by R is a simple right-shift since R = 2n

Hardware Variants

ArchitectureLatencyArea
Bit-serialO(n²)O(n)
Word-serial (w bits)O(n²/w)O(n·w)
Systolic arrayO(n)O(n²)
Full parallelO(1)*O(n²)

*Pipelined — one result per clock after filling

For Curve25519: the prime p = 2²⁵⁵ − 19 allows multiplier-less reduction — multiply overflow by 19 (shift + add).

Implementation Results: FPGA & ASIC

Hardware
DesignCurvePlatformAreaFreqLatencyThroughput
Hossain 2016P-256Virtex-55,710 slices256 MHz5.26 ms48.7 kbps
Marzouqi 2019P-256Virtex-73,200 LUTs285 MHz1.39 ms184 kbps
Amiet 2021P-256Virtex-723.7k LUTs225 MHz0.56 ms457 kbps
Scientific Reports 2025Ed25519Virtex-5117.8 MHz1.4 ms183 kbps
Unified Curve25519+Curve448Both28nm ASIC1096 kGE100 MHz0.43 ms (Curve25519)
Low-power ASICCurve2551965nm49 kGE65 MHz3.9 ms
ANSSI IPECCAny GF(p)Zynq-7000~8k LUTs + DSPs100 MHz~3 msFlexible

Design Trade-offs

Area-optimized: Single multiplier, sequential scheduling. Good for IoT/smartcards (~50 kGE).

Performance-optimized: 4× parallel multiplier banks, pipelined. Good for TLS offload (~1000 kGE).

Side-Channel Countermeasures for ECC

Hardware Defense
AttackCountermeasureCost
SPA (Simple Power Analysis)Montgomery ladder / unified formulas / regular execution patternsMinimal
DPA (Differential Power Analysis)Scalar blinding: k′ = k + r·n (random r)+1 big multiply
DPAPoint blinding: add random point, subtract at end+2 point adds
DPAProjective coordinate randomization: (X:Y:Z) → (λX:λY:λZ)+3 field mults
TimingConstant-time field operations (no branches on secret data)Design discipline
Fault injectionPoint-on-curve checks, redundant computation, error detection~2× area
EM emanationDual-rail logic (WDDL/SABL), shuffle, random delays2-3× area
Invalid curveValidate input points: check y² = x³+ax+b, check nQ = 𝒪1 scalar mult

Edwards Curves: Inherent Advantages

Complete addition law → no exceptional cases → no conditional branches → SPA-resistant by construction. Combined with projective coordinate randomization and scalar blinding → strong multi-countermeasure profile with lower overhead than Weierstrass implementations.

RTL: Montgomery Modular Multiplier

SystemVerilog
module mont_mul #(
    parameter N = 256   // Bit width
)(
    input  logic         clk, rst, start,
    input  logic [N-1:0] a, b, p,  // a, b in Montgomery domain
    input  logic [N-1:0] p_inv,    // -p^(-1) mod 2^N
    output logic [N-1:0] result,
    output logic         done
);
    logic [2*N:0] t, u_full;
    logic [N:0]   res;
    
    typedef enum logic [1:0] {IDLE, COMPUTE, REDUCE, DONE} state_t;
    state_t state;
    
    always_ff @(posedge clk or posedge rst) begin
        if (rst) begin
            state  <= IDLE;
            done   <= 0;
        end else case (state)
            IDLE: if (start) begin
                t     <= a * b;                    // Step 1: full product
                state <= COMPUTE;
                done  <= 0;
            end
            COMPUTE: begin
                u_full <= (t[N-1:0] * p_inv);      // Step 2: u = t * (-p^-1) mod R
                state  <= REDUCE;
            end
            REDUCE: begin
                res   <= (t + u_full[N-1:0] * p) >> N; // Step 3: (t + u*p) / R
                state <= DONE;
            end
            DONE: begin
                result <= (res >= p) ? res - p : res[N-1:0];
                done   <= 1;
                state  <= IDLE;
            end
        endcase
    end
endmodule

Simplified 3-cycle design. Production versions use word-serial or systolic architectures for area efficiency.

RTL: Jacobian Point Addition Scheduler

SystemVerilog
// Jacobian point addition: P3 = P1 + P2
// P1 = (X1:Y1:Z1), P2 = (X2:Y2:1) [mixed add]
// Cost: 7M + 4S  (a = -3 optimization)
// 
// Schedule (7 rounds with single multiplier):
//   R1: U2 = X2*Z1^2         S2 = Y2*Z1^3
//   R2: H  = U2 - X1         R  = S2 - Y1
//   R3: H2 = H^2             H3 = H*H2
//   R4: X1H2 = X1*H2
//   R5: X3 = R^2 - H3 - 2*X1H2
//   R6: Y3 = R*(X1H2 - X3) - Y1*H3
//   R7: Z3 = Z1 * H

module ecc_point_add_ctrl (
    input  logic clk, rst, start,
    input  logic mul_done,          // from Montgomery multiplier
    output logic [2:0] mul_sel_a,   // register file address for operand A
    output logic [2:0] mul_sel_b,   // register file address for operand B  
    output logic [2:0] result_dst,  // where to store result
    output logic mul_start,
    output logic done
);
    // FSM with 11 states for 7M + 4S operations
    // Parallelism: dual multipliers can overlap rounds
    // ... (full FSM omitted for clarity)
endmodule

Production ECC accelerators schedule 11M+5S (Jacobian) or 8M (extended Edwards) across 2-4 parallel multiplier units, achieving 4-6 rounds of computation per point operation.

Python: ECDH on a Toy Curve

Code
"""Elliptic Curve Diffie-Hellman on y^2 = x^3 + 2x + 3 mod 97"""
p = 97; a = 2; b = 3

def modinv(x, m=p):
    return pow(x, m - 2, m)  # Fermat's little theorem

def point_add(P, Q):
    if P is None: return Q
    if Q is None: return P
    x1, y1 = P; x2, y2 = Q
    if x1 == x2 and y1 != y2: return None  # P + (-P)
    if P == Q:
        lam = (3 * x1 * x1 + a) * modinv(2 * y1) % p
    else:
        lam = (y2 - y1) * modinv(x2 - x1) % p
    x3 = (lam * lam - x1 - x2) % p
    y3 = (lam * (x1 - x3) - y1) % p
    return (x3, y3)

def scalar_mult(k, P):
    R = None
    for bit in bin(k)[2:]:
        R = point_add(R, R)
        if bit == '1': R = point_add(R, P)
    return R

G = (3, 6)     # Generator point
n = 5           # Order of G (toy example)

# Alice & Bob
a_priv = 3;  A_pub = scalar_mult(a_priv, G)  # (80, 87)
b_priv = 4;  B_pub = scalar_mult(b_priv, G)  # (80, 10)

shared_A = scalar_mult(a_priv, B_pub)  # Alice computes
shared_B = scalar_mult(b_priv, A_pub)  # Bob computes
assert shared_A == shared_B            # Same point!
print(f"Shared secret: {shared_A}")    # (3, 91)

Interactive: Point Addition on E(𝔽p)

Interactive

Curve: y² ≡ x³ + 2x + 3 (mod 97)

Click a point on the canvas to select P, then click again for Q. The sum P + Q will be shown.

All 100 points on E(𝔽₉₇):

Interactive: Scalar Multiplication kG

Interactive

Curve: y² ≡ x³ + 2x + 3 (mod 97)

Generator G: (3, 6)

1

ECC in TLS 1.3

Deployment

TLS 1.3 mandates ECC — X25519 is the most common key exchange.

TLS 1.3 Handshake (1-RTT)

Client Server
Generate ephemeral X25519 keypair → ClientHello
+ key_share(X25519 public)
← ServerHello
+ key_share(X25519 public)
Generate ephemeral X25519 keypair
Both compute: shared_secret = X25519(my_private, peer_public)
→ HKDF-Expand → handshake keys → application keys
← {Certificate, CertificateVerify}
Ed25519 or ECDSA-P256 signature
Sign transcript with long-term key

Key exchange groups in TLS 1.3: x25519 (0x001D, most popular), secp256r1 (0x0017), x448 (0x001E), secp384r1 (0x0018), secp521r1 (0x0019). Post-quantum hybrids (X25519+ML-KEM-768) are being deployed.

ECC Deployment Landscape

Deployment
ApplicationCurve(s)ProtocolUse
TLS 1.3X25519, P-256ECDHE + ECDSA/EdDSAKey exchange + server auth
SSHEd25519, P-256EdDSA, ECDSAUser/host authentication
Bitcoinsecp256k1ECDSATransaction signing
Ethereumsecp256k1ECDSA (+ BLS on beacon)Transaction + consensus
Signal / WhatsAppCurve25519X3DH + Double RatchetE2E encrypted messaging
WireGuard VPNCurve25519Noise_IK(X25519)Tunnel establishment
FIDO2 / PasskeysP-256, Ed25519WebAuthn ECDSA/EdDSAPasswordless auth
Apple iMessageP-256ECDH + ECDSAE2E encryption + signing
TorCurve25519 (ntor)X25519Circuit establishment
Code SigningP-256, Ed25519ECDSA, EdDSASoftware integrity
IoT / EmbeddedP-256, Curve25519ECDH, ECDSAConstrained-device TLS
Smart CardsP-256, BrainpoolP256ECDSA, ECDHEMV, ePassport, PIV

ECC vs RSA: Performance Comparison

Analysis

At the 128-bit security level (P-256 / Curve25519 vs RSA-3072):

OperationRSA-3072ECDSA P-256Ed25519X25519
Key generation~50 ms~0.1 ms~0.05 ms~0.05 ms
Public key size384 bytes33 bytes32 bytes32 bytes
Private key size~1.5 KB32 bytes32 bytes32 bytes
Signature size384 bytes64 bytes64 bytes
Sign~1.5 ms~0.15 ms~0.08 ms
Verify~0.04 ms~0.4 ms~0.2 ms
Key exchange~1.5 ms~0.3 ms~0.14 ms

Approximate timings on modern x86-64 (Haswell/Skylake). RSA verify is fast due to small public exponent e = 65537.

Bandwidth matters: In TLS, ECC's 32-byte keys vs RSA's 384-byte keys compound across billions of connections per day. This is why TLS 1.3 favors X25519+Ed25519.

The Quantum Threat to ECC

Quantum

Shor's Algorithm

A quantum computer running Shor's algorithm can solve ECDLP in polynomial time.

Quantum resources for 256-bit ECC:

  • ~2,330 logical qubits
  • ~126 billion Toffoli gates
  • Far exceeds current quantum hardware

Compare: RSA-2048 needs ~4,098 qubits — ECC is actually easier for quantum computers than RSA at equivalent classical security.

Migration Timeline

MilestoneDate
NIST PQC standards finalizedAug 2024
CNSA 2.0: PQ for firmware/software signing2025
CNSA 2.0: PQ for web/cloud2025-2030
NIST deprecate ECC-only2030
NIST disallow ECC-only2035

Hybrid Approach

X25519 + ML-KEM-768 in TLS 1.3 — combine classical and post-quantum security. Already deployed in Chrome, Firefox, Cloudflare.

Curve Form Comparison Summary

PropertyShort WeierstrassMontgomeryTwisted Edwards
Equationy² = x³ + ax + bBv² = u³ + Au² + uax² + y² = 1 + dx²y²
Prime order possible?YesNo (h ≥ 4)No (h ≥ 4)
Complete addition law?Yes (Renes 2016, costly)N/A (x-only)Yes (when d non-square)
Best add cost7M + 4S (mixed Jacobian)4M + 2S (diff.)8M (extended)
Best double cost1M + 8S (Jacobian, a=-3)2M + 2S + 1D4M + 4S (extended)
Constant-time easeRequires careNatural (ladder)Natural (complete)
Key curvesP-256, P-384, secp256k1Curve25519, Curve448edwards25519, edwards448
Primary useECDSA, ECDH (NIST)X25519/X448 (ECDH)Ed25519/Ed448 (EdDSA)
FIPS approved?YesYes (SP 800-186)Yes (SP 800-186)

Modern best practice: Use X25519 for key exchange (Montgomery ladder, constant-time) and Ed25519 for signatures (complete Edwards addition, deterministic nonces). Fall back to P-256 for FIPS compliance where Curve25519 is not yet approved in your specific protocol.

References

Standards: FIPS 186-5 (DSS) · NIST SP 800-186 (ECC Curves, 2023) · RFC 7748 (X25519/X448) · RFC 8032 (EdDSA) · RFC 6979 (Deterministic ECDSA) · SEC 2 (SECG Curves) · RFC 5639 (Brainpool) · NIST IR 8547 (PQC Transition)

Foundations: Koblitz, "Elliptic Curve Cryptosystems" (1987) · Miller, "Use of Elliptic Curves in Cryptography" (CRYPTO 1985) · Montgomery, "Speeding the Pollard and EC Methods of Factorization" (1987) · Edwards, "A Normal Form for Elliptic Curves" (2007) · Bernstein & Lange, "Faster Addition and Doubling on Elliptic Curves" (ASIACRYPT 2007)

Curves: Bernstein, "Curve25519: New Diffie-Hellman Speed Records" (2006) · Hamburg, "Ed448-Goldilocks" (2015) · Bernstein & Lange, SafeCurves (safecurves.cr.yp.to)

Implementation: Renes, Costello, Batina, "Complete Addition Formulas for Prime Order Elliptic Curves" (2016) · Bernstein & Lange, "Montgomery Curves and the Montgomery Ladder" (2017) · Hankerson, Menezes, Vanstone, "Guide to Elliptic Curve Cryptography" (Springer, 2004)

Hardware: High-Performance Curve25519 & Curve448 Unified Accelerator (2025) · ANSSI IPECC VHDL IP · Low-Latency Twisted Edwards FPGA (Scientific Reports, 2025) · Area-Efficient ECSM over Prime Field (Microprocessors & Microsystems, 2023)

Attacks: Pohlig & Hellman (1978) · Pollard, "Monte Carlo Methods for Index Computation" (1978) · Smart, "The Discrete Logarithm Problem on Elliptic Curves of Trace One" (1999) · Menezes, Okamoto, Vanstone, "Reducing ECDLP to DLP over Extension Fields" (1993)

Modern Cryptography Series — 02

Elliptic Curve
Cryptography

38 slides · Mathematics · Standards · Hardware · Code · Attacks · Interactive

Key Takeaways:

  • ECC provides equivalent security to RSA with 10-30× smaller keys
  • Edwards curves offer complete addition laws — inherently side-channel resistant
  • X25519 + Ed25519 represent the modern state of the art
  • Quantum computers will break ECC — hybrid PQ migration is underway

Back to Series · Next: 03 — AES Design & Implementation