COMPUTER SCIENCE FUNDAMENTALS SERIES

Hash Tables

Hash functions · Collision resolution · Open addressing ·
Cuckoo hashing · Load factor · Concurrent maps
Mid-level software engineer track  ·  20 slides
02

Dictionary / Map ADT

The abstract interface

A dictionary (also called map, associative array) supports three core operations on key-value pairs:

  • INSERT(k, v) — associate value v with key k
  • SEARCH(k) — return the value for key k, or report absent
  • DELETE(k) — remove the pair with key k

Hash tables achieve amortised constant time by trading deterministic guarantees for expected-case performance. The key insight: convert the key to an array index via a hash function.

Naive implementations

Data StructureINSERTSEARCHDELETE
Unsorted arrayO(1)O(n)O(n)
Sorted arrayO(n)O(log n)O(n)
Balanced BSTO(log n)O(log n)O(log n)
Hash tableO(1) exp.O(1) exp.O(1) exp.
03

Direct Addressing

When keys are small integers

If every possible key is in {0, 1, ..., U-1} and U is small, store values in a plain array T[0..U-1].

T[k] = v       // INSERT
T[k]           // SEARCH
T[k] = NIL     // DELETE

All operations are O(1) worst case.

Direct addressing is the conceptual foundation. A hash table generalises it by compressing the key space via a hash function.

The problem

  • When U is large (e.g. 64-bit integers, strings), allocating an array of size U is infeasible
  • When only n << U keys are used, most of the array is wasted
  • Solution: hash the key into a smaller index range [0, m-1] where m = O(n)
Direct-address table T[0..9] 0 1 2 3 4 5 6 7 8 9 val_A val_B val_C key=2 key=5 key=8 Only 3 of 10 slots used
04

Hash Functions

Requirements

A hash function h(k) maps a key from universe U to an index in {0, 1, ..., m-1}.

  • Deterministic — same key always produces the same index
  • Uniform distribution — keys spread evenly across slots
  • Fast to compute — ideally O(1) or O(|key|)

Division method

h(k) = k mod m

Simple but sensitive to choice of m. Avoid powers of 2. Choose m as a prime not close to a power of 2.

Multiplication method

h(k) = floor(m * (k * A mod 1))
// where 0 < A < 1
// Knuth: A = (sqrt(5) - 1) / 2 ≈ 0.618

Practical hash functions

FunctionUse case
MurmurHash3General-purpose, fast, good distribution
xxHashExtremely fast, widely used in checksums
SipHashHash-flooding resistant — default in Python, Rust
FNV-1aSimple, byte-at-a-time, good for short keys
CityHash / FarmHashGoogle's fast hashes for strings

Cryptographic hashes (SHA-256) provide stronger guarantees but are much slower. Use them for security, not hash tables.

05

Universal Hashing

The adversary problem

Any single deterministic hash function has a worst case: an adversary can craft n keys that all hash to the same slot, degrading every operation to O(n).

Universal hash families

A family H is universal if for any two distinct keys x ≠ y:

Pr[h(x) = h(y)] ≤ 1/m
// where h chosen randomly from H

Carter-Wegman construction

For prime p ≥ |U|:

h_{a,b}(k) = ((a * k + b) mod p) mod m
// a ∈ {1, ..., p-1},  b ∈ {0, ..., p-1}

Properties

Expected chain length is n/m regardless of key distribution

No adversary can force worst-case behaviour without knowing a, b

Choosing a new random h at construction time is sufficient — no per-query randomness needed

Stronger families

  • k-independent — any k distinct keys hash independently
  • 2-independent ≈ universal; 5-independent suffices for linear probing to achieve O(1) expected
  • Tabulation hashing — 3-independent, fast, practical; suffices for chaining and cuckoo hashing
06

Separate Chaining

Collision resolution with linked lists

Each slot T[i] points to a linked list of all elements whose keys hash to i.

Operations

  • INSERT — compute h(k), prepend to list at T[h(k)]O(1)
  • SEARCH — compute h(k), traverse list at T[h(k)]O(chain length)
  • DELETE — search then unlink — O(chain length)

Performance

With load factor α = n/m and a good hash function:

  • Expected chain length = α
  • Expected search time = O(1 + α)
  • If n = O(m), all operations are O(1) expected

Chaining is simple and tolerates α > 1 gracefully. Downside: pointer chasing is cache-unfriendly.

Separate chaining T[0] T[1] T[2] T[3] T[4] T[5] T[6] T[7] k3 k7 k1 k5 k9 k2 k8 NIL NIL NIL k4
07

Open Addressing — Linear & Quadratic Probing

Open addressing principle

All elements stored directly in the table array. No pointers, no linked lists. On collision, probe a sequence of alternative slots until an empty one is found.

Linear probing

h(k, i) = (h'(k) + i) mod m
// for i = 0, 1, 2, ...

Slots checked sequentially. Cache-friendly but suffers from primary clustering — runs of occupied slots grow and merge.

Quadratic probing

h(k, i) = (h'(k) + c1*i + c2*i²) mod m

Spreads probes more than linear. Avoids primary clustering but can cause secondary clustering.

Comparison

MethodCacheClusteringCoverage
LinearExcellentPrimaryAll slots
QuadraticGoodSecondaryMay miss if m not prime

Linear probing is used in practice more than theory suggests — modern CPUs love sequential memory access. Swiss Table proves this decisively.

Primary clustering — occupied runs grow:

_ k1 k4 k7 k2 k9 _ k3 _ _

↑ cluster of 5 — new inserts extend it further

08

Double Hashing

Two independent hash functions

h(k, i) = (h1(k) + i * h2(k)) mod m
// h2(k) must never return 0
// Common: h2(k) = 1 + (k mod (m - 1)), m prime

Why it works

  • Each key generates a distinct probe sequence — no clustering
  • Approximates the ideal of uniform probing
  • Expected probes for unsuccessful search: 1 / (1 - α)
  • Expected probes for successful search: (1/α) * ln(1 / (1 - α))

As α → 1, performance degrades sharply. Open-addressing tables should resize well before reaching α = 1 — typically at α ∈ [0.5, 0.75].

Probe count at various load factors

Load αUnsuccessfulSuccessful
0.502.0 probes1.4 probes
0.754.0 probes1.8 probes
0.9010.0 probes2.6 probes
0.9520.0 probes3.2 probes

Probing comparison

Linear — fast cache, primary clustering

Quadratic — better spread, secondary clustering

Double — no clustering, two hash computations

09

Load Factor & Performance

Load factor

α = n / m
// n = stored elements, m = table capacity
  • Chaining: α can exceed 1.0. Performance degrades linearly: O(1 + α)
  • Open addressing: α must stay below 1.0. Performance degrades non-linearly

Amortised O(1) argument

Maintain α ≤ c for constant c < 1 by doubling when threshold exceeded:

  • Each insertion is O(1) expected
  • Resizing costs O(n) but happens only every O(n) insertions
  • Amortised cost per insertion: O(1)

Space-time trade-off

Load targetMemory overheadProbe perf.
α ≤ 0.502x actual dataExcellent
α ≤ 0.751.33x actual dataGood
α ≤ 0.901.11x actual dataAcceptable / poor

Most real-world hash tables target α ∈ [0.5, 0.75]. Python's dict resizes at α = 2/3.

Expected longest chain (chaining)

For n keys in m slots with universal hashing, the expected longest chain is Θ(log n / log log n).

10

Resizing & Rehashing

When to resize

Typically when α exceeds a threshold (e.g. 0.75). Some implementations also shrink when α drops below a lower bound (e.g. 0.25).

The rehashing process

  1. Allocate new table of size m' = 2m (or next prime > 2m)
  2. For every element, compute h(k) mod m' and insert into the new table
  3. Free the old table

Cost: O(n) — must rehash every element because the modulus changed.

Incremental rehashing (Redis approach)

  • Maintain two tables simultaneously
  • On each operation, migrate a few keys from old to new
  • Lookups check both tables during the transition
  • After all keys migrated, free the old table

Growth strategies

StrategyNew sizeAmortised
Doubling2mO(1)
1.5x growth1.5mO(1) (lower const.)
Prime steppingnext prime > 2mO(1), better distribution

Shrinking is often omitted in practice — the memory cost of an oversized table is usually acceptable, and shrink-grow oscillation near the threshold is expensive.

Old table (m) Rehash all n keys New table (2m)
Full rehash: one O(n) pause
Old Migrate k keys/op New
Incremental: no latency spike
11

Robin Hood Hashing

Equalising probe distances

A variant of open-addressing (typically linear probing) where the probe distance of each element is tracked.

The Robin Hood rule

During insertion, if the new element's probe distance exceeds the occupant's, swap them — "steal from the rich, give to the poor."

Insert key K with probe distance d:
  While slot is occupied:
    If d > occupant's probe distance:
      Swap K with occupant
      Continue inserting displaced element
    Advance to next slot, d++
  Place K in empty slot

Benefits

Reduced variance — all elements have similar probe lengths

Expected max probeO(log log n) vs O(log n) for standard linear probing

Early termination — lookup stops if current probe distance exceeds occupant's

Simpler deletion — backward-shift instead of tombstones

Robin Hood hashing makes linear probing practical at higher load factors. Rust's HashMap used Robin Hood hashing before switching to hashbrown (Swiss Table).

12

Cuckoo Hashing

Worst-case O(1) lookup

Uses two hash functions h1 and h2 and two tables T1, T2.

  • SEARCH: check T1[h1(k)] and T2[h2(k)] — exactly 2 lookups, always
  • DELETE: check both locations, remove — O(1) worst case

Insertion algorithm

Insert key k:
  If T1[h1(k)] empty -> place k, done
  Evict x from T1[h1(k)], place k
  If T2[h2(x)] empty -> place x, done
  Evict y from T2[h2(x)], place x
  ... repeat (max iteration limit)
  If cycle -> rehash with new functions

Properties

PropertyValue
LookupO(1) worst case
DeletionO(1) worst case
InsertionO(1) amortised (with rehash)
Max load factor~50% (two tables) or higher with buckets
Rehash triggerInsertion cycle detected

Cuckoo hashing guarantees constant-time lookups — ideal for network routers, hardware implementations, and read-heavy workloads.

T1

k1 _ k3 k5

T2

k2 k4 _ _
13

Hopscotch Hashing

Bounded neighbourhood

Each slot has a "neighbourhood" of H consecutive slots (typically H = 32). An element hashing to slot i must reside within i to i + H - 1.

Lookup

Check all H slots in the neighbourhood — a single cache line or two. Uses a bitmap per slot to record which neighbours are occupied by elements hashing to that slot.

Insertion

  1. Find an empty slot (may be beyond the neighbourhood)
  2. Repeatedly swap elements to move the empty slot closer, into the neighbourhood
  3. If no swap chain can bring the empty slot within range, resize

Advantages over linear probing

Bounded worst-case probe — at most H slots

Cache-friendly — neighbourhood fits in 1–2 cache lines

Concurrency-friendly — lock only the neighbourhood, not the whole table

No tombstones — just clear the bitmap bit and the slot

Hopscotch hashing combines the cache-friendliness of linear probing with a bounded probe guarantee. Used in Cliff Click's lock-free concurrent hash map.

Neighbourhood of slot 3 (H=4):

0 1 2 3 4 5 6 7 8 9

↑ bitmap: 1101 — elements at slots 3, 4, 6

14

Swiss Table

Google's flat_hash_map

Open-addressing with linear probing, but with SIMD-accelerated metadata lookups. Used in Abseil (absl::flat_hash_map) and Rust's hashbrown.

Architecture

  • Table divided into groups of 16 slots
  • Each group has a 16-byte control byte array (one byte per slot)
  • Control byte: top 7 bits of hash (H2) + 1 bit for empty/deleted/full
  • Probe sequence walks over groups, not individual slots

SIMD-powered lookup

1. h1 = hash(key)
2. group = h1 mod num_groups
3. H2 = top 7 bits of hash
4. Load 16 ctrl bytes → SIMD register
5. Compare all 16 vs H2 in ONE instr
   (SSE2 _mm_cmpeq_epi8)
6. Bitmask → check only matches

Performance characteristics

PropertyValue
Probe unit16 slots at once (SIMD)
Empty checkSingle SIMD comparison
Load factorUp to ~87.5% efficiently
Deletion"deleted" sentinel — no tombstone clustering
Memory layoutKeys/values stored flat, no pointers

Swiss Table is the state of the art for general-purpose hash tables. It proves that linear probing + SIMD beats more complex schemes in practice.

One group (16 control bytes):

4A 7F FF 1B 4A FF FE 33 FF 5C 0D FF 72 4A FF 19

H2 match · FF=empty · FE=deleted

15

Perfect Hashing

Zero collisions by construction

A perfect hash function maps n keys to m slots with no collisions. A minimal perfect hash uses m = n.

FKS two-level scheme

Level 1: Hash n keys into m primary slots using a universal hash function.

Level 2: For each primary slot j with n_j keys, build a secondary table of size n_j² with its own universal hash function.

Level 1:  h1(k) → primary slot j
Level 2:  h_j(k) → position in
          secondary table for slot j

Space and time

PropertyValue
LookupO(1) worst case — 2 hashes, 2 lookups
Total spaceO(n) expected
ConstructionO(n) expected
Static onlyKeys must be known in advance

Practical tools

gperf CMPH BBHash CHD

Perfect hashing is optimal for static dictionaries — compiler keyword tables, routing tables with fixed prefix sets. For dynamic sets, cuckoo hashing is the closest analogue.

16

Hash Tables in Practice — Python & Java

Python dict (CPython 3.6+)

  • Open addressing with compact layout
  • Separate hash array (indices) and dense entries array (preserves insertion order)
  • Hash function: SipHash (hash-flooding resistant)
  • Resize at α = 2/3; growth ~3x small, ~2x large
  • Compact dict saves ~25% memory vs pre-3.6

Java HashMap

  • Separate chaining with linked list per bucket
  • At α = 0.75, resize to 2 * capacity
  • Chain > 8 elements → treeified (red-black tree, O(log n) per bucket)
  • Chain < 6 → reverts to linked list
  • Hash: (h = key.hashCode()) ^ (h >>> 16)

Comparison

FeaturePython dictJava HashMap
StrategyOpen addressingSeparate chaining
Order preservedYes (insertion order)No (use LinkedHashMap)
TreeificationNoYes (chains > 8)
Hash functionSipHashhashCode() XOR-shifted
Default load factor0.670.75
17

Hash Tables in Practice — Go & C++

Go map

  • Separate chaining with buckets — each bucket holds 8 key-value pairs
  • Overflow buckets linked when full
  • Incremental rehashing — grows and evacuates lazily
  • Hash: architecture-dependent (AES on amd64)
  • Not thread-safe — concurrent read/write is a fatal panic
  • Iteration order randomised by design

C++ std::unordered_map

  • Separate chaining with linked list per bucket (mandated by iterator stability rules)
  • Default max_load_factor = 1.0
  • Poor cache locality — pointer-based chains
  • Widely considered a performance anti-pattern

Why std::unordered_map is slow

IssueCause
Pointer chasingEach node is a separate heap allocation
Cache missesNodes scattered across memory
High load factorDefault 1.0 means long chains
Iterator stabilityStandard requires refs stay valid — prevents open addressing

For performance-critical C++ code, use absl::flat_hash_map (Swiss Table), robin_map (Tessil), or ankerl::unordered_dense.

18

Thread-Safe Hash Maps

The problem

Concurrent reads and writes require synchronisation. A global lock serialises all operations and destroys throughput.

Lock striping (Java ConcurrentHashMap)

  • Partition the table into segments (e.g. 16)
  • Each segment has its own lock
  • Operations on different segments proceed in parallel
  • Java 8+: replaced segments with per-bucket CAS + synchronized on bucket head

Lock-free approaches

  • RCU — readers lock-free; writers copy, modify, swap pointer
  • Harris-style — CAS-based insertion/deletion on chains
  • Cliff Click's NonBlockingHashMap — lock-free open addressing via CAS

Comparison

ImplementationReadWriteResize
Global lockSerialSerialBlocking
Lock stripingParallelParallelPer-segment
CAS per bucketLock-freeLock-freeCooperative
RCUWait-freeLock writeCopy-on-write

For read-heavy workloads, RCU or lock-free maps offer the best throughput. For mixed workloads, CAS-based per-bucket locking (Java 8 ConcurrentHashMap) is the industry standard.

19

Applications & Limitations

Applications

Symbol tables — compilers, interpreters, debuggers

Caching — memoisation, LRU caches (hash map + doubly-linked list)

Databases — hash indexes, hash joins, in-memory tables

Networking — connection tracking, NAT tables, routing tables

Deduplication — seen-set for crawlers, stream processing

Counting — frequency tables, word counts, histograms

Sets — membership testing (hash set = hash map with no values)

Limitations

No ordering — no range queries, min/max, successor (use BST / B-tree)

Hash quality matters — poor functions cause clustering and O(n)

Hash-flooding attacks — adversarial inputs force worst case (mitigate with SipHash / random seeds)

Memory overhead — open addressing wastes empty slots; chaining wastes pointers

Resizing latency — rehashing n elements pauses the system (mitigate with incremental rehash)

Not scan-optimal — iterating all elements slower than a flat array

When you need sorted iteration or range queries, use a balanced BST (std::map, TreeMap) or B-tree.

20

Summary & Further Reading

Key takeaways

  • Hash tables provide O(1) expected time for insert, search, delete — the fastest dictionary for point queries
  • Hash function choice determines distribution quality and security against adversarial inputs
  • Separate chaining is simple but cache-unfriendly; open addressing keeps data contiguous
  • Robin Hood hashing equalises probe distances; cuckoo hashing guarantees O(1) worst-case lookup
  • Swiss Table (SIMD + linear probing) is the current state of the art
  • Perfect hashing achieves O(1) worst case for static key sets
  • Load factor management and amortised resizing keep operations constant-time
  • Thread safety requires lock striping, CAS, or RCU — never a single global lock

Recommended reading

SourceDescription
CLRSIntroduction to Algorithms, chapters 11–12
KnuthTAOCP Vol. 3 — Sorting & Searching, Section 6.4
Herlihy & ShavitThe Art of Multiprocessor Programming
AbseilSwiss Table design notes
Pagh & Rodler"Cuckoo Hashing" — J. Algorithms, 2004
Celis et al."Robin Hood Hashing" — FOCS 1985