COMPUTER SCIENCE FUNDAMENTALS SERIES

Disjoint Sets
(Union-Find)

Union by rank · Path compression · Inverse Ackermann ·
Kruskal's MST · Connected components · Amortised O(α(n))
Mid-level software engineer track  ·  20 slides
02

The Disjoint Set ADT

What is a disjoint set?

A collection of non-overlapping (disjoint) sets. Every element belongs to exactly one set at any time. Each set has a designated representative element.

  • Partition — a family of disjoint subsets whose union equals the full universe
  • Representative — a canonical element that uniquely identifies the set
  • Equivalence — two elements are equivalent iff they share the same representative

Formal definition

Given universe U = {x₁, x₂, ..., xₙ}:

  • Sets S₁, S₂, ..., Sₖ such that Sᵢ ∩ Sⱼ = ∅ for i ≠ j
  • S₁ ∪ S₂ ∪ ... ∪ Sₖ = U

The disjoint set data structure supports a dynamic partition that starts with singletons and merges sets over time — but never splits them.

03

Core Operations — Make-Set, Find, Union

The three operations

OperationDescriptionEffect
Make-Set(x)Create a new set containing only xx becomes its own representative
Find(x)Return the representative of the set containing xRead-only (with optimisations)
Union(x, y)Merge the sets containing x and yOne representative chosen for the merged set

Typical usage pattern

Make-Set(0), Make-Set(1), ..., Make-Set(n-1)
Union(0, 1)
Union(2, 3)
Find(1) == Find(0)   # true  -- same set
Find(1) == Find(2)   # false -- different sets
Union(1, 3)
Find(0) == Find(2)   # true  -- now merged

Find answers the connectivity question: are two elements in the same set? Union establishes new connections. Neither operation ever separates elements.

04

Naive Linked-List Implementation

Structure

Each set is a linked list. Every node stores a pointer to the head (representative). Union appends one list to another and updates all head pointers in the shorter list.

Complexity

OperationCost
Make-SetO(1)
FindO(1) — follow pointer to head
UnionO(n) worst case

The problem

A sequence of n - 1 Union operations on n singletons can cost O(n²) total if we always append the longer list to the shorter.

Using the weighted-union heuristic (always append shorter to longer), the total cost drops to O(m + n log n).

The linked-list approach is simple but the O(n) Union cost motivates the far superior forest representation.

05

Forest Representation (Parent Pointers)

Idea

Represent each set as a rooted tree. Each element stores a single parent pointer. The root points to itself and is the representative.

def make_set(x):
    parent[x] = x

def find(x):
    while parent[x] != x:
        x = parent[x]
    return x

def union(x, y):
    rx, ry = find(x), find(y)
    if rx != ry:
        parent[ry] = rx

Example

parent[]:  0  0  0  3  3
index:     0  1  2  3  4

Tree 1:    0       Tree 2:  3
          / \                |
         1   2               4

Complexity (naive forest)

OperationCost
Make-SetO(1)
FindO(n) worst case
UnionO(n) worst case

Without heuristics, n Unions can build a degenerate chain of depth n - 1. Two optimisations fix this.

06

Union by Rank

Concept

Maintain a rank for each root — an upper bound on the height of its subtree. When merging, attach the shorter tree under the taller one.

def make_set(x):
    parent[x] = x
    rank[x] = 0

def union(x, y):
    rx, ry = find(x), find(y)
    if rx == ry:
        return
    if rank[rx] < rank[ry]:
        rx, ry = ry, rx
    parent[ry] = rx
    if rank[rx] == rank[ry]:
        rank[rx] += 1

Why it works

  • Trees with rank r have at least nodes
  • Maximum rank is ⌊log₂ n⌋
  • Find is therefore O(log n) worst case

Key property

Rank only increases when two trees of equal rank merge. A node's rank never changes once it becomes a non-root.

Union by rank alone reduces worst-case Find from O(n) to O(log n). Combined with path compression it approaches O(1).

07

Union by Size

Track subtree sizes instead of rank

Store the actual count of nodes in each root's subtree. Always attach the smaller tree under the larger root.

def make_set(x):
    parent[x] = x
    size[x] = 1

def union(x, y):
    rx, ry = find(x), find(y)
    if rx == ry:
        return
    if size[rx] < size[ry]:
        rx, ry = ry, rx
    parent[ry] = rx
    size[rx] += size[ry]

Comparison with union by rank

PropertyBy rankBy size
StorageSmall intUp to n
Height bound⌊log₂ n⌋⌊log₂ n⌋
Practical speedSlightly fasterSlightly more memory
Path compressionRank becomes staleSize remains exact

Both achieve the same asymptotic bound. Union by rank is more common in textbooks; union by size is useful when you need the actual set cardinality.

08

Path Compression

The idea

During Find(x), make every node on the path from x to the root point directly to the root. Flattens the tree on every query.

def find(x):
    if parent[x] != x:
        parent[x] = find(parent[x])
    return parent[x]

Complexity

Path compression alone (without union by rank) gives amortised O(log n). Combined with union by rank: O(α(n)) amortised.

Before and after Find(4)

Before:           After:

     0                 0
     |               / | \
     1              1  2  4
     |              |
     2              3
     |
     3
     |
     4

Path compression is a retrospective optimisation — it does not prevent tall trees from forming, but flattens them after traversal.

09

Path Splitting & Path Halving

Path splitting

Every node on the find path points to its grandparent. Single-pass, iterative.

def find(x):
    while parent[x] != x:
        next = parent[x]
        parent[x] = parent[next]
        x = next
    return x

Path halving

Every other node on the find path points to its grandparent.

def find(x):
    while parent[x] != x:
        parent[x] = parent[parent[x]]
        x = parent[x]
    return x

Comparison

TechniqueNodes updatedAmortisedStack
Full compressionAll on pathO(α(n))O(log n)
Path splittingAll on pathO(α(n))O(1)
Path halving~HalfO(α(n))O(1)

All three achieve the same amortised bound with union by rank. Path splitting and halving are preferred in practice — iterative and stack-free.

10

Union by Rank + Path Compression

The optimal combination

  • Trees stay shallow (rank bounds height)
  • Paths flatten on every Find (compression keeps future Finds fast)
  • Rank becomes a stale upper bound after compression, but the algorithm remains correct

Amortised complexity

For m operations on n elements:

Total: O(m · α(n))

Practical performance

nα(n)
10
21
162
65 5363
2655364

For any conceivable input size, α(n) ≤ 4. Union-find operations are effectively constant time in practice.

11

Amortised Analysis — Inverse Ackermann

The Ackermann function

A rapidly growing function defined recursively:

  • A(0, j) = j + 1
  • A(i, 0) = A(i - 1, 1)   for i > 0
  • A(i, j) = A(i - 1, A(i, j - 1))   for i, j > 0

The inverse α(n)

α(n) = min { k ≥ 0 : A(k, 0) ≥ n }

Since A grows astronomically fast, α grows unimaginably slowly.

Proof sketch (Tarjan, 1975)

  • Assign each non-root node a potential based on how far its rank is from its parent's rank
  • Each Find reduces the total potential
  • The potential drop per operation is bounded by α(n)

Lower bound

Fredman & Saks (1989) proved Ω(m · α(n)) is a lower bound in the cell-probe model. Union by rank + path compression is asymptotically optimal.

12

Weighted Quick-Union

Sedgewick's formulation

A simplified union-find using union by size with a flat array. Popular in algorithms courses (Princeton, Coursera).

public class WeightedQuickUnionUF {
  private int[] parent;
  private int[] sz;
  private int count;

  public WeightedQuickUnionUF(int n) {
    parent = new int[n];
    sz = new int[n];
    count = n;
    for (int i = 0; i < n; i++) {
      parent[i] = i;  sz[i] = 1;
    }
  }

  public int find(int p) {
    while (p != parent[p]) p = parent[p];
    return p;
  }

  public void union(int p, int q) {
    int rP = find(p), rQ = find(q);
    if (rP == rQ) return;
    if (sz[rP] < sz[rQ]) {
      parent[rP] = rQ; sz[rQ] += sz[rP];
    } else {
      parent[rQ] = rP; sz[rP] += sz[rQ];
    }
    count--;
  }
}

Complexity

OperationCost
FindO(log n)
UnionO(log n)
+ path compressionO(α(n)) amortised

Weighted quick-union is the go-to teaching implementation. Simple, efficient, and easy to extend with path compression.

13

Persistent Union-Find

Motivation

Standard union-find is destructive — path compression and union modify the structure in place. Sometimes we need to query past versions.

Approaches

Fat nodes

Each node stores a list of (timestamp, parent) pairs. Space: O(n + m), Find: O(log n · log m).

Path copying

Copy the path from changed node to root on each modification. Space: O(m log n).

Persistent arrays

Sleator-Tarjan persistent arrays backing parent/rank. O(log n) amortised per version.

Conchon-Filliâtre (2007)

A practical persistent union-find using a semi-persistent array (backtrackable). Used in SMT solvers and proof assistants (OCaml).

  • Current version: nearly O(α(n)) operations
  • Past versions: O(log n) per access with rerooting

Persistence sacrifices the O(α(n)) amortised bound but enables version queries — essential in functional programming and backtracking search.

14

Rollback / Offline Union-Find

Rollback union-find

Support Union and Find as usual, plus Rollback to undo the last Union. Key constraint: do not use path compression (it makes rollback impossible without persistence).

stack = []

def union(x, y):
    rx, ry = find(x), find(y)
    if rx == ry: return
    if rank[rx] < rank[ry]:
        rx, ry = ry, rx
    stack.append((ry, parent[ry], rank[rx]))
    parent[ry] = rx
    if rank[rx] == rank[ry]:
        rank[rx] += 1

def rollback():
    ry, old_parent, old_rank = stack.pop()
    rx = parent[ry]
    rank[rx] = old_rank
    parent[ry] = old_parent

Offline union-find

When all queries are known in advance, process them in an optimal order (e.g., divide-and-conquer on a timeline). Used in competitive programming for "dynamic connectivity" problems.

Trade-off

Rollback union-find is O(log n) per operation (no path compression). You gain undo capability at the cost of the α(n) speedup.

Typical pattern

Divide the timeline into segments. At each node of the segment tree, apply the unions active during that segment, recurse, then rollback.

15

Application — Kruskal's MST

Algorithm

  1. Sort all edges by weight
  2. For each edge (u, v, w) in sorted order:
    — If Find(u) ≠ Find(v), add edge to MST and Union(u, v)
    — Otherwise skip (would create a cycle)
  3. Stop when MST has n - 1 edges

Why union-find is ideal

  • Need to test connectivity between two nodes efficiently
  • Need to merge components as edges are added
  • No need to split components
  • Exactly the operations disjoint sets provide

Complexity

StepCost
Sort edgesO(E log E)
n Make-SetO(n)
2E Find + n-1 UnionO(E · α(n))
TotalO(E log E)

Without union-find, Kruskal's would need O(V) connectivity checks per edge, giving O(VE) total. Union-find makes it O(E log E).

16

Application — Connected Components

Static graph

for v in vertices:
    make_set(v)

for (u, v) in edges:
    union(u, v)

# Connected iff find(u) == find(v)

Total: O(V + E · α(V)) — effectively linear.

Dynamic connectivity (incremental)

As edges arrive online, union-find maintains connected components without recomputation. Each new edge triggers at most one Union.

Counting components

Track a counter initialised to V. Decrement on each successful Union. The counter always holds the current number of connected components.

Comparison with BFS/DFS

MethodTimeOnline insert
BFS/DFSO(V + E)Must rerun
Union-FindO(V + E·α(V))O(α(V)) per edge

For incremental connectivity, union-find is strictly superior to graph traversal algorithms.

17

Application — Image Segmentation

Pixel-level segmentation

Treat each pixel as a node. Connect adjacent pixels whose intensity difference is below a threshold.

for each pixel p:
    make_set(p)

for each adjacent pair (p, q):
    if |intensity(p) - intensity(q)| < T:
        union(p, q)

Performance

  • O(n log n) for n pixels (dominated by edge sorting)
  • Near-linear in practice with path compression
  • Produces perceptually meaningful regions

Felzenszwalb-Huttenlocher (2004)

A graph-based segmentation algorithm using union-find at its core:

Algorithm

  • Sort edges (pixel pairs) by intensity difference
  • Merge components if edge weight is small relative to internal variation
  • "Internal difference" = max edge weight in the component's MST

Union-find gives image segmentation algorithms their speed — merging millions of pixel regions in near-linear time.

18

Application — Percolation

The percolation model

An n × n grid of sites, each independently open with probability p. The system percolates if there is a connected path of open sites from top to bottom.

Union-find approach

  • Create n² + 2 elements: one per site + virtual top + virtual bottom
  • Connect virtual top to all top-row sites
  • Connect virtual bottom to all bottom-row sites
  • When a site opens, Union with adjacent open sites
  • Percolates when Find(top) == Find(bottom)

Monte Carlo simulation

Randomly open sites one at a time. Track when percolation first occurs. Over many trials, estimate the percolation threshold p*.

For a square lattice: p* ≈ 0.5927 (site percolation)

Why union-find?

Each site opening requires up to 4 Union operations. Testing for percolation is a single Find comparison. Total cost per simulation: O(n² · α(n²)).

Percolation is a classic application from statistical physics. Union-find enables Monte Carlo estimation with millions of trials.

19

Equivalence Classes

Mathematical connection

A disjoint set maintains equivalence classes under a relation that is:

  • Reflexivex ~ x (Make-Set ensures every element is equivalent to itself)
  • Symmetricx ~ y ⇒ y ~ x (Union is symmetric)
  • Transitivex ~ y, y ~ z ⇒ x ~ z (maintained through shared representatives)

Congruence closure

Extend union-find to handle function symbols: if a ~ b then f(a) ~ f(b). Used in SMT solvers for the theory of equality with uninterpreted functions.

Applications in compilers & type systems

Type unification

In Hindley-Milner type inference, union-find merges type variables when constraints are discovered.

Register coalescing

Merge registers that must hold the same value during register allocation.

Common subexpression elimination

Identify equivalent expressions to avoid redundant computation.

20

Summary & Further Reading

Key takeaways

  • The disjoint set ADT supports Make-Set, Find, and Union — a dynamic partition that only merges
  • Naive implementations cost O(n) per op; the forest representation is the foundation for all efficient variants
  • Union by rank keeps trees shallow — O(log n) worst case
  • Path compression flattens trees during Find — combined with rank: O(α(n)) amortised
  • α(n) ≤ 4 for all practical n — effectively constant
  • Persistent and rollback variants trade O(α(n)) for version access or undo
  • Applications: Kruskal's MST, connectivity, percolation, image segmentation, type inference

Recommended reading

SourceDescription
Tarjan (1975)Original inverse Ackermann analysis
CLRS Ch. 21Data Structures for Disjoint Sets
Sedgewick & WayneAlgorithms, 4th ed. — Union-Find case study
Fredman & SaksThe Ω(α(n)) lower bound (1989)
Conchon & FilliâtrePersistent Union-Find (2007)
Felzenszwalb & HuttenlocherGraph-Based Image Segmentation (2004)