Every divide-and-conquer algorithm follows the same recursive blueprint:
D&C vs Dynamic Programming: both decompose problems. D&C subproblems are independent (no overlap); DP subproblems overlap and require memoisation.
A D&C algorithm with a subproblems of size n/b and O(nd) combine work has recurrence:
T(n) = a · T(n/b) + O(n^d)
T(n) = aT(n/b) + O(nd)| Algorithm | Recurrence | Solution |
|---|---|---|
| Binary search | T(n) = T(n/2) + O(1) | O(log n) |
| Merge sort | T(n) = 2T(n/2) + O(n) | O(n log n) |
| Karatsuba | T(n) = 3T(n/2) + O(n) | O(n1.585) |
| Strassen | T(n) = 7T(n/2) + O(n2) | O(n2.807) |
| Naive multiply | T(n) = 4T(n/2) + O(n) | O(n2) |
For recurrence T(n) = a · T(n/b) + O(nd) where a ≥ 1, b > 1, d ≥ 0:
d < logb(a)
T(n) = O(nlogb(a))
Recursion branches faster than work shrinks
d = logb(a)
T(n) = O(nd · log n)
Each level does equal work; log n levels total
d > logb(a)
T(n) = O(nd)
Combine cost dwarfs recursive cost
| Algorithm | a, b, d | logb(a) | Case | Result |
|---|---|---|---|---|
| Merge sort | 2, 2, 1 | 1 | Case 2 | O(n log n) |
| Binary search | 1, 2, 0 | 0 | Case 2 | O(log n) |
| Strassen | 7, 2, 2 | 2.807 | Case 1 | O(n2.807) |
| Karatsuba | 3, 2, 1 | 1.585 | Case 1 | O(n1.585) |
The Master Theorem does not cover all recurrences. Akra–Bazzi generalises it to unequal subproblem sizes.
O(n)
O(n log n) worst, average, and best caseO(n) auxiliary (merge buffer)merge_sort(A, lo, hi):
if hi - lo <= 1: return
mid = (lo + hi) / 2
merge_sort(A, lo, mid)
merge_sort(A, mid, hi)
merge(A, lo, mid, hi)
merge(A, lo, mid, hi):
L = A[lo..mid]
R = A[mid..hi]
i = j = 0, k = lo
while i < |L| and j < |R|:
if L[i] <= R[j]:
A[k++] = L[i++]
else:
A[k++] = R[j++]
copy remaining L or R into A
Level 0: n work (1 merge of n)
Level 1: n work (2 merges of n/2)
Level 2: n work (4 merges of n/4)
...
Level log n: n work (n merges of 1)
Total: n · log n
Merge sort can count inversions during the merge step. When a right-side element is chosen before left-side elements, it crosses over all remaining left elements — add that count.
Python and Java's hybrid: natural merge sort + insertion sort for small runs. Exploits partially-sorted input for O(n) best case.
Iterative variant. Merge pairs of size 1, then 2, then 4, ... Better cache locality, no recursion overhead.
Standard algorithm for sorting data that does not fit in RAM. Split into sorted runs, merge runs from disk using a priority queue.
Merge sort is the comparison sort baseline — guaranteed O(n log n) regardless of input distribution.
| Case | Time | When |
|---|---|---|
| Best | O(n log n) | Balanced partitions |
| Average | O(n log n) | Random pivot |
| Worst | O(n2) | Sorted + bad pivot |
quicksort(A, lo, hi):
if lo < hi:
p = partition(A, lo, hi)
quicksort(A, lo, p - 1)
quicksort(A, p + 1, hi)
O(log n) stack space vs O(n) for merge sortpartition(A, lo, hi):
pivot = A[hi]
i = lo
for j = lo to hi-1:
if A[j] <= pivot:
swap(A[i], A[j])
i++
swap(A[i], A[hi])
return i
partition(A, lo, hi):
pivot = A[lo]
i = lo-1, j = hi+1
loop:
do i++ while A[i]<pivot
do j-- while A[j]>pivot
if i >= j: return j
swap(A[i], A[j])
Partitions into < pivot, = pivot, > pivot. Essential when many duplicates are present.
lo, mid, hi pointers
scan left to right:
< pivot → swap to lo
= pivot → advance mid
> pivot → swap to hi
Used by std::sort variants and pdqsort.
binary_search(A, target):
lo, hi = 0, len(A) - 1
while lo <= hi:
mid = lo + (hi - lo) / 2
if A[mid] == target: return mid
elif A[mid] < target: lo = mid + 1
else: hi = mid - 1
return -1
Time: O(log n) — recurrence T(n) = T(n/2) + O(1), Master Case 2
Space: O(1) iterative, O(log n) recursive
lo + (hi - lo) / 2 not (lo + hi) / 2 to avoid overflow| Variant | Description |
|---|---|
| Lower bound | First index where A[i] ≥ target |
| Upper bound | First index where A[i] > target |
| Exponential | Doubling to find range, then binary search — O(log k) |
| Interpolation | Estimate by value distribution — O(log log n) uniform |
| Fractional cascading | Search multiple sorted lists — amortised O(log n + k) |
Divide each n×n matrix into four n/2 × n/2 submatrices. Naive block multiply needs 8 recursive multiplications. Strassen reduces this to 7.
M1 = (A11 + A22)(B11 + B22)
M2 = (A21 + A22) B11
M3 = A11 (B12 - B22)
M4 = A22 (B21 - B11)
M5 = (A11 + A12) B22
M6 = (A21 - A11)(B11 + B12)
M7 = (A12 - A22)(B21 + B22)
Result blocks computed from M1..M7 using only addition/subtraction.
| Method | Multiplications | Time |
|---|---|---|
| Naive | 8 per level | O(n3) |
| Strassen | 7 per level | O(n2.807) |
| CW variants | — | O(n2.372) (galactic) |
Strassen is practical for n ≥ 64 or so. Below that threshold, the constant factor from extra additions makes naive faster. Libraries like BLAS switch strategies based on matrix size.
Given n points in the plane, find the pair with minimum Euclidean distance. Brute force: O(n2).
O(n log n)dL, dRd = min(dL, dR). Check pairs straddling the dividing line within a strip of width 2dT(n) = 2T(n/2) + O(n) → O(n log n)
For each point in the strip, compare to at most 7 subsequent points (sorted by y). Strip processing is O(n), not O(n2).
Multiply two n-digit integers. School algorithm: O(n2).
Split each number into high and low halves:
x = x_H · B^m + x_L
y = y_H · B^m + y_L
Naive: 4 multiplications of n/2-digit numbers. Karatsuba uses 3:
z0 = x_L · y_L
z2 = x_H · y_H
z1 = (x_L + x_H)(y_L + y_H) - z0 - z2
x·y = z2·B^2m + z1·B^m + z0
Recurrence: T(n) = 3T(n/2) + O(n)
Master Case 1: log2(3) ≈ 1.585 > 1 → O(n1.585)
| Algorithm | Complexity | Crossover |
|---|---|---|
| Schoolbook | O(n2) | n < 20–80 |
| Karatsuba | O(n1.585) | 20–80 digits |
| Toom-Cook | O(n1.465) | ~100+ digits |
| FFT-based | O(n log n log log n) | ~10,000+ digits |
Python's int and GMP chain these algorithms as n grows.
O(n log n)T(n) = 2T(n/2) + O(n) → O(n log n)
O(n)kadane(A):
max_here = max_so_far = A[0]
for i = 1 to n-1:
max_here = max(A[i],
max_here + A[i])
max_so_far = max(max_so_far,
max_here)
return max_so_far
Single pass, O(1) space. Essentially dynamic programming.
| D&C | Kadane | |
|---|---|---|
| Time | O(n log n) | O(n) |
| Parallelism | Naturally parallel | Sequential |
| Teaching value | Demonstrates combine step | Demonstrates DP |
The maximum subarray is a case where D&C is instructive but not optimal. Kadane wins for serial execution; D&C wins for parallel.
Find the k-th smallest element in an unsorted array.
O(n)Partition around a random pivot. Recurse only into the side containing index k. Expected O(n), worst case O(n2).
O(n)The pivot is guaranteed to be between the 30th and 70th percentile. Each recursive call eliminates at least 30% of elements:
T(n) = T(n/5) + T(7n/10) + O(n)
This solves to O(n) because 1/5 + 7/10 = 9/10 < 1.
std::nth_element in C++ uses Introselect: Quickselect with median-of-medians fallback.
The constant factor is large — Quickselect with random pivot is faster in practice. Median of medians is primarily of theoretical importance: it proves linear selection is possible.
Given a sequence of n complex numbers, the DFT computes n frequency components. Naive evaluation: O(n2).
Divide the DFT into two half-size DFTs on even-indexed and odd-indexed elements:
X[k] = E[k] + ω^k · O[k]
X[k+n/2] = E[k] - ω^k · O[k]
where ω = e^{-2πi/n}
E = DFT of even elements
O = DFT of odd elements
T(n) = 2T(n/2) + O(n) → O(n log n)
Master theorem Case 2. Reduces DFT from O(n2) to O(n log n).
One addition and one subtraction, multiplied by a twiddle factor. Maps naturally to hardware and SIMD.
O(n log n) instead of O(n2)O(n log n log log n)| Transform | Domain | Use |
|---|---|---|
| FFT | Complex numbers | General signal processing |
| NTT | Integers mod p | Exact polynomial arithmetic |
| DCT | Real numbers | Compression (JPEG, MP3) |
| Walsh–Hadamard | Binary/Boolean | Subset-sum convolutions |
FFT over finite fields (integers mod prime p) instead of complex numbers. Avoids floating-point errors entirely. Used in competitive programming and cryptographic polynomial multiplication.
D&C subproblems are independent — they can run on separate cores, threads, or machines without synchronisation until the combine step.
fork-join quicksort(A, lo, hi):
if hi-lo < THRESHOLD:
insertion_sort(A, lo, hi)
return
p = partition(A, lo, hi)
fork: quicksort(A, lo, p-1)
fork: quicksort(A, p+1, hi)
join
| Metric | Definition | Merge sort |
|---|---|---|
Work W(n) | Total operations | O(n log n) |
Span S(n) | Longest dependency chain | O(n) |
| Parallelism | W(n) / S(n) | O(log n) |
RecursiveTask, work-stealing thread pool
parallel_for, parallel_reduce
cilk_spawn, cilk_sync
par_iter, work-stealing
Process pools (GIL workaround)
Amdahl's Law: speedup is limited by the sequential fraction. In merge sort, the final merge is sequential and dominates at high core counts.
O(n2). Always randomise or use median-of-three(lo + hi) / 2 overflows for large arrays. Use lo + (hi - lo) / 2| Technique | Applies to | Benefit |
|---|---|---|
| Hybrid cutoff | All D&C sorts | Insertion sort for n < 16 |
| Tail recursion | Quick Sort | O(log n) stack depth |
| Introsort | Quick Sort | Heap sort fallback at 2 log n depth |
| Bottom-up merge | Merge Sort | Better cache locality |
| Bit-reversal | FFT | In-place iterative FFT |
| Language | Algorithm | Notes |
|---|---|---|
| C++ | Introsort | QS + Heap + Insertion |
| Python | Timsort | Merge + Insertion; exploits runs |
| Java | Dual-pivot QS / Timsort | QS for primitives, Timsort for objects |
| Rust | pdqsort | Pattern-defeating QS |
| Go | pdqsort (1.19+) | Replaced introsort |
| Domain | Algorithm | D&C idea |
|---|---|---|
| Geometry | Convex hull | Merge two convex hulls |
| Linear algebra | Strassen, FFT solvers | Block decomposition |
| Databases | External merge sort | Divide across disk pages |
| Distributed | MapReduce | Map = divide, Reduce = combine |
| Graphics | BSP trees, k-d trees | Spatial subdivision |
If your problem decomposes into independent subproblems of the same type, D&C is likely the right paradigm.
O(n log n) baseline; Quick Sort wins in practiceMerge Sort Quick Sort Strassen FFT Karatsuba Binary Search Closest Pair