A priority queue is a container where each element has an associated priority. Core operations:
Sorting gives all elements in order but costs O(n log n) up front and requires all elements known. A priority queue supports dynamic insertion and extraction — elements arrive over time.
| Implementation | insert | extract | dec-key | merge |
|---|---|---|---|---|
| Unsorted array | O(1) | O(n) | O(1) | O(1) |
| Sorted array | O(n) | O(1) | O(n) | O(n) |
| Binary heap | O(log n) | O(log n) | O(log n) | O(n) |
| Binomial heap | O(1)* | O(log n) | O(log n) | O(log n) |
| Fibonacci heap | O(1)* | O(log n)* | O(1)* | O(1)* |
| Pairing heap | O(1) | O(log n)* | O(log log n)* | O(1) |
* = amortised
A binary heap is a complete binary tree — every level is fully filled except possibly the last, which is filled left to right.
O(log n) — guarantees worst-case operation boundsA binary heap is not a binary search tree. There is no ordering between left and right children — only between parent and child.
Every node's key is less than or equal to both of its children's keys. The root holds the minimum.
Every node's key is greater than or equal to both of its children's keys. The root holds the maximum.
The heap property is weaker than the BST property. Siblings have no ordering relative to each other. This weaker invariant is what makes heap operations faster for priority queue use cases.
A complete binary tree maps perfectly to a flat array with zero overhead — no pointers, no node allocations, no fragmented memory.
| Relationship | Formula |
|---|---|
Parent of node i | (i - 1) / 2 |
Left child of i | 2i + 1 |
Right child of i | 2i + 2 |
| Is leaf? | i >= n / 2 |
| Last internal node | n/2 - 1 |
No pointers, no node allocations, no fragmented memory. The array representation is why binary heaps dominate in practice despite theoretically inferior bounds to Fibonacci heaps.
def insert(heap, key):
heap.append(key)
i = len(heap) - 1
# Sift up
while i > 0:
parent = (i - 1) // 2
if heap[i] < heap[parent]:
heap[i], heap[parent] = heap[parent], heap[i]
i = parent
else:
break
Time: O(log n) — at most one swap per level from leaf to root.
def extract_min(heap):
minimum = heap[0]
heap[0] = heap.pop() # move last to root
sift_down(heap, 0)
return minimum
def sift_down(heap, i):
n = len(heap)
while 2 * i + 1 < n:
child = 2 * i + 1
if child + 1 < n and heap[child+1] < heap[child]:
child += 1
if heap[i] <= heap[child]:
break
heap[i], heap[child] = heap[child], heap[i]
i = child
Extract-min from [2, 5, 3, 8, 7, 9, 4]:
Time: O(log n) — at most one swap per level. Sift-down is also the core subroutine of build-heap and heap sort.
Insert elements one by one: n insertions × O(log n) each = O(n log n).
Start from the last internal node and sift-down each node, working backwards to the root.
def build_heap(arr):
n = len(arr)
# Start from last internal node
for i in range(n // 2 - 1, -1, -1):
sift_down(arr, i, n)
Key insight: most nodes are near the bottom and do almost no work. This makes the total O(n), not O(n log n).
| Level (from bottom) | Nodes | Sift cost | Total |
|---|---|---|---|
| 0 (leaves) | n/2 | 0 | 0 |
| 1 | n/4 | 1 | n/4 |
| 2 | n/8 | 2 | n/4 |
| 3 | n/16 | 3 | 3n/16 |
| k | n/2k+1 | k | kn/2k+1 |
Half the nodes are leaves (zero work), a quarter do one swap, an eighth do two — the geometric series converges to O(n).
T(n) = Σ (k=0 to h) ⌊n / 2^(k+1)⌋ · k
= n · Σ (k=0 to ∞) k / 2^(k+1)
= n · Σ (k=0 to ∞) k · x^k where x = 1/2
Using the power series identity Σ k·x^k = x / (1-x)² for |x| < 1:
Σ (k=0 to ∞) k / 2^k = (1/2) / (1 - 1/2)²
= (1/2) / (1/4)
= 2
Therefore: T(n) = n · 2 / 2 = n
This is a tight bound — you cannot build a heap faster than O(n) because you must at least read every element.
Intuition: The sum converges because the number of nodes decreases exponentially as the sift-down cost increases linearly. The geometric decrease dominates.
Contrast with top-down: Inserting one-by-one is O(n log n) because sift-up does the most work at the bottom where most nodes live. Floyd's algorithm uses sift-down, which does the least work at the bottom.
Interview note: This proof is frequently tested. Know the series identity and be able to explain why bottom-up is linear.
def heap_sort(arr):
n = len(arr)
# Phase 1: build max-heap -- O(n)
for i in range(n // 2 - 1, -1, -1):
sift_down(arr, i, n)
# Phase 2: extract max -- O(n log n)
for end in range(n - 1, 0, -1):
arr[0], arr[end] = arr[end], arr[0]
sift_down(arr, 0, end)
| Property | Value |
|---|---|
| Time (worst / avg / best) | O(n log n) |
| Space | O(1) — in-place |
| Stable? | No |
| Adaptive? | No — always O(n log n) |
Advantage over quicksort: guaranteed O(n log n) worst case. Quicksort degrades to O(n²) on adversarial input.
Disadvantage: worse cache locality than quicksort. In practice, quicksort wins on average due to sequential access patterns and smaller constant factors.
| Property | Min-heap | Max-heap |
|---|---|---|
| Root contains | Smallest | Largest |
| Parent vs child | parent ≤ child | parent ≥ child |
| extract returns | Minimum | Maximum |
| Primary use | Priority queues | Heap sort, top-K |
Negate all keys: a min-heap on -key behaves as a max-heap on key.
# Max-heap via Python's heapq (min-heap):
import heapq
max_heap = []
heapq.heappush(max_heap, -value)
largest = -heapq.heappop(max_heap)
| Language | Default | Module / Class |
|---|---|---|
| Python | Min-heap | heapq |
| Java | Min-heap | PriorityQueue |
| C++ | Max-heap | priority_queue |
| Go | Min-heap | container/heap |
| Rust | Max-heap | BinaryHeap |
| C# | Min-heap | PriorityQueue |
Common pitfall: C++ priority_queue is a max-heap by default — the opposite of most other languages. Use greater<T> comparator for min-heap behaviour.
A d-ary heap is a complete d-ary tree satisfying the heap property. A binary heap is the special case d = 2.
| Relationship | Formula |
|---|---|
Parent of node i | (i - 1) / d |
j-th child of i | d·i + j + 1 |
| Operation | Binary (d=2) | D-ary |
|---|---|---|
| insert (sift-up) | O(log₂ n) | O(log_d n) faster |
| extract (sift-down) | O(log₂ n) | O(d · log_d n) slower |
| decrease-key | O(log₂ n) | O(log_d n) faster |
Practical tip: for Dijkstra's where decrease-key dominates, a 4-ary heap often outperforms binary. The shallower tree also improves cache performance.
A binomial tree B_k is defined recursively:
B_0 is a single nodeB_k is formed by linking two B_(k-1) trees — one becomes the leftmost child of the other's rootB_k has exactly 2^k nodes and height kA collection of binomial trees where:
| Operation | Time |
|---|---|
| insert | O(log n), amortised O(1) |
| extract-min | O(log n) |
| merge | O(log n) |
| decrease-key | O(log n) |
Key advantage: O(log n) merge. Binary heaps require O(n) to merge two heaps. Merging binomial heaps works like binary addition — carry when two trees of the same order collide.
A Fibonacci heap is a collection of heap-ordered trees with:
* = amortised
The magic: defer cleanup to extract-min. This laziness is what makes insert, merge, and decrease-key all O(1) amortised.
Define Φ = t + 2m where t = number of trees in root list, m = number of marked nodes.
| Operation | Actual | ΔΦ | Amortised |
|---|---|---|---|
| insert | O(1) | +1 | O(1) |
| merge | O(1) | 0 | O(1) |
| extract-min | O(t + log n) | -(t - log n) | O(log n) |
| decrease-key | O(c) cuts | -(c - 2) | O(1) |
Cascading cuts: when a node loses a second child, it is cut from its parent and added to the root list. The mark bit tracks whether a node has lost a child since becoming a child itself.
k has at least F_(k+2) descendantsF_(k+2) ≥ φ^k where φ = (1+√5)/2 ≈ 1.618k ≤ log_φ(n) = O(log n)The Fibonacci connection: the name comes from the fact that the minimum subtree sizes follow the Fibonacci sequence. This bounds the maximum degree at O(log n).
In practice: Fibonacci heaps achieve theoretically optimal amortised bounds, but large constant factors and pointer-heavy structure make them slower than binary heaps for most real workloads. They shine only for very dense graphs with many decrease-key operations.
Pairing heaps achieve similar practical performance to Fibonacci heaps with a much simpler implementation. Each node stores:
| Operation | Time |
|---|---|
| insert | O(1) |
| find-min | O(1) |
| merge | O(1) — link roots |
| extract-min | O(log n) amortised |
| decrease-key | O(log log n) amortised* |
* conjectured O(1), proven O(log log n) by Pettie 2005
In practice: pairing heaps often outperform Fibonacci heaps due to simpler structure and better cache behaviour. They are the go-to "advanced" heap when binary heaps are insufficient.
A standard binary heap cannot locate an arbitrary element — you can only access the root. Graph algorithms need to update priorities of elements already in the queue.
Augment a binary heap with an index table tracking where each element lives in the heap array:
Heap array: [A:2, C:5, B:3, D:8, E:7]
Position map: { A→0, B→2, C→1, D→3, E→4 }
Inverse map: { 0→A, 1→C, 2→B, 3→D, 4→E }
Every swap in sift-up/sift-down must also update both maps — constant overhead per swap.
| Operation | Time | How |
|---|---|---|
| insert(id, p) | O(log n) | Add, sift-up, update maps |
| extract-min() | O(log n) | Swap, sift-down, update maps |
| decrease-key(id, p) | O(log n) | O(1) lookup + sift-up |
| contains(id) | O(1) | Check position map |
Standard implementation for Dijkstra's and Prim's algorithms. Java's PriorityQueue does not support decrease-key — you must implement your own or use lazy deletion.
Lazy deletion alternative: instead of decrease-key, insert a duplicate with the new priority. Ignore stale entries on extraction. Simpler but uses more memory.
def dijkstra(graph, source):
dist = {v: float('inf') for v in graph}
dist[source] = 0
pq = MinHeap()
pq.insert(source, 0)
while not pq.is_empty():
u = pq.extract_min()
for v, weight in graph[u]:
new_dist = dist[u] + weight
if new_dist < dist[v]:
dist[v] = new_dist
pq.decrease_key(v, new_dist)
return dist
The decrease_key call is the critical operation — it runs once per edge relaxation.
| Heap type | Dijkstra's total |
|---|---|
| Unsorted array | O(V²) |
| Binary heap | O((V + E) log V) |
| D-ary heap (d = E/V) | O(E · logE/V V) |
| Fibonacci heap | O(V log V + E) |
Sparse graphs (E ~ V): binary heap gives O(V log V). Fibonacci heap also O(V log V).
Dense graphs (E ~ V²): Fibonacci heap gives O(V² + V log V) ≈ O(V²), matching the unsorted-array approach with better constant. Binary heap gives O(V² log V) — worse.
Build an optimal prefix-free code by repeatedly extracting the two lowest-frequency symbols and merging.
Maintain the median of a data stream using two heaps:
Priority queues power:
Pattern: whenever you repeatedly need "the next most important element" from a dynamic collection, reach for a priority queue. The heap is the engine; the priority queue is the interface.
| Source | Description |
|---|---|
| CLRS | Ch. 6 (heapsort), Ch. 19 (Fibonacci heaps) — the standard reference |
| Sedgewick | Algorithms — clear binary heap and indexed PQ implementations |
| Tarjan | Data Structures and Network Algorithms — original Fibonacci heap analysis |
| Fredman & Tarjan | "Fibonacci Heaps and Their Uses" — JACM 1987 |
| Fredman et al. | "The Pairing Heap" — Algorithmica 1986 |