COMPUTER SCIENCE FUNDAMENTALS SERIES

Heaps and
Priority Queues

Binary heaps · Heap sort · Fibonacci heaps ·
Build-heap · Decrease-key · Priority queue ADT

Mid-level software engineer track · 20 slides

Priority Queue ADT

Abstract interface

A priority queue is a container where each element has an associated priority. Core operations:

insert(key, priority) — add an element with the given priority
extract-min / extract-max — remove and return the highest-priority element
peek-min / peek-max — return highest-priority element without removing
decrease-key(handle, new_priority) — lower the priority of an existing element
merge(pq1, pq2) — combine two priority queues into one

Sorting gives all elements in order but costs O(n log n) up front and requires all elements known. A priority queue supports dynamic insertion and extraction — elements arrive over time.

Implementations compared

Implementation	insert	extract	dec-key	merge
Unsorted array	O(1)	O(n)	O(1)	O(1)
Sorted array	O(n)	O(1)	O(n)	O(n)
Binary heap	O(log n)	O(log n)	O(log n)	O(n)
Binomial heap	O(1)*	O(log n)	O(log n)	O(log n)
Fibonacci heap	O(1)*	O(log n)*	O(1)*	O(1)*
Pairing heap	O(1)	O(log n)*	O(log log n)*	O(1)

* = amortised

Binary Heap — Structure Property

Complete binary tree

A binary heap is a complete binary tree — every level is fully filled except possibly the last, which is filled left to right.

Height is always O(log n) — guarantees worst-case operation bounds
No wasted space in the array representation — no pointers needed
Cache-friendly contiguous memory layout
Predictable shape makes implementation simple

A binary heap is not a binary search tree. There is no ordering between left and right children — only between parent and child.

Binary Heap — Heap Property

Min-heap property

Every node's key is less than or equal to both of its children's keys. The root holds the minimum.

Max-heap property

Every node's key is greater than or equal to both of its children's keys. The root holds the maximum.

The heap property is weaker than the BST property. Siblings have no ordering relative to each other. This weaker invariant is what makes heap operations faster for priority queue use cases.

Array Representation

Implicit tree in an array

A complete binary tree maps perfectly to a flat array with zero overhead — no pointers, no node allocations, no fragmented memory.

Navigation formulas (0-indexed)

Relationship	Formula
Parent of node `i`	`(i - 1) / 2`
Left child of `i`	`2i + 1`
Right child of `i`	`2i + 2`
Is leaf?	`i >= n / 2`
Last internal node	`n/2 - 1`

No pointers, no node allocations, no fragmented memory. The array representation is why binary heaps dominate in practice despite theoretically inferior bounds to Fibonacci heaps.

Sift-Up (Insert)

Algorithm

Append the new element at the end of the array (next leaf position)
Compare with parent — if smaller (min-heap), swap
Repeat up the tree until heap property is restored or root is reached

def insert(heap, key):
    heap.append(key)
    i = len(heap) - 1
    # Sift up
    while i > 0:
        parent = (i - 1) // 2
        if heap[i] < heap[parent]:
            heap[i], heap[parent] = heap[parent], heap[i]
            i = parent
        else:
            break

Time: O(log n) — at most one swap per level from leaf to root.

Visual walkthrough — insert 1

Step 0: 2538791

Swap 1↔3: 2518793

Swap 1↔2: 1528793

Done! 1528793

Sift-Down (Extract)

Algorithm (extract-min)

Return the root element (the minimum)
Move the last element to the root position
Compare with children — swap with the smaller child
Repeat down the tree until heap property is restored or a leaf is reached

def extract_min(heap):
    minimum = heap[0]
    heap[0] = heap.pop()  # move last to root
    sift_down(heap, 0)
    return minimum

def sift_down(heap, i):
    n = len(heap)
    while 2 * i + 1 < n:
        child = 2 * i + 1
        if child + 1 < n and heap[child+1] < heap[child]:
            child += 1
        if heap[i] <= heap[child]:
            break
        heap[i], heap[child] = heap[child], heap[i]
        i = child

Visual walkthrough

Extract-min from [2, 5, 3, 8, 7, 9, 4]:

Remove 2, move 4 to root:

453879

Swap 4↔3 (smaller child):

354879

4 < 9 — done:

354879

Time: O(log n) — at most one swap per level. Sift-down is also the core subroutine of build-heap and heap sort.

Build-Heap in O(n)

The naive approach

Insert elements one by one: n insertions × O(log n) each = O(n log n).

Floyd's algorithm (bottom-up)

Start from the last internal node and sift-down each node, working backwards to the root.

def build_heap(arr):
    n = len(arr)
    # Start from last internal node
    for i in range(n // 2 - 1, -1, -1):
        sift_down(arr, i, n)

Key insight: most nodes are near the bottom and do almost no work. This makes the total O(n), not O(n log n).

Work distribution by level

Level (from bottom)	Nodes	Sift cost	Total
0 (leaves)	n/2	0	0
1	n/4	1	n/4
2	n/8	2	n/4
3	n/16	3	3n/16
k	n/2^k+1	k	kn/2^k+1

Half the nodes are leaves (zero work), a quarter do one swap, an eighth do two — the geometric series converges to O(n).

Build-Heap — Proof of O(n)

Summing the work

T(n) = Σ (k=0 to h) ⌊n / 2^(k+1)⌋ · k

     = n · Σ (k=0 to ∞) k / 2^(k+1)

     = n · Σ (k=0 to ∞) k · x^k   where x = 1/2

The key identity

Using the power series identity Σ k·x^k = x / (1-x)² for |x| < 1:

Σ (k=0 to ∞) k / 2^k = (1/2) / (1 - 1/2)²
                       = (1/2) / (1/4)
                       = 2

Therefore: T(n) = n · 2 / 2 = n

Build-heap is O(n)

This is a tight bound — you cannot build a heap faster than O(n) because you must at least read every element.

Intuition: The sum converges because the number of nodes decreases exponentially as the sift-down cost increases linearly. The geometric decrease dominates.

Contrast with top-down: Inserting one-by-one is O(n log n) because sift-up does the most work at the bottom where most nodes live. Floyd's algorithm uses sift-down, which does the least work at the bottom.

Interview note: This proof is frequently tested. Know the series identity and be able to explain why bottom-up is linear.

Heap Sort

Algorithm

Build a max-heap from the input array — O(n)
Repeatedly extract the maximum and place it at the end:

def heap_sort(arr):
    n = len(arr)
    # Phase 1: build max-heap -- O(n)
    for i in range(n // 2 - 1, -1, -1):
        sift_down(arr, i, n)

    # Phase 2: extract max -- O(n log n)
    for end in range(n - 1, 0, -1):
        arr[0], arr[end] = arr[end], arr[0]
        sift_down(arr, 0, end)

Properties

Property	Value
Time (worst / avg / best)	O(n log n)
Space	O(1) — in-place
Stable?	No
Adaptive?	No — always O(n log n)

Advantage over quicksort: guaranteed O(n log n) worst case. Quicksort degrades to O(n²) on adversarial input.

Disadvantage: worse cache locality than quicksort. In practice, quicksort wins on average due to sequential access patterns and smaller constant factors.

Min-Heap vs Max-Heap

Structural difference

Property	Min-heap	Max-heap
Root contains	Smallest	Largest
Parent vs child	parent ≤ child	parent ≥ child
extract returns	Minimum	Maximum
Primary use	Priority queues	Heap sort, top-K

Converting between them

Negate all keys: a min-heap on -key behaves as a max-heap on key.

# Max-heap via Python's heapq (min-heap):
import heapq
max_heap = []
heapq.heappush(max_heap, -value)
largest = -heapq.heappop(max_heap)

Language defaults

Language	Default	Module / Class
Python	Min-heap	`heapq`
Java	Min-heap	`PriorityQueue`
C++	Max-heap	`priority_queue`
Go	Min-heap	`container/heap`
Rust	Max-heap	`BinaryHeap`
C#	Min-heap	`PriorityQueue`

Common pitfall: C++ priority_queue is a max-heap by default — the opposite of most other languages. Use greater<T> comparator for min-heap behaviour.

D-ary Heaps

Generalising the branching factor

A d-ary heap is a complete d-ary tree satisfying the heap property. A binary heap is the special case d = 2.

Navigation (0-indexed)

Relationship	Formula
Parent of node `i`	`(i - 1) / d`
j-th child of `i`	`d·i + j + 1`

Trade-offs

Operation	Binary (d=2)	D-ary
insert (sift-up)	O(log₂ n)	O(log_d n) faster
extract (sift-down)	O(log₂ n)	O(d · log_d n) slower
decrease-key	O(log₂ n)	O(log_d n) faster

4-ary heap example

Practical tip: for Dijkstra's where decrease-key dominates, a 4-ary heap often outperforms binary. The shallower tree also improves cache performance.

Binomial Heaps

Binomial trees

A binomial tree B_k is defined recursively:

B_0 is a single node
B_k is formed by linking two B_(k-1) trees — one becomes the leftmost child of the other's root
B_k has exactly 2^k nodes and height k

Binomial heap structure

A collection of binomial trees where:

Each tree satisfies the min-heap property
At most one tree of each order (like binary representation of n)
Trees stored in a linked list sorted by order

Operation	Time
insert	O(log n), amortised O(1)
extract-min	O(log n)
merge	O(log n)
decrease-key	O(log n)

Key advantage: O(log n) merge. Binary heaps require O(n) to merge two heaps. Merging binomial heaps works like binary addition — carry when two trees of the same order collide.

Fibonacci Heaps — Structure

Lazy mergeable heap

A Fibonacci heap is a collection of heap-ordered trees with:

A min pointer to the root with the smallest key
Trees stored in a circular doubly-linked list of roots
Each node tracks its degree and a mark bit

Key operations

insert: create single-node tree, add to root list, update min — O(1)
merge: concatenate root lists, update min — O(1)
extract-min: remove min, add children to root list, consolidate — O(log n)*
decrease-key: cut node, add to root list, cascade cuts — O(1)*

* = amortised

Structure diagram

The magic: defer cleanup to extract-min. This laziness is what makes insert, merge, and decrease-key all O(1) amortised.

Fibonacci Heaps — Amortised Analysis

Potential function

Define Φ = t + 2m where t = number of trees in root list, m = number of marked nodes.

Operation	Actual	ΔΦ	Amortised
insert	O(1)	+1	O(1)
merge	O(1)	0	O(1)
extract-min	O(t + log n)	-(t - log n)	O(log n)
decrease-key	O(c) cuts	-(c - 2)	O(1)

Cascading cuts: when a node loses a second child, it is cut from its parent and added to the root list. The mark bit tracks whether a node has lost a child since becoming a child itself.

Why O(log n) max degree?

A node of degree k has at least F_(k+2) descendants
F_(k+2) ≥ φ^k where φ = (1+√5)/2 ≈ 1.618
Therefore k ≤ log_φ(n) = O(log n)

The Fibonacci connection: the name comes from the fact that the minimum subtree sizes follow the Fibonacci sequence. This bounds the maximum degree at O(log n).

In practice: Fibonacci heaps achieve theoretically optimal amortised bounds, but large constant factors and pointer-heavy structure make them slower than binary heaps for most real workloads. They shine only for very dense graphs with many decrease-key operations.

Pairing Heaps

A simpler alternative

Pairing heaps achieve similar practical performance to Fibonacci heaps with a much simpler implementation. Each node stores:

A key value
A pointer to its leftmost child
A pointer to its next sibling

Operations

Operation	Time
insert	O(1)
find-min	O(1)
merge	O(1) — link roots
extract-min	O(log n) amortised
decrease-key	O(log log n) amortised*

* conjectured O(1), proven O(log log n) by Pettie 2005

Extract-min: two-pass pairing

Remove the root
Left-to-right pass: pair adjacent siblings, merge each pair
Right-to-left pass: merge resulting trees into one

Children: 538276

Pair L→R: [3,5][2,8][6,7]

Merge R→L: [2,...]

In practice: pairing heaps often outperform Fibonacci heaps due to simpler structure and better cache behaviour. They are the go-to "advanced" heap when binary heaps are insufficient.

Indexed Priority Queues

The decrease-key problem

A standard binary heap cannot locate an arbitrary element — you can only access the root. Graph algorithms need to update priorities of elements already in the queue.

Solution: position maps

Augment a binary heap with an index table tracking where each element lives in the heap array:

Heap array:    [A:2, C:5, B:3, D:8, E:7]
Position map:  { A→0, B→2, C→1, D→3, E→4 }
Inverse map:   { 0→A, 1→C, 2→B, 3→D, 4→E }

Every swap in sift-up/sift-down must also update both maps — constant overhead per swap.

Operations

Operation	Time	How
insert(id, p)	O(log n)	Add, sift-up, update maps
extract-min()	O(log n)	Swap, sift-down, update maps
decrease-key(id, p)	O(log n)	O(1) lookup + sift-up
contains(id)	O(1)	Check position map

Standard implementation for Dijkstra's and Prim's algorithms. Java's PriorityQueue does not support decrease-key — you must implement your own or use lazy deletion.

Lazy deletion alternative: instead of decrease-key, insert a duplicate with the new priority. Ignore stale entries on extraction. Simpler but uses more memory.

Application — Dijkstra's Algorithm

Shortest paths with a priority queue

def dijkstra(graph, source):
    dist = {v: float('inf') for v in graph}
    dist[source] = 0
    pq = MinHeap()
    pq.insert(source, 0)

    while not pq.is_empty():
        u = pq.extract_min()
        for v, weight in graph[u]:
            new_dist = dist[u] + weight
            if new_dist < dist[v]:
                dist[v] = new_dist
                pq.decrease_key(v, new_dist)
    return dist

The decrease_key call is the critical operation — it runs once per edge relaxation.

Complexity depends on the heap

Heap type	Dijkstra's total
Unsorted array	O(V²)
Binary heap	O((V + E) log V)
D-ary heap (d = E/V)	O(E · log_E/V V)
Fibonacci heap	O(V log V + E)

Sparse graphs (E ~ V): binary heap gives O(V log V). Fibonacci heap also O(V log V).

Dense graphs (E ~ V²): Fibonacci heap gives O(V² + V log V) ≈ O(V²), matching the unsorted-array approach with better constant. Binary heap gives O(V² log V) — worse.

Applications — Huffman, Median, Scheduling

Huffman Coding

Build an optimal prefix-free code by repeatedly extracting the two lowest-frequency symbols and merging.

Min-heap on symbol frequencies
O(n log n) for n symbols
Greedy — heap provides the greedy choice

Running Median

Maintain the median of a data stream using two heaps:

Max-heap for the lower half
Min-heap for the upper half
Balance sizes (differ by at most 1)
Median at a root — O(1) query, O(log n) insert

Task Scheduling

Priority queues power:

OS process scheduling
Event-driven simulation
Job queues (Celery, Sidekiq)
A* search (open list by f(n))
K-way merge of sorted lists

Pattern: whenever you repeatedly need "the next most important element" from a dynamic collection, reach for a priority queue. The heap is the engine; the priority queue is the interface.

Summary & Further Reading

Key takeaways

A priority queue is an ADT; a binary heap is its most common implementation
Binary heaps use an implicit array — no pointers, O(1) space overhead
Build-heap is O(n) via Floyd's bottom-up sift-down — not O(n log n)
Heap sort is O(n log n) worst-case, in-place, but not stable
Fibonacci heaps offer O(1) amortised decrease-key — optimal for dense-graph Dijkstra's
Pairing heaps are simpler than Fibonacci heaps and often faster in practice
Indexed priority queues solve the decrease-key lookup problem
D-ary heaps (d=4) often beat binary heaps due to cache effects

Source	Description
CLRS	Ch. 6 (heapsort), Ch. 19 (Fibonacci heaps) — the standard reference
Sedgewick	Algorithms — clear binary heap and indexed PQ implementations
Tarjan	Data Structures and Network Algorithms — original Fibonacci heap analysis
Fredman & Tarjan	"Fibonacci Heaps and Their Uses" — JACM 1987
Fredman et al.	"The Pairing Heap" — Algorithmica 1986

Heaps andPriority Queues

Priority Queue ADT

Abstract interface

Implementations compared

Binary Heap — Structure Property

Complete binary tree

Binary Heap — Heap Property

Min-heap property

Max-heap property

Array Representation

Implicit tree in an array

Navigation formulas (0-indexed)

Sift-Up (Insert)

Algorithm

Visual walkthrough — insert 1

Sift-Down (Extract)

Algorithm (extract-min)

Visual walkthrough

Build-Heap in O(n)

The naive approach

Floyd's algorithm (bottom-up)

Work distribution by level

Build-Heap — Proof of O(n)

Summing the work

The key identity

Build-heap is O(n)

Heap Sort

Algorithm

Properties

Min-Heap vs Max-Heap

Structural difference

Converting between them

Language defaults

D-ary Heaps

Generalising the branching factor

Navigation (0-indexed)

Trade-offs

4-ary heap example

Binomial Heaps

Binomial trees

Binomial heap structure

Fibonacci Heaps — Structure

Lazy mergeable heap

Key operations

Structure diagram

Fibonacci Heaps — Amortised Analysis

Potential function

Why O(log n) max degree?

Pairing Heaps

A simpler alternative

Operations

Extract-min: two-pass pairing

Indexed Priority Queues

The decrease-key problem

Solution: position maps

Operations

Application — Dijkstra's Algorithm

Shortest paths with a priority queue

Complexity depends on the heap

Applications — Huffman, Median, Scheduling

Huffman Coding

Running Median

Task Scheduling

Summary & Further Reading

Key takeaways

Recommended reading

Heaps and
Priority Queues