COMPUTER SCIENCE FUNDAMENTALS SERIES

Shortest Path
Algorithms

Dijkstra · Bellman-Ford · Floyd-Warshall · A* search ·
Negative cycles · Routing algorithms

Mid-level software engineer track · 20 slides

Single-Source Shortest Path Problem

Definition

Given a weighted directed graph G = (V, E) with weight function w: E → R and a source vertex s, find the shortest-weight path from s to every reachable vertex.

Path weight — sum of edge weights along the path: w(p) = Σ w(v_i, v_i+1)
Shortest-path weight — δ(s, v) = min { w(p) : s ⤳ v } or +∞ if unreachable
Predecessor sub-graph — encodes the shortest-path tree rooted at s

Optimal sub-structure

Any sub-path of a shortest path is itself a shortest path. This property is the foundation for every algorithm in this deck.

Problem variants

Variant	Description
Single-source	One source to all vertices — Dijkstra, Bellman-Ford
Single-destination	All vertices to one target — reverse edges, run single-source
Single-pair	One source to one target — A*, bidirectional search
All-pairs	Every vertex to every vertex — Floyd-Warshall, Johnson's

BFS for Unweighted Graphs

BFS solves single-source shortest paths when every edge has weight 1. It processes vertices in order of increasing distance from the source.

from collections import deque

def bfs_shortest(graph, source):
    dist = {v: float('inf') for v in graph}
    dist[source] = 0
    prev = {v: None for v in graph}
    queue = deque([source])

    while queue:
        u = queue.popleft()
        for v in graph[u]:
            if dist[v] == float('inf'):
                dist[v] = dist[u] + 1
                prev[v] = u
                queue.append(v)
    return dist, prev

Properties

Time complexity

O(V + E) — optimal for unweighted graphs

Space complexity

O(V) for the queue and distance array

Correctness

Vertices are discovered in non-decreasing distance order; once dequeued, distance is final

BFS is the baseline. When edges have varying weights, we need more sophisticated approaches — but for unit-weight graphs, nothing beats BFS.

Dijkstra's Algorithm

Dijkstra's algorithm (1959) solves single-source shortest paths for graphs with non-negative edge weights. Greedily extract the vertex with the smallest tentative distance — its distance is then final. Relax all outgoing edges.

def dijkstra(graph, source):
    dist = {v: float('inf') for v in graph}
    prev = {v: None for v in graph}
    dist[source] = 0
    Q = set(graph.keys())

    while Q:
        u = min(Q, key=lambda v: dist[v])
        Q.remove(u)
        for v, w in graph[u]:
            if v in Q and dist[u] + w < dist[v]:
                dist[v] = dist[u] + w
                prev[v] = u
    return dist, prev

Array-based version shown for clarity — O(V²). Priority queue version on next slide.

Relaxation

The fundamental operation. For edge (u, v) with weight w:

if dist[u] + w < dist[v]:
    dist[v] = dist[u] + w
    prev[v] = u

Key constraint

All edge weights must be non-negative: w(u, v) ≥ 0. Negative weights break the greedy guarantee — a settled vertex could later be improved via a negative edge.

Dijkstra — Priority Queue Implementation

The choice of priority queue determines Dijkstra's time complexity:

Priority Queue	extract-min	decrease-key	Total Complexity
Array (unsorted)	`O(V)`	`O(1)`	`O(V²)`
Binary heap	`O(log V)`	`O(log V)`	`O((V + E) log V)`
Fibonacci heap	`O(log V)` amortised	`O(1)` amortised	`O(V log V + E)`

Lazy deletion (practical default)

import heapq

def dijkstra(graph, source):
    dist = {v: float('inf') for v in graph}
    dist[source] = 0
    pq = [(0, source)]

    while pq:
        d, u = heapq.heappop(pq)
        if d > dist[u]:
            continue  # stale entry
        for v, w in graph[u]:
            if dist[u] + w < dist[v]:
                dist[v] = dist[u] + w
                heapq.heappush(pq, (dist[v], v))
    return dist

Practical notes

Binary heap is the standard choice — simple, cache-friendly, fast in practice
Fibonacci heap wins asymptotically for dense graphs (E >> V) but has high constant factors
Lazy deletion avoids decrease-key entirely; push new (dist, v) pairs and skip stale entries on extraction
For integer weights in range [0, C], dial's algorithm achieves O(V + E + C)

Dijkstra — Proof of Correctness

Theorem

When Dijkstra extracts vertex u from the priority queue, dist[u] = δ(s, u).

Proof sketch (by contradiction)

Step 1 — Assumption

Suppose u is the first vertex extracted with dist[u] > δ(s, u).

Step 2 — Crossing edge

Let p be a true shortest path s ⤳ u. Let (x, y) be the first edge on p crossing from settled set S to unsettled V - S.

Step 3 — x is correct

Since x ∈ S, dist[x] = δ(s, x) (by assumption, u is the first violation).

Step 4 — Relaxation

After relaxing (x, y): dist[y] ≤ dist[x] + w(x,y) = δ(s, y)

Step 5 — Non-negative weights

Since all weights ≥ 0: δ(s, y) ≤ δ(s, u) < dist[u]

Step 6 — Contradiction

Then dist[y] < dist[u], so the priority queue would extract y before u — contradiction.

The non-negative weight assumption is essential in step 5. With negative edges, δ(s, y) could exceed δ(s, u) via a later negative edge, and the proof breaks.

Bellman-Ford Algorithm

Bellman-Ford handles graphs with negative edge weights (but no negative cycles reachable from s).

def bellman_ford(vertices, edges, source):
    dist = {v: float('inf') for v in vertices}
    dist[source] = 0

    # V-1 relaxation passes
    for _ in range(len(vertices) - 1):
        for u, v, w in edges:
            if dist[u] + w < dist[v]:
                dist[v] = dist[u] + w

    # Negative cycle check
    for u, v, w in edges:
        if dist[u] + w < dist[v]:
            raise ValueError("Negative cycle")

    return dist

Why |V| - 1 passes?

A shortest path in a graph with |V| vertices has at most |V| - 1 edges. After pass i, all shortest paths using ≤ i edges are found.

Property	Value
Time complexity	`O(V · E)`
Space complexity	`O(V)`
Negative weights	Yes
Negative cycle detection	Yes

Bellman-Ford — Negative Cycle Detection

What is a negative cycle?

A cycle whose total edge weight is negative. If reachable from s, shortest-path distances are -∞ for all vertices reachable from the cycle.

Cycle weight: 2 + (-5) + 1 = -2

Detection mechanism

After |V| - 1 relaxation passes, all shortest paths are finalized (if no negative cycle). A |V|-th pass that still finds improvements proves a negative cycle exists.

Finding the cycle

Run the extra pass; record any vertex v that improves
Follow prev pointers from v for |V| steps to ensure you are on the cycle
Trace prev pointers until you revisit a vertex — that is the cycle

Applications

Currency arbitrage

Model exchange rates as -log(rate) edge weights; a negative cycle means profit

Network routing

Detect routing loops in distance-vector protocols

SPFA — Shortest Path Faster Algorithm

SPFA is a queue-based optimisation of Bellman-Ford. Instead of relaxing all edges in every pass, it only re-relaxes vertices whose distances have changed.

from collections import deque

def spfa(graph, source):
    dist = {v: float('inf') for v in graph}
    dist[source] = 0
    in_queue = {v: False for v in graph}
    in_queue[source] = True
    queue = deque([source])

    while queue:
        u = queue.popleft()
        in_queue[u] = False
        for v, w in graph[u]:
            if dist[u] + w < dist[v]:
                dist[v] = dist[u] + w
                if not in_queue[v]:
                    queue.append(v)
                    in_queue[v] = True
    return dist

Property	Value
Average-case	`O(E)` on random graphs
Worst-case	`O(V · E)` — same as Bellman-Ford
Negative weights	Yes
Negative cycle detection	Count relaxations per vertex; ≥ `V` times = cycle

SPFA is popular in competitive programming for its simplicity and good average-case performance. In production systems, Dijkstra with a binary heap is preferred for non-negative weights.

DAG Shortest Paths via Topological Sort

For directed acyclic graphs, we can find shortest paths in O(V + E) by processing vertices in topological order — even with negative edge weights.

from collections import deque

def dag_shortest(graph, source):
    # Kahn's topological sort
    in_deg = {v: 0 for v in graph}
    for u in graph:
        for v, w in graph[u]:
            in_deg[v] += 1
    queue = deque(v for v in graph if in_deg[v] == 0)
    order = []
    while queue:
        u = queue.popleft()
        order.append(u)
        for v, w in graph[u]:
            in_deg[v] -= 1
            if in_deg[v] == 0:
                queue.append(v)

    # Relax in topological order
    dist = {v: float('inf') for v in graph}
    dist[source] = 0
    for u in order:
        if dist[u] < float('inf'):
            for v, w in graph[u]:
                if dist[u] + w < dist[v]:
                    dist[v] = dist[u] + w
    return dist

Why it works

In topological order, all predecessors of v are processed before v. When we relax edges from u, dist[u] is already optimal.

Time — O(V + E)

Linear — optimal. No priority queue needed.

Handles negative weights

Yes — because there are no cycles at all.

Applications

Critical path analysis — negate weights; shortest = longest in original
Project scheduling — PERT/CPM networks are DAGs
Dynamic programming — many DP problems are shortest paths in an implicit DAG

Floyd-Warshall — All-Pairs Shortest Paths

Floyd-Warshall computes shortest paths between every pair of vertices using dynamic programming over intermediate vertices.

def floyd_warshall(n, edges):
    INF = float('inf')
    dist = [[INF] * n for _ in range(n)]
    for i in range(n):
        dist[i][i] = 0
    for u, v, w in edges:
        dist[u][v] = w

    for k in range(n):
        for i in range(n):
            for j in range(n):
                if dist[i][k] + dist[k][j] < dist[i][j]:
                    dist[i][j] = dist[i][k] + dist[k][j]

    return dist

Recurrence

dist_k(i, j) = min( dist_k-1(i, j), dist_k-1(i, k) + dist_k-1(k, j) )

At step k, consider whether routing through vertex k improves the path from i to j.

Property	Value
Time	`O(V³)`
Space	`O(V²)`
Negative weights	Yes (no negative cycles)
Negative cycle detection	`dist[v][v] < 0` for some `v`

Floyd-Warshall is ideal for dense graphs with small V. For sparse graphs, running Dijkstra from every vertex is faster: O(V · E log V).

Johnson's Algorithm

Johnson's computes all-pairs shortest paths efficiently on sparse graphs by combining Bellman-Ford and Dijkstra.

1. Reweight setup

Add vertex q with zero-weight edges to all vertices. Run Bellman-Ford from q to get potential h(v).

2. Reweight edges

w'(u,v) = w(u,v) + h(u) - h(v)
Guaranteed non-negative. Path ordering preserved because h values telescope.

3. Run V × Dijkstra

Remove q. Run Dijkstra from every vertex using w'. Recover: δ(u,v) = δ'(u,v) - h(u) + h(v).

Property	Value
Time	`O(V · E log V)` with binary heap Dijkstra
Space	`O(V²)` for the distance matrix
Better than Floyd-Warshall when	`E = o(V²)` — sparse graphs

Johnson's is the standard choice for all-pairs on sparse graphs. Floyd-Warshall wins on dense graphs due to simpler constant factors and cache behaviour.

A* Search

A* is a best-first search for single-pair shortest paths that uses a heuristic to focus exploration toward the goal, dramatically reducing the search space.

Core idea

Prioritise vertices by f(v) = g(v) + h(v):

g(v) — actual cost from source to v
h(v) — estimated cost from v to goal (the heuristic)
f(v) — estimated total cost of the cheapest path through v

import heapq

def a_star(graph, source, goal, h):
    g = {source: 0}
    prev = {}
    open_set = [(h(source), source)]

    while open_set:
        f_val, u = heapq.heappop(open_set)
        if u == goal:
            return reconstruct(prev, goal), g[goal]
        for v, w in graph[u]:
            tentative = g[u] + w
            if tentative < g.get(v, float('inf')):
                g[v] = tentative
                prev[v] = u
                heapq.heappush(open_set, (tentative + h(v), v))
    return None, float('inf')

Relationship to Dijkstra

When h(v) = 0 for all v, A* degenerates to Dijkstra's algorithm. The heuristic biases the search toward the goal.

A* explores a focused cone toward the goal; Dijkstra explores a full circle.

A* — Heuristics & Admissibility

Admissible heuristic

A heuristic h is admissible if it never overestimates: h(v) ≤ δ(v, goal) for all v. Admissibility guarantees A* finds the optimal path.

Consistent (monotone) heuristic

h is consistent if h(u) ≤ w(u,v) + h(v) for every edge (u,v). Consistency implies admissibility and ensures each vertex is expanded at most once.

The trade-off

h = 0 — Dijkstra; explores everything; always optimal
h = δ — perfect heuristic; A* goes straight to the goal
h > δ — inadmissible; faster but may return sub-optimal paths
Stronger admissible heuristics explore fewer vertices but cost more to compute

Common heuristics for grids

Heuristic	Formula	Grid type
Manhattan	`\|x1-x2\| + \|y1-y2\|`	4-directional
Euclidean	`√((x1-x2)² + (y1-y2)²)`	Any-angle
Chebyshev	`max(\|x1-x2\|, \|y1-y2\|)`	8-directional
Octile	`max(dx,dy) + (√2-1)·min(dx,dy)`	8-dir, exact

Choosing a heuristic

The heuristic must be admissible for optimality. For grid pathfinding, Manhattan distance (4-dir) and octile distance (8-dir) are the standard choices. Euclidean is always admissible but slightly weaker.

Bidirectional Search

Run two simultaneous searches — one forward from s, one backward from t — and terminate when they meet.

Unweighted (bidirectional BFS)

Alternate layers between forward and backward BFS
Terminate when a vertex appears in both visited sets
Shortest path passes through the meeting vertex

Weighted (bidirectional Dijkstra)

Run forward and backward Dijkstra simultaneously
Alternate extract-min between the two queues
Maintain best known path μ = min(dist_f[v] + dist_b[v])
Terminate when both fronts exceed μ

Performance

Metric	Unidirectional	Bidirectional
BFS explored	`O(b^d)`	`O(b^d/2)`
Speedup	baseline	~ `b^d/2`

Where b = branching factor, d = shortest path length.

Requires knowing the target and being able to traverse edges in reverse.

Contraction Hierarchies

Contraction hierarchies (CH) are a preprocessing-based speed-up technique for shortest-path queries on static graphs — the backbone of modern road routing engines.

1. Preprocessing

Order vertices by "importance" (edge difference, contracted neighbours, ...)
Contract vertices from least to most important
Add shortcut edges between neighbours if shortest path goes through contracted vertex

2. Query

Run bidirectional Dijkstra relaxing only upward edges in the hierarchy
Forward search goes up from s; backward from t
Meeting point = highest-importance vertex on the shortest path

3. Performance

Preprocessing: minutes to hours (one-time, offline)
Query: microseconds — typically < 1000 vertices explored on continental road networks
3–6 orders of magnitude faster than plain Dijkstra

Google Maps, OSRM, and GraphHopper all use contraction hierarchies or related techniques (ALT, transit nodes). CH is the de facto standard for static road network routing.

Applications — Routing & Network Protocols

Road network routing

GPS navigation: preprocess the road graph with contraction hierarchies or A* with landmark heuristics. Real-time traffic requires dynamic edge weights and partial re-computation.

Telecommunications

Minimum-cost routing in telephone switching networks
Optical network path computation (GMPLS)
Fibre-optic cable laying — shortest path through geographic terrain

Network routing protocols

Protocol	Algorithm	Type
OSPF	Dijkstra	Link-state
IS-IS	Dijkstra	Link-state
RIP	Bellman-Ford	Distance-vector
BGP	Path-vector	Policy-based

The Internet literally runs on shortest-path algorithms. Every packet traverses routes computed by Dijkstra (OSPF) or Bellman-Ford (RIP) at the router level.

Applications — Game AI & Beyond

Game AI pathfinding

Grid-based games — A* with Manhattan/octile heuristic
Navigation meshes — A* on polygon mesh for 3D environments
Hierarchical — HPA* for coarse + fine search
Flow fields — many units to same destination (RTS games)

Robotics

Motion planning — A* or RRT in configuration space
Autonomous vehicles — hybrid A* for kinematic constraints

Social networks

Degrees of separation = shortest path
Betweenness centrality identifies influential nodes

Bioinformatics

Sequence alignment — edit-distance graphs
Metabolic pathways — shortest paths in reaction networks

Operations research

Vehicle routing
Supply chain optimisation
Airline crew scheduling

Complexity Comparison

Algorithm	Time	Space	Neg. wts	Neg. cycle	Use case
BFS	`O(V+E)`	`O(V)`	No	N/A	Unweighted graphs
Dijkstra (bin. heap)	`O((V+E) log V)`	`O(V)`	No	No	Non-negative weights (default)
Dijkstra (Fibonacci)	`O(V log V + E)`	`O(V)`	No	No	Dense, non-negative
Bellman-Ford	`O(V · E)`	`O(V)`	Yes	Yes	Negative weights, cycle detection
SPFA	`O(E)` avg	`O(V)`	Yes	Yes	Competitive programming
DAG relaxation	`O(V+E)`	`O(V)`	Yes	N/A	DAGs, critical path
Floyd-Warshall	`O(V³)`	`O(V²)`	Yes	Yes	All-pairs, dense graphs
Johnson's	`O(V·E log V)`	`O(V²)`	Yes	Yes	All-pairs, sparse graphs
A*	`O(b^d)` worst	`O(b^d)`	No	No	Single-pair with heuristic

Decision tree

Unweighted? BFS. DAG? Topological sort. Non-negative weights? Dijkstra. Negative weights? Bellman-Ford. All-pairs dense? Floyd-Warshall. All-pairs sparse? Johnson's. Single-pair with domain knowledge? A*.

Summary & Further Reading

Key takeaways

Shortest-path problems have optimal sub-structure — every sub-path of a shortest path is itself shortest
Dijkstra is the workhorse for non-negative weights; binary heap + lazy deletion is the practical default
Bellman-Ford handles negative weights at the cost of O(VE); its extra pass detects negative cycles
Floyd-Warshall solves all-pairs in O(V³) — ideal for dense graphs with small vertex counts
Johnson's beats Floyd-Warshall on sparse graphs by reweighting to eliminate negatives
A* focuses search with a heuristic; admissibility guarantees optimality
Contraction hierarchies make continental-scale routing feasible in microseconds
The choice of algorithm depends on graph structure, weight constraints, and query type

Source	Description
CLRS	Introduction to Algorithms — chapters 22–25
Kleinberg & Tardos	Algorithm Design — shortest paths and network flow
Hart, Nilsson, Raphael	Original A* paper (1968)
Geisberger et al.	Contraction Hierarchies paper (2008)
Sedgewick & Wayne	Algorithms 4th ed. — Java implementations
Jeff Erickson	Algorithms — free online textbook

Shortest PathAlgorithms

Single-Source Shortest Path Problem

Definition

Optimal sub-structure

Problem variants

BFS for Unweighted Graphs

Properties

Time complexity

Space complexity

Correctness

Dijkstra's Algorithm

Relaxation

Key constraint

Dijkstra — Priority Queue Implementation

Lazy deletion (practical default)

Practical notes

Dijkstra — Proof of Correctness

Theorem

Proof sketch (by contradiction)

Step 1 — Assumption

Step 2 — Crossing edge

Step 3 — x is correct

Step 4 — Relaxation

Step 5 — Non-negative weights

Step 6 — Contradiction

Bellman-Ford Algorithm

Why |V| - 1 passes?

Bellman-Ford — Negative Cycle Detection

What is a negative cycle?

Detection mechanism

Finding the cycle

Applications

Currency arbitrage

Network routing

SPFA — Shortest Path Faster Algorithm

DAG Shortest Paths via Topological Sort

Why it works

Time — O(V + E)

Handles negative weights

Applications

Floyd-Warshall — All-Pairs Shortest Paths

Recurrence

Johnson's Algorithm

1. Reweight setup

2. Reweight edges

3. Run V × Dijkstra

A* Search

Core idea

Relationship to Dijkstra

A* — Heuristics & Admissibility

Admissible heuristic

Consistent (monotone) heuristic

The trade-off

Common heuristics for grids

Choosing a heuristic

Bidirectional Search

Unweighted (bidirectional BFS)

Weighted (bidirectional Dijkstra)

Performance

Contraction Hierarchies

1. Preprocessing

2. Query

3. Performance

Applications — Routing & Network Protocols

Road network routing

Telecommunications

Network routing protocols

Applications — Game AI & Beyond

Game AI pathfinding

Robotics

Social networks

Bioinformatics

Operations research

Complexity Comparison

Decision tree

Summary & Further Reading

Key takeaways

Recommended reading

Shortest Path
Algorithms