Backtracking is a systematic method for exploring all potential solutions by building candidates incrementally and abandoning a candidate as soon as it cannot lead to a valid solution.
Key insight: backtracking turns exponential brute force into something practical by cutting off invalid branches early. The earlier you prune, the faster the search.
A state space tree is a rooted tree representing all possible states of a backtracking algorithm. Each node is a partial solution; edges represent decisions.
Without pruning: tree has bd leaves (b = branching factor, d = depth). With pruning: many subtrees are never explored.
def backtrack(state, decisions):
if is_solution(state):
record_solution(state)
return
for choice in get_choices(state, decisions):
if is_valid(state, choice): # prune
apply(state, choice) # choose
backtrack(state, decisions) # recurse
undo(state, choice) # unchoose
Pattern recognition: if a problem asks "find all configurations satisfying constraints" or "does a valid arrangement exist", backtracking is the first technique to consider.
Rejects partial solutions that cannot lead to a valid complete solution. The earlier this fires, the more work is saved.
State mutation and reversal. The candidate is modified in place and restored after recursion — no copying needed.
Recognises when a complete, valid solution has been built. Records or returns the result.
Recursive backtracking uses the call stack to store state. Iterative backtracking uses an explicit stack — identical complexity, avoids stack overflow for very deep searches.
Place N queens on an N x N chessboard so that no two queens attack each other — no shared row, column, or diagonal.
| N | Solutions | Unique (symmetry) |
|---|---|---|
| 4 | 2 | 1 |
| 8 | 92 | 12 |
| 12 | 14,200 | 1,787 |
| 14 | 365,596 | 45,752 |
def is_safe(board, row, col):
for prev_row in range(row):
prev_col = board[prev_row]
# same column
if prev_col == col:
return False
# same diagonal
if abs(prev_col - col) == row - prev_row:
return False
return True
Row 0: col 0 ✓
Row 1: col 0 ✗ (column)
Row 1: col 1 ✗ (diagonal)
Row 1: col 2 ✓
Row 2: col 0 ✗ (diagonal)
Row 2: col 1 ✗ (col conflict row 1)
Row 2: col 2 ✗ (col conflict row 1)
Row 2: col 3 ✗ (diagonal)
← backtrack to row 1
Row 1: col 3 ✓
Row 2: col 0 ✗ (diagonal)
Row 2: col 1 ✓
Row 3: col 0 ✗ col 1 ✗ col 2 ✗ col 3 ✗
← backtrack
Row 2: col 2 ✗ ...
← backtrack to row 0
Row 0: col 1 ✓
Row 1: col 3 ✓
Row 2: col 0 ✓
Row 3: col 2 ✓ ★ SOLUTION [1,3,0,2]
Check all NN arrangements — 44 = 256 for N=4
Typically explores only a small fraction of the tree
| Approach | 8-Queens nodes |
|---|---|
| Brute force (88) | 16,777,216 |
| Backtracking | ~114 |
| Speedup | ~147,000x |
The power of pruning: each invalid placement at row r eliminates an entire subtree of N(N-r-1) nodes.
Fill a 9x9 grid so every row, column, and 3x3 box contains digits 1–9 exactly once.
def solve_sudoku(board):
cell = find_empty(board)
if cell is None:
return True # solved
row, col = cell
for num in range(1, 10):
if is_valid(board, row, col, num):
board[row][col] = num
if solve_sudoku(board):
return True
board[row][col] = 0 # undo
return False # trigger backtracking
If only one number is valid for a cell, place it immediately — no branching needed
If a number can only go in one cell in a row/col/box, place it
Reduce domains before branching — eliminates options proactively
Fill the cell with fewest remaining options first — fail faster
Sudoku with constraint propagation solves most newspaper puzzles without any backtracking at all. Hard puzzles (17-clue minimum) still require search.
Given a set S = {s1, s2, ..., sn} and a target T, find all subsets of S that sum to T.
def subset_sum(nums, target, start, curr, res):
if target == 0:
res.append(curr[:])
return
if target < 0:
return # prune: overshot
for i in range(start, len(nums)):
curr.append(nums[i])
subset_sum(nums, target - nums[i],
i + 1, curr, res)
curr.pop() # backtrack
If nums[i] > remaining target, skip all subsequent elements — they are larger
If nums[i] == nums[i-1] at same recursion level, skip to avoid duplicate subsets
Maintain sum of unused elements; if remaining_sum < target, prune — impossible to reach target
Subset sum is NP-complete. Backtracking with pruning is practical for moderate inputs (n < ~40). For larger inputs, dynamic programming or meet-in-the-middle may be preferable.
def permute(nums, start, result):
if start == len(nums):
result.append(nums[:])
return
for i in range(start, len(nums)):
nums[start], nums[i] = nums[i], nums[start]
permute(nums, start + 1, result)
nums[start], nums[i] = nums[i], nums[start]
n! (all valid — no pruning for plain permutations)When constraints are added (e.g., "no two adjacent elements differ by more than 2"), pruning becomes effective:
For inputs with repeated elements (e.g., [1, 1, 2]):
i if nums[i] == nums[i-1] and i-1 was not used at this leveldef combine(n, k, start, current, result):
if len(current) == k:
result.append(current[:])
return
# pruning: need (k - len(current)) more
# only proceed if enough elements remain
for i in range(start, n-(k-len(current))+2):
current.append(i)
combine(n, k, i + 1, current, result)
current.pop()
Pruning the search space: at each level, if not enough remaining elements to fill the combination, stop immediately.
| Property | Permutations | Combinations |
|---|---|---|
| Order matters | Yes | No |
| Count | n! / (n-k)! | n! / (k!(n-k)!) |
| Start index | Fixed (swap) | Advances |
| Typical pruning | Constraint | Size + constraint |
Combinations are a strict subset of the permutation search space. Using the start index avoids generating [1,2] and [2,1] separately.
Without pruning: explore all 2n subsets, filter by size k. With pruning: only generate C(n,k) valid combinations directly.
Assign colours to vertices of an undirected graph such that no two adjacent vertices share the same colour, using at most k colours.
def graph_colour(graph, k, colours, vertex):
if vertex == len(graph):
return True # all coloured
for c in range(1, k + 1):
if is_safe(graph, colours, vertex, c):
colours[vertex] = c
if graph_colour(graph, k, colours,
vertex + 1):
return True
colours[vertex] = 0 # backtrack
return False
Four colour theorem guarantees k=4 suffices for planar graphs
Compilers assign variables to CPU registers (graph = interference graph)
Exams, tasks, or frequencies with conflict constraints
Courses sharing students cannot be in the same slot
The minimum k for which a valid colouring exists is the chromatic number X(G). Finding X(G) is NP-hard.
def hamiltonian(graph, path, visited):
if len(path) == len(graph):
if graph[path[-1]][path[0]]:
return True # cycle found
return False
for v in range(len(graph)):
if not visited[v] and graph[path[-1]][v]:
visited[v] = True
path.append(v)
if hamiltonian(graph, path, visited):
return True
path.pop()
visited[v] = False # backtrack
return False
If an unvisited vertex has no unvisited neighbours, prune immediately
If removing the current vertex disconnects the unvisited subgraph, prune
Choose the vertex with fewest unvisited neighbours next — reduces branching factor
Hamiltonian cycle is NP-complete. No polynomial algorithm is known. Backtracking is the standard exact solver, but impractical for graphs with hundreds of vertices.
A CSP is defined by three components:
X1, X2, ..., Xn — the unknowns to assign
D1, D2, ..., Dn — possible values for each variable
Restrictions on which combinations of values are allowed
Why CSP framing matters: provides a uniform representation — one algorithm solves many problems. Separates problem modelling from search strategy.
| Problem | Variables | Domains |
|---|---|---|
| N-Queens | Queen per row | {1..N} |
| Sudoku | Each cell | {1..9} |
| Graph colour | Vertex colours | {1..k} |
| Map colour | Region colours | {R,G,B,Y} |
| Scheduling | Task slots | Available slots |
Constraints: N-Queens — no shared col/diag. Sudoku — row/col/box uniqueness. Graph colouring — adjacent vertices differ. Scheduling — no resource conflicts.
| Strategy | Effect |
|---|---|
| Plain backtracking | Explores many dead-end branches |
| + MRV | Detects failures earlier — fail-first |
| + Forward checking | Prunes domains proactively after each assignment |
| + MAC (AC-3) | Near-optimal pruning; solves most CSPs efficiently |
Without pruning, backtracking degenerates to brute force. Effective pruning is the difference between practical and intractable.
Reject partial solutions that already violate a constraint
In optimisation, reject branches that cannot improve the best known solution
Avoid exploring configurations that are rotations/reflections of already-explored solutions
Skip a partial solution if another is provably at least as good
Proactively reduce variable domains rather than waiting for conflicts
Rule of thumb: the cost of the pruning check must be less than the cost of exploring the pruned subtree. Cheap checks that eliminate large subtrees are the sweet spot.
For problems with small state spaces (N-Queens with N ≤ 30), represent sets as bitmasks for O(1) constraint checks.
def solve(row, cols, diag1, diag2, n):
if row == n:
return 1
count = 0
available = ((1 << n) - 1) \
& ~(cols | diag1 | diag2)
while available:
bit = available & (-available)
count += solve(row + 1,
cols | bit,
(diag1 | bit) << 1,
(diag2 | bit) >> 1, n)
available ^= bit
return count
Combine DFS with depth limits. Useful when solution depth is unknown and memory is constrained.
For hard CSPs, restart from a random initial state if the search stalls. Avoids unproductive subtrees.
Distribute independent subtrees across threads. Work stealing for load balancing. Near-linear speedup for many-branch problems.
Branch and bound adds a bounding function that estimates the best possible solution achievable from a partial solution.
| Aspect | Backtracking | Branch & Bound |
|---|---|---|
| Goal | Feasible solutions | Optimal solution |
| Pruning | Constraint violations | Bound vs best-so-far |
| Bounding fn | Not required | Essential |
| Problems | CSPs, enumeration | TSP, knapsack, scheduling |
Branch and bound is the standard approach for integer linear programming (ILP) solvers like CPLEX and Gurobi. The quality of the bounding function determines solver performance.
Generate all possible candidates, then check each for validity. Always explores the full search space. Time: O(bd).
Generate candidates incrementally, pruning invalid branches early. Explores only a fraction of the space. Worst case same; best case dramatically faster.
| Problem | Brute Force | Backtracking | Speedup |
|---|---|---|---|
| 8-Queens | 16,777,216 | ~114 nodes | ~147,000x |
| Sudoku (easy) | 951 | ~100 nodes | astronomical |
| Graph colour (sparse) | kn | ~O(k*n) | exponential |
Backtracking is not a different complexity class — it is a constant-factor (often enormous) improvement within the same exponential class. For polynomial-time solutions, you need a fundamentally different algorithm.
| Problem | Worst Case | Typical with Pruning | Notes |
|---|---|---|---|
| N-Queens | O(N!) | Much less in practice | One queen per row eliminates NN |
| Sudoku | O(981) | Near-instant for most | Constraint propagation dominates |
| Subset Sum | O(2n) | Depends on target | DP may be better for small targets |
| Permutations | O(n!) | O(n!) unconstrained | No pruning for plain case |
| Combinations C(n,k) | O(C(n,k)) | O(C(n,k)) | Size pruning only |
| Graph Colouring | O(kn) | Problem-dependent | Sparse graphs prune well |
| Hamiltonian Cycle | O(n!) | Problem-dependent | NP-complete |
| TSP (B&B) | O(n!) | Manageable for n<25 | Bounding fn quality is key |
Recursive backtracking: O(d) stack space where d = max recursion depth. Storing all solutions: O(S * d) where S = number of solutions.
Backtracking complexity is problem-instance dependent. Analyse the constraint structure, not just the worst case, to predict real-world performance.
| Source | Description |
|---|---|
| Cormen et al. | Introduction to Algorithms (CLRS) — backtracking and branch-and-bound |
| Skiena | The Algorithm Design Manual — practical strategies and war stories |
| Russell & Norvig | AI: A Modern Approach — CSP chapter with AC-3, MRV, MAC |
| Knuth | TAOCP Vol. 4 — exhaustive enumeration and Dancing Links |
| LeetCode | Backtracking tag — curated practice problems |
| Sedgewick | Algorithms — permutations, combinations, constraint search |