We need a machine-independent way to describe how an algorithm's resource usage grows as input size n increases. Constants and lower-order terms are irrelevant at scale.
| Notation | Meaning | Intuition |
|---|---|---|
| O(g(n)) | Asymptotic upper bound | f grows no faster than g |
| Ω(g(n)) | Asymptotic lower bound | f grows at least as fast as g |
| Θ(g(n)) | Tight bound | f grows at the same rate as g |
| o(g(n)) | Strict upper bound | f grows strictly slower than g |
| ω(g(n)) | Strict lower bound | f grows strictly faster than g |
Formal definition (Big-O): f(n) = O(g(n)) if there exist constants c > 0 and n0 such that for all n ≥ n0: f(n) ≤ c · g(n).
Big-O describes the worst-case growth rate. It is an upper bound — saying an algorithm is O(n²) does not mean it always takes n² steps.
| Class | Name | Example algorithm |
|---|---|---|
| O(1) | Constant | Hash table lookup |
| O(log n) | Logarithmic | Binary search |
| O(n) | Linear | Linear search, single pass |
| O(n log n) | Linearithmic | Merge sort, heap sort |
| O(n²) | Quadratic | Bubble sort, insertion sort |
| O(n³) | Cubic | Naive matrix multiplication |
| O(2n) | Exponential | Brute-force subset enumeration |
| O(n!) | Factorial | Brute-force permutations |
A problem is tractable if it admits a polynomial-time algorithm — O(nk) for some constant k. Exponential and factorial problems are intractable for large inputs.
| n | log n | n | n log n | n² | 2n |
|---|---|---|---|---|---|
| 10 | 3.3 | 10 | 33 | 100 | 1,024 |
| 100 | 6.6 | 100 | 664 | 10,000 | 1.27 × 1030 |
| 1,000 | 10 | 1,000 | 10,000 | 106 | — |
| 106 | 20 | 106 | 2 × 107 | 1012 | — |
At n = 100, an O(2n) algorithm requires more operations than atoms in the observable universe. This is why polynomial vs exponential matters.
5n² + 3n + 7 = O(n²)Amortised analysis gives the average cost per operation over a worst-case sequence of operations. Not the same as average-case analysis (which assumes a probability distribution).
Compute total cost of n operations, divide by n.
Assign an amortised "charge" per operation; overcharges build credit for expensive ops.
Define a potential function Φ on the data structure; amortised cost = actual cost + ΔΦ.
A dynamic array doubles its capacity when full. Individual push operations are O(1) except when resizing occurs (O(n) copy). But n pushes cost at most 2n copies total.
Amortised cost per push: O(1)
Other examples: splay trees — O(log n) amortised • Fibonacci heaps — O(1) amortised decrease-key • binary counter increment — O(1) amortised
A decision problem asks a yes/no question about an input. Complexity theory focuses on decision problems because they are the simplest to classify.
PRIME: "Is n a prime number?" — Yes/No
PATH: "Does graph G have a path from s to t?" — Yes/No
SAT: "Is there an assignment satisfying boolean formula φ?" — Yes/No
A language L over alphabet Σ is a set of strings. A decision problem maps to a language:
L = { x ∈ Σ* : the answer for x is YES }
A Turing machine M decides L if it halts on every input and accepts exactly the strings in L.
Complexity classes are sets of languages — they group decision problems by the resources (time, space) required by the Turing machines that decide them.
P is the class of decision problems solvable by a deterministic Turing machine in polynomial time:
P = ⋃k DTIME(nk)
P captures the intuitive notion of "efficiently solvable" — problems where a fast algorithm exists.
Not all polynomial algorithms are fast in practice — O(n100) is technically in P but useless. The importance of P is theoretical: it defines the boundary of tractability.
| Problem | Time |
|---|---|
| Sorting | O(n log n) |
| Shortest path | O(E + V log V) |
| Primality (AKS) | O(n6) |
| Max matching | O(V3) |
| Linear programming | Polynomial |
| 2-SAT | O(n + m) |
| Connectivity | O(V + E) |
NP is the class of decision problems solvable by a nondeterministic Turing machine in polynomial time.
Equivalently: NP is the class of problems for which a YES answer can be verified in polynomial time given a suitable certificate.
A nondeterministic TM can "guess" a solution (explore all branches simultaneously). If any branch accepts in polynomial time, the machine accepts.
NP does NOT mean "not polynomial". It means "nondeterministic polynomial time". Many NP problems are in P.
A language L is in NP if there exists a polynomial-time verifier V such that:
L = { x : ∃ certificate c with |c| = poly(|x|) and V(x, c) accepts }.
| Problem | Certificate | Verification |
|---|---|---|
| SAT | A satisfying assignment | Evaluate formula — O(n) |
| CLIQUE | A set of k vertices | Check all pairs adjacent — O(k²) |
| HAMILTONIAN CYCLE | A permutation of vertices | Check each edge exists — O(n) |
| COMPOSITE | A non-trivial factor | Perform division — O(n²) |
| SUBSET SUM | A subset of numbers | Sum and compare — O(n) |
The certificate is a "proof" that the answer is YES. The verifier checks this proof efficiently. The hard part is finding the certificate, not checking it.
A polynomial-time reduction from problem A to problem B is a poly-time computable function f such that:
x ∈ A ⇔ f(x) ∈ B
Written: A ≤P B — "A reduces to B"
Each arrow is a polynomial-time reduction. If any of these could be solved in poly-time, so could all of them.
A problem L is NP-complete if:
Solutions can be verified in polynomial time.
Every problem in NP reduces to L in polynomial time.
NP-complete problems are the hardest problems in NP. If any NP-complete problem has a polynomial-time algorithm, then P = NP.
Finding a problem NP-complete is practically useful: it tells you to stop looking for an exact efficient algorithm and instead consider approximation, heuristics, or parameterised solutions.
Theorem: SAT is NP-complete. (Cook, 1971; Levin, independently, 1973)
This was the first problem proven NP-complete. The proof shows that any NP computation can be encoded as a boolean satisfiability instance — the universal reduction.
Given any NP language L decided by a nondeterministic TM M in time p(n):
Once SAT is NP-complete, all other NP-completeness proofs reduce from SAT (or from a problem already known to be NP-complete).
Given a boolean formula φ, is there an assignment of variables that makes φ true?
SAT restricted to CNF with exactly 3 literals per clause. Still NP-complete (reduced from SAT). Surprisingly, 2-SAT is in P.
Given graph G and integer k, does G contain a complete subgraph of size k?
Given graph G and integer k, does G contain k pairwise non-adjacent vertices? (Complement of CLIQUE.)
Given graph G and integer k, is there a set of at most k vertices touching every edge?
These three graph problems are tightly related: S is an independent set ⇔ V \ S is a vertex cover. Reductions between them are straightforward.
Does graph G contain a cycle visiting every vertex exactly once?
Given a weighted complete graph and budget B, is there a tour visiting every city with total cost ≤ B?
Given a set of integers S and target t, is there a subset of S summing to exactly t?
Can vertices be coloured with k colours so no two adjacent vertices share a colour? NP-complete for k ≥ 3; polynomial for k = 2.
Given universe U and a collection of subsets, can you cover U with at most k subsets?
These problems arise constantly in scheduling, logistics, networking, resource allocation, and VLSI design. Their NP-completeness motivates approximation algorithms.
A problem H is NP-hard if every problem in NP reduces to H in polynomial time. An NP-hard problem does NOT need to be in NP itself.
| Problem | Why not in NP? |
|---|---|
| Halting problem | Undecidable — no TM decides it at all |
| QSAT (QBF) | PSPACE-complete; certificate may be exponential |
| TSP optimisation | Output is a tour (not yes/no); decision version is in NP |
| Counting SAT (#SAT) | Answer is a number, not a yes/no certificate |
The question: Does P = NP? Can every problem whose solution can be verified in polynomial time also be solved in polynomial time?
One of the seven Clay Mathematics Institute Millennium Prize Problems. $1,000,000 for a correct proof either way. Most complexity theorists believe P ≠ NP, but no proof exists.
"If P = NP, then the world would be a profoundly different place. Everyone who could appreciate a symphony would be Mozart." — Scott Aaronson
When exact solutions are NP-hard, we settle for solutions that are provably close to optimal in polynomial time.
| Problem | Technique | Ratio |
|---|---|---|
| Vertex Cover | Greedy edge matching | 2-approx |
| TSP (metric) | Christofides-Serdyukov | 3/2-approx |
| Set Cover | Greedy | O(ln n)-approx |
| MAX-SAT | Randomised rounding | 3/4-approx |
| Knapsack | FPTAS | (1 + ε) for any ε > 0 |
Polynomial-Time Approximation Scheme: (1+ε)-approx for any ε. Time polynomial in n but possibly exponential in 1/ε.
Fully Polynomial TAS: time polynomial in both n and 1/ε. The gold standard for approximation.
Not all NP-hard problems can be approximated well. TSP (general, non-metric) cannot be approximated within any constant factor unless P = NP.
Instead of measuring complexity in input size n alone, we also consider a parameter k. A problem is fixed-parameter tractable (FPT) if solvable in time f(k) · nc for some computable f and constant c.
An O(2k · n) algorithm is practical when k is small, even if n is large. Classical complexity calls this "exponential" — parameterised complexity recognises it as tractable.
| Problem | Parameter k | Technique | Time |
|---|---|---|---|
| Vertex Cover | Cover size | Bounded search tree | O(2k · n) |
| k-Clique | Clique size | Colour coding | O(2k · n) |
| Longest Path | Path length | Colour coding | O(2k · n²) |
| Treewidth | Treewidth | Dynamic programming | O(f(k) · n) |
W[0] = FPT ⊆ W[1] ⊆ W[2] ⊆ … — analogous to the P/NP hierarchy but parameterised. k-CLIQUE is W[1]-complete, meaning it is unlikely to be FPT.
| Class | Definition | Key problems |
|---|---|---|
| L | Decidable in O(log n) space (deterministic) | Undirected connectivity |
| NL | Decidable in O(log n) space (nondeterministic) | Directed s-t connectivity (PATH) |
| PSPACE | Decidable in polynomial space | QSAT, generalised chess/Go |
| EXPTIME | Decidable in exponential time | Generalised chess (n×n) |
NSPACE(f(n)) ⊆ DSPACE(f(n)²). Nondeterminism gives at most a quadratic blowup in space. In particular, NL ⊆ DSPACE(log² n).
PATH (directed s-t connectivity) is NL-complete under log-space reductions. It plays the same role in space complexity that SAT plays in time.
PSPACE captures problems where you can reuse space. Many two-player games are PSPACE-complete — the game tree is exponential but the space to evaluate it is polynomial.
| Source | Description |
|---|---|
| Sipser | Introduction to the Theory of Computation — the standard complexity textbook |
| Arora & Barak | Computational Complexity: A Modern Approach — comprehensive graduate reference |
| CLRS | Introduction to Algorithms — Ch. 34 covers NP-completeness |
| Garey & Johnson | Computers and Intractability — the classic NP-completeness catalogue |
| Complexity Zoo | complexityzoo.net — comprehensive class catalogue |
| Aaronson | scottaaronson.blog — accessible P vs NP writing |