COMPUTER SCIENCE FUNDAMENTALS SERIES

Computational
Complexity

Big-O · P vs NP · NP-completeness ·
Reductions · Approximation · Amortised analysis

Mid-level software engineer track · 20 slides

Asymptotic Notation — Big-O and Friends

We need a machine-independent way to describe how an algorithm's resource usage grows as input size n increases. Constants and lower-order terms are irrelevant at scale.

Notation	Meaning	Intuition
O(g(n))	Asymptotic upper bound	f grows no faster than g
Ω(g(n))	Asymptotic lower bound	f grows at least as fast as g
Θ(g(n))	Tight bound	f grows at the same rate as g
o(g(n))	Strict upper bound	f grows strictly slower than g
ω(g(n))	Strict lower bound	f grows strictly faster than g

Formal definition (Big-O): f(n) = O(g(n)) if there exist constants c > 0 and n₀ such that for all n ≥ n₀: f(n) ≤ c · g(n).

Big-O describes the worst-case growth rate. It is an upper bound — saying an algorithm is O(n²) does not mean it always takes n² steps.

Common Complexity Classes

Class	Name	Example algorithm
O(1)	Constant	Hash table lookup
O(log n)	Logarithmic	Binary search
O(n)	Linear	Linear search, single pass
O(n log n)	Linearithmic	Merge sort, heap sort
O(n²)	Quadratic	Bubble sort, insertion sort
O(n³)	Cubic	Naive matrix multiplication
O(2ⁿ)	Exponential	Brute-force subset enumeration
O(n!)	Factorial	Brute-force permutations

A problem is tractable if it admits a polynomial-time algorithm — O(n^k) for some constant k. Exponential and factorial problems are intractable for large inputs.

Comparing Growth Rates

Concrete numbers at scale

n	log n	n	n log n	n²	2ⁿ
10	3.3	10	33	100	1,024
100	6.6	100	664	10,000	1.27 × 10³⁰
1,000	10	1,000	10,000	10⁶	—
10⁶	20	10⁶	2 × 10⁷	10¹²	—

At n = 100, an O(2ⁿ) algorithm requires more operations than atoms in the observable universe. This is why polynomial vs exponential matters.

Rules of thumb

Drop constants and lower-order terms: 5n² + 3n + 7 = O(n²)
Nested loops multiply: O(n) loop inside O(n) loop = O(n²)
Sequential steps add: O(n) + O(n log n) = O(n log n)
Logarithm bases don't matter: log₂(n) = Θ(log₁₀(n))

Amortised Analysis

Amortised analysis gives the average cost per operation over a worst-case sequence of operations. Not the same as average-case analysis (which assumes a probability distribution).

Aggregate Method

Compute total cost of n operations, divide by n.

Accounting Method

Assign an amortised "charge" per operation; overcharges build credit for expensive ops.

Potential Method

Define a potential function Φ on the data structure; amortised cost = actual cost + ΔΦ.

Classic example: dynamic array

A dynamic array doubles its capacity when full. Individual push operations are O(1) except when resizing occurs (O(n) copy). But n pushes cost at most 2n copies total.

Amortised cost per push: O(1)

Other examples: splay trees — O(log n) amortised • Fibonacci heaps — O(1) amortised decrease-key • binary counter increment — O(1) amortised

Decision Problems and Languages

Decision problems

A decision problem asks a yes/no question about an input. Complexity theory focuses on decision problems because they are the simplest to classify.

PRIME: "Is n a prime number?" — Yes/No

PATH: "Does graph G have a path from s to t?" — Yes/No

SAT: "Is there an assignment satisfying boolean formula φ?" — Yes/No

Languages

A language L over alphabet Σ is a set of strings. A decision problem maps to a language:

L = { x ∈ Σ* : the answer for x is YES }

A Turing machine M decides L if it halts on every input and accepts exactly the strings in L.

Complexity classes are sets of languages — they group decision problems by the resources (time, space) required by the Turing machines that decide them.

The Class P

Definition

P is the class of decision problems solvable by a deterministic Turing machine in polynomial time:

P = ⋃_k DTIME(n^k)

P captures the intuitive notion of "efficiently solvable" — problems where a fast algorithm exists.

Not all polynomial algorithms are fast in practice — O(n¹⁰⁰) is technically in P but useless. The importance of P is theoretical: it defines the boundary of tractability.

Problems in P

Problem	Time
Sorting	O(n log n)
Shortest path	O(E + V log V)
Primality (AKS)	O(n⁶)
Max matching	O(V³)
Linear programming	Polynomial
2-SAT	O(n + m)
Connectivity	O(V + E)

The Class NP

Definition

NP is the class of decision problems solvable by a nondeterministic Turing machine in polynomial time.

Equivalently: NP is the class of problems for which a YES answer can be verified in polynomial time given a suitable certificate.

Nondeterminism

A nondeterministic TM can "guess" a solution (explore all branches simultaneously). If any branch accepts in polynomial time, the machine accepts.

Key relationships

P ⊆ NP (a deterministic TM is a special case)
coNP = complement languages of NP
Unknown: P = NP? NP = coNP?

NP does NOT mean "not polynomial". It means "nondeterministic polynomial time". Many NP problems are in P.

NP — Verifiers and Certificates

A language L is in NP if there exists a polynomial-time verifier V such that:
L = { x : ∃ certificate c with |c| = poly(|x|) and V(x, c) accepts }.

Problem	Certificate	Verification
SAT	A satisfying assignment	Evaluate formula — O(n)
CLIQUE	A set of k vertices	Check all pairs adjacent — O(k²)
HAMILTONIAN CYCLE	A permutation of vertices	Check each edge exists — O(n)
COMPOSITE	A non-trivial factor	Perform division — O(n²)
SUBSET SUM	A subset of numbers	Sum and compare — O(n)

The certificate is a "proof" that the answer is YES. The verifier checks this proof efficiently. The hard part is finding the certificate, not checking it.

Polynomial-Time Reductions

What is a reduction?

A polynomial-time reduction from problem A to problem B is a poly-time computable function f such that:

x ∈ A ⇔ f(x) ∈ B

Written: A ≤_P B — "A reduces to B"

What reductions tell us

If A ≤_P B and B ∈ P, then A ∈ P
If A ≤_P B and A ∉ P, then B ∉ P
Reductions transfer hardness: "B is at least as hard as A"

Classic reduction chain

SAT → 3-SAT

3-SAT → CLIQUE

CLIQUE → VERTEX COVER

VERTEX COVER → HAM. CYCLE

HAM. CYCLE → TSP

Each arrow is a polynomial-time reduction. If any of these could be solved in poly-time, so could all of them.

NP-Completeness

Definition

A problem L is NP-complete if:

1. L ∈ NP

Solutions can be verified in polynomial time.

2. L is NP-hard

Every problem in NP reduces to L in polynomial time.

NP-complete problems are the hardest problems in NP. If any NP-complete problem has a polynomial-time algorithm, then P = NP.

Significance

NP-complete problems form the "boundary" of tractability
Thousands of natural problems are NP-complete
No polynomial-time algorithm is known for any of them
Strong evidence (but no proof) that none exists

Finding a problem NP-complete is practically useful: it tells you to stop looking for an exact efficient algorithm and instead consider approximation, heuristics, or parameterised solutions.

The Cook-Levin Theorem

Theorem: SAT is NP-complete. (Cook, 1971; Levin, independently, 1973)

This was the first problem proven NP-complete. The proof shows that any NP computation can be encoded as a boolean satisfiability instance — the universal reduction.

Proof sketch

Given any NP language L decided by a nondeterministic TM M in time p(n):

1 Encode M's computation tableau as boolean variables

2 Build clauses: valid start state, valid transitions, accepting final state

3 Formula is satisfiable ⇔ M accepts the input

4 Reduction runs in polynomial time — tableau size is poly(n)

Once SAT is NP-complete, all other NP-completeness proofs reduce from SAT (or from a problem already known to be NP-complete).

Classic NP-Complete Problems I

SAT (Boolean Satisfiability)

Given a boolean formula φ, is there an assignment of variables that makes φ true?

3-SAT

SAT restricted to CNF with exactly 3 literals per clause. Still NP-complete (reduced from SAT). Surprisingly, 2-SAT is in P.

CLIQUE

Given graph G and integer k, does G contain a complete subgraph of size k?

INDEPENDENT SET

Given graph G and integer k, does G contain k pairwise non-adjacent vertices? (Complement of CLIQUE.)

VERTEX COVER

Given graph G and integer k, is there a set of at most k vertices touching every edge?

These three graph problems are tightly related: S is an independent set ⇔ V \ S is a vertex cover. Reductions between them are straightforward.

Classic NP-Complete Problems II

HAMILTONIAN CYCLE

Does graph G contain a cycle visiting every vertex exactly once?

TSP (decision version)

Given a weighted complete graph and budget B, is there a tour visiting every city with total cost ≤ B?

SUBSET SUM

Given a set of integers S and target t, is there a subset of S summing to exactly t?

GRAPH COLOURING (k ≥ 3)

Can vertices be coloured with k colours so no two adjacent vertices share a colour? NP-complete for k ≥ 3; polynomial for k = 2.

SET COVER

Given universe U and a collection of subsets, can you cover U with at most k subsets?

These problems arise constantly in scheduling, logistics, networking, resource allocation, and VLSI design. Their NP-completeness motivates approximation algorithms.

NP-Hard Problems

Definition

A problem H is NP-hard if every problem in NP reduces to H in polynomial time. An NP-hard problem does NOT need to be in NP itself.

NP-hard but not in NP

Problem	Why not in NP?
Halting problem	Undecidable — no TM decides it at all
QSAT (QBF)	PSPACE-complete; certificate may be exponential
TSP optimisation	Output is a tour (not yes/no); decision version is in NP
Counting SAT (#SAT)	Answer is a number, not a yes/no certificate

Relationship diagram

P vs NP — The Open Question

The question: Does P = NP? Can every problem whose solution can be verified in polynomial time also be solved in polynomial time?

One of the seven Clay Mathematics Institute Millennium Prize Problems. $1,000,000 for a correct proof either way. Most complexity theorists believe P ≠ NP, but no proof exists.

If P = NP

All NP-complete problems solvable efficiently
Cryptography based on factoring/discrete log breaks
Optimisation, scheduling, logistics become trivially solvable
Mathematical proof discovery becomes automatable

If P ≠ NP (proven)

Confirms inherently hard problems exist
Validates foundations of modern cryptography
Justifies study of approximation, heuristics, parameterised algorithms

"If P = NP, then the world would be a profoundly different place. Everyone who could appreciate a symphony would be Mozart." — Scott Aaronson

Approximation Algorithms

When exact solutions are NP-hard, we settle for solutions that are provably close to optimal in polynomial time.

Problem	Technique	Ratio
Vertex Cover	Greedy edge matching	2-approx
TSP (metric)	Christofides-Serdyukov	3/2-approx
Set Cover	Greedy	O(ln n)-approx
MAX-SAT	Randomised rounding	3/4-approx
Knapsack	FPTAS	(1 + ε) for any ε > 0

PTAS

Polynomial-Time Approximation Scheme: (1+ε)-approx for any ε. Time polynomial in n but possibly exponential in 1/ε.

FPTAS

Fully Polynomial TAS: time polynomial in both n and 1/ε. The gold standard for approximation.

Not all NP-hard problems can be approximated well. TSP (general, non-metric) cannot be approximated within any constant factor unless P = NP.

Parameterised Complexity

Instead of measuring complexity in input size n alone, we also consider a parameter k. A problem is fixed-parameter tractable (FPT) if solvable in time f(k) · n^c for some computable f and constant c.

An O(2^k · n) algorithm is practical when k is small, even if n is large. Classical complexity calls this "exponential" — parameterised complexity recognises it as tractable.

Problem	Parameter k	Technique	Time
Vertex Cover	Cover size	Bounded search tree	O(2^k · n)
k-Clique	Clique size	Colour coding	O(2^k · n)
Longest Path	Path length	Colour coding	O(2^k · n²)
Treewidth	Treewidth	Dynamic programming	O(f(k) · n)

W-hierarchy

W[0] = FPT ⊆ W[1] ⊆ W[2] ⊆ … — analogous to the P/NP hierarchy but parameterised. k-CLIQUE is W[1]-complete, meaning it is unlikely to be FPT.

Space Complexity — PSPACE, L, NL

Class	Definition	Key problems
L	Decidable in O(log n) space (deterministic)	Undirected connectivity
NL	Decidable in O(log n) space (nondeterministic)	Directed s-t connectivity (PATH)
PSPACE	Decidable in polynomial space	QSAT, generalised chess/Go
EXPTIME	Decidable in exponential time	Generalised chess (n×n)

Known inclusions

L ⊆ NL ⊆ P ⊆ NP ⊆ PSPACE ⊆ EXPTIME

Savitch's Theorem

NSPACE(f(n)) ⊆ DSPACE(f(n)²). Nondeterminism gives at most a quadratic blowup in space. In particular, NL ⊆ DSPACE(log² n).

NL-completeness

PATH (directed s-t connectivity) is NL-complete under log-space reductions. It plays the same role in space complexity that SAT plays in time.

PSPACE captures problems where you can reuse space. Many two-player games are PSPACE-complete — the game tree is exponential but the space to evaluate it is polynomial.

Summary & Further Reading

Key takeaways

Asymptotic notation (O, Ω, Θ) provides machine-independent growth-rate descriptions
P captures efficiently solvable problems; NP captures efficiently verifiable problems
NP-completeness (Cook-Levin, reductions) identifies the hardest problems in NP
If any NP-complete problem is in P, then P = NP — the most important open question in CS
Approximation algorithms provide provably good polynomial-time solutions when exact solutions are intractable
Parameterised complexity isolates the source of exponential blowup
Space complexity (L, NL, PSPACE) adds another dimension to classification

Source	Description
Sipser	Introduction to the Theory of Computation — the standard complexity textbook
Arora & Barak	Computational Complexity: A Modern Approach — comprehensive graduate reference
CLRS	Introduction to Algorithms — Ch. 34 covers NP-completeness
Garey & Johnson	Computers and Intractability — the classic NP-completeness catalogue
Complexity Zoo	complexityzoo.net — comprehensive class catalogue
Aaronson	scottaaronson.blog — accessible P vs NP writing

ComputationalComplexity

Asymptotic Notation — Big-O and Friends

Common Complexity Classes

Comparing Growth Rates

Concrete numbers at scale

Rules of thumb

Amortised Analysis

Aggregate Method

Accounting Method

Potential Method

Classic example: dynamic array

Decision Problems and Languages

Decision problems

Languages

The Class P

Definition

Problems in P

The Class NP

Definition

Nondeterminism

Key relationships

NP — Verifiers and Certificates

Polynomial-Time Reductions

What is a reduction?

What reductions tell us

Classic reduction chain

NP-Completeness

Definition

1. L ∈ NP

2. L is NP-hard

Significance

The Cook-Levin Theorem

Proof sketch

Classic NP-Complete Problems I

SAT (Boolean Satisfiability)

3-SAT

CLIQUE

INDEPENDENT SET

VERTEX COVER

Classic NP-Complete Problems II

HAMILTONIAN CYCLE

TSP (decision version)

SUBSET SUM

GRAPH COLOURING (k ≥ 3)

SET COVER

NP-Hard Problems

Definition

NP-hard but not in NP

Relationship diagram

P vs NP — The Open Question

If P = NP

If P ≠ NP (proven)

Approximation Algorithms

PTAS

FPTAS

Parameterised Complexity

W-hierarchy

Space Complexity — PSPACE, L, NL

Known inclusions

Savitch's Theorem

NL-completeness

Summary & Further Reading

Key takeaways

Recommended reading

Computational
Complexity