COMPUTER SCIENCE FUNDAMENTALS SERIES

Computational
Complexity

Big-O · P vs NP · NP-completeness ·
Reductions · Approximation · Amortised analysis
Mid-level software engineer track  ·  20 slides
02

Asymptotic Notation — Big-O and Friends

We need a machine-independent way to describe how an algorithm's resource usage grows as input size n increases. Constants and lower-order terms are irrelevant at scale.

NotationMeaningIntuition
O(g(n))Asymptotic upper boundf grows no faster than g
Ω(g(n))Asymptotic lower boundf grows at least as fast as g
Θ(g(n))Tight boundf grows at the same rate as g
o(g(n))Strict upper boundf grows strictly slower than g
ω(g(n))Strict lower boundf grows strictly faster than g

Formal definition (Big-O): f(n) = O(g(n)) if there exist constants c > 0 and n0 such that for all n ≥ n0: f(n) ≤ c · g(n).

Big-O describes the worst-case growth rate. It is an upper bound — saying an algorithm is O(n²) does not mean it always takes n² steps.

03

Common Complexity Classes

ClassNameExample algorithm
O(1)ConstantHash table lookup
O(log n)LogarithmicBinary search
O(n)LinearLinear search, single pass
O(n log n)LinearithmicMerge sort, heap sort
O(n²)QuadraticBubble sort, insertion sort
O(n³)CubicNaive matrix multiplication
O(2n)ExponentialBrute-force subset enumeration
O(n!)FactorialBrute-force permutations

A problem is tractable if it admits a polynomial-time algorithm — O(nk) for some constant k. Exponential and factorial problems are intractable for large inputs.

04

Comparing Growth Rates

Concrete numbers at scale

nlog nnn log n2n
103.310331001,024
1006.610066410,0001.27 × 1030
1,000101,00010,000106
106201062 × 1071012

At n = 100, an O(2n) algorithm requires more operations than atoms in the observable universe. This is why polynomial vs exponential matters.

Rules of thumb

  • Drop constants and lower-order terms: 5n² + 3n + 7 = O(n²)
  • Nested loops multiply: O(n) loop inside O(n) loop = O(n²)
  • Sequential steps add: O(n) + O(n log n) = O(n log n)
  • Logarithm bases don't matter: log2(n) = Θ(log10(n))
05

Amortised Analysis

Amortised analysis gives the average cost per operation over a worst-case sequence of operations. Not the same as average-case analysis (which assumes a probability distribution).

Aggregate Method

Compute total cost of n operations, divide by n.

Accounting Method

Assign an amortised "charge" per operation; overcharges build credit for expensive ops.

Potential Method

Define a potential function Φ on the data structure; amortised cost = actual cost + ΔΦ.

Classic example: dynamic array

A dynamic array doubles its capacity when full. Individual push operations are O(1) except when resizing occurs (O(n) copy). But n pushes cost at most 2n copies total.

Amortised cost per push: O(1)

Other examples: splay trees — O(log n) amortised • Fibonacci heaps — O(1) amortised decrease-key • binary counter increment — O(1) amortised

06

Decision Problems and Languages

Decision problems

A decision problem asks a yes/no question about an input. Complexity theory focuses on decision problems because they are the simplest to classify.

PRIME: "Is n a prime number?" — Yes/No

PATH: "Does graph G have a path from s to t?" — Yes/No

SAT: "Is there an assignment satisfying boolean formula φ?" — Yes/No

Languages

A language L over alphabet Σ is a set of strings. A decision problem maps to a language:

L = { x ∈ Σ* : the answer for x is YES }

A Turing machine M decides L if it halts on every input and accepts exactly the strings in L.

Complexity classes are sets of languages — they group decision problems by the resources (time, space) required by the Turing machines that decide them.

07

The Class P

Definition

P is the class of decision problems solvable by a deterministic Turing machine in polynomial time:

P = ⋃k DTIME(nk)

P captures the intuitive notion of "efficiently solvable" — problems where a fast algorithm exists.

Not all polynomial algorithms are fast in practice — O(n100) is technically in P but useless. The importance of P is theoretical: it defines the boundary of tractability.

Problems in P

ProblemTime
SortingO(n log n)
Shortest pathO(E + V log V)
Primality (AKS)O(n6)
Max matchingO(V3)
Linear programmingPolynomial
2-SATO(n + m)
ConnectivityO(V + E)
08

The Class NP

Definition

NP is the class of decision problems solvable by a nondeterministic Turing machine in polynomial time.

Equivalently: NP is the class of problems for which a YES answer can be verified in polynomial time given a suitable certificate.

Nondeterminism

A nondeterministic TM can "guess" a solution (explore all branches simultaneously). If any branch accepts in polynomial time, the machine accepts.

Key relationships

  • P ⊆ NP (a deterministic TM is a special case)
  • coNP = complement languages of NP
  • Unknown: P = NP?   NP = coNP?

NP does NOT mean "not polynomial". It means "nondeterministic polynomial time". Many NP problems are in P.

09

NP — Verifiers and Certificates

A language L is in NP if there exists a polynomial-time verifier V such that:
L = { x : ∃ certificate c with |c| = poly(|x|) and V(x, c) accepts }.

ProblemCertificateVerification
SATA satisfying assignmentEvaluate formula — O(n)
CLIQUEA set of k verticesCheck all pairs adjacent — O(k²)
HAMILTONIAN CYCLEA permutation of verticesCheck each edge exists — O(n)
COMPOSITEA non-trivial factorPerform division — O(n²)
SUBSET SUMA subset of numbersSum and compare — O(n)

The certificate is a "proof" that the answer is YES. The verifier checks this proof efficiently. The hard part is finding the certificate, not checking it.

10

Polynomial-Time Reductions

What is a reduction?

A polynomial-time reduction from problem A to problem B is a poly-time computable function f such that:

x ∈ A ⇔ f(x) ∈ B

Written: A ≤P B — "A reduces to B"

What reductions tell us

  • If A ≤P B and B ∈ P, then A ∈ P
  • If A ≤P B and A ∉ P, then B ∉ P
  • Reductions transfer hardness: "B is at least as hard as A"

Classic reduction chain

SAT 3-SAT
3-SAT CLIQUE
CLIQUE VERTEX COVER
VERTEX COVER HAM. CYCLE
HAM. CYCLE TSP

Each arrow is a polynomial-time reduction. If any of these could be solved in poly-time, so could all of them.

11

NP-Completeness

Definition

A problem L is NP-complete if:

1. L ∈ NP

Solutions can be verified in polynomial time.

2. L is NP-hard

Every problem in NP reduces to L in polynomial time.

NP-complete problems are the hardest problems in NP. If any NP-complete problem has a polynomial-time algorithm, then P = NP.

Significance

  • NP-complete problems form the "boundary" of tractability
  • Thousands of natural problems are NP-complete
  • No polynomial-time algorithm is known for any of them
  • Strong evidence (but no proof) that none exists

Finding a problem NP-complete is practically useful: it tells you to stop looking for an exact efficient algorithm and instead consider approximation, heuristics, or parameterised solutions.

12

The Cook-Levin Theorem

Theorem: SAT is NP-complete. (Cook, 1971; Levin, independently, 1973)

This was the first problem proven NP-complete. The proof shows that any NP computation can be encoded as a boolean satisfiability instance — the universal reduction.

Proof sketch

Given any NP language L decided by a nondeterministic TM M in time p(n):

1 Encode M's computation tableau as boolean variables
2 Build clauses: valid start state, valid transitions, accepting final state
3 Formula is satisfiable ⇔ M accepts the input
4 Reduction runs in polynomial time — tableau size is poly(n)

Once SAT is NP-complete, all other NP-completeness proofs reduce from SAT (or from a problem already known to be NP-complete).

13

Classic NP-Complete Problems I

SAT (Boolean Satisfiability)

Given a boolean formula φ, is there an assignment of variables that makes φ true?

3-SAT

SAT restricted to CNF with exactly 3 literals per clause. Still NP-complete (reduced from SAT). Surprisingly, 2-SAT is in P.

CLIQUE

Given graph G and integer k, does G contain a complete subgraph of size k?

INDEPENDENT SET

Given graph G and integer k, does G contain k pairwise non-adjacent vertices? (Complement of CLIQUE.)

VERTEX COVER

Given graph G and integer k, is there a set of at most k vertices touching every edge?

These three graph problems are tightly related: S is an independent set ⇔ V \ S is a vertex cover. Reductions between them are straightforward.

14

Classic NP-Complete Problems II

HAMILTONIAN CYCLE

Does graph G contain a cycle visiting every vertex exactly once?

TSP (decision version)

Given a weighted complete graph and budget B, is there a tour visiting every city with total cost ≤ B?

SUBSET SUM

Given a set of integers S and target t, is there a subset of S summing to exactly t?

GRAPH COLOURING (k ≥ 3)

Can vertices be coloured with k colours so no two adjacent vertices share a colour? NP-complete for k ≥ 3; polynomial for k = 2.

SET COVER

Given universe U and a collection of subsets, can you cover U with at most k subsets?

These problems arise constantly in scheduling, logistics, networking, resource allocation, and VLSI design. Their NP-completeness motivates approximation algorithms.

15

NP-Hard Problems

Definition

A problem H is NP-hard if every problem in NP reduces to H in polynomial time. An NP-hard problem does NOT need to be in NP itself.

NP-hard but not in NP

ProblemWhy not in NP?
Halting problemUndecidable — no TM decides it at all
QSAT (QBF)PSPACE-complete; certificate may be exponential
TSP optimisationOutput is a tour (not yes/no); decision version is in NP
Counting SAT (#SAT)Answer is a number, not a yes/no certificate

Relationship diagram

NP-hard NP P NP-complete (NP-hard ∩ NP) Halting QSAT
16

P vs NP — The Open Question

The question: Does P = NP? Can every problem whose solution can be verified in polynomial time also be solved in polynomial time?

One of the seven Clay Mathematics Institute Millennium Prize Problems. $1,000,000 for a correct proof either way. Most complexity theorists believe P ≠ NP, but no proof exists.

If P = NP

  • All NP-complete problems solvable efficiently
  • Cryptography based on factoring/discrete log breaks
  • Optimisation, scheduling, logistics become trivially solvable
  • Mathematical proof discovery becomes automatable

If P ≠ NP (proven)

  • Confirms inherently hard problems exist
  • Validates foundations of modern cryptography
  • Justifies study of approximation, heuristics, parameterised algorithms

"If P = NP, then the world would be a profoundly different place. Everyone who could appreciate a symphony would be Mozart." — Scott Aaronson

17

Approximation Algorithms

When exact solutions are NP-hard, we settle for solutions that are provably close to optimal in polynomial time.

ProblemTechniqueRatio
Vertex CoverGreedy edge matching2-approx
TSP (metric)Christofides-Serdyukov3/2-approx
Set CoverGreedyO(ln n)-approx
MAX-SATRandomised rounding3/4-approx
KnapsackFPTAS(1 + ε) for any ε > 0

PTAS

Polynomial-Time Approximation Scheme: (1+ε)-approx for any ε. Time polynomial in n but possibly exponential in 1/ε.

FPTAS

Fully Polynomial TAS: time polynomial in both n and 1/ε. The gold standard for approximation.

Not all NP-hard problems can be approximated well. TSP (general, non-metric) cannot be approximated within any constant factor unless P = NP.

18

Parameterised Complexity

Instead of measuring complexity in input size n alone, we also consider a parameter k. A problem is fixed-parameter tractable (FPT) if solvable in time f(k) · nc for some computable f and constant c.

An O(2k · n) algorithm is practical when k is small, even if n is large. Classical complexity calls this "exponential" — parameterised complexity recognises it as tractable.

ProblemParameter kTechniqueTime
Vertex CoverCover sizeBounded search treeO(2k · n)
k-CliqueClique sizeColour codingO(2k · n)
Longest PathPath lengthColour codingO(2k · n²)
TreewidthTreewidthDynamic programmingO(f(k) · n)

W-hierarchy

W[0] = FPT ⊆ W[1] ⊆ W[2] ⊆ … — analogous to the P/NP hierarchy but parameterised. k-CLIQUE is W[1]-complete, meaning it is unlikely to be FPT.

19

Space Complexity — PSPACE, L, NL

ClassDefinitionKey problems
LDecidable in O(log n) space (deterministic)Undirected connectivity
NLDecidable in O(log n) space (nondeterministic)Directed s-t connectivity (PATH)
PSPACEDecidable in polynomial spaceQSAT, generalised chess/Go
EXPTIMEDecidable in exponential timeGeneralised chess (n×n)

Known inclusions

L NL P NP PSPACE EXPTIME

Savitch's Theorem

NSPACE(f(n)) ⊆ DSPACE(f(n)²). Nondeterminism gives at most a quadratic blowup in space. In particular, NL ⊆ DSPACE(log² n).

NL-completeness

PATH (directed s-t connectivity) is NL-complete under log-space reductions. It plays the same role in space complexity that SAT plays in time.

PSPACE captures problems where you can reuse space. Many two-player games are PSPACE-complete — the game tree is exponential but the space to evaluate it is polynomial.

20

Summary & Further Reading

Key takeaways

  • Asymptotic notation (O, Ω, Θ) provides machine-independent growth-rate descriptions
  • P captures efficiently solvable problems; NP captures efficiently verifiable problems
  • NP-completeness (Cook-Levin, reductions) identifies the hardest problems in NP
  • If any NP-complete problem is in P, then P = NP — the most important open question in CS
  • Approximation algorithms provide provably good polynomial-time solutions when exact solutions are intractable
  • Parameterised complexity isolates the source of exponential blowup
  • Space complexity (L, NL, PSPACE) adds another dimension to classification

Recommended reading

SourceDescription
SipserIntroduction to the Theory of Computation — the standard complexity textbook
Arora & BarakComputational Complexity: A Modern Approach — comprehensive graduate reference
CLRSIntroduction to Algorithms — Ch. 34 covers NP-completeness
Garey & JohnsonComputers and Intractability — the classic NP-completeness catalogue
Complexity Zoocomplexityzoo.net — comprehensive class catalogue
Aaronsonscottaaronson.blog — accessible P vs NP writing