Pedagogical arc
01 — The book's map
The four pillars (regression, dim-reduction, density estimation, classification) and what each chapter contributes to them.
02 – 07 Part I — Foundations
Linear algebra, analytic geometry, matrix decompositions, vector calculus, probability and continuous optimisation — the toolkit Part II will use.
08 – 12 Part II — Four canonical problems
Linear regression as projection, PCA as eigenvectors of the covariance, GMMs trained with EM, and the kernel SVM — built directly from the Part I tools.
Presentations in this series
Part I — Mathematical Foundations
- 01Introduction and Motivation →The four pillars of ML (regression, dim. reduction, density estimation, classification) and the four mathematical foundations (linear algebra, analytic geometry, vector calculus, probability). Interactive map of the book and the dependency graph between chapters.
- 02Linear Algebra →Systems of linear equations, matrices, Gaussian elimination, vector spaces, basis & rank, linear & affine maps. Interactive Gaussian-elimination row stepper that solves $A\mathbf{x}=\mathbf{b}$ live.
- 03Analytic Geometry →Norms, inner products, lengths & distances, angles, orthonormal bases, orthogonal complements, orthogonal projections, rotations. Interactive projection demo with the ⊥ residual drawn live.
- 04Matrix Decompositions →Determinant & trace, eigendecomposition, Cholesky, the singular value decomposition, Eckart–Young low-rank approximation, the matrix-decomposition taxonomy. Interactive SVD image compressor.
- 05Vector Calculus →Partial derivatives, the gradient, the Jacobian, gradients of matrix expressions, the chain rule, backpropagation as the chain rule on a DAG, automatic differentiation, multivariate Taylor series. Interactive gradient-descent visualiser on a 2D loss surface.
- 06Probability and Distributions →Sample spaces, sum & product rules, Bayes' theorem, summary statistics, the Gaussian (with marginalisation and conditioning), conjugacy, the exponential family, change of variables. Interactive Bayes' theorem demo and bivariate Gaussian explorer.
- 07Continuous Optimisation →Gradient descent (with and without momentum), constrained optimisation & Lagrange multipliers, convex sets & functions, linear & quadratic programming, the Lagrange dual. Interactive descent paths on a 2D loss surface with adjustable learning rate and momentum.
Part II — Central Machine Learning Problems
- 08When Models Meet Data →Data & features, empirical risk minimisation, the bias-variance trade-off, regularisation, cross-validation, MLE vs MAP, directed graphical models. Interactive polynomial-fitting demo with live bias/variance and learning-curve readouts.
- 09Linear Regression →Least squares as orthogonal projection onto the column space, ridge / MAP regularisation, Bayesian linear regression with posterior predictive distribution, feature maps. Interactive regression playground that updates the conjugate Gaussian posterior point-by-point.
- 10Dimensionality Reduction with PCA →Variance-maximisation view, reconstruction-error view, eigenvectors of the covariance, low-rank approximation, PCA in high dimensions, probabilistic PCA, the latent-variable perspective. Interactive 2D → 1D PCA visualiser with live principal-axis fitting.
- 11Density Estimation with GMMs →Mixture model likelihood, the EM algorithm derived from the latent-variable view, responsibilities, the lower bound, soft vs hard clustering, model-order selection. Interactive 2D EM animator with step-by-step E and M passes.
- 12Classification with SVMs →Separating hyperplanes & the maximum-margin idea, primal hard-margin SVM, soft-margin slack, the Lagrange dual, support vectors, the kernel trick (linear, polynomial, RBF). Interactive 2D kernel SVM demo where you place points and watch the decision boundary change kernel.
Why this companion? Deisenroth, Faisal & Ong's book sits at a rare intersection: it is honest mathematics — with axioms, theorems and proofs — written explicitly for ML readers. Most companions either re-derive the maths (and add nothing) or paraphrase the ML parts (and lose the rigour). This series instead visualises what the book proves: every abstract definition gets a draggable picture, every theorem gets a live numerical check, and every algorithm of Part II is traced back, slide-by-slide, to a theorem in Part I.
Read the book and the deck side by side — chapter and section numbers in the slides correspond directly to the book, so you can dive deeper into any proof at any time.