Calculus for Machine Learning

Introduction

This calculus course is structured around the mathematical foundations most relevant to machine learning, drawing from MIT 18.S096 (Matrix Calculus for ML), Stanford MATH51, and the Mathematics for Machine Learning textbook by Deisenroth, Faisal, and Ong. The material is organized into three sections progressing from single-variable and multivariate differential calculus through matrix calculus and automatic differentiation to integral calculus and optimization.

The Three Sections

Section 1: Differential Calculus

Derivatives, gradients, and higher-order analysis in multiple dimensions.

Topics:

Partial derivatives, gradient vectors, and directional derivatives
The chain rule in single and multiple variables
Jacobian and Hessian matrices, second-order conditions
Taylor approximation: linearization and quadratic approximation

Section 2: Matrix Calculus and Automatic Differentiation

Differentiating matrix expressions and computing gradients algorithmically.

Topics:

Derivatives of trace, determinant, and inverse; layout conventions; common identities
Forward-mode and reverse-mode automatic differentiation
Computational graphs, JVP vs VJP, backpropagation

Section 3: Integral Calculus and Optimization

Integration for probabilistic ML and calculus-based optimization theory.

Topics:

Computing expectations, marginalizations, normalizing constants, Monte Carlo integration
First/second order optimality conditions, convexity via the Hessian
Gradient descent convergence, Newton's method, Lagrange multipliers

Introduction​

The Three Sections​

Section 1: Differential Calculus​

Section 2: Matrix Calculus and Automatic Differentiation​

Section 3: Integral Calculus and Optimization​

Introduction

The Three Sections

Section 1: Differential Calculus

Section 2: Matrix Calculus and Automatic Differentiation

Section 3: Integral Calculus and Optimization