Integration for ML
The Central Question: How Do We Compute the Sums and Averages That Probability Requires?
Probability and statistics are built on integration: expectations are integrals, marginalizations require integrating out variables, and normalizing constants ensure densities integrate to one. Many of these integrals have no closed form, making approximation methods essential.
Consider these scenarios:
- The posterior in Bayesian inference is . The denominator is often an intractable integral.
- Computing for a complex distribution requires . Monte Carlo approximation replaces this with a sample average.
- In normalizing flows, the change-of-variables formula transforms integrals via the Jacobian determinant, connecting integration to linear algebra.
Integration is the bridge between probability models and computable quantities.
Topics to Cover
Computing Expectations
- (continuous) or (discrete)
- Linearity of expectation
- Expectations of common distributions
Marginalization
- : integrating out variables
- Applications in latent variable models
- Connection to the sum rule of probability
Normalizing Constants
- where
- Partition functions in exponential families and energy-based models
- When is tractable vs intractable
Monte Carlo Integration
- Basic Monte Carlo: where
- Convergence rate: regardless of dimension
- Importance sampling:
Change of Variables with the Jacobian
- Connection to Determinants
- Application to probability density transformation
Summary
Answering the Central Question: Integration is how we compute expectations (), marginalize out variables (), and normalize densities (). When these integrals are intractable, Monte Carlo methods approximate them with sample averages that converge at . The change-of-variables formula with the Jacobian determinant allows us to transform integrals between coordinate systems, which is fundamental to normalizing flows and density estimation.
Applications in Data Science and Machine Learning
- Bayesian inference: Computing posterior distributions requires marginalizing over parameters (intractable integrals motivate MCMC and variational inference)
- Expectation-maximization (EM): The E-step computes expected sufficient statistics, requiring integration over latent variables
- Normalizing flows: The change-of-variables formula with Jacobian determinants enables tractable density estimation
- Monte Carlo methods: MCMC, importance sampling, and particle filters approximate intractable integrals
- Variational inference: The ELBO is an expectation that lower-bounds the log-evidence
Guided Problems
References
- Deisenroth, Faisal, and Ong - Mathematics for Machine Learning, Chapter 6.3
- Bishop, Christopher - Pattern Recognition and Machine Learning, Chapter 11 (Sampling Methods)
- Murphy, Kevin - Machine Learning: A Probabilistic Perspective, Chapter 24 (Monte Carlo Methods)
- Papamakarios et al. - Normalizing Flows for Probabilistic Modeling and Inference (2021)