Positive Definite Matrices
The Central Question: When Does a Quadratic Form Always Curve Upward?
The quadratic form generalizes the idea of to multiple dimensions. When is this bowl-shaped (always positive), guaranteeing a unique minimum? Positive definiteness answers this, connecting eigenvalue signs, pivot signs, Cholesky factorization, and the curvature of loss functions in optimization.
Topics to Cover
Quadratic Forms and Geometry
- The quadratic form and its graph
- Positive definite = bowl (minimum), negative definite = dome (maximum), indefinite = saddle
- Connection to second-derivative test: Hessian matrix
Tests for Positive Definiteness
- Five equivalent conditions (for symmetric ):
- All eigenvalues
- All upper-left determinants (leading minors) > 0
- All pivots > 0
- for all (energy test)
- for some matrix with independent columns (Cholesky)
- Proving the equivalence chain
- Positive semi-definite: everywhere (allow zero eigenvalues)
Cholesky Decomposition (Deeper Treatment)
- : the "square root" of a positive definite matrix
- Why it exists (positive pivots guarantee no zero divisions)
- Cost: — half the cost of LU
- Numerical stability: no pivoting needed
- Cross-reference to Matrix Operations for the introductory treatment
The Gram Matrix
- Always positive semi-definite (proof via energy test: )
- Positive definite iff has independent columns (trivial nullspace)
- Central object: normal equations, covariance matrices, kernel matrices
Rayleigh Quotient and Min-Max Principles
- Rayleigh quotient:
- for all
- Min-max (Courant-Fischer): variational characterization of every eigenvalue
- Interlacing theorem (eigenvalues of submatrices)
Ellipsoids and Principal Axes
- defines an ellipsoid
- Eigenvectors = axis directions, = axis lengths
- Condition number = elongation of the ellipsoid
Summary
Answering the Central Question: A quadratic form always curves upward (is always positive for ) exactly when is positive definite. Five equivalent tests characterize this: all eigenvalues positive, all pivots positive, all leading principal minors positive, for some with independent columns, and for all nonzero . Cholesky factorization () is the computational signature of positive definiteness.
Applications in Data Science and Machine Learning
- Optimization: Hessian positive definite ⇔ strict local minimum; condition number controls convergence speed of gradient descent
- Covariance matrices: always PSD; eigenvalues = variance along principal axes
- Kernel methods: kernel matrix must be PSD (Mercer's condition)
- Gaussian processes: covariance matrix must be PSD; Cholesky used for sampling and log-likelihood
- Regularization: is always positive definite for (Ridge regression makes the bowl rounder)
Guided Problems
References
- Strang, Introduction to Linear Algebra, Chapter 6 (6.1–6.2)
- Strang, Linear Algebra and Its Applications, Chapter 6