Orthogonality and Projections
The Central Question: What Is the Closest Point in a Subspace?
Given a vector that does not lie in a subspace , what point in is nearest to ? The answer is the orthogonal projection: drop a perpendicular from onto . This geometric idea underlies least squares regression (project onto the column space of ), Gram-Schmidt orthogonalization, and the orthogonal complement relationships among the four fundamental subspaces.
Topics to Cover
Orthogonal Vectors and Subspaces
- Definition:
- Orthogonal complements: = all vectors perpendicular to
- Orthogonality of the four fundamental subspaces:
- Row space Nullspace (both in )
- Column space Left nullspace (both in )
- Why this orthogonality is the geometric backbone of
- Cross-reference to The Four Fundamental Subspaces
Orthogonal Bases and Orthonormal Bases
- Orthogonal basis: mutually perpendicular, any length
- Orthonormal basis: mutually perpendicular, unit length
- Why orthonormal bases are computationally ideal: (no system to solve)
- Orthogonal matrices: ,
- Cross-reference to Special Matrices: Orthogonal
Projection onto a Line
- Projection of onto line through :
- Projection matrix:
- Error is perpendicular to (the key idea)
- Geometric picture: dropping a perpendicular
Projection onto a Subspace
- Projection of onto column space of :
- Derivation from , i.e.,
- Normal equations:
- Projection matrix:
- Properties: (idempotent), (symmetric)
- projects onto the orthogonal complement
Summary
Answering the Central Question: The closest point in a subspace to a vector is the orthogonal projection , where is the projection matrix. The error is perpendicular to , making the minimum possible distance. This is equivalent to solving the normal equations , which is exactly the least squares problem.
Applications in Data Science and Machine Learning
- Linear regression as projection: projects the response vector onto the column space of the feature matrix
- Residuals: lives in the left nullspace — the part of unexplained by features
- Dimensionality reduction: projection onto top- principal subspace
- Signal vs noise decomposition: projecting data onto signal subspace, residual = noise
- Feature orthogonalization: removing collinearity by projecting out shared directions
Guided Problems
References
- Strang, Introduction to Linear Algebra, Chapter 4 (4.1–4.2)