Linear Algebra

Teaching

Linear Algebra
- Orthogonality and least squares

Linear Algebra – Orthogonality and least squares – Orthogonal projections

Theorem: If $W$ is a subspace of $\mathbb{R}^n$, then we have: each vector $\mathbf{y}\in\mathbb{R}^n$ can uniquely be written in the form $\mathbf{y}=\hat{\mathbf{y}}+\mathbf{z}$, where $\hat{\mathbf{y}}\in W$ and $\mathbf{z}\in W^{\perp}$. If $\{\mathbf{v}_1,\ldots,\mathbf{v}_p\}$ is an orthogonal basis of $W$, then we have:

\[\hat{\mathbf{y}}=\text{proj}_{W}\mathbf{y}=\left(\frac{\mathbf{y}\cdot\mathbf{v}_1}{\mathbf{v}_1\cdot\mathbf{v}_1}\right)\mathbf{v}_1+\cdots +\left(\frac{\mathbf{y}\cdot\mathbf{v}_p}{\mathbf{v}_p\cdot\mathbf{v}_p}\right)\mathbf{v}_p.\]

This is the (orthogonal) projection of $\mathbf{y}$ onto $W$.

Proof: Suppose that $\{\mathbf{v}_1,\ldots,\mathbf{v}_p\}$ is any orthogonal basis of $W$ and that $\hat{\mathbf{y}}$ is defined as in the theorem, then we have: $\hat{\mathbf{y}}\in W$. Now let $\mathbf{z}=\mathbf{y}-\hat{\mathbf{y}}$, then we have:

\[\mathbf{z}\cdot\mathbf{v}_i=\left(\mathbf{y}-\hat{\mathbf{y}}\right)\cdot\mathbf{v}_i=\mathbf{y}\cdot\mathbf{v}_i -\left(\frac{\mathbf{y}\cdot\mathbf{v}_i}{\mathbf{v}_i\cdot\mathbf{v}_i}\right)\mathbf{v}_i\cdot\mathbf{v}_i =\mathbf{y}\cdot\mathbf{v}_i-\mathbf{y}\cdot\mathbf{v}_i=0,\quad i=1,2,\ldots,p.\]

This implies that $\mathbf{z}\perp\mathbf{v}_i$ for all $i=1,2,\ldots,p$ and therefore: $\mathbf{z}\in W^{\perp}$.

In order to show that the decomposition is unique, suppose that also $\mathbf{y}=\tilde{\mathbf{y}}+\mathbf{w}$ with $\tilde{\mathbf{y}}\in W$ and $\mathbf{w}\in W^{\perp}$. Then we have: $\hat{\mathbf{y}}+\mathbf{z}=\tilde{\mathbf{y}}+\mathbf{w}$ and therefore $\hat{\mathbf{y}}-\tilde{\mathbf{y}}=\mathbf{w}-\mathbf{z}$. The left-hand side is clearly a vector in $W$, while the right-hand side is a vector in $W^{\perp}$. Since the zero vector $\mathbf{0}$ is the only vector that is both in $W$ and in $W^{\perp}$, this implies that $\hat{\mathbf{y}}-\tilde{\mathbf{y}}=\mathbf{0}$ and $\mathbf{w}-\mathbf{z}=\mathbf{0}$. Hence: $\hat{\mathbf{y}}=\tilde{\mathbf{y}}$ and $\mathbf{w}=\mathbf{z}$.

Theorem: Let $W$ be a subspace of $\mathbb{R}^n$ and $\mathbf{y}$ a vector in $\mathbb{R}^n$. Suppose that $\hat{\mathbf{y}}$ is the orthogonal projection of $\mathbf{y}$ onto $W$. Then we have: $\hat{\mathbf{y}}$ is the vector in $W$ that is closest to $\mathbf{y}$, which means that $||\mathbf{y}-\hat{\mathbf{y}}|| < ||\mathbf{y}-\mathbf{w}||$ for all $\mathbf{w}\in W$ unequal to $\hat{\mathbf{y}}$.

The vector $\hat{\mathbf{y}}$ is called the best approximation of $\mathbf{y}$ byr elements of $W$.

Proof: Let $\mathbf{w}$ be a vector in $W$ unequal to $\hat{\mathbf{y}}$. Then we have: $\hat{\mathbf{y}}-\mathbf{w}\in W$. Note that $\mathbf{y}-\hat{\mathbf{y}}\in W^{\perp}$ and so also: $\mathbf{y}-\hat{\mathbf{y}}\perp\hat{\mathbf{y}}-\mathbf{w}$. Now we have: $\mathbf{y}-\mathbf{w}=(\mathbf{y}-\hat{\mathbf{y}})+(\hat{\mathbf{y}}-\mathbf{w})$. However then Pythagoras' theorem implies that $||\mathbf{y}-\mathbf{w}||^2=||\mathbf{y}-\hat{\mathbf{y}}||^2+||\hat{\mathbf{y}}-\mathbf{w}||^2$. Since $\mathbf{w}$ is unequal to $\hat{\mathbf{y}}$, this implies that $||\mathbf{y}-\hat{\mathbf{y}}|| < ||\mathbf{y}-\mathbf{w}||$.

This can be used to determine the distance of a point to a line or a plane in $\mathbb{R}^3$ by finding the (orthogonal) projection first and then the distance to that (orthogonal) projection.

Examples:

1) Consider the point $P=(1,-2,1)$ and the plane $V$ given by the equation $x_1+3x_2-2x_3=0$ in $\mathbb{R}^3$. Note that $V^{\perp}$ is the line spanned by the vector $\mathbf{v}=\begin{pmatrix}1\\3\\-2\end{pmatrix}$. The (orthogonal) projection of $\mathbf{y}=\begin{pmatrix}1\\-2\\1\end{pmatrix}$ onto $V^{\perp}$ is:

\[\text{proj}_{V^{\perp}}\mathbf{y}=\left(\frac{\mathbf{y}\cdot\mathbf{v}}{\mathbf{v}\cdot\mathbf{v}}\right)\mathbf{v} =\frac{1-6-2}{1+9+4}\mathbf{v}=-\frac{7}{14}\begin{pmatrix}1\\3\\-2\end{pmatrix}=-\frac{1}{2}\begin{pmatrix}1\\3\\-2\end{pmatrix}.\]

Then the distance of $P$ to the plane $V$ is $\frac{1}{2}||\begin{pmatrix}1\\3\\-2\end{pmatrix}||=\frac{1}{2}\sqrt{1+9+4}=\frac{1}{2}\sqrt{14}$.

2) Consider the point $P=(1,-2,1)$ and the line $\ell$ spanned by the vector $\mathbf{v}=\begin{pmatrix}1\\3\\-2\end{pmatrix}$. The (orthogonal) projection of $\mathbf{y}=\begin{pmatrix}1\\-2\\1\end{pmatrix}$ onto the line $\ell$ is then $\hat{\mathbf{y}}=-\frac{1}{2}\begin{pmatrix}1\\3\\-2\end{pmatrix}$. This implies: $\mathbf{y}-\hat{\mathbf{y}}=\frac{1}{2}\begin{pmatrix}3\\-1\\0\end{pmatrix}$. Then the distance of $P$ to the line $\ell$ is $||\mathbf{y}-\hat{\mathbf{y}}||=\frac{1}{2}||\begin{pmatrix}3\\-1\\0\end{pmatrix}||=\frac{1}{2}\sqrt{9+1+0}=\frac{1}{2}\sqrt{10}$.

Theorem: If $\{\mathbf{u}_1,\ldots,\mathbf{u}_p\}$ is an orthonormal basis of a subspace $W$ of $\mathbb{R}^n$ and $\mathbf{y}$ is a vector in $\mathbb{R}^n$, then we have:

\[\text{proj}_W\mathbf{y}=(\mathbf{y}\cdot\mathbf{u}_1)\mathbf{u}_1+\cdots+(\mathbf{y}\cdot\mathbf{u}_p)\mathbf{u}_p.\]

If $U=\Bigg(\mathbf{u}_1\;\ldots\;\mathbf{u}_p\Bigg)$, then we have:

\[\text{proj}_W\mathbf{y}=UU^T\mathbf{y}\quad\text{for all}\quad\mathbf{y}\in\mathbb{R}^n.\]

Proof: The first part immediately follows from the first theorem since $\mathbf{u}_i\cdot\mathbf{u}_i=1$ for $i=1,2,\ldots,p$. For the second part we have:

\[\text{proj}_W\mathbf{y}=(\mathbf{y}\cdot\mathbf{u}_1)\mathbf{u}_1+\cdots+(\mathbf{y}\cdot\mathbf{u}_p)\mathbf{u}_p =(\mathbf{u}_1\cdot\mathbf{y})\mathbf{u}_1+\cdots+(\mathbf{u}_p\cdot\mathbf{y})\mathbf{u}_p =(\mathbf{u}_1^T\mathbf{y})\mathbf{u}_1+\cdots+(\mathbf{u}_p^T\mathbf{y})\mathbf{u}_p.\]

This is a linear combination of the columns of $U$ and the weights are the elements of the vector $U^T\mathbf{y}$.

The matrix $UU^T$ is called a projection matrix. This is symmetric, since: $(UU^T)^T=(U^T)^TU^T=UU^T$. Moreover we have that $U^TU=I$.

Example: Consider the projection onto the plane $V$ given by the equation $x_1+x_2+x_3=0$. The orthogonal complement of $V$ is the line spanned by the vector $\mathbf{v}=\begin{pmatrix}1\\1\\1\end{pmatrix}$. The (orthogonal) projection of a vector $\mathbf{y}=\begin{pmatrix}y_1\\y_2\\y_3\end{pmatrix}$ onto that line is then

\[\left(\frac{\mathbf{y}\cdot\mathbf{v}}{\mathbf{v}\cdot\mathbf{v}}\right)\mathbf{v}=\frac{y_1+y_2+y_3}{3}\begin{pmatrix}1\\1\\1\end{pmatrix}.\]

This implies that the (orthogonal) projection of $\mathbf{y}$ onto the plane $V$ is equal to

\[\text{proj}_V\mathbf{y}=\mathbf{y}-\left(\frac{\mathbf{y}\cdot\mathbf{v}}{\mathbf{v}\cdot\mathbf{v}}\right)\mathbf{v} =\begin{pmatrix}y_1\\y_2\\y_3\end{pmatrix}-\frac{y_1+y_2+y_3}{3}\begin{pmatrix}1\\1\\1\end{pmatrix} =\frac{1}{3}\begin{pmatrix}2y_1-y_2-y_3\\-y_1+2y_2-y_3\\-y_1-y_2+2y_3\end{pmatrix} =\frac{1}{3}\begin{pmatrix}2&-1&-1\\-1&2&-1\\-1&-1&2\end{pmatrix}\mathbf{y}.\]

Note that $\{\mathbf{u}_1,\mathbf{u}_2\}$ with $\mathbf{u}_1=\dfrac{1}{\sqrt{2}}\begin{pmatrix}1\\0\\-1\end{pmatrix}$ and $\dfrac{1}{\sqrt{6}}\begin{pmatrix}1\\-2\\1\end{pmatrix}$ is an orthonormal basis of $V$. Now suppose that $U=\Bigg(\mathbf{u}_1\;\mathbf{u}_2\Bigg)$, then we have:

\[U=\frac{1}{\sqrt{6}}\begin{pmatrix}\sqrt{3}&1\\0&-2\\-\sqrt{3}&1\end{pmatrix}\quad\Longrightarrow\quad UU^T=\frac{1}{6}\begin{pmatrix}\sqrt{3}&1\\0&-2\\-\sqrt{3}&1\end{pmatrix}\begin{pmatrix}\sqrt{3}&0&-\sqrt{3}\\1&-2&1\end{pmatrix} =\frac{1}{6}\begin{pmatrix}$&-2&-2\\-2&4&-2\\-2&-2&4\end{pmatrix}=\frac{1}{3}\begin{pmatrix}2&-1&-1\\-1&2&-1\\-1&-1&2\end{pmatrix}.\]

However $\{\mathbf{u}_1,\mathbf{u}_2\}$ with $\mathbf{u}_1=\dfrac{1}{\sqrt{2}}\begin{pmatrix}0\\-1\\1\end{pmatrix}$ and $\mathbf{u}_2=\dfrac{1}{\sqrt{6}}\begin{pmatrix}-2\\1\\1\end{pmatrix}$ is also an orthonormal basis of $V$. Now suppose that $U=\Bigg(\mathbf{u}_1\;\mathbf{u}_2\Bigg)$, then we have:

\[U=\frac{1}{\sqrt{6}}\begin{pmatrix}0&-2\\-\sqrt{3}&1\\\sqrt{3}&1\end{pmatrix}\quad\Longrightarrow\quad UU^T=\frac{1}{6}\begin{pmatrix}0&-2\\-\sqrt{3}&1\\\sqrt{3}&1\end{pmatrix}\begin{pmatrix}0&-\sqrt{3}&\sqrt{3}\\-2&1&1\end{pmatrix} =\frac{1}{6}\begin{pmatrix}4&-2&-2\\-2&4&-2\\-2&-2&4\end{pmatrix}=\frac{1}{3}\begin{pmatrix}2&-1&-1\\-1&2&-1\\-1&-1&2\end{pmatrix}.\]

Last modified on May 1, 2021

Author: Roelof Koekoek

Teaching

Linear Algebra – Orthogonality and least squares – Orthogonal projections

Metamenu

Roelof Koekoek

Roelof Koekoek

Roelof Koekoek