Linear Algebra – Orthogonality and least squares

Definition: If \(\mathbf{u}=\begin{pmatrix}u_1\\u_2\\\vdots\\u_n\end{pmatrix}\) and \(\mathbf{v}=\begin{pmatrix}v_1\\v_2\\\vdots\\v_n\end{pmatrix}\) are vectors in \(\mathbb{R}^n\), then

\[\mathbf{u}\cdot\mathbf{v}=\mathbf{u}^T\mathbf{v}=u_1v_1+u_2v_2+\cdots+u_nv_n\]

is called the inner product of the vectors \(\mathbf{u}\) and \(\mathbf{v}\).

Remark: so the inner product of two vectors is a scalar.

The definition simply leads to the following rules of calculation:

Theorem: If \(\mathbf{u}\), \(\mathbf{v}\) and \(\mathbf{w}\) are vectors in \(\mathbb{R}^n\) and \(c\in\mathbb{R}\), then we have:

  1. \(\mathbf{u}\cdot\mathbf{v}=\mathbf{v}\cdot\mathbf{u}\),

  2. \((\mathbf{u}+\mathbf{v})\cdot\mathbf{w}=\mathbf{u}\cdot\mathbf{w}+\mathbf{v}\cdot\mathbf{w}\),

  3. \((c\mathbf{u})\cdot\mathbf{v}=c(\mathbf{u}\cdot\mathbf{v})=\mathbf{u}\cdot(c\mathbf{v})\),

  4. \(\mathbf{u}\cdot\mathbf{u}\geq0\) and \(\mathbf{u}\cdot\mathbf{u}=0\;\Longleftrightarrow\;\mathbf{u}=\mathbf{0}\).

The latter rule of calculation makes it possible to define the notion of length or norm:

Definition: If \(\mathbf{v}=\begin{pmatrix}v_1\\v_2\\\vdots\\v_n\end{pmatrix}\in\mathbb{R}^n\), then we have:

\[||\mathbf{v}||=\sqrt{\mathbf{v}\cdot\mathbf{v}}=\sqrt{v_1^2+v_2^2+\cdots+v_n^2}\]

is called the length or norm of the vector \(\mathbf{v}\).

Remark: so we have that \(||\mathbf{v}||^2=\mathbf{v}\cdot\mathbf{v}\).

The rules of calculation imply that \(||c\mathbf{v}||=|c|\,||\mathbf{v}||\) for all \(c\in\mathbb{R}\). After all: \(||c\mathbf{v}||^2=(c\mathbf{v})\cdot(c\mathbf{v})=c^2(\mathbf{v}\cdot\mathbf{v})=c^2||\mathbf{v}||^2\). This implies that \(||c\mathbf{v}||=\sqrt{c^2}||\mathbf{v}||=|c|\,||\mathbf{v}||\).

Definition: A vector with length \(1\) is called a unit vector.

The unit vector in the direction of a vector \(\mathbf{v}\neq\mathbf{0}\) is: \(\displaystyle\frac{1}{||\mathbf{v}||}\mathbf{v}\).

Now we are able to define a notion of distance:

Definition: If \(\mathbf{u}=\begin{pmatrix}u_1\\u_2\\\vdots\\u_n\end{pmatrix}\) and \(\mathbf{v}=\begin{pmatrix}v_1\\v_2\\\vdots\\v_n\end{pmatrix}\) are vectors in \(\mathbb{R}^n\), then

\[\text{dist}(\mathbf{u},\mathbf{v})=||\mathbf{u}-\mathbf{v}||=\sqrt{(u_1-v_1)^2+(u_2-v_2)^2+\cdots+(u_n-v_n)^2}\]

is called the distance of the vectors \(\mathbf{u}\) and \(\mathbf{v}\).

This can be used to define orthogonality. Two vectors \(\mathbf{u}\) and \(\mathbf{v}\) are called orthogonal if \(\text{dist}(\mathbf{u},\mathbf{v})=\text{dist}(\mathbf{u},-\mathbf{v})\). Now we have:

\[\{\text{dist}(\mathbf{u},\mathbf{v})\}^2=||\mathbf{u}-\mathbf{v}||^2=(\mathbf{u}-\mathbf{v})\cdot(\mathbf{u}-\mathbf{v}) =\mathbf{u}\cdot\mathbf{u}-\mathbf{u}\cdot\mathbf{v}-\mathbf{v}\cdot\mathbf{u}+\mathbf{v}\cdot\mathbf{v}=||\mathbf{u}||^2-2(\mathbf{u}\cdot\mathbf{v})+||\mathbf{v}||^2\]

and similarly

\[\{\text{dist}(\mathbf{u},-\mathbf{v})\}^2=||\mathbf{u}+\mathbf{v}||^2=(\mathbf{u}+\mathbf{v})\cdot(\mathbf{u}+\mathbf{v}) =\mathbf{u}\cdot\mathbf{u}+\mathbf{u}\cdot\mathbf{v}+\mathbf{v}\cdot\mathbf{u}+\mathbf{v}\cdot\mathbf{v}=||\mathbf{u}||^2+2(\mathbf{u}\cdot\mathbf{v})+||\mathbf{v}||^2.\]

This implies:

\[\text{dist}(\mathbf{u},\mathbf{v})=\text{dist}(\mathbf{u},-\mathbf{v}) \quad\Longleftrightarrow\quad-(\mathbf{u}\cdot\mathbf{v})=\mathbf{u}\cdot\mathbf{v}\quad\Longleftrightarrow\quad\mathbf{u}\cdot\mathbf{v}=0.\]

This leads to:

Definition: Two vectors \(\mathbf{u}\) and \(\mathbf{v}\) in \(\mathbb{R}^n\) are called orthogonal if \(\mathbf{u}\cdot\mathbf{v}=0\). Notation: \(\mathbf{u}\perp\mathbf{v}\).

Remark: this definition implies that the zero vector is "perpendicular" to every other vector and even to itself.

Now we are able to prove a generalization of Pythagoras' theorem:

Theorem: If \(\mathbf{u}\) and \(\mathbf{v}\) are vectors in \(\mathbb{R}^n\), then we have: \[\mathbf{u}\perp\mathbf{v}\quad\Longleftrightarrow\quad||\mathbf{u}+\mathbf{v}||^2=||\mathbf{u}||^2+||\mathbf{v}||^2.\]

Proof: We have:

\[||\mathbf{u}+\mathbf{v}||^2=(\mathbf{u}+\mathbf{v})\cdot(\mathbf{u}+\mathbf{v})=\mathbf{u}\cdot\mathbf{u}+\mathbf{u}\cdot\mathbf{v} +\mathbf{v}\cdot\mathbf{u}+\mathbf{v}\cdot\mathbf{v}=||\mathbf{u}||^2+2(\mathbf{u}\cdot\mathbf{v})+||\mathbf{v}||^2.\]

This implies: \(||\mathbf{u}+\mathbf{v}||^2=||\mathbf{u}||^2+||\mathbf{v}||^2\;\;\Longleftrightarrow\;\;\mathbf{u}\cdot\mathbf{v}=0\;\;\Longleftrightarrow\;\;\mathbf{u}\perp\mathbf{v}\).

Finally we als define the angle between two vectors:

Definition: If \(\mathbf{u}\) and \(\mathbf{v}\) are vectors in \(\mathbb{R}^n\), then we have: \(\mathbf{u}\cdot\mathbf{v}=||\mathbf{u}||\,||\mathbf{v}||\,\cos(\theta)\), where \(\theta\in[0,\pi]\) is the angle between \(\mathbf{u}\) and \(\mathbf{v}\).

Remark: \(\mathbf{u}\perp\mathbf{v}\;\;\Longleftrightarrow\;\;\theta=\frac{1}{2}\pi\;\;\Longleftrightarrow\;\;\cos\left(\frac{1}{2}\pi\right)=0\).


Last modified on May 2, 2021
© Roelof Koekoek

Metamenu