Today we finish 5.1 and start 5.2.
Continue **reading** Section 5.2 for Friday, and start reading 5.3.
Work through recommended homework questions.

**Tutorials:** Quiz 6 covers Sections 4.4 and
the Markov Chains part of 4.6.

**Office hour:** Wednesday, 12:30-1:30, MC103B.

**Help Centers:** Monday-Friday 2:30-6:30 in MC 106.

**T/F:** A matrix with orthogonal columns is called an orthogonal matrix.

**T/F:** An orthogonal matrix must be square.

**Question:** Why are orthonormal bases great? Are orthogonal
bases great too?

**Definition:**
A set of vectors $\{ \vv_1, \vv_2, \ldots, \vv_k \}$ in $\R^n$ is
an **orthogonal set** if $\vv_i \cdot \vv_j = 0$ for $i \neq j$.

**Theorem 5.1:** An orthogonal set of nonzero vectors is always linearly independent.

**Definition:** An **orthogonal basis** for a subspace $W$ of
$\R^n$ is a basis of $W$ that is an orthogonal set.

You only need to check that the set spans $W$, since it is automatically linearly independent.

**Fact:** We'll show in Section 5.3 that every subspace
*has* an orthogonal basis.

Recall that if $\{ \vv_1, \vv_2, \ldots, \vv_k \}$ is any basis of a subspace $W$, then any $\vw$ in $W$ can be written uniquely as a linearly combination of the vectors in the basis. In general, finding the coefficients involves solving a linear system. For an orthogonal basis, it is much easier:

**Theorem 5.2:** If $\{ \vv_1, \vv_2, \ldots, \vv_k \}$ is an orthogonal basis of a subspace $W$,
and $\vw$ is in $W$, then
$$
\vw = c_1 \vv_1 + \cdots + c_k \vv_k \qtext{where}
c_i = \frac{\vw \cdot \vv_i}{\vv_i \cdot \vv_i}
$$

**Definition:** An **orthonomal set** is an orthogonal set of unit vectors.
An **orthonormal basis** for a subspace $W$ is a basis for $W$ that is an orthonormal set.

The condition of being orthonormal can be expressed as $$ \vv_i \cdot \vv_j = \begin{cases} 0 & \text{if } i \neq j \\ 1 & \text{if } i = j \end{cases} $$

**Question:** How many orthonormal bases are there for $\R^3$?

Note that an orthogonal basis can be converted to an orthonormal basis by dividing each vector by its length.

**Theorem 5.3:** If $\{ \vq_1, \vq_2, \ldots, \vq_k \}$ is an orthonormal basis of a subspace $W$,
and $\vw$ is in $W$, then
$$
\vw = (\vw \cdot \vq_1) \vq_1 + \cdots + (\vw \cdot \vq_k) \vq_k
$$

**Definition:** A square matrix $Q$ with real entries
whose columns form an orthonormal set is called an **orthogonal** matrix!

**Note:** In $\R^2$ and $\R^3$, orthogonal matrices correspond
exactly to the rotations and reflections. This is an important geometric
reason to study them. Another reason is that we will see in Section 5.4 that
they are related to diagonalization of symmetric matrices.

**Theorems 5.4 and 5.5:** $Q$ is orthogonal if and only if $Q^T Q = I$, i.e.
if and only if $Q$ is invertible and $Q^{-1} = Q^T$.

**Examples:** $A = \bmat{rrr} 0 & 1 & 0 \\ 0 & 0 & 1 \\ 1 & 0 & 0 \emat$
and $B = \bmat{rr} \cos \theta & -\sin \theta \\ \sin \theta & \cos \theta \emat$.

**Theorem 5.6:** Let $Q$ be an $n \times n$ matrix. Then the following
statements are equivalent:

a. $Q$ is orthogonal.

b. $\|Q \vx\| = \|\vx\|$ for every $\vx$ in $\R^n$.

c. $Q\vx \cdot Q\vy = \vx \cdot \vy$ for every $\vx$ and $\vy$ in $\R^n$.

(a) $\implies$ (c): Suppose $Q$ is orthogonal, so $Q^T Q = I$. Then $$ \kern-6ex Q\vx \cdot Q\vy = (Q \vx)^T (Q \vy) = \vx^T Q^T Q \vy = \vx^T \vy = \vx \cdot \vy $$ (c) $\implies$ (a): Suppose (c) holds. Take $\vx = \ve_i$ and $\vy = \ve_j$. Then $Q\vx = Q\ve_i$ is the $i$th column of $Q$ and $Q\vy = Q\ve_j$ is the $j$th column of $Q$. (c) says that the dot product of these columns is $$ Q\ve_i \cdot Q\ve_j = \ve_i \cdot \ve_j = \begin{cases} 0 & \text{if } i \neq j \\ 1 & \text{if } i = j \end{cases} $$ So the columns of $Q$ are orthonomal, which means $Q$ is orthogonal.

(c) $\implies$ (b) is clear, by taking $\vx = \vy$ in (c).

(b) $\implies$ (c): see text. $\quad\Box$

**Theorem 5.7:** If $Q$ is orthogonal, then its **rows** form
an orthonormal set too.

**Proof:** Since $Q^T Q = I$, we must also have $Q Q^T = I$.
But the last equation says exactly that the rows of $Q$ are orthonormal.$\quad\Box$

Another way to put it is that $Q^T$ is also an orthogonal matrix.

**Theorem 5.8:** Let $Q$ be an orthogonal matrix. Then:

a. $Q^{-1}$ is orthogonal.

b. $\det Q = \pm 1$

c. If $\lambda$ is an eigenvalue of $Q$, then $|\lambda| = 1$.

d. If $Q_1$ and $Q_2$ are orthogonal matrices of the same size,
then $Q_1 Q_2$ is orthogonal.

**Proof:**

(a) is Theorem 5.7, since $Q^{-1} = Q^T$.

(b): Since $I = Q^T Q$, we have $$1 = \det I = \det(Q^T Q) = \det(Q^T) \det(Q) = \det(Q)^2.$$ Therefore $\det(Q) = \pm 1$.

(c) If $Q\vv = \lambda \vv$, then $$ \|\vv\| = \|Q\vv\| = \|\lambda \vv\| = |\lambda| \|\vv\| $$ so $|\lambda| = 1$, since $\|\vv\| \neq 0$.

(d) Exericse, using properties of transpose.$\quad\Box$

**Definition:** Let $W$ be a subspace of $\R^n$.
A vector $\vv$ is **orthogonal** to $W$ if $\vv$ is orthogonal
to every vector in $W$.
The **orthogonal complement** of $W$ is the set of all vectors
orthogonal to $W$ and is denoted $W^\perp$. So
$$
\kern-4ex
W^\perp = \{ \vv \in \R^n : \vv \cdot \vw = 0 \text{ for all } \vw \text{ in } W \}
$$

In the example above, if we write $\ell = \span(\vn)$ for the line perpendicular to $W$, then $\ell = W^\perp$ and $W = \ell^\perp$.

**Theorem 5.9:** Let $W$ be a subspace of $\R^n$. Then:

a. $W^\perp$ is a subspace of $\R^n$.

b. $(W^\perp)^\perp = W$

c. $W \cap W^\perp = \{ \vec 0 \}$

d. If $W = \span(\vw_1, \ldots, \vw_k)$, then $\vv$ is in $W^\perp$
if and only if $\vv \cdot \vw_i = 0$ for all $i$.

Explain (a), (c), (d) on whiteboard. (b) will be Corollary 5.12.

**Theorem 5.10:** Let $A$ be an $m \times n$ matrix. Then
$$
\kern-4ex
(\row(A))^\perp = \null(A) \qtext{and} (\col(A))^\perp = \null(A^T)
$$

The first two are in $\R^n$ and the last two are in $\R^m$.
These are the **four fundamental subspaces** of $A$.

Let's see why $(\row(A))^\perp = \null(A)$. A vector is in $\null(A)$ exactly when it is orthogonal to the rows of $A$. But the rows of $A$ span $\row(A)$, so the vectors in $\null(A)$ are exactly those which are orthogonal to $\row(A)$, by 5.9(d).

The fact that $(\col(A))^\perp = \null(A^T)$ follows by replacing $A$ with $A^T$.

**Example:** Let $W$ be the subspace spanned by
$\vv_1 = [ 1, 2, 3 ]$ and $\vv_2 = [ 2, 5, 7 ]$. Find a basis for $W^\perp$.

**Solution:** Let $A$ be the matrix with $\vv_1$ and $\vv_2$ as rows.
Then $W = \row(A)$, so $W^\perp = \null(A)$. Continue on whiteboard.

We didn't name it then, but we also noticed that $\vv - \proj_{\vu}(\vv)$ is orthogonal to $\vu$. Let's call this $\Perp_{\vu}(\vv)$.

So if we write $W = \span(\vu)$, then $\vw = \proj_{\vu}(\vv)$ is in $W$, $\vw^\perp = \Perp_{\vu}(\vv)$ is in $W^\perp$, and $\vv = \vw + \vw^\perp$. We can do this more generally (sketch):

**Definition:** Let $W$ be a subspace of $\R^n$
and let $\{ \vu_1, \ldots, \vu_k \}$ be an orthogonal basis for $W$.
For $\vv$ in $\R^n$, the **orthogonal projection** of $\vv$ onto $W$
is the vector
$$
\proj_W(\vv) = \proj_{\vu_1}(\vv) + \cdots + \proj_{\vu_k}(\vv)
$$
The **component of $\vv$ orthogonal to $W$** is the vector
$$
\Perp_W(\vv) = \vv - \proj_W(\vv)
$$

We will show soon that $\Perp_W(\vv)$ is in $W^\perp$.

Note that multiplying $\vu$ by a scalar in the earlier example doesn't change $W$, $\vw$ or $\vw^\perp$. We'll see later that the general definition also doesn't depend on the choice of orthogonal basis.