Today we finish 4.6 and start Section 5.1.
Continue reading **Section 5.1** for next class,
and start reading **Section 5.2**.
Work through recommended homework questions.

**Help Centers:** Monday-Friday 2:30-6:30 in MC 106.

**Theorem 4.30:** Every stochastic matrix has a steady state vector,
i.e. it has $\lambda = 1$ as an eigenvalue.

If in addition the entries of $P$ are all positive, then all eigenvalues besides $\lambda = 1$ have $|\lambda| < 1$.

**Theorem 4.33:**
Let $P$ be an $n \times n$ stochastic matrix all of whose entries are positive.
Then as $k \to \infty$, $P^k \to L$, a matrix all of whose columns are equal
to the same vector $\vx$ which is a steady state probability vector for $P$.

**Proof of 4.33:**
We'll assume $P$ is diagonalizable: $Q^{-1} P Q = D$.
So $P^k = Q D^k Q^{-1}$. As $k \to \infty$, $D^k$ approaches a matrix $D^*$
with $1$'s and $0$'s on the diagonal (by Theorem 4.31), which means that $P^k$ approaches
$L = Q D^* Q^{-1}$.

Now that we know that $P^k$ has *some* limit $L$, we can deduce
something about it.
Since $\lim_{k \to \infty} P^k = L$, we have
$$
PL = P\lim_{k \to \infty} P^k = \lim_{k \to \infty} P^{k+1} = L
$$
This means that the columns of $L$ must be steady-state vectors for $P$.
Since the columns of $P^k$ are probability vectors, the same must be true
of the columns of $L$.
It's not hard to show that $P$ has a unique steady-state probability vector $\vx$,
so $L = [\, \vx \, \vx \, \cdots \, \vx \, ]$, as required.$\quad\Box$

Finally, we can deduce that Markov chains tend to their steady states:

**Theorem 4.34:**
Let $P$ be an $n \times n$ stochastic matrix all of whose entries are positive,
and let $\vx_0$ be any initial probability vector.
Then as $k \to \infty$, $\vx_k \to \vx$, where
$\vx$ is the steady state probability vector for $P$.

**Proof:**
Suppose that $\vx_0$ has components $x_1, x_2, \ldots, x_n$. Then
$$
\begin{aligned}
\lim_{k \to \infty} \vx_k &= \lim_{k \to \infty} P^k \vx_0 = L \vx_0 \\
&= [\, \vx \, \vx \, \cdots \, \vx \, ] \vx_0 \\
&= x_1 \vx + x_2 \vx + \cdots + x_n \vx \\
&= (x_1 + x_2 + \cdots + x_n ) \vx = \vx \quad\Box
\end{aligned}
$$

This result works both ways: if you compute the eigenvector with eigenvalue 1, that tells you the steady-state vector that other states go to as $k \to \infty$. But it also means that if you don't know the steady-state vector, you can approximate it by starting with any vector $\vx_0$ and computing $P^k \vx_0$ for large $k$!

The latter is what Google does to compute the page rank eigenvector.

**Note:** A transition matrix $P$ is **regular** if some
power $P^k$ of it has positive entries. This means that there
is a nonzero probably to get from any starting state to any
ending state after some number of steps.
The text proves the above results for regular matrices, which
is not hard once you know them for matrices with positive entries.

**Question:** Let $P = \bmat{rr} 1/2 & 1/3 \\ 1/2 & 2/3 \emat$.
**(a)**
Find a clever way to figure out the eigenvalues of $P$ and determine
whether it is diagonalizable.

Recall that vectors $\vu$ and $\vv$ are **orthogonal** if $\vu \cdot \vv = 0$.

**Definition:** A set $\{ \vv_1, \vv_2, \ldots, \vv_k \}$ of vectors in $\R^n$
is an **orthogonal set** if all pairs of distinct vectors in the set are
orthogonal, i.e. $\vv_i \cdot \vv_j = 0$ whenever $i \neq j$.

**Example 5.1:** The following vectors form an orthogonal set:
$$
\vv_1 = \colll 2 1 {-1} ,\ \vv_2 = \colll 0 1 1,\ \vv_3 = \colll 1 {-1} 1
$$

**Theorem 5.1:** If $\{ \vv_1, \vv_2, \ldots, \vv_k \}$ is an orthogonal
set of nonzero vectors in $\R^n$, then these vectors are linearly independent.

**Definition:** An **orthogonal basis** for a subspace $W$ of $\R^n$
is a basis that is also orthogonal.

For example, the vectors in Example 5.1 automatically form a basis for $\R^3$. Checking orthogonality is much easier than checking linear independence!

**Fact:** We'll show in Section 5.3 that every subspace
*has* an orthogonal basis.

It is very easy to figure out coordinates with respect to an orthogonal basis. Suppose $\{ \vv_1, \vv_2, \ldots, \vv_k \}$ is an orthogonal basis and $$ \vw = c_1 \vv_1 + \cdots + c_k \vv_k $$ Then if we dot both sides with $\vv_1$, we find $$ \vw \cdot \vv_1 = c_1 (\vv_1 \cdot \vv_1) $$ and so $$ c_1 = \frac{\vw \cdot \vv_1}{\vv_1 \cdot \vv_1} $$

**Theorem 5.2:**
Let $\{ \vv_1, \vv_2, \ldots, \vv_k \}$ be an orthogonal basis for
a subspace $W$ of $\R^n$, and let $\vw$ be any vector in $W$.
Then the unique scalars $c_1, c_2, \ldots, c_k$ such that
$$
\vw = c_1 \vv_1 + \cdots + c_k \vv_k
$$
are given by the formula
$$
c_i = \frac{\vw \cdot \vv_i}{\vv_i \cdot \vv_i}
$$

**Example:**
Find the coordinates of $\vw = \colll 1 2 3$ with respect to the basis
$$
\vv_1 = \colll 2 1 {-1} ,\ \vv_2 = \colll 0 1 1,\ \vv_3 = \colll 1 {-1} 1
$$
**Solution:** The coordinates are
$$
\kern-9ex
c_1 = \frac{\vw \cdot \vv_1}{\vv_1 \cdot \vv_1} = \frac{1}{6}, \quad
c_2 = \frac{\vw \cdot \vv_2}{\vv_2 \cdot \vv_2} = \frac{5}{2}, \quad
c_3 = \frac{\vw \cdot \vv_3}{\vv_3 \cdot \vv_3} = \frac{2}{3}
$$
So
$$
\vw = \frac{1}{6} \vv_1 + \frac{5}{2} \vv_2 + \frac{2}{3} \vv_3
$$
Normally you have to solve a system to figure out the coordinates!

**Definition:** An **orthonormal set** is an orthogonal set of *unit* vectors.
An **orthonormal basis** for a subspace $W$ is a basis for $W$ that is an orthonormal set.

The condition of being orthonormal can be expressed as $$ \vv_i \cdot \vv_j = \begin{cases} 0 & \text{if } i \neq j \\ 1 & \text{if } i = j \end{cases} $$

**Example:** The standard basis $\{ \ve_1, \ldots, \ve_n \}$ is an
orthonormal basis for $\R^n$.

For an orthonormal basis, the formula for the coordinates simplifies:

**Theorem 5.3:** If $\{ \vq_1, \vq_2, \ldots, \vq_k \}$ is an orthonormal basis of a subspace $W$,
and $\vw$ is in $W$, then
$$
\vw = (\vw \cdot \vq_1) \vq_1 + \cdots + (\vw \cdot \vq_k) \vq_k
$$

Note that any orthogonal basis can be converted to an orthonormal basis by dividing each vector by its length. E.g. $$ \kern-6ex \vq_1 = \frac{1}{\sqrt{6}} \colll 2 1 {-1},\quad \vq_2 = \frac{1}{\sqrt{2}} \colll 0 1 1,\quad \vq_3 = \frac{1}{\sqrt{3}} \colll 1 {-1} 1 $$ is an orthonormal basis for $\R^3$.

**Question:** How many orthonormal bases are there for $\R^3$?

**Definition:** A square matrix $Q$ with real entries
whose columns form an orthonormal set is called an **orthogonal** matrix.
(Strange name!)

**Examples:** $A = \bmat{rrr} 0 & 1 & 0 \\ 0 & 0 & 1 \\ 1 & 0 & 0 \emat$
and $B = \bmat{rr} \cos \theta & -\sin \theta \\ \sin \theta & \cos \theta \emat$
and $C = [ \, \vq_1 \, \vq_2 \, \vq_3 \, ] =
\bmat{rrr}
\frac{2}{\!\sqrt{6}} & 0 \,\, & \frac{1}{\!\sqrt{3}} \\
\frac{1}{\!\sqrt{6}} & \ph\frac{1}{\!\sqrt{2}} & -\frac{1}{\!\sqrt{3}} \\
-\frac{1}{\!\sqrt{6}} & \frac{1}{\!\sqrt{2}} & \frac{1}{\!\sqrt{3}} \\
\emat^\strut$.

**Non-example:** $D = \bmat{rr} 1 & 2 \\ 3 & 4 \emat$ is not orthogonal.

**Note:** In $\R^2$ and $\R^3$, orthogonal matrices correspond
exactly to rotations and reflections. This is an important geometric
reason to study them. Another reason is that we will see in Section 5.4 that
they are related to diagonalization of symmetric matrices.

**Theorems 5.4 and 5.5:** $Q$ is orthogonal if and only if $Q^T Q = I$, i.e.
if and only if $Q$ is invertible and $Q^{-1} = Q^T$.

Note that $Q^T$ is much easier to calculate than $Q^{-1}$!

**Theorem 5.7:** If $Q$ is orthogonal, then its **rows** form
an orthonormal set too.

**Proof:** Since $Q^T Q = I$, we must also have $Q \, Q^T = I$.
But the last equation says exactly that the rows of $Q$ are orthonormal.$\quad\Box$

Another way to put it is that $Q^T$ is also an orthogonal matrix. Look at the examples again.

**Theorem 5.6:** Let $Q$ be an $n \times n$ matrix. Then the following
statements are equivalent:

a. $Q$ is orthogonal.

b. $\|Q \vx\| = \|\vx\|$ for every $\vx$ in $\R^n$.

c. $Q\vx \cdot Q\vy = \vx \cdot \vy$ for every $\vx$ and $\vy$ in $\R^n$.

For example, $T_B$ is rotation, which preserves lengths and angles.