Math 1600 Lecture 32, Section 2, 21 Nov 2014

$ \newcommand{\bdmat}[1]{\left|\begin{array}{#1}} \newcommand{\edmat}{\end{array}\right|} \newcommand{\bmat}[1]{\left[\begin{array}{#1}} \newcommand{\emat}{\end{array}\right]} \newcommand{\coll}[2]{\bmat{r} #1 \\ #2 \emat} \newcommand{\ccoll}[2]{\bmat{c} #1 \\ #2 \emat} \newcommand{\colll}[3]{\bmat{r} #1 \\ #2 \\ #3 \emat} %\def\colll #1 #2 #3{\bmat{r} #1 \\ #2 \\ #3 \emat} \newcommand{\ccolll}[3]{\bmat{c} #1 \\ #2 \\ #3 \emat} \newcommand{\collll}[4]{\bmat{r} #1 \\ #2 \\ #3 \\ #4 \emat} \newcommand{\ccollll}[4]{\bmat{c} #1 \\ #2 \\ #3 \\ #4 \emat} \newcommand{\colllll}[5]{\bmat{r} #1 \\ #2 \\ #3 \\ #4 \\ #5 \emat} \newcommand{\ccolllll}[5]{\bmat{c} #1 \\ #2 \\ #3 \\ #4 \\ #5 \emat} \newcommand{\red}[1]{{\color{red}#1}} \newcommand{\lra}[1]{\mbox{$\xrightarrow{#1}$}} \newcommand{\rank}{\textrm{rank}} \newcommand{\row}{\textrm{row}} \newcommand{\col}{\textrm{col}} \newcommand{\null}{\textrm{null}} \newcommand{\nullity}{\textrm{nullity}} \renewcommand{\Re}{\operatorname{Re}} \renewcommand{\Im}{\operatorname{Im}} \renewcommand{\Arg}{\operatorname{Arg}} \renewcommand{\arg}{\operatorname{arg}} \newcommand{\adj}{\textrm{adj}} \newcommand{\mystack}[2]{\genfrac{}{}{0}{0}{#1}{#2}} \newcommand{\mystackthree}[3]{\mystack{\mystack{#1}{#2}}{#3}} \newcommand{\qimplies}{\quad\implies\quad} \newcommand{\qtext}[1]{\quad\text{#1}\quad} \newcommand{\qqtext}[1]{\qquad\text{#1}\qquad} \newcommand{\smalltext}[1]{{\small\text{#1}}} \newcommand{\svec}[1]{\,\vec{#1}} \newcommand{\querytext}[1]{\toggle{\text{?}\vphantom{\text{#1}}}{\text{#1}}\endtoggle} \newcommand{\query}[1]{\toggle{\text{?}\vphantom{#1}}{#1}\endtoggle} \newcommand{\smallquery}[1]{\toggle{\text{?}}{#1}\endtoggle} \newcommand{\bv}{\mathbf{v}} %\require{AMScd} $


Today we finish 4.6 and start Section 5.1. Continue reading Section 5.1 for next class, and start reading Section 5.2. Work through recommended homework questions.

Help Centers: Monday-Friday 2:30-6:30 in MC 106.

Review: Section 4.6: Markov chains:

Theorem 4.30: Every stochastic matrix has a steady state vector, i.e. it has $\lambda = 1$ as an eigenvalue.

Theorem 4.31: Let $P$ be an $n \times n$ stochastic matrix. Then every eigenvalue $\lambda$ has $|\lambda| \leqslant 1$.

If in addition the entries of $P$ are all positive, then all eigenvalues besides $\lambda = 1$ have $|\lambda| < 1$.

Theorem 4.33: Let $P$ be an $n \times n$ stochastic matrix all of whose entries are positive. Then as $k \to \infty$, $P^k \to L$, a matrix all of whose columns are equal to the same vector $\vx$ which is a steady state probability vector for $P$.

New material:

Proof of 4.33: We'll assume $P$ is diagonalizable: $Q^{-1} P Q = D$. So $P^k = Q D^k Q^{-1}$. As $k \to \infty$, $D^k$ approaches a matrix $D^*$ with $1$'s and $0$'s on the diagonal (by Theorem 4.31), which means that $P^k$ approaches $L = Q D^* Q^{-1}$.

Now that we know that $P^k$ has some limit $L$, we can deduce something about it. Since $\lim_{k \to \infty} P^k = L$, we have $$ PL = P\lim_{k \to \infty} P^k = \lim_{k \to \infty} P^{k+1} = L $$ This means that the columns of $L$ must be steady-state vectors for $P$. Since the columns of $P^k$ are probability vectors, the same must be true of the columns of $L$. It's not hard to show that $P$ has a unique steady-state probability vector $\vx$, so $L = [\, \vx \, \vx \, \cdots \, \vx \, ]$, as required.$\quad\Box$

Finally, we can deduce that Markov chains tend to their steady states:

Theorem 4.34: Let $P$ be an $n \times n$ stochastic matrix all of whose entries are positive, and let $\vx_0$ be any initial probability vector. Then as $k \to \infty$, $\vx_k \to \vx$, where $\vx$ is the steady state probability vector for $P$.

Proof: Suppose that $\vx_0$ has components $x_1, x_2, \ldots, x_n$. Then $$ \begin{aligned} \lim_{k \to \infty} \vx_k &= \lim_{k \to \infty} P^k \vx_0 = L \vx_0 \\ &= [\, \vx \, \vx \, \cdots \, \vx \, ] \vx_0 \\ &= x_1 \vx + x_2 \vx + \cdots + x_n \vx \\ &= (x_1 + x_2 + \cdots + x_n ) \vx = \vx \quad\Box \end{aligned} $$

This result works both ways: if you compute the eigenvector with eigenvalue 1, that tells you the steady-state vector that other states go to as $k \to \infty$. But it also means that if you don't know the steady-state vector, you can approximate it by starting with any vector $\vx_0$ and computing $P^k \vx_0$ for large $k$!

The latter is what Google does to compute the page rank eigenvector.

Note: A transition matrix $P$ is regular if some power $P^k$ of it has positive entries. This means that there is a nonzero probably to get from any starting state to any ending state after some number of steps. The text proves the above results for regular matrices, which is not hard once you know them for matrices with positive entries.

Question: Let $P = \bmat{rr} 1/2 & 1/3 \\ 1/2 & 2/3 \emat$. (a) Find a clever way to figure out the eigenvalues of $P$ and determine whether it is diagonalizable.

(b) Compute $P^{10} \vx_0$, where $\vx_0 = \coll {1/2} {1/2}$.

Section 5.1: Orthogonality in $\R^n$:

Chapter 5 is all about orthogonality. We'll learn about (1) orthogonal projections onto planes and other subspaces; (2) how easy it is to work with an orthogonal basis; (3) how to orthogonally diagonalize matrices; etc.

Recall that vectors $\vu$ and $\vv$ are orthogonal if $\vu \cdot \vv = 0$.

Definition: A set $\{ \vv_1, \vv_2, \ldots, \vv_k \}$ of vectors in $\R^n$ is an orthogonal set if all pairs of distinct vectors in the set are orthogonal, i.e. $\vv_i \cdot \vv_j = 0$ whenever $i \neq j$.

Example 5.1: The following vectors form an orthogonal set: $$ \vv_1 = \colll 2 1 {-1} ,\ \vv_2 = \colll 0 1 1,\ \vv_3 = \colll 1 {-1} 1 $$

Theorem 5.1: If $\{ \vv_1, \vv_2, \ldots, \vv_k \}$ is an orthogonal set of nonzero vectors in $\R^n$, then these vectors are linearly independent.


Definition: An orthogonal basis for a subspace $W$ of $\R^n$ is a basis that is also orthogonal.

For example, the vectors in Example 5.1 automatically form a basis for $\R^3$. Checking orthogonality is much easier than checking linear independence!

Fact: We'll show in Section 5.3 that every subspace has an orthogonal basis.

It is very easy to figure out coordinates with respect to an orthogonal basis. Suppose $\{ \vv_1, \vv_2, \ldots, \vv_k \}$ is an orthogonal basis and $$ \vw = c_1 \vv_1 + \cdots + c_k \vv_k $$ Then if we dot both sides with $\vv_1$, we find $$ \vw \cdot \vv_1 = c_1 (\vv_1 \cdot \vv_1) $$ and so $$ c_1 = \frac{\vw \cdot \vv_1}{\vv_1 \cdot \vv_1} $$

Theorem 5.2: Let $\{ \vv_1, \vv_2, \ldots, \vv_k \}$ be an orthogonal basis for a subspace $W$ of $\R^n$, and let $\vw$ be any vector in $W$. Then the unique scalars $c_1, c_2, \ldots, c_k$ such that $$ \vw = c_1 \vv_1 + \cdots + c_k \vv_k $$ are given by the formula $$ c_i = \frac{\vw \cdot \vv_i}{\vv_i \cdot \vv_i} $$

Example: Find the coordinates of $\vw = \colll 1 2 3$ with respect to the basis $$ \vv_1 = \colll 2 1 {-1} ,\ \vv_2 = \colll 0 1 1,\ \vv_3 = \colll 1 {-1} 1 $$ Solution: The coordinates are $$ \kern-9ex c_1 = \frac{\vw \cdot \vv_1}{\vv_1 \cdot \vv_1} = \frac{1}{6}, \quad c_2 = \frac{\vw \cdot \vv_2}{\vv_2 \cdot \vv_2} = \frac{5}{2}, \quad c_3 = \frac{\vw \cdot \vv_3}{\vv_3 \cdot \vv_3} = \frac{2}{3} $$ So $$ \vw = \frac{1}{6} \vv_1 + \frac{5}{2} \vv_2 + \frac{2}{3} \vv_3 $$ Normally you have to solve a system to figure out the coordinates!

Definition: An orthonormal set is an orthogonal set of unit vectors. An orthonormal basis for a subspace $W$ is a basis for $W$ that is an orthonormal set.

The condition of being orthonormal can be expressed as $$ \vv_i \cdot \vv_j = \begin{cases} 0 & \text{if } i \neq j \\ 1 & \text{if } i = j \end{cases} $$

Example: The standard basis $\{ \ve_1, \ldots, \ve_n \}$ is an orthonormal basis for $\R^n$.

For an orthonormal basis, the formula for the coordinates simplifies:

Theorem 5.3: If $\{ \vq_1, \vq_2, \ldots, \vq_k \}$ is an orthonormal basis of a subspace $W$, and $\vw$ is in $W$, then $$ \vw = (\vw \cdot \vq_1) \vq_1 + \cdots + (\vw \cdot \vq_k) \vq_k $$

Note that any orthogonal basis can be converted to an orthonormal basis by dividing each vector by its length. E.g. $$ \kern-6ex \vq_1 = \frac{1}{\sqrt{6}} \colll 2 1 {-1},\quad \vq_2 = \frac{1}{\sqrt{2}} \colll 0 1 1,\quad \vq_3 = \frac{1}{\sqrt{3}} \colll 1 {-1} 1 $$ is an orthonormal basis for $\R^3$.

Question: How many orthonormal bases are there for $\R^3$?

Orthogonal Matrices

Definition: A square matrix $Q$ with real entries whose columns form an orthonormal set is called an orthogonal matrix. (Strange name!)

Examples: $A = \bmat{rrr} 0 & 1 & 0 \\ 0 & 0 & 1 \\ 1 & 0 & 0 \emat$ and $B = \bmat{rr} \cos \theta & -\sin \theta \\ \sin \theta & \cos \theta \emat$ and $C = [ \, \vq_1 \, \vq_2 \, \vq_3 \, ] = \bmat{rrr} \frac{2}{\!\sqrt{6}} & 0 \,\, & \frac{1}{\!\sqrt{3}} \\ \frac{1}{\!\sqrt{6}} & \ph\frac{1}{\!\sqrt{2}} & -\frac{1}{\!\sqrt{3}} \\ -\frac{1}{\!\sqrt{6}} & \frac{1}{\!\sqrt{2}} & \frac{1}{\!\sqrt{3}} \\ \emat^\strut$.

Non-example: $D = \bmat{rr} 1 & 2 \\ 3 & 4 \emat$ is not orthogonal.

Note: In $\R^2$ and $\R^3$, orthogonal matrices correspond exactly to rotations and reflections. This is an important geometric reason to study them. Another reason is that we will see in Section 5.4 that they are related to diagonalization of symmetric matrices.

Theorems 5.4 and 5.5: $Q$ is orthogonal if and only if $Q^T Q = I$, i.e. if and only if $Q$ is invertible and $Q^{-1} = Q^T$.


Note that $Q^T$ is much easier to calculate than $Q^{-1}$!

Theorem 5.7: If $Q$ is orthogonal, then its rows form an orthonormal set too.

Proof: Since $Q^T Q = I$, we must also have $Q \, Q^T = I$. But the last equation says exactly that the rows of $Q$ are orthonormal.$\quad\Box$

Another way to put it is that $Q^T$ is also an orthogonal matrix. Look at the examples again.

Theorem 5.6: Let $Q$ be an $n \times n$ matrix. Then the following statements are equivalent:
a. $Q$ is orthogonal.
b. $\|Q \vx\| = \|\vx\|$ for every $\vx$ in $\R^n$.
c. $Q\vx \cdot Q\vy = \vx \cdot \vy$ for every $\vx$ and $\vy$ in $\R^n$.

For example, $T_B$ is rotation, which preserves lengths and angles.

Partial proof of 5.6: