## Announcements:

Read Appendix C (complex numbers) for next Monday. Work through recommended homework questions.

Midterm 2: next Thursday evening, 7-8:30 pm. Send me an e-mail message today if you have a conflict (even if you told me before midterm 1). Midterm 2 covers from Section 2.3 until the end of Chapter 3 (today), but builds on the earlier material as well. A practice exam is available from the course home page. Last name A-Q must write in NS1, R-Z in NS7. See the missed exam section of the course web page for policies, including for illness.

Tutorials: No tutorials this week! Review in tutorials next week.

Office hour: today, 12:30-1:30, MC103B.
Help Centers: Monday-Friday 2:30-6:30 in MC 106. (But probably not Thursday or Friday this week.)

Extra Linear Algebra Review Session: Tuesday, Nov 5, 4:30-6:30pm, MC110.

## Review of last lecture: Section 3.6: Linear Transformations

Given an $m \times n$ matrix $A$, we can use $A$ to transform a column vector in $\R^n$ into a column vector in $\R^m$. We write: $$T_A(\vx) = A \vx \quad\text{for \vx in \R^n}$$

Any rule $T$ that assigns to each $\vx$ in $\R^n$ a unique vector $T(\vx)$ in $\R^m$ is called a transformation from $\R^n$ to $\R^m$ and is written $T : \R^n \to \R^m$.

Definition: A transformation $T : \R^n \to \R^m$ is called a linear transformation if:
1. $T(\vu + \vv) = T(\vu) + T(\vv)$ for all $\vu$ and $\vv$ in $\R^n$, and
2. $T(c \vu) = c \, T(\vu)$ for all $\vu$ in $\R^n$ and all scalars $c$.

Theorem 3.30: Let $A$ be an $m \times n$ matrix. Then $T_A : \R^n \to \R^m$ is a linear transformation.

Theorem 3.31: Let $T : \R^n \to \R^m$ be a linear transformation. Then $T = T_A$, where $$A = [\, T(\ve_1) \mid T(\ve_2) \mid \cdots \mid T(\ve_n) \,]$$

The matrix $A$ is called the standard matrix of $T$ and is written $[T]$.

Example 3.58: Let $R_\theta : \R^2 \to \R^2$ be rotation by an angle $\theta$ counterclockwise about the origin. Show that $R_\theta$ is linear and find its standard matrix.

Solution: A geometric argument shows that $R_\theta$ is linear.

To find the standard matrix, we note that $$R_\theta \coll 1 0 = \coll {\cos \theta} {\sin \theta} \qqtext{and} R_\theta \coll 0 1 = \coll {-\sin \theta} {\cos \theta}$$ Therefore, the standard matrix of $R_\theta$ is $\bmat{rr} \cos \theta & -\sin \theta \\ \sin \theta & \cos \theta \emat$.

### New linear transformations from old

If $T : \R^m \to \R^\red{n}$ and $S : \R^\red{n} \to \R^p$, then $S(T(\vx))$ makes sense for $\vx$ in $\R^m$. The composition of $S$ and $T$ is the transformation $S \circ T : \R^m \to \R^p$ defined by $$(S \circ T)(\vx) = S(T(\vx)) .$$ If $S$ and $T$ are linear, it is easy to check that this new transformation $S \circ T$ is automatically linear.

Theorem 3.32: $[S \circ T] = [S][T]$.

Example: It is geometrically clear that $R_\theta \circ R_\phi = R_{\theta+\phi}$.

## New material

Note that $R_0$ is rotation by zero degrees, so $R_0(\vx) = \vx$. We say that $R_0$ is the identity transformation, which we write $I : \R^2 \to \R^2$. Similarly, $R_{360} = I$.

Since $R_{120} \circ R_{120} \circ R_{120} = R_{360} = I$, we must have $[R_{120}]^3 = [I] = I$. Thus $[R_{120}] = \bmat{rr} -1/2 & -\sqrt{3}/2 \\ \sqrt{3}/2 & -1/2 \emat$ is an answer to the challenge problem from Lecture 16.

Our new point of view about matrix multiplication gives us a geometrical way to understand it!

### Inverses of Linear Transformations

Since $R_{60} \circ R_{-60} = R_0 = I$, it follows that $[R_{60}] [R_{-60}] = I$. So $[R_{-60}] = [R_{60}]^{-1}$. See Example 3.62 for details.

Definition: Let $S$ and $T$ be linear transformations from $\R^\red{n}$ to $\R^\red{n}$. Then $S$ and $T$ are inverse transformations if $S \circ T = I$ and $T \circ S = I$. When this is the case, we say that $S$ and $T$ are invertible and are inverses.

The same argument as for matrices shows that an inverse is unique when it exists, so we write $S = T^{-1}$ and $T = S^{-1}$.

Theorem 3.33: Let $T : \R^n \to \R^n$ be a linear transformation. Then $T$ is invertible if and only if $[T]$ is an invertible matrix. In this case, $[T^{-1}] = [T]^{-1}$.

The argument is easy and is essentially what we did for $R_{60}$.

Question: Is projection onto the $x$-axis invertible?

Question: Is reflection in the $x$-axis invertible?

Question: Is translation a linear transformation?

### Section 3.7: Markov Chains

Example 3.64: 200 people are testing two brands of toothpaste, Brand A and Brand B. Each month they are allowed to switch brands. The research firm observes the following:
• Of those using Brand A in a given month, 70% continue in the following month and 30% switch to B.
• Of those using Brand B in a given month, 80% continue in the following month and 20% switch to A.

This is called a Markov chain. There are definite states, and from each state there is a transition probability for moving to another state and each time step. These probabilities are constant and depend only on the current state.

Suppose at the start that 120 people use Brand A and 80 people use Brand B. Then, in the next month, $$0.70(120) + 0.20(80) = 100 \qtext{will use Brand A}$$ and $$0.30(120) + 0.80(80) = 100 \qtext{will use Brand B}$$ This is a matrix equation: $$\bmat{ll} 0.70 & 0.20 \\ 0.30 & 0.80 \emat \coll {120} {80} = \coll {100} {100}$$ Write $P$ for the transition matrix and $\vx_k$ for the state vector after $k$ months have gone by. Then $\vx_{k+1} = P \vx_k$. So $$\vx_2 = P \vx_1 = \bmat{ll} 0.70 & 0.20 \\ 0.30 & 0.80 \emat \coll {100} {100} = \coll {90} {110}$$ and we see that there are 90 people using Brand A and 110 using Brand B after 2 months.

We can also work with the percentage of people using each brand. Then $\vx_0 = \coll {120/200} {80/200} = \coll {0.60} {0.40}$ and $P\vx_0 = \coll {0.50} {0.50}$. Vectors with non-negative components that sum to 1 are called probability vectors

Note that $P$ is a stochastic matrix: this means that it is square and that each column is a probability vector.

The columns of $P$ correspond to the current state and the rows correspond to the next state. The entry $P_{ij}$ is the probability that you transition from state $j$ to state $i$ in one time step, where we now label the states with numbers.

Multiple steps: Can we compute the probability that we go from state $j$ to state $i$ in two steps? Well, $x_{k+2} = P x_{k+1} = P^2 x_k$, so the matrix $P^2$ computes this transition:

$\ P^2 = \bmat{ll} 0.7 & 0.2 \\ 0.3 & 0.8 \emat \bmat{ll} 0.7 & 0.2 \\ 0.3 & 0.8 \emat = \bmat{ll} 0.55 & 0.30 \\ 0.45 & 0.70 \emat$

So the probability of going from Brand A to Brand B after two steps is $(P^2)_{21} = 0.45 = 0.21+0.24$.

More generally, $(P^k)_{ij}$ is the probability of going from state $\red{j}$ to state $\red{i}$ in $k$ steps.

Long-term behaviour: By multiplying by $P$, you can show that the state evolves as follows: \kern-5ex \begin{aligned} &\coll {0.60} {0.40}, \coll {0.50} {0.50}, \coll {0.45} {0.55}, \coll {0.425} {0.575}, \coll {0.412} {0.588}, \coll {0.406} {0.594},\\ &\coll {0.403} {0.597}, \coll {0.402} {0.598}, \coll {0.401} {0.599}, \coll {0.400} {0.600}, \coll {0.400} {0.600}, \dots \end{aligned} with 40% of the people using Brand A in the long run. Since $$\bmat{ll} 0.70 & 0.20 \\ 0.30 & 0.80 \emat \coll {0.4} {0.6} = \coll {0.4} {0.6}$$ once we reach this state, we don't leave. A state $\vx$ with $P \vx = \vx$ is called a steady state vector. We'll prove below that every Markov chain has a steady state vector!

Here's how to find it. We want to find $\vx$ such that $(I - P)\vx = \vec 0$. The augmented system is $$[I - P \mid \vec 0] = \bmat{rr|r} 0.30 & -0.20 & 0 \\ -0.30 & 0.20 & 0 \emat$$ which reduces to $$\bmat{rr|r} 1 & -2/3 & 0 \\ 0 & 0 & 0 \emat$$ The solution is $$x_1 = \frac{2}{3} t,\quad x_2 = t$$ We'd like a probability vector, so $\frac{2}{3} t + t = 1$ which means that $t = 3/5$. This gives $\vx = \coll {0.4} {0.6}$ as we found above.

Theorem: Every Markov chain has a steady state vector.

Proof: Let $P$ be the transition matrix. We want to find a non-trivial solution to $(I - P)\vx = \vec 0$. By the fundamental theorem of invertible matrices and the fact that $\rank (I - P) = \rank ((I - P)^T)$, this is equivalent to $(I - P)^T \vx = \vec 0$ having a non-trivial solution. That is, finding a non-trivial $\vx$ such that $$P^T \vx = \vx \qtext{(since I^T = I).}$$ But since $P$ is a stochastic matrix, we always have $$P^T \colll 1 {\vdots} 1 = \colll 1 {\vdots} 1$$ So therefore $P \vx = \vx$ also has a (different) non-trivial solution.$\qquad\Box$

Note: A Markov chain can have more than two states. Example 3.65 in the text is a good example of a Markov chain with three states. On whiteboard.

In Chapter 4 we'll study Markov chains again.

I have time to answer questions after class, and my office hour is 12:30-1:30 in MC103B.