Announcements:

Today we finish Section 5.4 and finish the course material. Read the whole book for Monday. Work through recommended homework questions and more.

Tutorials: This week: review. Bring questions.
Office hour: Wednesday, 12:30-1:30, MC103B.
Help Centers: Monday-Friday 2:30-6:30 in MC 106.
Review Session: Friday in class; bring questions. Also, Sunday, Dec 8, 10am-11am, in MC110.

Final exam: Covers whole course, with an emphasis on the material in Chapters 4 and 5 (after the second midterm). It does not cover $\Z_m$, code vectors, Markov chains or network analysis. Everything else we covered in class is considered exam material. Questions are similar to textbook questions, midterm questions and quiz questions.

Section 5.4: Orthogonal Diagonalization of Symmetric Matrices

In Section 4.4 we learned all about diagonalizing a square matrix $A$. One of the difficulties that arose is that a matrix with real entries can have complex eigenvalues. In this section, we focus on the case where $A$ is a symmetric matrix, and we will show that the eigenvalues of $A$ are always real and that $A$ is always diagonalizable!

Symmetric matrices are important in applications. For example, in quantum theory, they correspond to observable quantities.

Recall that a square matrix $A$ is symmetric if $A^T = A$.

Examples: $\bmat{rr} 1 & 2 \\ 2 & 3 \emat$, $\bmat{rr} 3 & 2 \\ 2 & 3 \emat$, $\bmat{rr} 1 & 0 \\ 0 & 3 \emat$, $\bmat{rrr} 1 & 2 & 3 \\ 2 & 4 & 5 \\ 3 & 5 & 6 \emat$.

Non-examples: $\bmat{rr} 3 & -2 \\ 2 & 3 \emat$, $\bmat{rrr} 1 & 2 & 3 \\ 5 & 4 & 5 \\ 3 & 2 & 6 \emat$.

New material

Example 5.16: If possible, diagonalize $A = \bmat{rr} 1 & 2 \\ 2 & -2 \emat$. On whiteboard.

Definition: A square matrix $A$ is orthogonally diagonalizable if there exists an orthogonal matrix $Q$ such that $Q^T A Q$ is a diagonal matrix $D$.

Notice that if $A$ is orthogonally diagonalizable, then $Q^T A Q = D$, so $A = Q D Q^T$. Therefore $$A^T = (Q D Q^T)^T = (Q^T)^T D^T Q^T = Q D Q^T = A.$$ We have proven:

Theorem 5.17: If $A$ is orthogonally diagonalizable, then $A$ is symmetric.

The rest of this section is working towards proving that every symmetric matrix $A$ is orthogonally diagonalizable. I'll organize this a bit more efficiently than the textbook.

Theorem 5.19: If $A$ is a symmetric matrix, then eigenvectors corresponding to distinct eigenvalues of $A$ are orthogonal.

In non-symmetric examples we've seen earlier, the eigenvectors were not orthogonal.

Proof: Suppose $\vv_1$ and $\vv_2$ are eigenvectors corresponding to distinct eigenvalues $\lambda_1$ and $\lambda_2$. Then we have \kern-6ex \begin{aligned} \lambda_1 (\vv_1 \cdot \vv_2) &= (\lambda_1 \vv_1) \cdot \vv_2 = (A \vv_1) \cdot \vv_2 = (A \vv_1)^T \vv_2 \\ &= \vv_1^T A^T \vv_2 = \vv_1^T (A \vv_2) = \vv_1^T \lambda_2 \vv_2 = \lambda_2 (\vv_1 \cdot \vv_2) \end{aligned} So $(\lambda_1 - \lambda_2) (\vv_1 \cdot \vv_2) = 0$, which implies that $\vv_1 \cdot \vv_2 = 0$.$\quad\Box$

Theorem 5.18: If $A$ is a real symmetric matrix, then the eigenvalues of $A$ are real.

To prove this, we have to recall some facts about complex numbers. If $z = a + bi$, then its complex conjugate is $\bar{z} = a - bi$, which is the reflection in the real axis. So $z$ is real if and only if $z = \bar{z}$.

Proof: Suppose that $\lambda$ is an eigenvalue of $A$ with eigenvector $\bv$. Then the complex conjugate $\bar{\bv}$ is an eigenvector with eigenvalue $\bar{\lambda}$, since $$A \bar{\bv} = \bar{A} \bar{\bv} = \overline{A \bv} = \overline{\lambda \bv} = \bar{\lambda} \bar{\bv} .$$ If $\lambda \neq \bar{\lambda}$, then Theorem 5.19 shows that $\bv \cdot \bar{\bv} = 0$.

But if $\bv = \ccolll {z_1} {\vdots} {z_n}$ then $\bar{\bv} = \ccolll {\bar{z}_1} {\vdots} {\bar{z}_n}$ and so $$\bv \cdot \bar{\bv} = z_1 \bar{z}_1 + \cdots + z_n \bar{z}_n = |z_1|^2 + \cdots + |z_n|^2 \neq 0$$ since $\bv \neq \vec 0$. Therefore, $\lambda = \bar{\lambda}$, so $\lambda$ is real. $\quad\Box$

Example 5.17 and 5.18: The eigenvalues of $A = \bmat{rrr} 2 & 1 & 1 \\ 1 & 2 & 1 \\ 1 & 1 & 2 \emat$ are $4$ and $1$, with eigenspaces $$E_4 = \span(\colll 1 1 1) \qtext{and} E_1 = \span(\colll {-1} 0 1, \colll {-1} 1 0)$$ We see that every vector in $E_1$ is orthogonal to every vector in $E_4$. (In fact, $E_1 = E_4^\perp$.)

But notice that the vectors in $E_1$ aren't necessarily orthogonal to each other. However, we can apply Gram-Schmidt to get an orthogonal basis for $E_1$: \begin{aligned} \vv_1 &= \vx_1 = \colll {-1} 0 1 \\ \vv_2 &= \vx_2 - \frac{\vv_1 \cdot \vx_2}{\vv_1 \cdot \vv_1} \vv_1 \\ &= \colll {-1} 1 0 - \frac{1}{2} \colll {-1} 0 1 = \ccolll {-1/2} 1 {-1/2} \end{aligned} We normalize the three basis eigenvectors and put them in the columns of a matrix $Q = \bmat{ccc} 1/\sqrt{3} & -1/\sqrt{2} & -1/\sqrt{6} \\ 1/\sqrt{3} & 0 & \ph 2/\sqrt{6} \\ 1/\sqrt{3} & -1/\sqrt{2} & -1/\sqrt{6} \\ \emat .$ Then $Q^T A Q = \bmat{rrr} 4 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \emat$, so $A$ is orthogonally diagonalizable.

The spectral theorem

The set of eigenvalues of a matrix are called its spectrum because the spectral lines you see when light from an atom is sent through a prism correspond to the eigenvalues of a certain matrix.

Theorem 5.20 (The spectral theorem): Let $A$ be an $n \times n$ real matrix. Then $A$ is symmetric if and only if $A$ is orthogonally diagonalizable.

Proof: We have seen that every orthogonally diagonalizable matrix is symmetric.

We also know that if $A$ is symmetric, then it's eigenvectors for distinct eigenvalues are orthogonal. So, by using Gram-Schmidt on the eigenvectors with the same eigenvalue, we get an orthogonal set of eigenvectors.

The only thing that isn't clear is that we get $n$ eigenvectors. The argument here is a bit complicated. See the text. $\quad\Box$.

Method for orthogonally diagonalizing a real symmetric $n \times n$ matrix A:
1. Find all eigenvalues. They will all be real, and the algebraic multiplicities will add up to $n$.
2. Find a basis for each eigenspace.
3. If an eigenspace has dimension greater than one, use Gram-Schmidt to create an orthogonal basis of that eigenspace.
4. Normalize all basis vectors. Put them in the columns of $Q$, and make the eigenvalues (in the same order) the diagonal entries of a diagonal matrix $D$.
5. Then $Q^T A Q = D$.

Note that $A$ can be expressed in terms of its eigenvectors $\vq_1, \ldots, \vq_n$ and eigenvalues $\lambda_1, \ldots, \lambda_n$ (repeated according to their multiplicity) as \kern-7ex \begin{aligned} A &= Q D Q^T = [\, \vq_1 \cdots \vq_n \, ] \bmat{ccc} \lambda_1 & \cdots & 0 \\ \vdots & \ddots & \vdots \\ 0 & \cdots & \lambda_n \emat \ccolll {\vq_1^T} {\vdots} {\vq_n^T} \\ &= [\, \lambda_1 \vq_1 \cdots \lambda_n \vq_n \, ] \ccolll {\vq_1^T} {\vdots} {\vq_n^T} \\ &= \lambda_1 \vq_1 \vq_1^T + \lambda_2 \vq_2 \vq_2^T + \cdots + \lambda_n \vq_n \vq_n^T \end{aligned} This is called the spectral decomposition of $A$.

Note that the $n \times n$ matrix $\vq_1 \vq_1^T$ sends a vector $\vx$ to $\vq_1 \vq_1^T \vx = (\vq_1 \cdot \vx) \vq_1 = \proj_{\vq_1}(\vx)$, so it is orthogonal projection onto $\span(\vq_1)$. Thus you can compute $A \vx$ by projecting $\vx$ onto each $\vq_i$, multiplying by $\lambda_i$, and adding the results.

Example 5.20: Find a $2 \times 2$ matrix with eigenvalues 3 and -2 and corresponding eigenvectors $\coll 3 4$ and $\coll {-4} 3$.

Method 1: Let $P = \bmat{rr} 3 & -4 \\ 4 & 3 \emat$ and $D = \bmat{rr} 3 & 0 \\ 0 & -2 \emat$. Then \begin{aligned} A &= P D P^{-1} = \bmat{rr} 3 & -4 \\ 4 & 3 \emat \bmat{rr} 3 & 0 \\ 0 & -2 \emat \bmat{rr} 3 & -4 \\ 4 & 3 \emat^{-1} \\ &= \bmat{rr} 9 & 8 \\ 12 & -6 \emat \frac{1}{25} \bmat{rr} 3 & 4 \\ -4 & 3 \emat \\ &= \frac{1}{25} \bmat{rr} -5 & 60 \\ 60 & -30 \emat = \bmat{rr} -1/5 & 12/5 \\ 12/5 & -6/5 \emat \end{aligned} This didn't use anything from this section and works for any diagonalizable matrix.

Method 2: First normalize the eigenvectors to have length 1. Then use the spectral decomposition: \kern-8ex \begin{aligned} A &= \lambda_1 \vq_1 \vq_1^T + \lambda_2 \vq_2 \vq_2^T \\ &= 3 \coll {3/5} {4/5} \bmat{rr} 3/5 & 4/5 \emat -2 \coll {-4/5} {3/5} \bmat{rr} -4/5 & 3/5 \emat \\ &= 3 \bmat{rr} 9/25 & 12/25 \\ 12/25 & 16/25 \emat -2 \bmat{rr} 16/25 & -12/25 \\ -12/25 & 9/25 \emat = \bmat{rr} -1/5 & 12/5 \\ 12/5 & -6/5 \emat \end{aligned} This method only works because the given vectors are orthogonal.

See Example 5.19 in the text for another example.