Section 27 Orthogonal Diagonalization
Focus Questions
By the end of this section, you should be able to give precise and thorough answers to the questions listed below. You may want to keep these questions in mind to focus your thoughts as you complete the section.
What does it mean for a matrix to be orthogonally diagonalizable and why is this concept important?
What is a symmetric matrix and what important property related to diagonalization does a symmetric matrix have?
What is the spectrum of a matrix?
Subsection Application: The Multivariable Second Derivative Test
In single variable calculus, we learn that the second derivative can be used to classify a critical point of the type where the derivative of a function is 0 as a local maximum or minimum.
Theorem 27.1. The Second Derivative Test for Single-Variable Functions.
If is a critical number of a function so that and if exists, then
if then is a local maximum value of
if then is a local minimum value of and
if this test yields no information.
In the two-variable case we have an analogous test, which is usually seen in a multivariable calculus course.
Theorem 27.2. The Second Derivative Test for Functions of Two Variables.
Suppose is a critical point of the function for which and Let be the quantity defined by
If and then has a local maximum at
If and then has a local minimum at
If then has a saddle point at
If then this test yields no information about what happens at
A proof of this test for two-variable functions is based on Taylor polynomials, and relies on symmetric matrices, eigenvalues, and quadratic forms. The steps for a proof will be found later in this section.
Subsection Introduction
We have seen how to diagonalize a matrix — if we can find linearly independent eigenvectors of an matrix and let be the matrix whose columns are those eigenvectors, then is a diagonal matrix with the eigenvalues down the diagonal in the same order corresponding to the eigenvectors placed in We will see that in certain cases we can take this one step further and create an orthogonal matrix with eigenvectors as columns to diagonalize a matrix. This is called orthogonal diagonalization. Orthogonal diagonalizability is useful in that it allows us to find a “convenient” coordinate system in which to interpret the results of certain matrix transformations. A set of orthonormal basis vectors for an orthogonally diagonalizable matrix is called a set of principal axes for Orthogonal diagonalization will also play a crucial role in the singular value decomposition of a matrix, a decomposition that has been described by some as the “pinnacle” of linear algebra.
Definition 27.3.
An matrix is orthogonally diagonalizable if there is an orthogonal matrix such that
is a diagonal matrix. We say that the matrix orthogonally diagonalizes the matrix
Preview Activity 27.1.
(a)
For each matrix whose eigenvalues and corresponding eigenvectors are given, find a matrix such that is a diagonal matrix.
(i)
with eigenvalues and 3 and corresponding eigenvectors and
(ii)
with eigenvalues and and corresponding eigenvectors and
(iii)
with eigenvalues and and corresponding eigenvectors and
(b)
Which matrices in part 1 seem to satisfy the orthogonal diagonalization requirement? Do you notice any common traits among these matrices?
Subsection Symmetric Matrices
As we saw in Preview Activity 27.1, matrices that are not symmetric need not be orthogonally diagonalizable, but the symmetric matrix examples are orthogonally diagonalizable. We explore that idea in this section.
If is a matrix that orthogonally diagonalizes the matrix then where is a diagonal matrix. Since and we have
Therefore, and matrices with this property are the only matrices that can be orthogonally diagonalized. Recall that any matrix satisfying is a symmetric matrix.
While we have just shown that the only matrices that can be orthogonally diagonalized are the symmetric matrices, the amazing thing about symmetric matrices is that every symmetric matrix can be orthogonally diagonalized. We will prove this shortly.
Symmetric matrices have useful properties, a few of which are given in the following activity (we will use some of these properties later in this section).
Activity 27.2.
Let be a symmetric matrix and let and be vectors in
(a)
Show that
(b)
Show that
(c)
Show that the eigenvalues of a symmetric matrix are real.
Activity 27.2 (c) shows that a symmetric matrix has real eigenvalues. This is a general result about real symmetric matrices.
Theorem 27.4.
Let be an symmetric matrix with real entries. Then the eigenvalues of are real.
Proof.
Let be an symmetric matrix with real entries and let be an eigenvalue of with eigenvector To show that is real, we will show that We know
Since has real entries, we also know that is an eigenvalue for with eigenvector Multiply both sides of (27.1) on the left by to obtain
Now
and equation (27.2) becomes
Since this implies that and is real.
To orthogonally diagonalize a matrix, it must be the case that eigenvectors corresponding to different eigenvalues are orthogonal. This is an important property and it would be useful to know when it happens.
Activity 27.3.
Let be a real symmetric matrix with eigenvalues and and corresponding eigenvectors and respectively.
(a)
Use Activity 27.2 (b) to show that
(b)
Explain why the result of part (a) shows that and are orthogonal if
Activity 27.3 proves the following theorem.
Theorem 27.5.
If is a real symmetric matrix, then eigenvectors corresponding to distinct eigenvalues are orthogonal.
Recall that the only matrices that can be orthogonally diagonalized are the symmetric matrices. Now we show that every real symmetric matrix can be orthogonally diagonalized, which completely characterizes the matrices that are orthogonally diagonalizable. The proof of the following theorem proceeds by induction. A reader who has not yet encountered this technique of proof can safely skip the proof of this theorem without loss of continuity.
Theorem 27.6.
Let be a real symmetric matrix. Then is orthogonally diagonalizable.
Proof.
Let be a real symmetric matrix. The proof proceeds by induction on If then is diagonal and orthogonally diagonalizable. So assume that any real symmetric matrix is orthogonally diagonalizable. Assume that is a real symmetric matrix. By Theorem 25.4 (find reference), the eigenvalues of are real. Let be a real eigenvalue of with corresponding unit eigenvector We can use the Gram-Schmidt process to extend to an orthonormal basis for Let Then is an orthogonal matrix. Also,
where is a vector, is the zero vector in and is an matrix. Letting we have that
so is a symmetric matrix. Therefore, and is a symmetric matrix. By our induction hypothesis, is orthogonally diagonalizable. That is, there exists an orthogonal matrix such that where is a diagonal matrix. Now define by
where is the zero vector in By construction, the columns of are orthonormal, so is an orthogonal matrix. Since is also an orthogonal matrix,
and is an orthogonal matrix. Finally,
Therefore, is a diagonal matrix and orthogonally diagonalizes This completes our proof.
The set of eigenvalues of a matrix is called the spectrum of and we have just proved the following theorem.
Theorem 27.7. The Spectral Theorem for Real Symmetric Matrices.
Let be an symmetric matrix with real entries. Then
has real eigenvalues (counting multiplicities)
the dimension of each eigenspace of is the multiplicity of the corresponding eigenvalue as a root of the characteristic polynomial
eigenvectors corresponding to different eigenvalues are orthogonal
is orthogonally diagonalizable.
So any real symmetric matrix is orthogonally diagonalizable. We have seen examples of the orthogonal diagonalization of real symmetric matrices with distinct eigenvalues, but how do we orthogonally diagonalize a symmetric matrix having eigenvalues of multiplicity greater than 1? The next activity shows us the process.
Activity 27.4.
Let The eigenvalues of are 2 and 8, with eigenspace of dimension 2 and dimension 1, respectively.
(a)
Explain why can be orthogonally diagonalized.
(b)
Two linearly independent eigenvectors for corresponding to the eigenvalue 2 are and Note that are not orthogonal, so cannot be in an orthogonal basis of consisting of eigenvectors of So find a set of orthogonal eigenvectors of so that
(c)
The vector is an eigenvector for corresponding to the eigenvalue 8. What can you say about the orthogonality relationship between 's and
(d)
Find a matrix that orthogonally diagonalizes Verify your work.
Subsection The Spectral Decomposition of a Symmetric Matrix
Let be an symmetric matrix with real entries. The Spectral Theorem tells us we can find an orthonormal basis of eigenvectors of Let for each If then we know that
where is the diagonal matrix
Since we see that
where the last product follows from Exercise 4. The expression in (27.3) is called a spectral decomposition of the matrix Let for each The matrices satisfy several special conditions given in the next theorem. The proofs are left to the exercises.
Theorem 27.8.
Let be an symmetric matrix with real entries, and let be an orthonormal basis of eigenvectors of with for each For each let Then
is a symmetric matrix for each
is a rank 1 matrix for each
for each
if
for each
if
For any vector in
The consequence of Theorem 27.8 is that any symmetric matrix can be written as the sum of symmetric, rank 1 matrices. As we will see later, this kind of decomposition contains much information about the matrix product for any matrix
Activity 27.5.
Let Let and be the eigenvalues of A basis for the eigenspace of corresponding to the eigenvalue 8 is and a basis for the eigenspace of corresponding to the eigenvalue 2 is (Compare to Activity 27.4.)
(a)
Find orthonormal eigenvectors and of corresponding to and respectively.
(b)
Compute
(c)
Compute
(d)
Compute
(e)
Verify that
Subsection Examples
What follows are worked examples that use the concepts from this section.
Example 27.9.
For each of the following matrices determine if is diagonalizable. If is not diagonalizable, explain why. If is diagonalizable, find a matrix so that is a diagonal matrix. If the matrix is diagonalizable, is it orthogonally diagonalizable? If orthogonally diagonalizable, find an orthogonal matrix that diagonalizes Use appropriate technology to find eigenvalues and eigenvectors.
(a)
Solution.
Recall that an matrix is diagonalizable if and only if has linearly independent eigenvectors, and is orthogonally diagonalizable if and only if is symmetric. Since is not symmetric, is not orthogonally diagonalizable. Technology shows that the eigenvalues of are and and bases for the corresponding eigenspaces are and So is diagonalizable and if then
(b)
Solution.
Since is not symmetric, is not orthogonally diagonalizable. Technology shows that the eigenvalues of are and and bases for the corresponding eigenspaces are and We cannot create a basis of consisting of eigenvectors of so is not diagonalizable.
(c)
Solution.
Since is symmetric, is orthogonally diagonalizable. Technology shows that the eigenvalues of are and and bases for the eigenspaces and respectively. To find an orthogonal matrix that diagonalizes we must find an orthonormal basis of consisting of eigenvectors of To do that, we use the Gram-Schmidt process to obtain an orthogonal basis for the eigenspace of corresponding to the eigenvalue Doing so gives an orthogonal basis where and
So an orthonormal basis for of eigenvectors of is
Therefore, is orthogonally diagonalizable and if is the matrix then
Example 27.10.
Let Find an orthonormal basis for consisting of eigenvectors of
Solution.
Since is symmetric, there is an orthogonal matrix such that is diagonal. The columns of will form an orthonormal basis for Using a cofactor expansion along the first row shows that
So the eigenvalues of are and The reduced row echelon forms of and are, respectively,
Thus, a basis for the eigenspace of is and a basis for the eigenspace of is The set is an orthogonal set, so an orthonormal basis for consisting of eigenvectors of is
Subsection Summary
An matrix is orthogonally diagonalizable if there is an orthogonal matrix such that is a diagonal matrix. Orthogonal diagonalizability is useful in that it allows us to find a “convenient” coordinate system in which to interpret the results of certain matrix transformations. Orthogonal diagonalization also a plays a crucial role in the singular value decomposition of a matrix.
An matrix is symmetric if The symmetric matrices are exactly the matrices that can be orthogonally diagonalized.
The spectrum of a matrix is the set of eigenvalues of the matrix.
Exercises Exercises
1.
For each of the following matrices, find an orthogonal matrix so that is a diagonal matrix, or explain why no such matrix exists.
(a)
(b)
(c)
2.
For each of the following matrices find an orthonormal basis of eigenvectors of Then find a spectral decomposition of
(a)
(b)
(c)
(d)
3.
Find a non-diagonal matrix with eigenvalues 2, 3 and 6 which can be orthogonally diagonalized.
4.
Let be an matrix with columns and let be an matrix with rows Show that
5.
Let be an symmetric matrix with real entries and let be an orthonormal basis of eigenvectors of For each let Prove Theorem 27.8 — that is, verify each of the following statements.
(a)
For each is a symmetric matrix.
(b)
For each is a rank 1 matrix.
(c)
For each
(d)
If then
(e)
For each
(f)
If then
(g)
If is in show that
For this reason we call an orthogonal projection matrix.
6.
Show that if is an matrix and for every in then is a symmetric matrix.
7.
Let be an symmetric matrix and assume that has an orthonormal basis of eigenvectors of so that for each Let for each It is possible that not all of the eigenvalue of are distinct. In this case, some of the eigenvalues will be repeated in the spectral decomposition of If we want only distinct eigenvalues to appear, we might do the following. Let be the distinct eigenvalues of For each between 1 and let be the sum of all of the that have as eigenvalue.
(a)
The eigenvalues for the matrix are and Find a basis for each eigenspace and determine each Then find and each
(b)
Show in general (not just for the specific example in part (a), that the satisfy the same properties as the That is, verify the following.
(i)
Hint.Collect matrices with the same eigenvalues.
(ii)
is a symmetric matrix for each
Hint.Use the fact that each is a symmetric matrix.
(iii)
for each
(iv)
when
(v)
if is the eigenspace for corresponding to the eigenvalue and if is in then
Hint.Explain why is a orthonormal basis for
(c)
What is the rank of Verify your answer.
8.
Label each of the following statements as True or False. Provide justification for your response.
(a) True/False.
Every real symmetric matrix is diagonalizable.
(b) True/False.
If is a matrix whose columns are eigenvectors of a symmetric matrix, then the columns of are orthogonal.
(c) True/False.
If is a symmetric matrix, then eigenvectors of corresponding to distinct eigenvalues are orthogonal.
(d) True/False.
If and are distinct eigenvectors of a symmetric matrix then and are orthogonal.
(e) True/False.
Any symmetric matrix can be written as a sum of symmetric rank 1 matrices.
(f) True/False.
If is a matrix satisfying and and are vectors satisfying and then
(g) True/False.
If an matrix has orthogonal eigenvectors, then is a symmetric matrix.
(h) True/False.
If an matrix has real eigenvalues (counted with multiplicity), then is a symmetric matrix.
(i) True/False.
For each eigenvalue of a symmetric matrix, the algebraic multiplicity equals the geometric multiplicity.
(j) True/False.
If is invertible and orthogonally diagonalizable, then so is
(k) True/False.
If are orthogonally diagonalizable matrices, then so is
Subsection Project: The Second Derivative Test for Functions of Two Variables
In this project we will verify the Second Derivative Test for functions of two variables. 48 This test will involve Taylor polynomials and linear algebra. As a quick review, recall that the second order Taylor polynomial for a function of a single variable at is
As with the linearization of a function, the second order Taylor polynomial is a good approximation to around — that is for close to If is a critical number for with then
In this situation, if then for close to which makes This implies that for close to which makes a relative maximum value for Similarly, if then is a relative minimum.
We now need a Taylor polynomial for a function of two variables. The complication of the additional independent variable in the two variable case means that the Taylor polynomials will need to contain all of the possible monomials of the indicated degrees. Recall that the linearization (or tangent plane) to a function at a point is given by
Note that and This makes the best linear approximation to near the point The polynomial is the first order Taylor polynomial for at
Similarly, the second order Taylor polynomial centered at the point for the function is
Project Activity 27.6.
To see that is the best approximation for near we need to know that the first and second order partial derivatives of agree with the corresponding partial derivatives of at the point Verify that this is true.
We can rewrite this second order Taylor polynomial using matrices and vectors so that we can apply techniques from linear algebra to analyze it. Note that
where is the gradient of and is the Hessian of where 49
Project Activity 27.7.
Use Equation (27.5) to compute for at
The important idea for us is that if is a point at which and are zero, then is the zero vector and Equation (27.5) reduces to
To make the connection between the multivariable second derivative test and properties of the Hessian, at a critical point of a function at which we will need to connect the eigenvalues of a matrix to the determinant and the trace.
Let be an matrix with eigenvalues (not necessarily distinct). Exercise 1 in Section 18 shows that
In other words, the determinant of a matrix is equal to the product of the eigenvalues of the matrix. In addition, Exercise 9 in Section 19 shows that
for a diagonalizable matrix, where is the sum of the diagonal entries of Equation (27.8) is true for any square matrix, but we don't need the more general result for this project.
The fact that the Hessian is a symmetric matrix makes it orthogonally diagonalizable. We denote the eigenvalues of as and Thus there exists an orthogonal matrix and a diagonal matrix such that or Equations (27.7) and (27.8) show that
Now we have the machinery to verify the Second Derivative Test for Two-Variable Functions. We assume is a point in the domain of a function so that First we consider the case where
Project Activity 27.8.
Explain why if then
is indefinite. Explain why this implies that is “saddle-shaped” near
Hint.Substitute What does the graph of look like in the and directions?
Now we examine the situation when
Project Activity 27.9.
(a)
Explain why either both and are positive or both are negative.
(b)
If and explain why and must be positive.
(c)
Explain why, if and then is a local minimum value for
When and either or is negative, a slight modification of the preceding argument leads to the fact that has a local maximum at (the details are left to the reader). Therefore, we have proved the Second Derivative Test for functions of two variables!
Project Activity 27.10.
Use the Hessian to classify the local maxima, minima, and saddle points of Draw a graph of to illustrate.
Many thanks to Professor Paul Fishback for sharing his activity on this topic. Much of this project comes from his activity.
Note that under reasonable conditions (e.g., that has continuous second order mixed partial derivatives in some open neighborhood containing ) we have that and is a symmetric matrix. We will only consider functions that satisfy these reasonable conditions.