Skip to main content

Section 21 Complex Eigenvalues

Subsection Application: The Gershgorin Disk Theorem

We have now seen different methods for calculating/approximating eigenvalues of a matrix. The algebraic method using the characteristic polynomial can provide exact values, but only in cases where the size of the matrix is small. Methods like the power method allow us to approximate eigenvalues in many, but not all, cases. These approximation techniques can be made more efficient if we have some idea of where the eigenvalues are. The Gershgorin Disc Theorem is a useful tool that can quickly provide bounds on the location of eigenvalues using elementary calculations. For example, using the Gershsgorin Disk Theorem we can quickly tell that the real parts of the eigenvalues of the matrix

[31101+ii212i]

lie between 4 and 5 and the imaginary parts lie between 5 and 2. Even more, we can say that the eigenvalues lie in the disks (called Gershgorin disks) shown in Figure 21.1.

Figure 21.1. Gershgorin disks.

We will learn more details about the Gershgorin Disk Theorem at the end of this section.

Subsection Introduction

So far we have worked with real matrices whose eigenvalues are all real. However, the characteristic polynomial of a matrix with real entries can have complex roots. In this section we investigate the properties of these complex roots and their corresponding eigenvectors, how these complex eigenvectors are found, and the geometric interpretation of the transformations defined by matrices with complex eigenvalues. Although we can consider matrices that have complex numbers as entries, we will restrict ourselves to matrices with real entries.

Preview Activity 21.1.

Let A=[2422].

(a)

Find the characteristic polynomial of A.

(b)

Find the eigenvalues of A. You should get two complex numbers. How are these complex numbers related?

(c)

Find an eigenvector corresponding to each eigenvalue of A. You should obtain vectors with complex entries.

Subsection Complex Eigenvalues

As you noticed in Preview Activity 21.1, the complex roots of the characteristic equation of a real matrix A come in complex conjugate pairs. This should come as no surprise since we know through our use of the quadratic formula that complex roots of (real) quadratic polynomials come in complex conjugate pairs. More generally, if p(x)=a0+a1x+a2x2++anxn is a polynomial with real coefficients and z is a root of this polynomial, meaning p(z)=0, then

0=p(z)=a0+a1z+a2z2++anzn=a0+a1z+a2z2++anzn=p(z).

Therefore, z is also a root of p(x).

Activity 21.2.

Let A=[0110].

(a)

The matrix transformation T:R2R2 defined by T(x)=Ax is a rotation transformation. What is the angle of rotation?

(b)

Find the eigenvalues of A. For each eigenvalue, find an eigenvector.

In Preview Activity 21.1 and in Activity 21.2, you found that if v is an eigenvector of A corresponding to λ, then v obtained by taking the complex conjugate of each entry in v is an eigenvector of A corresponding to λ. Specifically, if v=u+iw where both u and w are real vectors is an eigenvector of A, then so is v=uiw. We can justify this property using matrix algebra as follows:

Av=Av=Av=λv=λv.

In the first equality, we used the fact that A is a real matrix, so A=A. In all the other equalities, we used the properties of the conjugation operation in complex numbers.

Subsection Rotation and Scaling Matrices

Recall that a rotation matrix is of the form

Rθ=[cos(θ)sin(θ)sin(θ)cos(θ)]

where the rotation is counterclockwise about the origin by an angle of θ radians. In Activity 21.2, we considered the rotation matrix with angle π/2 in counterclockwise direction. We will soon see that rotation matrices play an important role in the geometry of a matrix transformation for a matrix that has complex eigenvalues. In this activity, we will restrict ourselves to the 2×2 case, but similar arguments can be made in higher dimensions.

Activity 21.3.

Let A=[1111].

(a)

Explain why A is not a rotation matrix.

(b)

Although A is not a rotation matrix, there is a rotation matrix B inside A. To find the matrix B, factor out 2 from all entries of A. In other words, write A as a product of two matrices in the form

A=[2002]B.
(c)

The B matrix is a rotation matrix with an appropriate θ. Find this θ.

(d)

If we think about the product of two matrices as applying one transformation after another, describe the effect of the matrix transformation defined by A geometrically.

More generally, if we have a matrix A of the form A=[abba], then

A=[a2+b200a2+b2][aa2+b2ba2+b2ba2+b2aa2+b2].

The first matrix in the decomposition is a scaling matrix with a scaling factor of s=a2+b2. So if s>1, the transformation stretches vectors, and if s<1, the transformation shrinks vectors. The second matrix in the decomposition is a rotation matrix with angle θ such that cos(θ)=aa2+b2 and sin(θ)=ba2+b2. This angle is also the angle between the positive x-axis and the vector v=[ab]. We will refer to the matrices of the form [abba] as rotation-scaling matrices.

Subsection Matrices with Complex Eigenvalues

Now we will investigate how a general 2×2 matrix with complex eigenvalues can be seen to be similar (both in a linear algebra and a colloquial meaning) to a rotation-scaling matrix.

Activity 21.4.

Let B=[1523]. The eigenvalues of B are 2±3i. An eigenvector for the eigenvalue 23i is v=[513i]. We will use this eigenvector to show that B is similar to a rotation-scaling matrix.

(a)

Any complex vector v can be written as v=u+iw where both u and w are real vectors. What are these real vectors u and w for the eigenvector v above?

(b)

Let P=[u w] be the matrix whose first column is the real part of v and whose second column is the imaginary part of v (without the i). Find R=P1BP.

(c)

Express R as a product of a rotation and a scaling matrix. What is the factor of scaling? What is the rotation angle?

In Activity 21.4, we saw that the matrix B with complex eigenvalues 2±3i is similar to a rotation-scaling matrix. Specifically R=P1BP, where the columns of P are the real and imaginary parts of an eigenvector of B, is the rotation-scaling matrix with a factor of scaling by 22+32 and a rotation by angle θ=arccos(222+32).

Does a similar decomposition result hold for a general 2×2 matrix with complex eigenvalues? We investigate this question in the next activity.

Activity 21.5.

Let A be a 2×2 matrix with complex eigenvalue λ=abi, b0, and corresponding complex eigenvector v=u+iw.

(a)

Explain why Av=Au+iAw.

(b)

Explain why λv=(au+bw)+i(awbu).

(c)

Use the previous two results to explain why

  • Au=au+bw and

  • Aw=awbu.

(d)

Let P=[u w]. We will now show that AP=PR where R=[abba].

(i)

Without any calculation, explain why

AP=[Au Aw].
(ii)

Recall that if M is an m×n matrix and x is an n×1 vector, then the matrix product Mx is a linear combination of the columns of M with weights the corresponding entries of the vector x. Use this idea to show that

PR=[au+bw bu+aw].
(iii)

Now explain why AP=PR.

(iv)

Assume for the moment that P is an invertible matrix. Show that A=PRP1.

Your work in Activity 21.5 shows that any 2×2 matrix is similar to a rotation-scaling matrix with a factor of scaling by a2+b2 and a rotation by angle θ=arccos(aa2+b2) if b0, and θ=arccos(aa2+b2) if b<0. Geometrically, this means that every 2×2 real matrix with complex eigenvalues is just a scaled rotation (R) with respect to the basis B formed by u and w from the complex eigenvector v. Multiplying by P1 and P simply provides the change of basis from the standard basis to the basis B, as we will see in detail when we learn about linear transformations.

The one fact that we have not yet addressed is why the matrix P=[u w] is invertible. We do that now to complete the argument.

Let A be a real 2×2 matrix with Av=λv, where λ=abi, b0 and v=u+iw (where u and v are in R2) with w0. From Activity 21.5 we know that

Au=au+bw  and  Aw=awbu.

To show that u and w are linearly independent, we need to show that no nontrivial linear combination of u and w can be the zero vector. Suppose

x1u+x2w=0

for some scalars x1 and x2. We will show that x1=x2=0. Assume to the contrary that one of x1,x2 is not zero. First, assume x10. Then u=x2x1w. Let c=x2x1. From this we have

Au=A(cw)Au=cAwau+bw=c(awbu)(a+cb)u=(cab)w(a+cb)(cw)=(cab)w.

Since w0 we must have (a+cb)c=cab. A little algebra shows that (c2+1)b=0. Since b0, we conclude that c2+1=0, which is impossible for a real constant c. Therefore, we cannot have x10. A similar argument (left to the reader) shows that x2=0. Thus we can conclude that u and w are linearly independent.

Subsection Examples

What follows are worked examples that use the concepts from this section.

Example 21.3.

Let A=[010101111].

(a)

Without doing any computations, explain why not all of the eigenvalues of A can be complex.

Solution.

Since complex eigenvalues occur in conjugate pairs, the complex eigenvalues with nonzero imaginary parts occur in pairs. Since A can have at most 3 different eigenvalues, at most two of them can have nonzero imaginary parts. So at least one eigenvalue of A is real.

(b)

Find all of the eigenvalues of A.

Solution.

For this matrix A we have AλI3=[λ101λ111λ+1]. Using a cofactor expansion along the first row gives us

det(AλI3)=(λ)((λ)(1λ)+1)((1)(1λ)+1)=λ3+λ2λ+1λ1=λ3+λ22λ=λ(λ2λ+2).

The roots of the characteristic polynomial are λ=0 and

λ=1±14(2)2=12(1±7i).

Example 21.4.

Let A=[1213]. Find a rotation scaling matrix R that is similar to A. Identify the rotation and scaling factor.

Solution.

The eigenvalues of A are the roots of the characteristic polynomial

p(λ)=det(AλI2)=det([1λ213λ])=(1λ)(3λ)+2=λ24λ+5.

The quadratic formula shows that the roots of p(λ) are

4±42=2±i.

To find an eigenvector for A with eigenvalue 2i, we row reduce

A(2i)I3=[1+i211+i]

to

[1i100].

An eigenvector for A with eigenvalue 2i is then

[1+i  1]T=[1 1]T+i[1 0]T.

Letting P=[1110], we have

R=P1AP=[2112].

The scaling is determined by the determinant of R which is 5, and the angle θ of rotation satisfies sin(θ)=15. This makes θ0.2014 radians or approximately 11.5370 counterclockwise.

Subsection Summary

  • For a real matrix, complex eigenvalues appear in conjugate pairs. Specifically, if λ=a+ib is an eigenvalue of a real matrix A, then λ=aib is also an eigenvalue of A.

  • For a real matrix, if a v is an eigenvector corresponding to λ, then the vector v obtained by taking the complex conjugate of each entry in v is an eigenvector corresponding to λ.

  • The rotation-scaling matrix A=[abba] can be written as

    [a2+b200a2+b2][aa2+b2ba2+b2ba2+b2aa2+b2].

    This decomposition geometrically means that the transformation corresponding to A can be viewed as a rotation by angle θ=arccos(aa2+b2) if b0, or θ=arccos(aa2+b2) if b<0, followed by a scaling by factor a2+b2.

  • If A is a real 2×2 matrix with complex eigenvalue abi and corresponding eigenvector v=u+iw, then A is similar to the rotation-scaling matrix R=[abba]. More specifically,

    A=PRP1, where P=[u w].

Exercises Exercises

1.

Find eigenvalues and eigenvectors of each of the following matrices.

(c)

[1243]

2.

Find a rotation-scaling matrix where the rotation angle is θ=3π/4 and scaling factor is less than 1.

3.

Determine which rotation-scaling matrices have determinant equal to 1. Be as specific as possible.

4.

Determine the rotation-scaling matrix inside the matrix [2422].

5.

Find a real 2×2 matrix with eigenvalue 1+2i.

6.

Find a real 2×2 matrix which is not a rotation-scaling matrix with eigenvalue 1+2i.

7.

We have seen how to find the characteristic polynomial of an n×n matrix. In this exercise we consider the reverse question. That is, given a polynomial p(λ) of degree n, can we find an n×n matrix whose characteristic polynomial is p(λ)?

(a)

Find the characteristic polynomial of the 2×2 matrix C=[0a01a1]. Use this result to find a real valued matrix whose eigenvalues are 1+i and 1i.

(b)

Repeat part (a) by showing that p(λ)=(λ3+a2λ2+a1λ+a0) is the characteristic polynomial of the 3×3 matrix C=[00a010a101a2].

(c)

We can generalize this argument. Prove, using mathematical induction, that the polynomial

p(λ)=(1)n(λn+an1λn1+an2λn2++a1λ+a0)

is the characteristic polynomial of the matrix

C=[0000a01000a10100a20001an1].

The matrix C is called the companion matrix for p(λ).

8.

Label each of the following statements as True or False. Provide justification for your response.

(a) True/False.

If 34i is an eigenvalue of a real matrix, then so is 3+4i.

(b) True/False.

If 2+3i is an eigenvalue of a 3×3 real matrix A, then A has three distinct eigenvalues.

(c) True/False.

Every 2×2 real matrix with complex eigenvalues is a rotation-scaling matrix.

(d) True/False.

Every square matrix with real entries has real number eigenvalues.

(e) True/False.

If A is a 2×2 matrix with complex eigenvalues similar to a rotation-scaling matrix R, the eigenvalues of A and R are the same.

(f) True/False.

If A is a real matrix with complex eigenvalues, all eigenvectors of A must be non-real.

Subsection Project: Understanding the Gershgorin Disk Theorem

To understand the Gershgorin Disk Theorem, we need to recall how to visualize a complex number in the plane. Recall that a complex number z is a number of the form z=a+bi where a and b are real numbers and i2=1. The number a is the real part of z, denoted as  Re (z), and b is the imaginary part of z, denoted  Im (z). The set of all complex numbers is denoted C. We define addition and multiplication on C as follows. For a+bi,c+diC,

(a+bi)+(c+di)=(a+c)+(b+d)i      and      (a+bi)(c+di)=(acbd)+(ad+bc)i.

Note that the product is what we would expect if we “expanded” the product in the normal way and used the fact that i2=1. The set of complex numbers forms a field — that is, C satisfies all of the same properties as R as stated in Theorem 4.3.

We can visualize the complex number a+bi in the plane as the point (a,b). Here we are viewing the horizontal axis as the real axis and the vertical axis as the imaginary axis. The length (or magnitude) of the complex number z=a+bi, which we denote as |z|, is the distance from the origin to z. So by the Pythagorean Theorem we have |a+bi|=a2+b2. Note that the magnitude of z=a+bi can be written as a complex product

|z|=(a+bi)(abi).

The complex number abi is called the complex conjugate of z=a+bi and is denoted as z. A few important properties of real numbers and their conjugates are the following. Let z=a+bi and w=c+di be complex numbers. Then

  • z+w=(a+c)+(b+d)i=(a+c)(b+d)i=(abi)+(cdi)=z+w,

  • zw=(acbd)+(ad+bc)i=(acbd)(ad+bc)i=(abi)(cdi)=zw,

  • z=z,

  • |z|=a2+b2a2=|a|=| Re (z)|,

  • |z|=a2+b2b2=|b|=| Im (z)|,

  • |z|=|z|,

  • |z|=0 if and only if z=0,

  • If p(x) is a polynomial with real coefficients and the complex number z satisfies p(z)=0, then p(z)=0 as well.

Using these facts we can show that the triangle inequality is true for complex numbers. That is,

|z+w||z|+|w|.

To see why, notice that

|z+w|2=(z+w)(z+w)=(z+w)(z+w)=zz+zw+wz+ww=zz+zw+zww+ww=|z|2+2 Re (zw)+|w|2|z|2+2|zw|+|w|2=|z|2+2|z||w|+|w|2=(|z|+|w|)2.

Since |z+w|, |z|, and |w| are all non-negative, taking square roots of both sides gives us |z+w||z|+|w| as desired. We can extend this triangle inequality to any number of complex numbers. That is, if z1, z2, , zk are complex numbers, then

(21.1)|z1+z2++zk||z1|+|z2|++|zk|.

We can prove Equation (21.1) by mathematical induction. We have already done the k=2 case and so we assume that Equation (21.1) is true for any sum of k complex numbers. Now suppose that z1, z2, , zk, zk+1 are complex numbers. Then

|z1+z2++zk+zk+1|=|(z1+z2++zk)+zk+1||z1+z2++zk|+|zk+1|(|z1|+|z2|++|zk|)+|zk+1|=|z1|+|z2|++|zk|+|zk+1|.

To prove the Gershgorin Disk Theorem, we will use the Levy-Desplanques Theorem, which gives conditions that guarantee that a matrix is invertible. We illustrate with an example in the following activity.

Project Activity 21.6.

Let A=[3214]. Since det(A)0, we know that A is an invertible matrix. Let us assume for a moment that we don't know that A is invertible and try to determine if 0 is an eigenvalue of A. In other words, we want to know if there is a nonzero vector v so that Av=0. Assuming the existence of such a vector v=[v1 v2]T, for Av to be 0 it must be the case that

3v1+2v2=0  and  v1+4v2=0.

Since the vector v is not the zero vector, at least one of v1, v2 is not zero. Note that if one of v1, v2 is zero, the so is the other. So we can assume that v1 and v2 are nonzero.

(a)

Use the fact that 3v1+2v2=0 to show that |v2|>|v1|.

(b)

Use the fact that v1+4v2=0 to show that |v1|>|v2|. What conclusion can we draw about whether 0 is an eigenvalue of A? Why does this mean that A is invertible?

What makes the arguments work in Project Activity 21.6 is that |3|>|2| and |4|>|1|. This argument can be extended to larger matrices, as described in the following theorem.

Proof.

Let A=[aij] be an n×n matrix satisfying |aii|>ji|aij| for all i. Let us assume that A is not invertible, that is that there is a vector v0 such that Av=0. Let v=[v1 v2  vn] and t be between 1 and n so that |vt||vi| for all i. That is, choose vt to be the component of v with the largest absolute value.

Expanding the product Av using the row-column product along the tth row shows that

at1v1+at2v2+atnvn=0.

Solving for the att term gives us

attvt=(at1v1+at2v2+at(t1)vt1+at(t+1)vt+1++atnvn).

Then

|att||vt|=|(at1v1+at2v2+at(t1)vt1+at(t+1)vt+1++atnvn|=|at1v1+at2v2+at(t1)vt1+at(t+1)vt+1++atnvn||at1||v1|+|at2||v2|+|at(t1)||vt1|+|at(t+1)||vt+1|++|atn||vn||at1||vt|+|at2||vt|+|at(t1)||vt|+|at(t+1)||vt|++|atn||vt|=(|at1|+|at2|+|at(t1)|+|at(t+1)|++|atn|)|vt|.

Since |vt|0, we cancel the |vt| term to conclude that

|att||at1|+|at2|+|at(t1)|+|at(t+1)|++|atn|.

But this contradicts the condition that |aii|>ji|aij| for all i. We conclude that 0 is not an eigenvalue for A and A is invertible.

Any matrix A=[aij] satisfying the condition of the Levy-Desplanques Theorem is given a special name.

Definition 21.6.

A square matrix A=[aij] is strictly diagonally dominant if |aii|>ji|aij| for all i.

So any strictly diagonally dominant matrix is invertible. A quick glance can show that a matrix is strictly diagonally dominant. For example, since |3|>|1|+|1|, |12|>|5|+|6|, and |8|>|2|+|4|, the matrix

A=[3115126248]

is strictly diagonally dominant and therefore invertible. However, just because a matrix is not strictly diagonally dominant, it does not follow that the matrix is non-invertible. For example, the matrix B=[1201] is invertible, but not strictly diagonally dominant.

Now we can address the Gershgorin Disk Theorem.

Activity 21.7.

Let A be an arbitrary n×n matrix and assume that λ is an eigenvalue of A.

(a)

Explain why the matrix AλI is singular.

(b)

What does the Levy-Desplanques Theorem tell us about the matrix AλI?

(c)

Explain how we can conclude the Gershgorin Disk Theorem.

Based on this theorem, we define a Gershgorin disk to be D(aii,ri), where ri=ji|aij|.

(d)

Use the Gershgorin Disk Theorem to give estimates on the locations of the eigenvalues of the matrix A=[1232].

The Gershgorin Disk Theorem has a consequence that gives additional information about the eigenvalues if some of the Gershgorin disks do not overlap.

Proof.

Most proofs of this theorem require some results from topology. For that reason, we will not present a completely rigorous proof but rather give the highlights. Let A=[aij] be an n×n matrix. Let Di be a collection of Gershgorin disks of A for 1im such that S=1imDi does not intersect any other Gershgorin disk of A, and let S be the union of the Gershgorin disks of A that are different from the Di. Note that SS=. Let C be the matrix whose ith column is aiiei, that is C is the diagonal matrix whose diagonal entries are the corresponding diagonal entries of A. Note that the eigenvalues of C are aii and the Gershgorin disks of C are just the points aii. So our theorem is true for this matrix C. To prove the result, we build a continuum of matrices from C to A as follows: let B=AC (so that B is the matrix whose off-diagonal entries are those of A and whose diagonal entries are 0), and let A(t)=tB+C for t in the interval [0,1]. Note that A(1)=A. Since the diagonal entries of A(t) are the same as those of A, the Gershgorin disks of A(t) have the same centers as the corresponding Gershgorin disks of A, while the radii of the Gershgorin disks of A(t) are those of A but scaled by t. So the Gershgorin disks of A(t) increase from points (the aii) to the Gershgorin disks of A as t increases from 0 to 1. While the centers of the disks all remain fixed, it is important to recognize that the eigenvalues of A(t) move as t changes. An illustration of this is shown in Figure 21.9 with the eigenvalues as the black points and the changing Gershgorin disks dashed in magenta, using the matrix [i1212+i]. We can learn about how the eigenvalues move with the characteristic polynomial.

Figure 21.9. How eigenvalues move.

Let p(t,x) be the characteristic polynomial of A(t). Note that these characteristic polynomials are functions of both t and x. Since polynomials are continuous functions, their roots (the eigenvalues of A(t)) are continuous for t[0,1] as well. Let λ(t) be an eigenvalue of A(t). Note that λ(1) is an eigenvalue of A, and λ(0) is one of the aii and is therefore in S. We will argue that λ(t) is in S for every value of t in [0,1]. Let ri be the radius of Di and let D(t)i be the Gershgorin disk of A(t) with the same center as Di and radius r(t)i=tri. Let S(t)=1imD(s)i. Since r(s)iri, it follows that D(s)iDi and so S(t)S= as well. From topology, we know that since the disks Di are closed, the union S of these disks is also closed. Similarly, S(t) and S are closed. Thus, λ(t) is continuous in a closed set and so does not leave the set. Thus, λ(t) is in S for every value of t in [0,1].