Approximating Eigenvalues and Eigenvectors

Section 20 Approximating Eigenvalues and Eigenvectors

Focus Questions

By the end of this section, you should be able to give precise and thorough answers to the questions listed below. You may want to keep these questions in mind to focus your thoughts as you complete the section.

What is the power method for?
How does the power method work?
How can we use the inverse power method to approximate any eigenvalue/eigenvector pair?

Subsection Application: Leslie Matrices and Population Modeling

The Leslie Matrix (also called the Leslie Model) is a powerful model for describing an age distributed growth of a population that is closed to migration. In a Leslie model, it is usually the case that only one gender (most often female) is considered. As an example, we will later consider a population of sheep that is being grown commercially. A natural question that we will address is how we can harvest the population to build a sustainable environment.

When working with populations, the matrices we use are often large. For large matrices, using the characteristic polynomial to calculate eigenvalues is too time and resource consuming to be practical, and we generally cannot find the exact values of the eigenvalues. As a result, approximation techniques are very important. In this section we will explore a method for approximating eigenvalues. The eigenvalues of a Leslie matrix are important because they describe the limiting or steady-state behavior of a population. The matrix and model were introduced by Patrick H. Leslie in “On the Use of Matrices in Certain Population Mathematics”, Leslie, P.H., Biometrika, Volume XXXIII, November 1945, pp. 183-212.

Subsection Introduction

We have used the characteristic polynomial to find the eigenvalues of a matrix, and for each eigenvalue row reduced a corresponding matrix to find the eigenvectors This method is only practical for small matrices — for more realistic applications approximation techniques are used. We investigate one such technique in this section, the power method.

Preview Activity 20.1.

Let \(A = \left[ \begin{array}{cc} 2\amp 6 \\ 5\amp 3 \end{array} \right]\text{.}\) Our goal is to find a scalar \(\lambda\) and a nonzero vector \(\vv\) so that \(A \vv = \lambda \vv\text{.}\)

(a)

If we have no prior knowledge of the eigenvalues and eigenvectors of this matrix, we might just begin with a guess. Let \(\vx_0 = [1 \ 0]^{\tr}\) be such a guess for an eigenvector. Calculate \(A \vx_0\text{.}\) Is \(\vx_0\) an eigenvector of \(A\text{?}\) Explain.

(b)

If \(\vx_0\) is not a good approximation to an eigenvector of \(A\text{,}\) then we need to make a better guess. We have little to work with other than just random guessing, but we can use \(\vx_1 = A\vx_0\) as another guess. We calculated \(\vx_1\) in part 1. Is \(\vx_1\) an eigenvector for \(A\text{?}\) Explain.

(c)

In parts (a) and (b) you might have noticed that in some sense \(\vx_1\) is closer to being an eigenvector of \(A\) than \(\vx_0\) was. So maybe continuing this process will get us closer to an eigenvector of \(A\text{.}\) In other words, for each positive integer \(k\) we define \(\vx_k\) as \(A \vx_{k-1}\text{.}\) Before we proceed, however, we should note that as we calculate the vectors \(\vx_1\text{,}\) \(\vx_2\text{,}\) \(\vx_3\text{,}\) \(\ldots\text{,}\) the entries in the vectors get large very quickly. So it will be useful to scale the entries so that they stay at a reasonable size, which makes it easier to interpret the output. One way to do this is to divide each vector \(\vx_i\) by its largest component in absolute value so that all of the entries stay between \(-1\) and \(1\text{.}\)³⁷ So in our example we have \(\vx_0 = [1 \ 0]^{\tr}\text{,}\) \(\vx_1 = [2/5 \ 1]^{\tr}\text{,}\) and \(\vx_2 = [1 \ 25/34]^{\tr}\text{.}\) Explain why scaling our vectors will not affect our search for an eigenvector.

(d)

Use an appropriate technological tool to find the vectors \(\vx_k\) up to \(k=10\text{.}\) What do you think the limiting vector \(\lim_{k \to \infty} \vx_k\) is? Is this limiting vector an eigenvector of \(A\text{?}\) If so, what is the corresponding eigenvalue?

Subsection The Power Method

While the examples we present in this text are small in order to highlight the concepts, matrices that appear in real life applications are often enormous. For example, in Google's PageRank algorithm that is used to determine relative rankings of the importance of web pages, matrices of staggering size are used (most entries in the matrices are zero, but the size of the matrices is still huge). Finding eigenvalues of such large matrices through the characteristic polynomial is impractical. In fact, finding the roots of all but the smallest degree characteristic polynomials is a very difficult problem. As a result, using the characteristic polynomial to find eigenvalues and then finding eigenvectors is not very practical in general, and it is often a better option to use a numeric approximation method. We will consider one such method in this section, the power method.

In Preview Activity 20.1, we saw an example of a matrix \(A = \left[ \begin{array}{cc} 2\amp 6 \\ 5\amp 3 \end{array} \right]\) so that the sequence \(\{\vx_k\}\text{,}\) where \(\vx_k = A \vx_{k-1}\text{,}\) converged to a dominant eigenvector of \(A\) for an initial guess vector \(\vx_0 = [1 \ 0]^{\tr}\text{.}\) The vectors \(\vx_i\) for \(i\) from \(1\) to \(6\) (with scaling) are approximately

\begin{align*} \vx_1 \amp = \left[ \begin{array}{c} 0.4 \\ 1 \end{array} \right] \amp \vx_2 \amp = \left[ \begin{array}{c} 1 \\ 0.7353 \end{array} \right] \amp \vx_3 \amp = \left[ \begin{array}{c} 0.8898 \\ 1 \end{array} \right]\\ \vx_4 \amp = \left[ \begin{array}{c} 1 \\ 0.9575 \end{array} \right] \amp \vx_5 \amp = \left[ \begin{array}{c} 0.9838 \\ 1 \end{array} \right] \amp \vx_6 \amp = \left[ \begin{array}{c} 1 \\ 0.9939 \end{array} \right]\text{.} \end{align*}

Numerically we can see that the sequence \(\{\vx_k\}\) approaches the vector \([1 \ 1]^{\tr}\text{,}\) and Figure 20.1 illustrates this geometrically as well.

This method of successive approximations \(\vx_k = A \vx_{k-1}\) is called the power method (since we could write \(\vx_k\) as \(A^k \vx_0\)). Our task now is to show that this method works in general. In the next activity we restrict our argument to the \(2 \times 2\) case, and then discuss the general case afterwards.

Let \(A\) be an arbitrary \(2 \times 2\) matrix with two linearly independent eigenvectors \(\vv_1\) and \(\vv_2\) and corresponding eigenvalues \(\lambda_1\) and \(\lambda_2\text{,}\) respectively. We will also assume \(|\lambda_1| > |\lambda_2|\text{.}\) An eigenvalue whose absolute value is larger than that of any other eigenvalue is called a dominant eigenvalue. Any eigenvector for a dominant eigenvalue is called a dominant eigenvector. Before we show that our method can be used to approximate a dominant eigenvector, we recall that since \(\vv_1\) and \(\vv_2\) are eigenvectors corresponding to distinct eigenvalues, then \(\vv_1\) and \(\vv_2\) are linearly independent. So there exist scalars \(a_1\) and \(a_2\) such that

\begin{equation*} \vx_0 = a_1 \vv_1 + a_2 \vv_2\text{.} \end{equation*}

We have seen that for each positive integer \(k\) we can write \(\vx_n\) as

\begin{equation} \vx_k = a_1 \lambda_1^k \vv_1 + a_2 \lambda_2^k \vv_2\text{.}\tag{20.1} \end{equation}

With this representation of \(\vx_0\) we can now see why the power method approximates a dominant eigenvector of \(A\text{.}\)

Activity 20.2.

Assume as above that \(A\) is an arbitrary \(2 \times 2\) matrix with two linearly independent eigenvectors \(\vv_1\) and \(\vv_2\) and corresponding eigenvalues \(\lambda_1\) and \(\lambda_2\text{,}\) respectively. (We are assuming that we don't know these eigenvectors, but we can assume that they exist.) Assume that \(\lambda_1\) is the dominant eigenvalue for \(A\text{,}\) \(\vx_0\) is some initial guess to an eigenvector for \(A\text{,}\) that \(\vx_0 = a_1 \vv_1 + a_2 \vv_2\text{,}\) and that \(\vx_k = A\vx_{k-1}\) for \(k \geq 1\text{.}\)

(a)

We divide both sides of equation (20.1) by \(\lambda_1^k\) (since \(\lambda_1\) is the dominant eigenvalue, we know that \(\lambda_1\) is not \(0\)) to obtain

\begin{equation} \frac{1}{\lambda_1^k}\vx_k = a_1 \vv_1 + a_2 \left(\frac{\lambda_2}{\lambda_1}\right)^k \vv_2\text{.}\tag{20.2} \end{equation}

Recall that \(\lambda_1\) is the dominant eigenvalue for \(A\text{.}\) What happens to \(\left( \frac{\lambda_2}{\lambda_1} \right)^k\) as \(k \to \infty\text{?}\) Explain what happens to the right hand side of equation (20.2) as \(k \to \infty\text{.}\)

(b)

Explain why the previous result tells us that the vectors \(\vx_k\) are approaching a vector in the direction of \(\vv_1\) or \(-\vv_1\) as \(k \to \infty\text{,}\) assuming \(a_1 \neq 0\text{.}\) (Why do we need \(a_1 \neq 0\text{?}\) What happens if \(a_1=0\text{?}\))

(c)

What does all of this tell us about the sequence \(\{\vx_k\}\) as \(k \to \infty\text{?}\)

The power method is straightforward to implement, but it is not without its drawbacks. We began by assuming that we had a basis of eigenvectors of a matrix \(A\text{.}\) So we are also assuming that \(A\) is diagonalizable. We also assumed that \(A\) had a dominant eigenvalue \(\lambda_1\text{.}\) That is, if \(A\) is \(n \times n\) we assume that \(A\) has eigenvalues \(\lambda_1\text{,}\) \(\lambda_2\text{,}\) \(\ldots\text{,}\) \(\lambda_n\text{,}\) not necessarily distinct, with

\begin{equation*} |\lambda_1| > |\lambda_2| \geq |\lambda_3| \geq \cdots \geq |\lambda_n| \end{equation*}

and with \(\vv_i\) an eigenvector of \(A\) with eigenvalue \(\lambda_i\text{.}\) We could then write any initial guess \(\vx_0\) in the form

\begin{equation*} \vx_0 = a_1 \vv_1 + a_2\vv_2 + \cdots + a_n \vv_n\text{.} \end{equation*}

The initial guess is also called a seed.

Then

\begin{equation*} \vx_k = a_1 \lambda_1^k \vv_1 + a_2 \lambda_2^k \vv_2 + \cdots + a_n \lambda_n^k \vv_n \end{equation*}

and

\begin{equation} \frac{1}{\lambda_1^k} \vx_k = a_1 \vv_1 + a_2 \left(\frac{\lambda_2}{\lambda_1}\right)^k \vv_2 + \cdots + a_n \left(\frac{\lambda_n}{\lambda_1}\right)^k \vv_n\text{.}\tag{20.3} \end{equation}

Notice that we are not actually calculating the vectors \(\vx_k\) here — this is a theoretical argument and we don't know \(\lambda_1\) and are not performing any scaling like we did in Preview Activity 20.1. We are assuming that \(\lambda_1\) is the dominant eigenvalue of \(A\text{,}\) though, so for each \(i\) the terms \(\left(\frac{\lambda_i}{\lambda_1}\right)^k\) converge to \(0\) as \(k\) goes to infinity. Thus,

\begin{equation*} \vx_k \approx \lambda_1^k a_1 \vv_1 \end{equation*}

for large values of \(k\text{,}\) which makes the sequence \(\{\vx_k\}\) converge to a vector in the direction of a dominant eigenvector \(\vv_1\) provided \(a_1 \neq 0\text{.}\) So we need to be careful enough to choose a seed that has a nonzero component in the direction of \(\vv_1\text{.}\) Of course, we generally don't know that our matrix is diagonalizable before we make these calculations, but for many matrices the sequence \(\{\vx_k\}\) will approach a dominant eigenvector.

The power method approximates a dominant eigenvector, and there are ways that we can approximate the dominant eigenvalue. Exercise 8 presents one way — by keeping track of the components of the \(\vx_k\) that have the largest absolute values, and the next activity shows another.

Activity 20.3.

Let \(A\) be an \(n \times n\) matrix with eigenvalue \(\lambda\) and corresponding eigenvector \(\vv\text{.}\)

(a)

Explain why \(\lambda = \frac{\lambda (\vv \cdot \vv)}{\vv \cdot \vv}\text{.}\)

(b)

Use the result of part (a) to explain why \(\lambda = \frac{(A\vv) \cdot \vv}{\vv \cdot \vv}\text{.}\)

The result of Activity 20.3 is that, when the vectors in the sequence \(\{\vx_k\}\) approximate a dominant eigenvector of a matrix \(A\text{,}\) the quotients

\begin{equation} \frac{(A\vx_k) \cdot \vx_k}{\vx_k \cdot \vx_k} = \frac{\vx_k^{\tr}A\vx_k}{\vx_k^{\tr}\vx_k}\tag{20.4} \end{equation}

approximate the dominant eigenvalue of \(A\text{.}\) The quotients in (20.4) are called Rayleigh quotients.

To summarize, the procedure for applying the power method for approximating a dominant eigenvector and dominant eigenvalue of a matrix \(A\) is as follows.

Step 1: Select an arbitrary nonzero vector \(\vx_0\) as an initial guess to a dominant eigenvector.
Step 2: Let \(\vx_1 = A\vx_0\text{.}\) Let \(k = 1\text{.}\)
Step 3: To avoid having the magnitudes of successive approximations become excessively large, scale this approximation \(\vx_k\text{.}\) That is, find the entry \(\alpha_k\) of \(\vx_k\) that is largest in absolute value. Then replace \(\vx_k\) by \(\frac{1}{\alpha_k} \vx_k\text{.}\)
Step 4: Calculate the Rayleigh quotient \(r_k = \frac{(A\vx_{k}) \cdot \vx_k}{\vx_k \cdot \vx_k}\text{.}\)
Step 5: Let let \(\vx_{k+1} = A \vx_k\text{.}\) Increase \(k\) by \(1\) and repeat Steps 3 through 5.

If the sequence \(\{\vx_k\}\) converges to a dominant eigenvector of \(A\text{,}\) then the sequence \(\{r_k\}\) converges to the dominant eigenvalue of \(A\text{.}\)

The power method can be useful for approximating a dominant eigenvector as long as the successive multiplications by \(A\) are fairly simple — for example, if many entries of \(A\) are zero.³⁸ The rate of convergence of the sequence \(\{\vx_k\}\) depends on the ratio \(\frac{\lambda_2}{\lambda_1}\text{.}\) If this ratio is close to \(1\text{,}\) then it can take many iterations before the power \(\left(\frac{\lambda_2}{\lambda_1}\right)^k\) makes the \(\vv_2\) term negligible. There are other methods for approximating eigenvalues and eigenvectors, e.g., the QR factorization, that we will not discuss at this point.

Subsection The Inverse Power Method

The power method only allows us to approximate the dominant eigenvalue and a dominant eigenvector for a matrix \(A\text{.}\) It is possible to modify this method to approximate other eigenvectors and eigenvalues under certain conditions. We consider an example in the next activity to motivate the general situation.

Activity 20.4.

Let \(A = \left[ \begin{array}{cc} 2\amp 6 \\ 5\amp 3 \end{array} \right]\) be the matrix from Preview Activity 20.1. Recall that \(8\) is an eigenvalue for \(A\text{,}\) and a quick calculation can show that \(-3\) is the other eigenvalue of \(A\text{.}\) Consider the matrix \(B = (A - (-2)I_2)^{-1} = \frac{1}{10}\left[ \begin{array}{rr} -5\amp 6 \\ 5\amp -4 \end{array} \right]\text{.}\)

(a)

Show that \(\frac{1}{8-(-2)}\) and \(\frac{1}{-3-(-2)}\) are the eigenvalues of \(B\text{.}\)

(b)

Recall that \(\vv_1 = [1 \ 1]^{\tr}\) is an eigenvector of \(A\) corresponding to the eigenvalue \(8\) and assume that \(\vv_2 = [-6 \ 5]^{\tr}\) is an eigenvector for \(A\) corresponding to the eigenvalue \(-3\text{.}\) Calculate the products \(B \vv_1\) and \(B \vv_2\text{.}\) How do the products relate to the results of part (a)?

Activity 20.4 provides evidence that we can translate the matrix \(A\) having a dominant eigenvalue to a different matrix \(B\) with the same eigenvectors as \(A\) and with a dominant eigenvalue of our choosing. To see why, let \(A\) be an \(n \times n\) matrix with eigenvalues \(\lambda_1\text{,}\) \(\lambda_2\text{,}\) \(\ldots\text{,}\) \(\lambda_n\text{,}\) and let \(\alpha\) be any real number distinct from the eigenvalues. Let \(B = (A - \alpha I_n)^{-1}\text{.}\) In our example in Activity 20.4 the numbers

\begin{equation*} \frac{1}{\lambda_1 - \alpha}, \ \frac{1}{\lambda_2 - \alpha}, \ \frac{1}{\lambda_3 - \alpha}, \ \ldots, \frac{1}{\lambda_n - \alpha} \end{equation*}

were the eigenvalues of \(B\text{,}\) and that if \(\vv_i\) is an eigenvector for \(A\) corresponding to the eigenvalue \(\lambda_i\text{,}\) then \(\vv_i\) is an eigenvector of \(B\) corresponding to the eigenvalue \(\frac{1}{\lambda_i - \alpha}\text{.}\) To see why, let \(\lambda\) be an eigenvalue of an \(n \times n\) matrix \(A\) with corresponding eigenvector \(\vv\text{.}\) Let \(\alpha\) be a scalar that is not an eigenvalue of \(A\text{,}\) and let \(B = (A - \alpha I_n)^{-1}\text{.}\) Now

\begin{align*} A \vv \amp = \lambda \vv\\ A \vv - \alpha \vv \amp = \lambda \vv - \alpha \vv\\ (A-\alpha I_n) \vv \amp = (\lambda - \alpha) \vv\\ \frac{1}{\lambda - \alpha} \vv \amp = (A-\alpha I_n)^{-1} \vv\text{.} \end{align*}

So \(\frac{1}{\lambda - \alpha}\) is an eigenvalue of \(B\) with eigenvector \(\vv\text{.}\)

Now suppose that \(A\) is an \(n \times n\) matrix with eigenvalues \(\lambda_1\text{,}\) \(\lambda_2\text{,}\) \(\ldots\text{,}\) \(\lambda_n\text{,}\) and that we want to approximate an eigenvector and corresponding eigenvalue \(\lambda_i\) of \(A\text{.}\) If we can somehow find a value of \(\alpha\) so that \(|\lambda_i - \alpha| \lt |\lambda_j - \alpha|\) for all \(j \neq i\text{,}\) then \(\left| \frac{1}{\lambda_i - \alpha} \right| > \left| \frac{1}{\lambda_j - \alpha} \right|\) for any \(j \neq i\text{.}\) Thus, the matrix \(B = (A - \alpha I_n)^{-1}\) has \(\frac{1}{\lambda_i - \alpha}\) as its dominant eigenvalue and we can use the power method to approximate an eigenvector and the Rayleigh quotient to approximate the eigenvalue \(\frac{1}{\lambda_i - \alpha}\text{,}\) and hence approximate \(\lambda_i\text{.}\)

Activity 20.5.

Let \(A = \frac{1}{8}\left[ \begin{array}{crr} 7\amp 3\amp 3 \\ 30\amp 22\amp -10 \\ 15\amp -21\amp 11 \end{array} \right]\text{.}\)

(a)

Apply the power method to the matrix \(B = (A - I_3)^{-1}\) with initial vector \(\vx_0 = [1 \ 0 \ 0]^{\tr}\) to fill in Table 20.2 (to four decimal places). Use this information to estimate an eigenvalue for \(A\) and a corresponding eigenvector.

Table 20.2. Applying the power method to \((A - I_3)^{-1}\)

\(k\)	\(10\)	\(15\)	\(20\)
\(\vx_k\)
\(\frac{\vx_k^{\tr}A\vx_k}{\vx_k^{\tr}\vx_k}\)

(b)

Applying the power method to the matrix \(B = (A - 0I_3)^{-1}\) with initial vector \(\vx_0 = [1 \ 0 \ 0]^{\tr}\) yields the information in Table 20.3 (to four decimal places). Use this information to estimate an eigenvalue for \(A\) and a corresponding eigenvector.

Table 20.3. Applying the power method to \((A - 0I_3)^{-1}\)

\(k\)	\(10\)	\(15\)	\(20\)
\(\vx_k\)	\(\left[ \begin{array}{r} -0.3344 \\ 0.6677 \\ 1.0000 \end{array} \right]\)	\(\left[\begin{array}{r} -0.3333 \\ 0.6666 \\ 1.0000 \end{array} \right]\)	\(\left[ \begin{array}{r} -0.3333 \\ 0.6666 \\ 1.0000 \end{array} \right]\)
\(\frac{\vx_k^{\tr}A\vx_k}{\vx_k^{\tr}\vx_k}\)	\(-1.0014\)	\(-1.0000\)	\(-1.0000\)

(c)

Applying the power method to the matrix \(B = (A - 5I_3)^{-1}\) with initial vector \(\vx_0 = [1 \ 0 \ 0]^{\tr}\) yields the information in Table 20.4 (to four decimal places). Use this information to estimate an eigenvalue for \(A\) and a corresponding eigenvector.

Table 20.4. Applying the power method to \((A - 5I_3)^{-1}\)

\(k\)	\(10\)	\(15\)	\(20\)
\(\vx_k\)	\(\left[ \begin{array}{r} 0.0000 \\ 1.0000 \\ -1.0000 \end{array} \right]\)	\(\left[ \begin{array}{r} 0.0000 \\ 1.0000 \\ -1.0000 \end{array} \right]\)	\(\left[ \begin{array}{r} 0.0000 \\ 1.0000 \\ -1.0000 \end{array} \right]\)
\(\frac{\vx_k^{\tr}A\vx_k}{\vx_k^{\tr}\vx_k}\)	\(-1.0000\)	\(-1.0000\)	\(-1.0000\)

Subsection Examples

What follows are worked examples that use the concepts from this section.

Example 20.5.

Let \(A = \left[ \begin{array}{ccc} 1\amp 2\amp 3\\4\amp 5\amp 6\\7\amp 8\amp 9 \end{array} \right]\text{.}\)

(a)

Approximate the dominant eigenvalue of \(A\) accurate to two decimal places using the power method. Use technology as appropriate.

Solution.

We use technology to calculate the scaled vectors \(A^k \vx_0\) for values of \(k\) until the components don't change in the second decimal place. We start with the seed \(\vx_0 = [1 \ 1 \ 1]^{\tr}\text{.}\) For example, to two decimal places we have \(\vx_k = [0.28 \ 0.64 \ 1.00]^{\tr}\) for \(k \geq 20\text{.}\) So we suspect that \(\left[ 0.28 \ 0.64 \ 1.00 \right]^{\tr}\) is close to a dominant eigenvector for \(A\text{.}\) For the dominant eigenvalue, we can calculate the Rayleigh quotients \(\frac{(A\vx_k) \cdot \vx_k}{\vx_k \cdot \vx_k}\) until they do not change to two decimal places. For \(k \geq 4\text{,}\) our Rayleigh quotients are all (to two decimal places) equal to \(16.12\text{.}\) So we expect that the dominant eigenvalue of \(A\) is close to \(16.12\text{.}\) Notice that

\begin{equation*} A [0.28 \ 0.64 \ 1.00]^{\tr} = [4.56 \ 10.32 \ 16.08]^{\tr}\text{,} \end{equation*}

which is not far off from \(16.12 [0.28 \ 0.64 \ 1.00]^{\tr}\text{.}\)

(b)

Find the characteristic polynomial \(p(\lambda)\) of \(A\text{.}\) Then find the the root of \(p(\lambda)\) farthest from the origin. Compare to the result of part (a). Use technology as appropriate.

Solution.

The characteristic polynomial of \(A\) is

\begin{equation*} p(\lambda) =-\lambda^3 + 15 \lambda^2 + 18 \lambda = -\lambda(\lambda^2-15\lambda-18)\text{.} \end{equation*}

The quadratic formula gives the nonzero roots of \(p(\lambda)\) as

\begin{equation*} \frac{15 \pm \sqrt{15^2 + 4(18)}}{2} = \frac{15 \pm 3\sqrt{33}}{2}\text{.} \end{equation*}

The roots farthest from the origin is approximately \(16.12\text{,}\) as was also calculated in part (a).

Example 20.6.

Let \(A = \left[ \begin{array}{ccc} 2\amp 1\amp 0 \\ 1\amp 3\amp 1 \\ 0\amp 1\amp 2 \end{array} \right]\text{.}\)

(a)

Use the power method to approximate the dominant eigenvalue and a corresponding eigenvector (using scaling) accurate to two decimal places. Use \(\vx_0 = [1 \ 1 \ 1]^{\tr}\) as the seed.

Solution.

We use technology to calculate the scaled vectors \(A^k \vx_0\) for values of \(k\) until the components don't change in the second decimal place. For example, to two decimal places we have \(\vx_k = [0.50 \ 1.00 \ 0.50]^{\tr}\) for \(k \geq 4\text{.}\) So we suspect that \(\left[ \frac{1}{2} \ 1 \ \frac{1}{2} \right]^{\tr}\) is a dominant eigenvector for \(A\text{.}\) For the dominant eigenvalue, we can calculate the Rayleigh quotients \(\frac{(A\vx_k) \cdot \vx_k}{\vx_k \cdot \vx_k}\) until they do not change to two decimal places. For \(k \geq 2\text{,}\) our Rayleigh quotients are all (to two decimal places) equal to \(4\text{.}\) So we expect that the dominant eigenvalue of \(A\) is \(4\text{.}\) We could also use the fact that

\begin{equation*} A \left[ \frac{1}{2} \ 1 \ \frac{1}{2} \right]^{\tr} = [2 \ 4 \ 2]^{\tr} = 4\left[ \frac{1}{2} \ 1 \ \frac{1}{2} \right]^{\tr} \end{equation*}

to see that \(\left[ \frac{1}{2} \ 1 \ \frac{1}{2} \right]^{\tr}\) is a dominant eigenvector for \(A\) with eigenvalue \(4\text{.}\)

(b)

Determine the exact value of the dominant eigenvalue of \(A\) and compare to your result from part (a).

Solution.

Technology shows that the characteristic polynomial of \(A - \lambda I_3\) is

\begin{equation*} p(\lambda) = -\lambda^3 + 7\lambda^2 - 14 \lambda + 8 = -(\lambda-1)(\lambda-2)(\lambda-4)\text{.} \end{equation*}

We can see from the characteristic polynomial that \(4\) is the dominant eigenvalue of \(A\text{.}\)

(c)

Approximate the remaining eigenvalues of \(A\) using the inverse power method.

Hint.

Try \(\alpha = 0.5\) and \(\alpha = 1.8\text{.}\)

Solution.

Applying the power method to \(B = (A-0.5I_3)^{-1}\) with seed \(\vx_0 = [1 \ 1 \ 1]^{\tr}\) gives \(\vx_k \approx [ 0.50 \ 1.00 \ 0.50]^{\tr}\) for \(k \geq 5\text{,}\) with Rayleigh quotients of \(2\) (to several decimal places). So \(2\) is the dominant eigenvalue of \(B\text{.}\) But \(\frac{1}{\lambda-0.5}\) is also the dominant eigenvalue of \(B\text{,}\) where \(\lambda\) is the corresponding eigenvalue of \(A\text{.}\) . So to find \(\lambda\text{,}\) we note that \(\frac{1}{\lambda-0.5} = 2\) implies that \(\lambda = 1\) is an eigenvalue of \(A\text{.}\) Now applying the power method to \(B = (A-1.8I_3)^{-1}\) with seed \(\vx_0 = [1 \ 1 \ 1]^{\tr}\) gives \(\vx_k \approx [ 1.00 \ -1.00 \ 1.00]^{\tr}\) for large enough \(k\text{,}\) with Rayleigh quotients of \(5\) (to several decimal places). To find the corresponding eigenvalue \(\lambda\) for \(A\text{,}\) we note that \(\frac{1}{\lambda-1.8} = 5\text{,}\) or \(\lambda = 2\) is an eigenvalue of \(A\text{.}\) Admittedly, this method is very limited. Finding good choices for \(\alpha\) often depends on having some information about the eigenvalues of \(A\text{.}\) Choosing \(\alpha\) close to an eigenvalue provides the best chance of obtaining that eigenvalue.

Subsection Summary

The power method is an iterative method that can be used to approximate the dominant eigenvalue of an \(n \times n\) matrix \(A\) that has \(n\) linearly independent eigenvectors and a dominant eigenvalue.
To use the power method we start with a seed \(\vx_0\) and then calculate the sequence \(\{\vx_k\}\) of vectors, where \(\vx_k = A \vx_{k-1}\text{.}\) If \(\vx_0\) is chosen well, then the sequence \(\{\vx_k\}\) converges to a dominant eigenvector of \(A\text{.}\)
If \(A\) is an \(n \times n\) matrix with eigenvalues \(\lambda_1\text{,}\) \(\lambda_2\text{,}\) \(\ldots\text{,}\) \(\lambda_n\text{,}\) to approximate an eigenvector of \(A\) corresponding to the eigenvalue \(\lambda_i\text{,}\) we apply the power method to the matrix \(B = (A - \alpha I_n)^{-1}\text{,}\) where \(\alpha\) is not a eigenvalue of \(A\) and \(\left| \frac{1}{\lambda_i - \alpha} \right| > \left| \frac{1}{\lambda_j - \alpha} \right|\) for any \(j \neq i\text{.}\)

Exercises Exercises

1.

Let \(A = \left[ \begin{array}{cc} 1\amp 2\\2\amp 1 \end{array} \right]\text{.}\) Let \(\vx_0 = [1 \ 0]^{\tr}\text{.}\)

(a)

Find the eigenvalues and corresponding eigenvectors for \(A\text{.}\)

(b)

Use appropriate technology to calculate \(\vx_k = A^k \vx_0\) for \(k\) up to 10. Compare to a dominant eigenvector for \(A\text{.}\)

(c)

Use the eigenvectors from part (b) to approximate the dominant eigenvalue for \(A\text{.}\) Compare to the exact value of the dominant eigenvalue of \(A\text{.}\)

(d)

Assume that the other eigenvalue for \(A\) is close to \(0\text{.}\) Apply the inverse power method and compare the results to the remaining eigenvalue and eigenvectors for \(A\text{.}\)

2.

Let \(A = \left[ \begin{array}{rcc} 1\amp 2\amp 0\\-2\amp 1\amp 2 \\ 1\amp 3\amp 1 \end{array} \right]\text{.}\) Use the power method to approximate a dominant eigenvector for \(A\text{.}\) Use \(\vx_0 = [1 \ 1 \ 1]^{\tr}\) as the seed. Then approximate the dominant eigenvalue of \(A\text{.}\)

3.

Let \(A = \left[ \begin{array}{rr} 3\amp -1\\-1\amp 3 \end{array} \right]\text{.}\) Use the power method starting with \(\vx_0 = [1 \ 1]^{\tr}\text{.}\) Explain why the method fails in this case to approximate a dominant eigenvector, and how you could adjust the seed to make the process work.

4.

Let \(A = \left[ \begin{array}{cc} 0\amp 1\\1\amp 0 \end{array} \right]\text{.}\)

(a)

Find the eigenvalues and an eigenvector for each eigenvalue.

(b)

Apply the power method with an initial starting vector \(\vx_0 = [0 \ 1]^{\tr}\text{.}\) What is the resulting sequence?

(c)

Use equation (20.3) to explain the sequence you found in part (b).

5.

Let \(A = \left[ \begin{array}{cc} 2\amp 6 \\ 5\amp 3 \end{array} \right]\text{.}\) Fill in the entries in Table 20.7, where \(\vx_k\) is the \(k\)th approximation to a dominant eigenvector using the power method, starting with the seed \(\vx_0 = [1 \ 0]^{\tr}\text{.}\) Compare the results of this table to the eigenvalues of \(A\) and \(\lim_{k \to \infty} \frac{\vx_{k+1}\cdot \vx_k}{\vx_k \cdot \vx_k}\text{.}\) What do you notice?

Table 20.7. Values of the Rayleigh quotient

\(\vv\)	\(\vx_0\)	\(\vx_1\)	\(\vx_2\)	\(\vx_3\)	\(\vx_4\)	\(\vx_5\)
\(\frac{\vv^{\tr}A\vv}{\vv^{\tr}\vv}\)
\(\vv\)	\(\vx_6\)	\(\vx_7\)	\(\vx_8\)	\(\vx_9\)	\(\vx_{10}\)	\(\vx_{11}\)
\(\frac{\vv^{\tr}A\vv}{\vv^{\tr}\vv}\)

6.

Let \(A = \left[ \begin{array}{cr} 4\amp -5 \\ 2\amp 15 \end{array} \right]\text{.}\) The power method will approximate the dominant eigenvalue \(\lambda = 14\text{.}\) In this exercise we explore what happens if we apply the power method to \(A^{-1}\text{.}\)

(a)

Apply the power method to \(A^{-1}\) to approximate the dominant eigenvalue of \(A^{-1}\text{.}\) Use \([1 \ 1]^{\tr}\) as the seed. How is this eigenvalue related to an eigenvalue of \(A\text{?}\)

(b)

Explain in general why applying the power method to the inverse of an invertible matrix \(B\) might give an approximation to an eigenvalue of \(B\) of smallest magnitude. When might this not work?

7.

There are other algebraic methods that do not rely on the determinant of a matrix that can be used to find eigenvalues of a matrix. We examine one such method in this exercise. Let \(A\) be any \(n \times n\) matrix, and let \(\vv\) be any vector in \(\R^n\text{.}\)

(a)

Explain why the vectors

\begin{equation*} \vv, \ A\vv, \ A^2\vv, \ \ldots, \ A^n\vv \end{equation*}

are linearly dependent.

(b)

Let \(c_0\text{,}\) \(c_1\text{,}\) \(\ldots\text{,}\) \(c_n\) be scalars, not all 0, so that

\begin{equation*} c_0 \vv + c_1 A\vv + c_2 A^2 \vv + \cdots + c_n A^n \vv = \vzero\text{.} \end{equation*}

Explain why there must be a smallest positive integer \(k\) so that there are scalars \(a_0\text{,}\) \(a_1\text{,}\) \(\ldots\text{,}\) \(a_k\) with \(a_k \neq 0\text{.}\) such that

\begin{equation*} a_0 \vv + a_1 A\vv + a_2 A^2 \vv + \cdots + a_k A^k \vv = \vzero\text{.} \end{equation*}

Hint.

Proceed down the list \(c_{n-1}\text{,}\) \(c_{n-2}\text{,}\) etc., until you reach a weight that is non-zero.

(c)

Let

\begin{equation*} q(t) = a_0 + a_1t + a_2t^2 + \cdots + a_kt^k\text{.} \end{equation*}

Then

\begin{equation*} q(A) = a_0 + a_1A + a_2A^2 + \cdots + a_kA^k \end{equation*}

and

\begin{align*} q(A)\vv \amp = (a_0 + a_1A + a_2A^2 + \cdots + a_kA^k)\vv\\ \amp = a_0 \vv + a_1 A\vv + a_2 A^2 \vv + \cdots + a_k A^k \vv\\ \amp = \vzero\text{.} \end{align*}

Suppose the polynomial \(q(t)\) has a linear factor, say \(q(t) = (t-\lambda)Q(t)\) for some degree \(k-1\) polynomial \(Q(t)\text{.}\) Explain why, if \(Q(A) \vv\) is non-zero, \(\lambda\) is an eigenvalue of \(A\) with eigenvector \(Q(A) \vv\text{.}\)

(d)

This method allows us to find certain eigenvalues and eigenvectors, the roots of the polynomial \(q(t)\text{.}\) Any other eigenvector must lie outside the eigenspaces we have already found, so repeating the process with a vector \(\vv\) not in any of the known eigenspaces will produce different eigenvalues and eigenvectors. Let \(A = \left[ \begin{array}{ccr} 2\amp 2\amp -1 \\ 2\amp 2\amp 2 \\ 0\amp 0\amp 6 \end{array} \right]\text{.}\)

(i)

Find the polynomial \(q(t)\text{.}\) Use \(\vv = [1 \ 1 \ 1]^{\tr}\text{.}\)

(ii)

Find all of the roots of \(q(t)\text{.}\)

(iii)

For each root \(\lambda\) of \(q(t)\text{,}\) find the polynomial \(Q(t)\) and use this polynomial to determine an eigenvector of \(A\text{.}\) Verify your work.

8.

We have seen that the Rayleigh quotients approximate the dominant eigenvalue of a matrix \(A\text{.}\) As an alternative to using Rayleigh quotients, we can keep track of the scaling factors. Recall that the scaling in the power method can be used to make the magnitudes of the successive approximations smaller and easier to work with. Let \(A\) be an \(n \times n\) matrix and begin with a non-zero seed \(\vv_0\text{.}\) We now want to keep track of the scaling factors, so let \(\alpha_0\) be the component of \(\vv_0\) with largest absolute value and let \(\vx_0 = \frac{1}{\alpha_0}\vv_0\text{.}\) For \(k \geq 0\text{,}\) let \(\vv_k = A\vx_{k-1}\text{,}\) let \(\alpha_k\) be the component of \(\vv_k\) with largest absolute value and let \(\vx_k = \frac{1}{\alpha_k}\vv_k\text{.}\)

(a)

Let \(A = \left[ \begin{array}{rc} 0\amp 1 \\ -8\amp 6 \end{array} \right]\text{.}\) Use \(\vx_0 = [1 \ 1]^{\tr}\) as the seed and calculate \(\alpha_k\) for \(k\) from \(1\) to \(10\text{.}\) Compare to the dominant eigenvalue of \(A\text{.}\)

(b)

Assume that for large \(k\) the vectors \(\vx_k\) approach a dominant eigenvector with dominant eigenvalue \(\lambda\text{.}\) Show now in general that the sequence of scaling factors \(\alpha_k\) approaches \(\lambda\text{.}\)

9.

Let \(A\) be an \(n \times n\) matrix and let \(\alpha\) be a scalar that is not an eigenvalue of \(A\text{.}\) Suppose that \(\vx\) is an eigenvector of \(B = (A-\alpha I_n)^{-1}\) with eigenvalue \(\beta\text{.}\) Find an eigenvalue of \(A\) in terms of \(\beta\) and \(\alpha\) with corresponding eigenvector \(\vx\text{.}\)

10.

Label each of the following statements as True or False. Provide justification for your response.

(a) True/False.

The largest eigenvalue of a matrix is a dominant eigenvalue.

(b) True/False.

If an \(n \times n\) matrix \(A\) has \(n\) linearly independent eigenvectors and a dominant eigenvalue, then the sequence \(\{A^k \vx_0\}\) converges to a dominant eigenvector of \(A\) for any initial vector \(\vx_0\text{.}\)

(c) True/False.

If \(\lambda\) is an eigenvalue of an \(n \times n\) matrix \(A\) and \(\alpha\) is not an eigenvalue of \(A\text{,}\) then \(\lambda - \alpha\) is an eigenvalue of \(A - \alpha I_n\text{.}\)

(d) True/False.

Every square matrix has a dominant eigenvalue.

Subsection Project: Managing a Sheep Herd

Sheep farming is a significant industry in New Zealand. New Zealand is reported to have the highest density of sheep in the world. Sheep can begin to reproduce after one year, and give birth only once per year. Table 20.8 gives Birth and Survival Rates for Female New Zealand Sheep (from G. Caughley, “Parameters for Seasonally Breeding Populations,” Ecology, 48, (1967), 834-839). Since sheep hardly ever live past 12 years, we will only consider the population through 12 years.

Table 20.8. New Zealand female sheep data by age group

Age (years)	Birth Rate	Survival Rate
0-1	0.000	0.845
1-2	0.045	0.975
2-3	0.391	0.965
3-4	0.472	0.950
4-5	0.484	0.926
5-6	0.546	0.895
6-7	0.543	0.850
7-8	0.502	0.786
8-9	0.468	0.691
9-10	0.459	0.561
10-11	0.433	0.370
11-12	0.421	0.000

As sheep reproduce, they add to the 0-1 sheep (lamb) population. The potential to produce offspring is called fecundity (derived from the word fecund which generally refers to reproductive ability) and determines how many lamb are added to the population. Let \(F_k\) (the fecundity rate) be the rate at which females in age class \(k\) give birth to female offspring. Not all members of a given age group survive to the next age groups, so let \(s_k\) be the fraction of individuals that survives from age group \(k\) to age group \(k+1\text{.}\) With these ideas in mind, we can create a life cycle chart as in Figure 20.9 that illustrates how the population of sheep changes on a farm (for the sake of space, we illustrate with four age classes).

Figure 20.9. Life cycle with four age classes.

To model the sheep population, we need a few variables. Let \(n_1^{(0)}\) be the number of sheep in age group 0-1, \(n_2^{(0)}\) the number in age group 1-2, \(n_3\) the number in age group 2-3 and, in general, \(n_k^{(0)}\) the number of sheep in age group \((k-1)\)-\(k\) at some initial time (time \(0\)), and let

\begin{equation*} \vx_0 = \left[ n_1^{(0)} \ n_2^{(0)} \ n_3^{(0)} \ \cdots \ n_{12}^{(0)} \right]^{\tr}\text{.} \end{equation*}

We wish to determine the populations in the different groups after one year. Let

\begin{equation*} \vx_1 = \left[ n_1^{(1)} \ n_2^{(1)} \ n_3^{(1)} \ \cdots \ n_{12}^{(1)} \right]^{\tr}\text{,} \end{equation*}

where \(n_1^{(1)}\) denotes the number of sheep in age group 0-1, \(n_2^{(1)}\) the number of sheep in age group 1-2 and, in general, \(n_{k}^{(1)}\) the number of tilapia in age group \((k-1)\)-\(k\) after one year.

Project Activity 20.6.

Table 20.8 shows that, on average, each female in age group 1-2 produces \(0.045\) female offspring in a year. Since there are \(n_2\) females in age group 1-2, the lamb population increases by \(0.045n_2\) in a year.

(a)

Continue this analysis to explain why

\begin{align*} n_1^{(1)} \amp = 0.045n_2+0.391 n_3+0.472n_4+0.484 n_5+0.546n_6+0.543n_7\\ \amp \qquad+0.502 n_8+0.468 n_9+0.459n_{10}+0.433 n_{11}+0.421 n_{12}\text{.} \end{align*}

(b)

Explain why \(n_2^{(1)} = 0.845n_1\text{.}\)

(c)

Now explain why

\begin{equation} \vx_1 = L\vx_0\text{,}\tag{20.5} \end{equation}

where \(L\) is the matrix

\begin{equation} {\scriptsize \left[ \begin{array}{cccccccccccc} 0 \amp 0.045 \amp 0.391 \amp 0.472 \amp 0.484 \amp 0.546 \amp 0.543 \amp 0.502 \amp 0.468 \amp 0.459 \amp 0.433 \amp 0.421 \\ 0.845 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \\ 0 \amp 0.975 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \\ 0 \amp 0 \amp 0.965 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \\ 0 \amp 0 \amp 0 \amp 0.950 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \\ 0 \amp 0 \amp 0 \amp 0 \amp 0.926 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \\ 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0.895 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \\ 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0.850 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \\ 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0.786 \amp 0 \amp 0 \amp 0 \amp 0 \\ 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0.691 \amp 0 \amp 0 \amp 0 \\ 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0.561 \amp 0 \amp 0 \\ 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0.370 \amp 0 \end{array} \right]}\text{.}\tag{20.6} \end{equation}

Notice that our matrix \(L\) has the form

\begin{equation*} \left[ \begin{array}{cccccc} F_1\amp F_2\amp F_3\amp \cdots\amp F_{n-1}\amp F_n \\ s_1\amp 0\amp 0\amp \cdots\amp 0\amp 0 \\ 0\amp s_2\amp 0\amp \cdots\amp 0\amp 0 \\ 0\amp 0\amp s_3\amp \cdots\amp 0\amp 0 \\ \amp \amp \amp \ddots\amp \amp \\ 0\amp 0\amp 0\amp \cdots\amp s_{n-1}\amp 0 \end{array} \right]\text{.} \end{equation*}

Such a matrix is called a Leslie matrix.

Leslie matrices have certain useful properties, and one eigenvalue of a Leslie matrix can tell us a lot about the long-term behavior of the situation being modeled. You can take these properties as fact unless otherwise directed.

A Leslie matrix \(L\) has a unique positive eigenvalue \(\lambda_1\) with a corresponding eigenvector \(\vv_1\) whose entries are all positive.
If \(\lambda_i\) (\(i > 1\)) is any other eigenvalue (real or complex) of \(L\text{,}\) then \(|\lambda_i| \leq \lambda_1\text{.}\) If \(\lambda_1\) is the largest magnitude eigenvalue of a matrix \(L\text{,}\) we call \(\lambda_1\) a dominant eigenvalue of \(L\text{.}\)
If any two successive entries in the first row of \(L\) are both positive, then \(|\lambda_i| \lt \lambda_1\) for every \(i > 1\text{.}\) In this case we say that \(\lambda_1\) is a strictly dominant eigenvalue of \(L\text{.}\) In a Leslie model, this happens when the females in two successive age classes are fertile, which is almost always the case.
If \(\lambda_1\) is a strictly dominant eigenvalue, then \(\vx_k\) is approximately a scalar multiple of \(\vv_1\) for large values of \(k\text{,}\) regardless of the initial state \(\vx_0\text{.}\) In other words, large state vectors are close to eigenvectors for \(\lambda_1\text{.}\)

We can use these properties to determine the long-term behavior of the sheep herd.

Project Activity 20.7.

Assume that \(L\) is defined by (20.6), and let

\begin{equation*} \vx_m = \left[ n_1^{(m)} \ n_2^{(m)} \ n_3^{(m)} \ \cdots \ n_{12}^{(m)} \right]^{\tr}\text{,} \end{equation*}

where \(n_1^{(m)}\) denotes the number of sheep in age group 0-1, \(n_2^{(m)}\) the number of sheep in age group 1-2 and, in general, \(n_k^{(m)}\) the number of sheep in age group \((k-1)\)-\(k\) after \(k\) years.

(a)

Assume that \(\vx_0 = \left[100 \ 100 \ 100 \ \cdots \ 100 \right]^{\tr}\text{.}\) Use appropriate technology to calculate \(\vx_{22}\text{,}\) \(\vx_{23}\text{,}\) \(\vx_{24}\text{,}\) and \(\vx_{25}\text{.}\) Round to the nearest whole number. What do you notice about the sheep population? You may use the GeoGebra applet at geogebra.org/m/yqss88xq.

We can use the third and fourth properties of Leslie matrices to better understand the long-term behavior of the sheep population. Since successive entries in the first row of the Leslie matrix in (20.6) are positive, our Leslie matrix has a strictly dominant eigenvalue \(\lambda_1\text{.}\) Given the dimensions of our Leslie matrix, finding this dominant eigenvalue through algebraic means is not feasible. Use the power method to approximate the dominant eigenvalue \(\lambda_1\) of the Leslie matrix in (20.6) to five decimal places. Explain your process. Then explain how this dominant eigenvalue tells us that, unchecked, the sheep population grows at a rate that is roughly exponential. What is the growth rate of this exponential growth? You may use the GeoGebra applet at geogebra.org/m/yqss88xq.

Project Activity 20.7 indicates that, unchecked, the sheep population will grow without bound, roughly exponentially with ratio equal to the dominant eigenvalue of our Leslie matrix \(L\text{.}\) Of course, a sheep farmer cannot provide the physical environment or the resources to support an unlimited population of sheep. In addition, most sheep farmers cannot support themselves only by shearing sheep for the wool. Consequently, some harvesting of the sheep population each year for meat and skin is necessary. A sustainable harvesting policy allows for the regular harvesting of some sheep while maintaining the population at a stable level. It is necessary for the farmer to find an optimal harvesting rate to attain this stable population and the following activity leads us through an analysis of how such a harvesting rate can be determined.

Project Activity 20.8.

The Leslie model can be modified to consider harvesting. It is possible to harvest different age groups at different rates, and to harvest only some age groups and not others. In the case of sheep, it might make sense to only harvest from the youngest population since lamb is more desirable than mutton and the lamb population grows the fastest. Assume that this is our harvesting strategy and that we harvest our sheep from only the youngest age group at the start of each year. Let \(h\) be the fraction of sheep we harvest from the youngest age group each year after considering growth.

(a)

If we begin with an initial population \(\vx_0\text{,}\) then the state vector after births and expected deaths is \(L\vx_0\text{.}\) Now we harvest. Explain why if we harvest a fraction \(h\) from the youngest age group after considering growth, then the state vector after 1 year will be

\begin{equation*} \vx_1 = L\vx_0 - HL\vx_0\text{,} \end{equation*}

where

\begin{equation*} H = \left[ \begin{array}{c c c c c c c c c c c c} h \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \\ 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \\ 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \\ 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \\ 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \\ 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \\ 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \\ 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \\ 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \\ 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \\ 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \\ 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \amp 0 \end{array} \right]\text{.} \end{equation*}

(b)

Our goal is to find a harvesting rate that will lead to a steady state in which the sheep population remains the same each year. In other words, we want to find a value of \(h\text{,}\) if one exists, that satisfies

\begin{equation} \vx = L\vx - HL\vx\text{.}\tag{20.7} \end{equation}

Show that (20.7) is equivalent to the matrix equation

\begin{equation} \vx = (I_{12}-H)L\vx\text{.}\tag{20.8} \end{equation}

(c)

Use appropriate technology to experiment numerically with different values of \(h\) to find the value you think gives the best uniform harvest rate. Explain your reasoning. You may use the GeoGebra applet at geogebra.org/m/yqss88xq.

(d)

Now we will use some algebra to find an equation that explicitly gives us the harvest rate in the general setting. This will take a bit of work, but none of it is too difficult. To simplify our work but yet illustrate the overall idea, let us consider the general \(4 \times 4\) case with arbitrary Leslie matrix

\begin{equation*} L = \left[ \begin{array}{cccc} F_1 \amp F_2 \amp F_3 \amp F_4 \\ s_1 \amp 0 \amp 0 \amp 0 \\ 0 \amp s_2 \amp 0 \amp 0 \\ 0 \amp 0 \amp s_3 \amp 0 \end{array} \right]\text{.} \end{equation*}

Recall that we want to find a value of \(h\) that satisfies (20.8) with \(H = \left[ \begin{array}{cccc} h \amp 0 \amp 0 \amp 0 \\ 0 \amp 0 \amp 0 \amp 0 \\ 0 \amp 0 \amp 0 \amp 0 \\ 0 \amp 0 \amp 0 \amp 0 \end{array} \right]\text{.}\) Let \(\vx = [x_1 \ x_2 \ x_3 \ x_4]^{\tr}\text{.}\)

(i)

Calculate the matrix product \((I_{4}-H)L\text{.}\) Explain why this product is again a Leslie matrix and why \((I_{4}-H)L\) will have a dominant eigenvalue of 1.

(ii)

Now calculate \((I_{4}-H)L\vx\) and set it equal to \(\vx\text{.}\) Write down the resulting system of 4 equations that must be satisfied. Be sure that your first equation is

\begin{equation} x_1 = (1-h)F_1x_1 + (1-h)F_2x_2 + (1-h)F_3x_3 + (1-h)F_4x_4\text{.}\tag{20.9} \end{equation}

(iii)

Equation (20.9) as written depends on the entries of the vector \(\vx\text{,}\) but we should be able to arrive at a result that is independent of \(\vx\text{.}\) To see how we do this, we assume the population of the youngest group is never 0, so we can divide both sides of (20.9) by \(x_1\) to obtain

\begin{equation} 1 = (1-h)F_1 + (1-h)F_2 \frac{x_2}{x_1} + (1-h)F_3 \frac{x_3}{x_1} + (1-h)F_4 \frac{x_4}{x_1}\text{.}\tag{20.10} \end{equation}

Now we need to write the fractions \(\frac{x_2}{x_1}\text{,}\) \(\frac{x_3}{x_1}\text{,}\) and \(\frac{x_4}{x_1}\) so that they do not involve the \(x_i\text{.}\) Use the remaining equations in your system to show that

\begin{align*} \frac{x_2}{x_1} \amp = s_1\\ \frac{x_3}{x_1} \amp = s_1s_2\\ \frac{x_4}{x_1} \amp = s_1s_2s_3\text{.} \end{align*}

(iv)

Now conclude that the harvesting value \(h\) must satisfy the equation

\begin{equation} 1 = (1-h) [F_1 + F_2 s_1 + F_3 s_1s_2 + F_4 s_1s_2s_3]\text{.}\tag{20.11} \end{equation}

The value \(R = F_1 + F_2 s_1 + F_3 s_1s_2 + F_4 s_1s_2s_3\) is called the net reproduction rate of the population and turns out to be the average number of daughters born to a female in her expected lifetime.

(e)

Extend (20.11) to the 12 age group case of the sheep herd. Calculate the value of \(R\) for this sheep herd and then find the value of \(h\text{.}\) Compare this \(h\) to the value you obtained through experimentation earlier. Find the fraction of the lambs that should be harvested each year and explain what the stable population state vector \(\vx\) tells us about the sheep population for this harvesting policy.

There are several other ways to scale, but we won't consider them here.

A matrix in which most entries are zero is called a sparse matrix.