By the end of this section, you should be able to give precise and thorough answers to the questions listed below. You may want to keep these questions in mind to focus your thoughts as you complete the section.
What is the power method for?
How does the power method work?
How can we use the inverse power method to approximate any eigenvalue/eigenvector pair?
The Leslie Matrix (also called the Leslie Model) is a powerful model for describing an age distributed growth of a population that is closed to migration. In a Leslie model, it is usually the case that only one gender (most often female) is considered. As an example, we will later consider a population of sheep that is being grown commercially. A natural question that we will address is how we can harvest the population to build a sustainable environment.
When working with populations, the matrices we use are often large. For large matrices, using the characteristic polynomial to calculate eigenvalues is too time and resource consuming to be practical, and we generally cannot find the exact values of the eigenvalues. As a result, approximation techniques are very important. In this section we will explore a method for approximating eigenvalues. The eigenvalues of a Leslie matrix are important because they describe the limiting or steady-state behavior of a population. The matrix and model were introduced by Patrick H. Leslie in βOn the Use of Matrices in Certain Population Mathematicsβ, Leslie, P.H., Biometrika, Volume XXXIII, November 1945, pp. 183-212.
We have used the characteristic polynomial to find the eigenvalues of a matrix, and for each eigenvalue row reduced a corresponding matrix to find the eigenvectors This method is only practical for small matrices β for more realistic applications approximation techniques are used. We investigate one such technique in this section, the power method.
If we have no prior knowledge of the eigenvalues and eigenvectors of this matrix, we might just begin with a guess. Let be such a guess for an eigenvector. Calculate . Is an eigenvector of ? Explain.
If is not a good approximation to an eigenvector of , then we need to make a better guess. We have little to work with other than just random guessing, but we can use as another guess. We calculated in part 1. Is an eigenvector for ? Explain.
In parts (a) and (b) you might have noticed that in some sense is closer to being an eigenvector of than was. So maybe continuing this process will get us closer to an eigenvector of . In other words, for each positive integer we define as . Before we proceed, however, we should note that as we calculate the vectors ,,,, the entries in the vectors get large very quickly. So it will be useful to scale the entries so that they stay at a reasonable size, which makes it easier to interpret the output. One way to do this is to divide each vector by its largest component in absolute value so that all of the entries stay between and .β37β So in our example we have ,, and . Explain why scaling our vectors will not affect our search for an eigenvector.
Use an appropriate technological tool to find the vectors up to . What do you think the limiting vector is? Is this limiting vector an eigenvector of ? If so, what is the corresponding eigenvalue?
While the examples we present in this text are small in order to highlight the concepts, matrices that appear in real life applications are often enormous. For example, in Google's PageRank algorithm that is used to determine relative rankings of the importance of web pages, matrices of staggering size are used (most entries in the matrices are zero, but the size of the matrices is still huge). Finding eigenvalues of such large matrices through the characteristic polynomial is impractical. In fact, finding the roots of all but the smallest degree characteristic polynomials is a very difficult problem. As a result, using the characteristic polynomial to find eigenvalues and then finding eigenvectors is not very practical in general, and it is often a better option to use a numeric approximation method. We will consider one such method in this section, the power method.
In Preview Activity 20.1, we saw an example of a matrix so that the sequence , where , converged to a dominant eigenvector of for an initial guess vector . The vectors for from to (with scaling) are approximately
This method of successive approximations is called the power method (since we could write as ). Our task now is to show that this method works in general. In the next activity we restrict our argument to the case, and then discuss the general case afterwards.
Let be an arbitrary matrix with two linearly independent eigenvectors and and corresponding eigenvalues and , respectively. We will also assume . An eigenvalue whose absolute value is larger than that of any other eigenvalue is called a dominant eigenvalue. Any eigenvector for a dominant eigenvalue is called a dominant eigenvector. Before we show that our method can be used to approximate a dominant eigenvector, we recall that since and are eigenvectors corresponding to distinct eigenvalues, then and are linearly independent. So there exist scalars and such that
Assume as above that is an arbitrary matrix with two linearly independent eigenvectors and and corresponding eigenvalues and , respectively. (We are assuming that we don't know these eigenvectors, but we can assume that they exist.) Assume that is the dominant eigenvalue for , is some initial guess to an eigenvector for , that , and that for .
Explain why the previous result tells us that the vectors are approaching a vector in the direction of or as , assuming . (Why do we need ? What happens if ?)
The power method is straightforward to implement, but it is not without its drawbacks. We began by assuming that we had a basis of eigenvectors of a matrix . So we are also assuming that is diagonalizable. We also assumed that had a dominant eigenvalue . That is, if is we assume that has eigenvalues ,,,, not necessarily distinct, with
Notice that we are not actually calculating the vectors here β this is a theoretical argument and we don't know and are not performing any scaling like we did in Preview Activity 20.1. We are assuming that is the dominant eigenvalue of , though, so for each the terms converge to as goes to infinity. Thus,
for large values of , which makes the sequence converge to a vector in the direction of a dominant eigenvector provided . So we need to be careful enough to choose a seed that has a nonzero component in the direction of . Of course, we generally don't know that our matrix is diagonalizable before we make these calculations, but for many matrices the sequence will approach a dominant eigenvector.
The power method approximates a dominant eigenvector, and there are ways that we can approximate the dominant eigenvalue. Exercise 8 presents one way β by keeping track of the components of the that have the largest absolute values, and the next activity shows another.
To summarize, the procedure for applying the power method for approximating a dominant eigenvector and dominant eigenvalue of a matrix is as follows.
Step 1
Select an arbitrary nonzero vector as an initial guess to a dominant eigenvector.
Step 2
Let . Let .
Step 3
To avoid having the magnitudes of successive approximations become excessively large, scale this approximation . That is, find the entry of that is largest in absolute value. Then replace by .
Step 4
Calculate the Rayleigh quotient .
Step 5
Let let . Increase by and repeat Steps 3 through 5.
The power method can be useful for approximating a dominant eigenvector as long as the successive multiplications by are fairly simple β for example, if many entries of are zero.β38β The rate of convergence of the sequence depends on the ratio . If this ratio is close to , then it can take many iterations before the power makes the term negligible. There are other methods for approximating eigenvalues and eigenvectors, e.g., the QR factorization, that we will not discuss at this point.
The power method only allows us to approximate the dominant eigenvalue and a dominant eigenvector for a matrix . It is possible to modify this method to approximate other eigenvectors and eigenvalues under certain conditions. We consider an example in the next activity to motivate the general situation.
Let be the matrix from Preview Activity 20.1. Recall that is an eigenvalue for , and a quick calculation can show that is the other eigenvalue of . Consider the matrix .
Recall that is an eigenvector of corresponding to the eigenvalue and assume that is an eigenvector for corresponding to the eigenvalue . Calculate the products and . How do the products relate to the results of part (a)?
Activity 20.4 provides evidence that we can translate the matrix having a dominant eigenvalue to a different matrix with the same eigenvectors as and with a dominant eigenvalue of our choosing. To see why, let be an matrix with eigenvalues ,,,, and let be any real number distinct from the eigenvalues. Let . In our example in Activity 20.4 the numbers
were the eigenvalues of , and that if is an eigenvector for corresponding to the eigenvalue , then is an eigenvector of corresponding to the eigenvalue . To see why, let be an eigenvalue of an matrix with corresponding eigenvector . Let be a scalar that is not an eigenvalue of , and let . Now
Now suppose that is an matrix with eigenvalues ,,,, and that we want to approximate an eigenvector and corresponding eigenvalue of . If we can somehow find a value of so that for all , then for any . Thus, the matrix has as its dominant eigenvalue and we can use the power method to approximate an eigenvector and the Rayleigh quotient to approximate the eigenvalue , and hence approximate .
Apply the power method to the matrix with initial vector to fill in Table 20.2 (to four decimal places). Use this information to estimate an eigenvalue for and a corresponding eigenvector.
Applying the power method to the matrix with initial vector yields the information in Table 20.3 (to four decimal places). Use this information to estimate an eigenvalue for and a corresponding eigenvector.
Applying the power method to the matrix with initial vector yields the information in Table 20.4 (to four decimal places). Use this information to estimate an eigenvalue for and a corresponding eigenvector.
Approximate the dominant eigenvalue of accurate to two decimal places using the power method. Use technology as appropriate.
Solution.
We use technology to calculate the scaled vectors for values of until the components don't change in the second decimal place. We start with the seed . For example, to two decimal places we have for . So we suspect that is close to a dominant eigenvector for . For the dominant eigenvalue, we can calculate the Rayleigh quotients until they do not change to two decimal places. For , our Rayleigh quotients are all (to two decimal places) equal to . So we expect that the dominant eigenvalue of is close to . Notice that
Find the characteristic polynomial of . Then find the the root of farthest from the origin. Compare to the result of part (a). Use technology as appropriate.
Solution.
The characteristic polynomial of is
.
The quadratic formula gives the nonzero roots of as
.
The roots farthest from the origin is approximately , as was also calculated in part (a).
Use the power method to approximate the dominant eigenvalue and a corresponding eigenvector (using scaling) accurate to two decimal places. Use as the seed.
Solution.
We use technology to calculate the scaled vectors for values of until the components don't change in the second decimal place. For example, to two decimal places we have for . So we suspect that is a dominant eigenvector for . For the dominant eigenvalue, we can calculate the Rayleigh quotients until they do not change to two decimal places. For , our Rayleigh quotients are all (to two decimal places) equal to . So we expect that the dominant eigenvalue of is . We could also use the fact that
to see that is a dominant eigenvector for with eigenvalue .
Approximate the remaining eigenvalues of using the inverse power method.
Hint.
Try and .
Solution.
Applying the power method to with seed gives for , with Rayleigh quotients of (to several decimal places). So is the dominant eigenvalue of . But is also the dominant eigenvalue of , where is the corresponding eigenvalue of . . So to find , we note that implies that is an eigenvalue of . Now applying the power method to with seed gives for large enough , with Rayleigh quotients of (to several decimal places). To find the corresponding eigenvalue for , we note that , or is an eigenvalue of . Admittedly, this method is very limited. Finding good choices for often depends on having some information about the eigenvalues of . Choosing close to an eigenvalue provides the best chance of obtaining that eigenvalue.
The power method is an iterative method that can be used to approximate the dominant eigenvalue of an matrix that has linearly independent eigenvectors and a dominant eigenvalue.
To use the power method we start with a seed and then calculate the sequence of vectors, where . If is chosen well, then the sequence converges to a dominant eigenvector of .
If is an matrix with eigenvalues ,,,, to approximate an eigenvector of corresponding to the eigenvalue , we apply the power method to the matrix , where is not a eigenvalue of and for any .
Assume that the other eigenvalue for is close to . Apply the inverse power method and compare the results to the remaining eigenvalue and eigenvectors for .
Let . Use the power method starting with . Explain why the method fails in this case to approximate a dominant eigenvector, and how you could adjust the seed to make the process work.
Let . Fill in the entries in Table 20.7, where is the th approximation to a dominant eigenvector using the power method, starting with the seed . Compare the results of this table to the eigenvalues of and . What do you notice?
Explain in general why applying the power method to the inverse of an invertible matrix might give an approximation to an eigenvalue of of smallest magnitude. When might this not work?
There are other algebraic methods that do not rely on the determinant of a matrix that can be used to find eigenvalues of a matrix. We examine one such method in this exercise. Let be any matrix, and let be any vector in .
This method allows us to find certain eigenvalues and eigenvectors, the roots of the polynomial . Any other eigenvector must lie outside the eigenspaces we have already found, so repeating the process with a vector not in any of the known eigenspaces will produce different eigenvalues and eigenvectors. Let .
We have seen that the Rayleigh quotients approximate the dominant eigenvalue of a matrix . As an alternative to using Rayleigh quotients, we can keep track of the scaling factors. Recall that the scaling in the power method can be used to make the magnitudes of the successive approximations smaller and easier to work with. Let be an matrix and begin with a non-zero seed . We now want to keep track of the scaling factors, so let be the component of with largest absolute value and let . For , let , let be the component of with largest absolute value and let .
Assume that for large the vectors approach a dominant eigenvector with dominant eigenvalue . Show now in general that the sequence of scaling factors approaches .
Let be an matrix and let be a scalar that is not an eigenvalue of . Suppose that is an eigenvector of with eigenvalue . Find an eigenvalue of in terms of and with corresponding eigenvector .
If an matrix has linearly independent eigenvectors and a dominant eigenvalue, then the sequence converges to a dominant eigenvector of for any initial vector .
Sheep farming is a significant industry in New Zealand. New Zealand is reported to have the highest density of sheep in the world. Sheep can begin to reproduce after one year, and give birth only once per year. Table 20.8 gives Birth and Survival Rates for Female New Zealand Sheep (from G. Caughley, βParameters for Seasonally Breeding Populations,βEcology, 48, (1967), 834-839). Since sheep hardly ever live past 12 years, we will only consider the population through 12 years.
As sheep reproduce, they add to the 0-1 sheep (lamb) population. The potential to produce offspring is called fecundity (derived from the word fecund which generally refers to reproductive ability) and determines how many lamb are added to the population. Let (the fecundity rate) be the rate at which females in age class give birth to female offspring. Not all members of a given age group survive to the next age groups, so let be the fraction of individuals that survives from age group to age group . With these ideas in mind, we can create a life cycle chart as in Figure 20.9 that illustrates how the population of sheep changes on a farm (for the sake of space, we illustrate with four age classes).
To model the sheep population, we need a few variables. Let be the number of sheep in age group 0-1, the number in age group 1-2, the number in age group 2-3 and, in general, the number of sheep in age group - at some initial time (time ), and let
where denotes the number of sheep in age group 0-1, the number of sheep in age group 1-2 and, in general, the number of tilapia in age group - after one year.
Table 20.8 shows that, on average, each female in age group 1-2 produces female offspring in a year. Since there are females in age group 1-2, the lamb population increases by in a year.
Leslie matrices have certain useful properties, and one eigenvalue of a Leslie matrix can tell us a lot about the long-term behavior of the situation being modeled. You can take these properties as fact unless otherwise directed.
A Leslie matrix has a unique positive eigenvalue with a corresponding eigenvector whose entries are all positive.
If () is any other eigenvalue (real or complex) of , then . If is the largest magnitude eigenvalue of a matrix , we call a dominant eigenvalue of .
If any two successive entries in the first row of are both positive, then for every . In this case we say that is a strictly dominant eigenvalue of . In a Leslie model, this happens when the females in two successive age classes are fertile, which is almost always the case.
If is a strictly dominant eigenvalue, then is approximately a scalar multiple of for large values of , regardless of the initial state . In other words, large state vectors are close to eigenvectors for .
where denotes the number of sheep in age group 0-1, the number of sheep in age group 1-2 and, in general, the number of sheep in age group - after years.
Assume that . Use appropriate technology to calculate ,,, and . Round to the nearest whole number. What do you notice about the sheep population? You may use the GeoGebra applet at geogebra.org/m/yqss88xq.
We can use the third and fourth properties of Leslie matrices to better understand the long-term behavior of the sheep population. Since successive entries in the first row of the Leslie matrix in (20.6) are positive, our Leslie matrix has a strictly dominant eigenvalue . Given the dimensions of our Leslie matrix, finding this dominant eigenvalue through algebraic means is not feasible. Use the power method to approximate the dominant eigenvalue of the Leslie matrix in (20.6) to five decimal places. Explain your process. Then explain how this dominant eigenvalue tells us that, unchecked, the sheep population grows at a rate that is roughly exponential. What is the growth rate of this exponential growth? You may use the GeoGebra applet at geogebra.org/m/yqss88xq.
Project Activity 20.7 indicates that, unchecked, the sheep population will grow without bound, roughly exponentially with ratio equal to the dominant eigenvalue of our Leslie matrix . Of course, a sheep farmer cannot provide the physical environment or the resources to support an unlimited population of sheep. In addition, most sheep farmers cannot support themselves only by shearing sheep for the wool. Consequently, some harvesting of the sheep population each year for meat and skin is necessary. A sustainable harvesting policy allows for the regular harvesting of some sheep while maintaining the population at a stable level. It is necessary for the farmer to find an optimal harvesting rate to attain this stable population and the following activity leads us through an analysis of how such a harvesting rate can be determined.
The Leslie model can be modified to consider harvesting. It is possible to harvest different age groups at different rates, and to harvest only some age groups and not others. In the case of sheep, it might make sense to only harvest from the youngest population since lamb is more desirable than mutton and the lamb population grows the fastest. Assume that this is our harvesting strategy and that we harvest our sheep from only the youngest age group at the start of each year. Let be the fraction of sheep we harvest from the youngest age group each year after considering growth.
If we begin with an initial population , then the state vector after births and expected deaths is . Now we harvest. Explain why if we harvest a fraction from the youngest age group after considering growth, then the state vector after 1 year will be
Our goal is to find a harvesting rate that will lead to a steady state in which the sheep population remains the same each year. In other words, we want to find a value of , if one exists, that satisfies
.(20.7)
Show that (20.7) is equivalent to the matrix equation
Use appropriate technology to experiment numerically with different values of to find the value you think gives the best uniform harvest rate. Explain your reasoning. You may use the GeoGebra applet at geogebra.org/m/yqss88xq.
Now we will use some algebra to find an equation that explicitly gives us the harvest rate in the general setting. This will take a bit of work, but none of it is too difficult. To simplify our work but yet illustrate the overall idea, let us consider the general case with arbitrary Leslie matrix
.
Recall that we want to find a value of that satisfies (20.8) with . Let .
Equation (20.9) as written depends on the entries of the vector , but we should be able to arrive at a result that is independent of . To see how we do this, we assume the population of the youngest group is never 0, so we can divide both sides of (20.9) by to obtain
.(20.10)
Now we need to write the fractions ,, and so that they do not involve the . Use the remaining equations in your system to show that
Now conclude that the harvesting value must satisfy the equation
.(20.11)
The value is called the net reproduction rate of the population and turns out to be the average number of daughters born to a female in her expected lifetime.
Extend (20.11) to the 12 age group case of the sheep herd. Calculate the value of for this sheep herd and then find the value of . Compare this to the value you obtained through experimentation earlier. Find the fraction of the lambs that should be harvested each year and explain what the stable population state vector tells us about the sheep population for this harvesting policy.
There are several other ways to scale, but we won't consider them here.
A matrix in which most entries are zero is called a sparse matrix.