Section 26 Least Squares Approximations
Focus Questions
By the end of this section, you should be able to give precise and thorough answers to the questions listed below. You may want to keep these questions in mind to focus your thoughts as you complete the section.
How, in general, can we find a least squares approximation to a system
If the columns of
are linearly independent, how can we find a least squares approximation to using just matrix operations?Why are these approximations called “least squares” approximations?
Subsection Application: Fitting Functions to Data
Data is all around us. Data is collected on almost everything and it is important to be able to use data to make predictions. However, data is rarely well-behaved and so we generally need to use approximation techniques to estimate from the data. One technique for this is least squares approximations. As we will see, we can use linear algebra to fit a variety of different types of curves to data.Subsection Introduction
In this section our focus is on fitting linear and polynomial functions to data sets.Preview Activity 26.1.
NBC was awarded the U.S. television broadcast rights to the 2016 and 2020 summer Olympic games. Table 26.1 lists the amounts paid (in millions of dollars) by NBC sports for the 2008 through 2012 summer Olympics plus the recently concluded bidding for the 2016 and 2020 Olympics, where year 0 is the year 2008. (We will assume a simple model here, ignoring variations such as value of money due to inflation, viewership data which might affect NBC's expected revenue, etc.) Figure 26.2 shows a plot of the data. Our goal in this activity is to find a linear function
Year | Amount |
0 | 894 |
4 | 1180 |
8 | 1226 |
12 | 1418 |
If the data were actually linear, then the data would satisfy the system
The vector form of this equation is
This equation does not have a solution, so we seek the best approximation to a solution we can find. That is, we want to find
Letting
To make a best fit, we will minimize the square of the distance between
Rephrasing this in terms of projections, we are looking for the vector in
(a)
Find an orthogonal basis
(b)
Use the basis
(c)
Find the values of
(d)
Draw a picture of your line from the previous part on the axes with the data set. How well do you think your line approximates the data? Explain.
Subsection Least Squares Approximations
In Section 25 we saw that the projection of a vectorActivity 26.2.
Let
(a)
Explain why
(b)
Let
(c)
From the previous part, show that
Theorem 26.5.
The least squares solutions to the system
Activity 26.3.
Now use the least squares method to find the best polynomial approximations (in the least squares sense) of degrees 2 and 3 for the Olympics data set in Table 26.1. Which polynomial seems to give the “best” fit? Explain why. Include a discussion of the errors in your approximations. Use your “best” least squares approximation to estimate how much NBC might pay for the television rights to the 2024 Olympic games. Use technology as appropriate.
Activity 26.4.
Let
(a)
What must be the relationship between
(b)
We know that an
(i)
Show that
What is
(ii)
What does
(iii)
What is
Theorem 26.6.
If the columns of
Subsection Examples
What follows are worked examples that use the concepts from this section.Example 26.7.
According to the Centers for Disease Control and Prevention 45 , the average length of a male infant (in centimeters) in the US as it ages (with time in months from 1.5 to 8.5) is given in Table 26.8.
Age (months) | 1.5 | 2.5 | 3.5 | 4.5 | 5.5 | 6.5 | 7.5 | 8.5 |
Average Length (cm) | 56.6 | 59.6 | 62.1 | 64.2 | 66.1 | 67.9 | 69.5 | 70.9 |
In this problem we will find the line and the quadratic of best fit in the least squares sense to this data. We treat age in months as the independent variable and length in centimeters as the dependent variable.
(a)
Find a line that is the best fit to the data in the least squares sense. Draw a picture of your least squares solution against a scatterplot of the data.
Solution.
We assume that a line of the form
Letting
we can write this system in the matrix form
Technology shows that (with entries rounded to 3 decimal places),
and
So the least squares linear function to the data is
(b)
Now find the least squares quadratic of the form
Solution.
The first data point would satisfy
Letting
we can write this system in the matrix form
Technology shows that (with entries rounded to 3 decimal places)
and
So the least squares quadratic function to the data is
Example 26.10.
Least squares solutions can be found through a QR factorization, as we explore in this example. Let
(a)
Replace
Hint.
Use the fact that
Solution.
Replacing
So if
(b)
Consider the data set in Table 26.11, which shows the average life expectance in years in the US for selected years from 1950 to 2010.
year | 1950 | 1965 | 1980 | 1995 | 2010 |
age | 68.14 | 70.21 | 73.70 | 75.98 | 78.49 |
(i)
Use (26.9) to find the least squares linear fit to the data set.
Solution.
A linear fit to the data will be provided by the least squares solution to
Technology shows that
(ii)
Use appropriate technology to find the QR factorization of an appropriate matrix
Solution.
Technology shows that
Then we have that
just as in part i.
Subsection Summary
A least squares approximation to
is found by orthogonally projecting ontoIf the columns of
are linearly independent, then the least squares approximation to is-
The least squares solution to
where and minimizes the distance whereSo the least squares solution minimizes a sum of squares.
Exercises Exercises
1.
The University of Denver Infant Study Center investigated whether babies take longer to learn to crawl in cold months, when they are often bundled in clothes that restrict their movement, than in warmer months. The study sought a relationship between babies' first crawling age and the average temperature during the month they first try to crawl (about 6 months after birth). Some of the data from the study is in Table 26.12. Let
33 | 37 | 48 | 57 | |
33.83 | 33.35 | 33.38 | 32.32 |
(a)
Find the least squares line to fit this data. Plot the data and your line on the same set of axes. (We aren't concerned about whether a linear fit is really a good choice outside of this data set, we just fit a line to it to see what happens.)
(b)
Use your least squares line to predict the average crawling age when the temperature is 65.
2.
The cost, in cents, of a first class postage stamp in years from 1981 to 1995 is shown in Table 26.13.
Year | 1981 | 1985 | 1988 | 1991 | 1995 |
Cost | 20 | 22 | 25 | 29 | 32 |
(a)
Find the least squares line to fit this data. Plot the data and your line on the same set of axes.
(b)
Now find the least squares quadratic approximation to this data. Plot the quadratic function on same axes as your linear function.
(c)
Use your least squares line and quadratic to predict the cost of a postage stamp in this year. Look up the cost of a stamp today and determine how accurate your prediction is. Which function gives a better approximation? Provide reasons for any discrepancies.
3.
According to The Song of Insects by G.W. Pierce (Harvard College Press, 1948) the sound of striped ground crickets chirping, in number of chirps per second, is related to the temperature. So the number of chirps per second could be a predictor of temperature. The data Pierce collected is shown in the table and scatterplot below, where
20.0 | 88.6 |
16.0 | 71.6 |
19.8 | 93.3 |
18.4 | 84.3 |
17.1 | 80.6 |
15.5 | 75.2 |
14.7 | 69.7 |
17.1 | 82.0 |
15.4 | 69.4 |
16.2 | 83.3 |
15.0 | 79.6 |
17.2 | 82.6 |
16.0 | 80.6 |
17.0 | 83.5 |
14.4 | 76.3 |
The relationship between
4.
We showed that if the columns of
5.
Consider the small data set of points
(a)
Find a linear system
(b)
Explain what happens when we attempt to find the least squares solution
(c)
Does the system
(d)
Fit a linear function of the form
6.
Let
(a)
Show that
(b)
Show that
For part, see Exercise 12 in Section 15.
(c)
Show that
7.
We have seen that if the columns of a matrix
is a least squares solution to
(a)
Explain why it is enough to show that the rank of the augmented matrix
(b)
Explain why
See Exercise 6.
(c)
Explain why
Use the definition of the matrix product.
(d)
Explain why
See Exercise 6.
(e)
Finally, explain why
Combine parts (b) and (d).
8.
If
(a)
Show that
(b)
In general, we define projection matrices as follows.
Definition 26.14.
A square matrix
(c)
Notice that the projection matrix from part (b) is not an orthogonal matrix.
Definition 26.15.
A square matrix
(d)
If
(e)
Recall the projection
9.
Label each of the following statements as True or False. Provide justification for your response.
(a) True/False.
If the columns of
(b) True/False.
Let
(c) True/False.
The least squares line to the data points
(d) True/False.
If the columns of a matrix
(e) True/False.
Every matrix equation of the form
(f) True/False.
If the columns of
Subsection Project: Other Least Squares Approximations
In this section we learned how to fit a polynomial function to a set of data in the least squares sense. But data takes on many forms, so it is important to be able to fit other types of functions to data sets. We investigate three different types of regression problems in this project.Project Activity 26.5.
The length of a species of fish is to be represented as a function of the age and water temperature as shown in the table on the next page. 47 The fish are kept in tanks at 25, 27, 29 and 31 degrees Celsius. After birth, a test specimen is chosen at random every 14 days and its length measured. The data include:
the index; the age of the fish in days; the water temperature in degrees Celsius; the length of the fish.
Since there are three variables in the data, we cannot perform a simple linear regression. Instead, we seek a model of the form
to fit the data, where
(a)
As we did when we fit polynomials to data, we start by considering what would happen if all of our data points satisfied our model function. In this case our data points have the form
(b)
Write the system from (a) in the form
(c)
The same derivation as with the polynomial regression models shows that the vector
Use this to find the least squares fit of the form
(d)
Provide a numeric measure of how well this model function fits the data. Explain.
Index | Age | Temp ( |
Length |
1 | 14 | 25 | 620 |
2 | 28 | 25 | 1315 |
3 | 41 | 25 | 2120 |
4 | 55 | 25 | 2600 |
5 | 69 | 25 | 3110 |
6 | 83 | 25 | 3535 |
7 | 97 | 25 | 3935 |
8 | 111 | 25 | 4465 |
9 | 125 | 25 | 4530 |
10 | 139 | 25 | 4570 |
11 | 153 | 25 | 4600 |
12 | 14 | 27 | 625 |
13 | 28 | 27 | 1215 |
14 | 41 | 27 | 2110 |
15 | 55 | 27 | 2805 |
16 | 69 | 27 | 3255 |
17 | 83 | 27 | 4015 |
18 | 97 | 27 | 4315 |
19 | 111 | 27 | 4495 |
20 | 125 | 27 | 4535 |
21 | 139 | 27 | 4600 |
22 | 153 | 27 | 4600 |
23 | 14 | 29 | 590 |
24 | 28 | 29 | 1305 |
25 | 41 | 29 | 2140 |
26 | 55 | 29 | 2890 |
27 | 69 | 29 | 3920 |
28 | 83 | 29 | 3920 |
29 | 97 | 29 | 4515 |
30 | 111 | 29 | 4520 |
31 | 125 | 29 | 4525 |
32 | 139 | 29 | 4565 |
33 | 153 | 29 | 4566 |
34 | 14 | 31 | 590 |
35 | 28 | 31 | 1205 |
36 | 41 | 31 | 1915 |
37 | 55 | 31 | 2140 |
38 | 69 | 31 | 2710 |
39 | 83 | 31 | 3020 |
40 | 97 | 31 | 3030 |
41 | 111 | 31 | 3040 |
42 | 125 | 31 | 3180 |
43 | 139 | 31 | 3257 |
44 | 153 | 31 | 3214 |

Project Activity 26.6.
Population growth is typically not well modeled by polynomial functions. Populations tend to grow at rates proportional to the population, which implies exponential growth. For example, Table 26.16 shows the approximate population of the United States in years between 1920 and 2000, with the population measured in millions.
Year | 1920 | 1930 | 1940 | 1950 | 1960 | 1970 | 1980 | 1990 | 2000 |
Population | 106 | 123 | 142 | 161 | 189 | 213 | 237 | 259 | 291 |
If we assume the population grows exponentially, we would want to find the best fit function
Project Activity 26.7.
Carl Friedrich Gauss is often credited with inventing the method of least squares. He used the method to find a best-fit ellipse which allowed him to correctly predict the orbit of the asteroid Ceres as it passed behind the sun in 1801. (Adrien-Marie Legendre appears to be the first to publish the method, though.) Here we examine the problem of fitting an ellipse to data.
An ellipse is a quadratic equation that can be written in the form
for constants
A picture of the best fit ellipse is shown in Figure 26.17.
(a)
Find the system of linear equations that would result if the ellipse (26.10) were to exactly pass through the given points.
(b)
Write the linear system from part (a) in the form
(c)
Find the least squares ellipse to this set of points. Make sure your method is clear. (Note that we are really fitting a surface of the form
cdc.gov/growthcharts/html_charts/lenageinf.htm
macrotrends.net/countries/USA/united-states/life-expectancy