Skip to main content

Section 2 The Matrix Representation of a Linear System

Subsection Application: Approximating Area Under a Curve

We know from basic geometry how to find areas of circles and triangles. However, it is much more difficult to find areas of other geometric objects. In fact, it is generally an impossible problem to determine the exact area bounded by a complicated curve. For this reason, approximation methods are used. One such method involves approximating curves using quadratic functions.

Unless you have learned some calculus, you have probably never calculated the area under a parabola. In the ancient work Quadrature of the Parabola (3rd century BC), Archimedes determined a method for finding the area of a region bounded by a parabola by using mechanics and then by geometric methods. Once we know how to calculate the area of a region bounded by a parabola, Simpson's Rule uses parabolas to approximate a function, and then approximates the area under the graph of the graph of the function by using the areas under the parabolas. In order to use Simpsons Rule, we need to know how to exactly fit a quadratic function to three points. More details about this process can be found at the end of this section. This idea of fitting a polynomial to a set of data points has uses in other areas as well. For example, two common applications of Bézier curves are font design and drawing tools. When fitting a polynomial to a large set of data points, our systems of equations can become quite large, and can be difficult to solve by hand. In this section we will see how to use matrices to more conveniently represent systems of equations of any size. We also consider how the elimination process works on the matrix representation of a linear system and how we can determine the existence of solutions and the form of solutions of a linear system.

Subsection Introduction

When working with a linear system, the labels for the variables are irrelevant to the solution — the only thing that matters is the coefficients of the variables in the equations and the constants on the other side of the equations. For example, given a linear system of the form

\begin{alignat}{3} a_2 \amp {}-{} \amp a_1 \amp {}+{} \amp a_0 \amp = 2\notag\\ a_2 \amp {}+{} \amp a_1 \amp {}+{} \amp a_0 \amp = 6\tag{2.1}\\ 4a_2 \amp {}+{} \amp 2a_1 \amp {}+{} \amp a_0 \amp = 5\text{,}\notag \end{alignat}

the important information in the system can be represented as

\begin{align*} 1 \amp \amp -1 \amp \amp 1 \amp \amp 2\\ 1 \amp \amp 1 \amp \amp 1 \amp \amp 6\\ 4 \amp \amp 2 \amp \amp 1 \amp \amp 5 \end{align*}

where we interpret the first three numbers in each horizontal row to represent the coefficients of the variables \(a\text{,}\) \(b\) and \(c\text{,}\) respectively, and the last number to be the constant on the right hand side of the equation. This tells us that we can record all the necessary information about our system in a rectangular array of numbers. Such an array is called a matrix.

Definition 2.1.

A matrix is a rectangular array of quantities or expressions.

We usually delineate a matrix by enclosing its entries in square brackets \([ * ]\text{.}\) For the system in (2.1), there are two corresponding matrices:

The matrix on the left is the matrix of the coefficients of the system, and is called the coefficient matrix of the system. The matrix on the right is the matrix of coefficients and the constants, and is called the augmented matrix of the system (where we say we augment the coefficient matrix with the additional column of constants). We will separate the augmented column from the coefficient matrix with a vertical line to keep it clear that the last column is an augmented column of constants and not a column of coefficients. 7 

Terminology.

There is some important terminology related to matrices.

  • Any number in a matrix is called an entry of the matrix.

  • The collection of entries in an augmented matrix that corresponds to a given equation (that is reading the entries from left to right, or a horizontal set of entries) is called a row of the matrix. We number the rows from top to bottom in a matrix. For example, \(\left[ \begin{array}{crc} 1\amp -1\amp 1 \end{array} \right]\) is the first row and \(\left[ \begin{array}{ccc} 1\amp 1\amp 1 \end{array} \right]\) is the second row of the coefficient matrix of the system (2.1).

  • The set of entries as we read from top to bottom (or a vertical set of entries that correspond to one fixed variable or the constants on the right hand sides of the equations) is called a column of the matrix. We number the columns from left to right in a matrix. For example, \(\left[ \begin{array}{c} 1 \\ 1 \\ 4 \end{array} \right]\) is the first column and \(\left[ \begin{array}{c} 1 \\ 1 \\ 1 \end{array} \right]\) is the third column of the coefficient matrix of the system (2.1).

  • The size of a matrix is given as \(m\times n\) where \(m\) is the number of rows and \(n\) is the number of columns. The coefficient matrix above is a \(3\times 3\) matrix since it has 3 rows and 3 columns, while the augmented matrix is a \(3\times 4\) matrix as it has 4 columns.

Preview Activity 2.1.

(a)

Write the augmented matrix for the following linear system. If needed, rearrange an equation to ensure that the variables appear in the same order on the left side in each equation with the constants being on the right hand side of each equation.

\begin{align} -x_3 + 3 + 2x_2\amp = -x_1\notag\\ -3 + 2x_3 \amp = -x_2\tag{2.2}\\ -2x_2 + x_1 \amp = 3x_3-7\notag \end{align}
(b)

Write the linear system in variables \(x_1, x_2\) and \(x_3\text{,}\) appearing in the natural order that corresponds to the following augmented matrix. Then solve the linear system using the elimination method.

\begin{equation*} \left[ \begin{array}{rrr|r} 1 \amp 1 \amp -1 \amp 4 \\ 1 \amp 2 \amp 2 \amp 3 \\ 2 \amp 3 \amp -3 \amp 11 \end{array} \right] \end{equation*}
(c)

Consider the three types of elementary operations on systems of equations introduced in Section 1. Each row of an augmented matrix of a system corresponds to an equation, so each elementary operation on equations corresponds to an operation on rows (called row operations).

(i)

Describe the row operation that corresponds to interchanging two equations.

(ii)

Describe the row operation that corresponds to multiplying an equation by a nonzero scalar.

(iii)

Describe the row operation that corresponds to replacing one equation by the sum of that equation and a scalar multiple of another equation.

Subsection Simplifying Linear Systems Represented in Matrix Form

Once we have stored the information about a linear system in an augmented matrix, we can perform the elementary operations directly on the augmented matrix.

Recall that the allowable operations on a system of equations are the following:

  1. Replacing one equation by the sum of that equation and a scalar multiple of another equation.

  2. Interchanging the positions of two equations.

  3. Replacing an equation by a nonzero scalar multiple of itself.

Recall that we use these elementary operations to transform a system, with the ultimate goal of finding a simpler, equivalent system that we can solve. Since each row of an augmented matrix corresponds to an equation, we can translate these operations on equations to corresponding operations on rows (called row operations or elementary row operations):

  1. Replacing one row by the sum of that row and a scalar multiple of another row.

  2. Interchanging two rows.

  3. Replacing a row by a nonzero scalar multiple of itself.

Activity 2.2.

Consider the system

\begin{alignat*}{3} {}a_2 \amp {}-{} \amp {}a_1 \amp {}+{} \amp {}a_0 \amp = 2\\ {}a_2 \amp {}+{} \amp {}a_1 \amp {}+{} \amp {}a_0 \amp = 6\\ 4a_2 \amp {}+{} \amp 2a_1 \amp {}+{} \amp {}a_0 \amp = 5 \end{alignat*}

with corresponding augmented matrix

\begin{equation*} \left[ \begin{array}{crc|c} 1 \amp -1 \amp 1 \amp 2 \\ 1 \amp 1 \amp 1 \amp 6 \\ 4 \amp 2 \amp 1 \amp 5 \end{array} \right] \end{equation*}
(a)

As a first step in solving our system, we might eliminate \(a_2\) from the second equation. This means that the corresponding entry in the second row and first column of the augmented matrix will become 0. Find a row operation that adds a multiple of the first row to the second row to achieve this goal. Then write the system of equations that corresponds to this new augmented matrix.

(b)

Now that we have eliminated the \(a_2\) terms from the second equation, we eliminate the \(a_2\) term from the third equation. Find an appropriate row operation that does that, and write the corresponding system of linear equations that corresponds to the new augmented matrix.

(c)

Now you should have a system in which the last two rows correspond to a system of 2 linear equations in two unknowns. Use a row operation that adds a multiple of the second row to the third row to turn the coefficient of \(a_1\) in the third row to 0. Then write the corresponding system of linear equations.

(d)

Your simplified system and its augment matrix are in row echelon form and this system is solvable using back-substitution (substituting the known variable values into the previous equation to find the value of another variable). Solve the system.

Reflection 2.2.

Do you see how this standard elimination process can be generalized to any linear system with any number of variables to produce a simplified system? Do you see why the process does not change the solutions of the system? If needed, can you modify the standard elimination process to obtain a simplified system in which the last equation contains only the variable \(a_2\text{,}\) the next to last equation contains only the variables \(a_1, a_2\text{,}\) etc.? Understanding the standard process will enable you to be able to modify it, if needed, in a problem.

Activity 2.2 illustrates how we can perform all of the operations on equations with operations on the rows of augmented matrices to reduce a system to a solvable form. Each time we perform an operation on the system of equations (or on the rows of an augmented matrix) we obtain an equivalent system (or an augmented matrix corresponding to an equivalent system). For completeness, we list the operations on equations and the corresponding row operations below that can be used to solve our polynomial fitting system. Throughout the process we will let \(E_1\text{,}\) \(E_2\text{,}\) and \(E_3\) be the first, second, and third equations in the system and \(R_1\text{,}\) \(R_2\text{,}\) and \(R_3\) the first, second, and third rows of the augmented matrices. The notation \(E_1+E_2\) placed next to equation \(E_2\) means means that we replace the second equation in the system with the sum of the first two equations. We start with the system

\begin{alignat*}{3} {}a_2 \amp {}-{} \amp {}a_1 \amp {}+{} \amp {}a_0 \amp = 2\\ {}a_2 \amp {}+{} \amp {}a_1 \amp {}+{} \amp {}a_0 \amp = 6\\ 4a_2 \amp {}+{} \amp 2a_1 \amp {}+{} \amp {}a_0 \amp = 5 \end{alignat*}

On the left we demonstrate the operations on equations and on the right the corresponding operations on rows of the augmented matrix.

\(\begin{array}{c} \text{ } \\ E_2-E_1\to E_2 \\ \text{ } \end{array}\)

\begin{alignat*}{3} {}a_2 \amp {}-{} \amp {}a_1 \amp {}+{} \amp {}a_0 \amp {}= 2\\ {} \amp {} \amp 2a_1 \amp {} \amp {} \amp {}= 4\\ 4a_2 \amp {}+{} \amp 2a_1 \amp {}+{} \amp {}a_0 \amp {}= 5 \end{alignat*}

\(R_2-R_1\to R_2\)

\begin{equation*} \left[ \begin{array}{rrr|r} 1 \amp -1 \amp 1 \amp 2 \\ 0 \amp 2 \amp 0 \amp 4 \\ 4 \amp 2\amp 1 \amp 5 \end{array} \right] \end{equation*}

\(\begin{array}{c} \text{ } \\ \text{ } \\ E_3-4E_1\to E_3 \end{array}\)

\begin{alignat*}{4} {}a_2 \amp {}-{} \amp {}a_1 \amp {}+{} \amp {}a_0 \amp {}= \amp \ {}\amp 2\\ {} \amp {} \amp 2a_1 \amp {} \amp {} \amp {}= \amp \ {}\amp 4\\ {} \amp {} \amp 6a_1 \amp {}-{} \amp 3a_0 \amp {}= \amp \ {-}\amp 3 \end{alignat*}

\(\begin{array}{c} \text{ } \\ \text{ } \\ R_3-4R_1\to R_3 \end{array}\)

\begin{equation*} \left[ \begin{array}{rrr|r} 1 \amp -1 \amp 1 \amp 2 \\ 0 \amp 2 \amp 0 \amp 4 \\ 0 \amp 6 \amp -3 \amp -3 \end{array} \right] \end{equation*}

\(\begin{array}{c} \text{ } \\ \text{ } \\ E_3-3E_2\to E_3 \end{array}\)

\begin{alignat*}{5} {}a_2 \amp {}-{} \amp {}a_1 \amp {}+{} \amp {}a_0 \amp {}= \amp \ {}\amp \amp {}2\\ {} \amp {} \amp 2a_1 \amp {} \amp {} \amp {}= \amp \ {}\amp \amp {}4\\ {} \amp {} \amp {} \amp {}-{} \amp 3a_0 \amp {}= \amp \ {-}\amp \amp {15} \end{alignat*}

\(\begin{array}{c} \text{ } \\ \text{ } \\ R_3-3R_2\to R_3 \end{array}\)

\begin{equation*} \left[ \begin{array}{rrr|r} 1 \amp -1 \amp 1 \amp 2 \\ 0 \amp 2 \amp 0 \amp 4 \\ 0 \amp 0 \amp -3 \amp -15 \end{array} \right] \end{equation*}

Now we can solve the last equation for \(a_0\) to find that \(a_0=5\text{.}\) The second equation gives us \(a_1 = 2\text{.}\) 8  Finally, using the first equation with the already determined values of \(a_0\) and \(a_1\) gives us \(a_2=-1\text{.}\) Thus we have found the solution to the polynomial fitting system to be \(a_2=-1\text{,}\) \(a_1=2\text{,}\) and \(a_0=5\text{.}\)

We summarize the steps of the (partial) elimination on matrices we used above to solve a general linear system in the variables \(x_1\text{,}\) \(x_2\text{,}\) \(\ldots\text{,}\) \(x_n\text{.}\)

  1. Interchange equations if needed to ensure that the coefficient of \(x_1\) (or, more generally, the first non-zero variable) in the first equation is non-zero.

  2. Use the first equation to eliminate \(x_1\) (or, the first non-zero variable) from other equations by adding a multiple of the first equation to the others.

  3. After \(x_1\) is eliminated from all equations but the first equation, focus on the rest of the equations. Repeat the process of elimination on these equations to eliminate \(x_2\) (or, the next non-zero variable) all but the second equation.

  4. Once the process of eliminating variables recursively is finished, solve for the variables in a backwards fashion starting with the last equation and substituting known values in the equations above as they become known.

This elimination method where the variables are eliminated from lower equations is called the forward elimination phase as it eliminates variables in the forward direction. Solving for variables using substitution into upper equations is called back substitution. The matrix representation of a linear system after the forward elimination process is said to be in row echelon form. We will define this form and the elimination process on the matrices more precisely in the next section.

Subsection Linear Systems with Infinitely Many Solutions

Each of the systems that we solved so far have had a unique (exactly one) solution. The geometric representation of linear systems with two equations in two variables shows that this does not always have to be the case. We also have linear systems with no solution and systems with infinitely many solutions. We now consider the problem of how to represent the set of solutions of a linear system that has infinitely many solutions. (Systems with infinitely many solutions will also be of special interest to us a bit later when we study eigenspaces of a matrix.)

Activity 2.3.

Consider the system

\begin{alignat*}{4} {}x_1 \amp {}+{} \amp {2}x_2 \amp {}-{} \amp {}x_3 \amp {}= \amp \ 1\amp {}\\ {}x_1 \amp {}+{} \amp {}x_2 \amp {}-{} \amp {3}x_3 \amp {}= \amp \ 0\amp {}\\ {2}x_1 \amp {}+{} \amp {3}x_2 \amp {}-{} \amp {4}x_3 \amp {}= \amp \ 1\amp {.} \end{alignat*}
(a)

Without explicitly solving the system, check that \((-1,1,0)\) and \((4,-1,1)\) are solutions to this system.

(b)

Without explicitly solving the system, show that \(x_1 = -1+5t\text{,}\) \(x_2 = 1-2t\text{,}\) and \(x_3 = t\) is a solution to this system for any value of \(t\text{.}\) What values of \(t\) yield the solutions \((-1,1,0)\) and \((4,-1,1)\) from part (a)? The equations \(x_1 = -1+5t\text{,}\) \(x_2 = 1-2t\text{,}\) and \(x_3 = t\) form what is called a parametric solution to the system with parameter \(t\).

(c)

Part (b) shows that our system has infinitely many solutions. We were given solutions in part (b) — but how do we find these solutions and how do we know that these are all of the solutions? We address those questions now. If we apply row operations to the augmented matrix

\begin{equation*} \left[ \begin{array}{ccr|c} 1\amp 2\amp -1\amp 1 \\ 1\amp 1\amp -3\amp 0 \\ 2\amp 3\amp -4\amp 1 \end{array} \right] \end{equation*}

of this system, we can reduce this system to one with augmented matrix

\begin{equation*} \left[ \begin{array}{ccr|c} 1\amp 2\amp -1\amp 1 \\ 0\amp 1\amp 2\amp 1 \\ 0\amp 0\amp 0\amp 0 \end{array} \right]\text{.} \end{equation*}
(i)

What is it about this reduced form of the augmented matrix that indicates that the system has infinitely many solutions?

(ii)

Since the system has infinitely many solutions, we will not be able to explicitly determine values for each of the variables. Instead, at least one of the variables can be chosen arbitrarily. What is it about the reduced form of the augmented matrix that indicates that \(x_3\) is convenient to choose as the arbitrary variable?

(iii)

Letting \(x_3\) be arbitrary (we call \(x_3\) a free variable), use the second row to show that \(x_2 = 1-2x_3\) (so that we can write \(x_2\) in terms of the arbitrary variable \(x_3\)).

(iv)

Use the first row to show that \(x_1 = 5x_3-1\) (and we can write \(x_1\) in terms of the arbitrary variable \(x_3\)). Compare this to the solutions from part (b).

After using the elimination method, the first non-zero coefficient (from the left) of each equation in the linear system is in a different position. We call each such coefficient a pivot and a variable corresponding to a pivot a basic variable. In the system

\begin{alignat*}{5} {}a_2 \amp {}-{} \amp {}a_1 \amp {}+{} \amp {}a_0 \amp {}= \amp \ {}\amp \amp {}2\\ {} \amp {} \amp {2}a_1 \amp {} \amp {} \amp {}= \amp \ {}\amp \amp {}4\\ {} \amp {} \amp {} \amp {}-{} \amp {3}a_0 \amp {}= \amp \ {-}\amp \amp {15} \end{alignat*}

the basic variables are \(a_2, a_1, a_0\) for the first, second, and third equations, respectively. In the system,

\begin{alignat*}{4} {}x_1 \amp {}+{} \amp {2}x_2 \amp {}-{} \amp {}x_3 \amp {}= \amp \ 1\\ {} \amp {} \amp {}x_2 \amp {}+{} \amp {2}x_3 \amp {}= \amp \ 1\\ {} \amp {} \amp {} \amp {} \amp {}0 \amp {}= \amp 0 \end{alignat*}

the basic variables are \(x_1\) and \(x_2\) for the first and second equations, respectively, while the third equation does not have a basic variable. Through back-substitution, we can solve for each variable in a unique way if each appears as the basic variable in an equation. If, however, a variable is free, meaning that it is not the basic variable of an equation, we cannot solve for that variable explicitly. We instead assign a distinct parameter to each such free variable and solve for the basic variables in terms of these parameters.

Definition 2.3.

The first non-zero coefficient (from the left) in an equation in a linear system after elimination is called a pivot. A variable corresponding to a pivot is a basic variable and while a variable not corresponding to a pivot is a free variable.

Activity 2.4.

Each matrix is an augmented matrix for a linear system after elimination. Identify the basic variables (if any) and free variables (if any). Then write the general solution (if there is a solution) expressing all variables in terms of the free variables. Use any symbols you like for the variables.

(a)

\(\left[ \begin{array}{ccc|c} 1\amp 0\amp 2\amp 1 \\ 0\amp 3\amp 1\amp 0 \\ 0\amp 0\amp 0\amp 0 \end{array} \right]\)

(b)

\(\left[ \begin{array}{ccr|c} 1\amp 0\amp -1\amp 1 \\ 0\amp 0\amp 1\amp 2 \\ 0\amp 0\amp 0\amp 0 \end{array} \right]\)

(c)

\(\left[ \begin{array}{ccrc|c} 1\amp 2\amp -1\amp 1\amp 1 \\ 0\amp 1\amp 0\amp 2\amp 1 \\ 0\amp 0\amp 0\amp 0\amp 0 \\ 0\amp 0\amp 0\amp 0\amp 0 \end{array} \right]\)

Reflection 2.4.

Does the existence of a row of 0's always mean a free variable? Can you think of an example where there is a row of 0's but none of the variables is free? How do the numbers of equations and the variables compare in that case?

Subsection Linear Systems with No Solutions

We saw in the previous section that geometrically two parallel and distinct lines represent a linear system with two equations in two unknowns which has no solution. Similarly, two parallel and distinct planes in three dimensions represent a linear system with two equations in three unknowns which has no solution. We can have at least four different geometric configurations of three planes in three dimensions representing a system with no solution. But how do these geometrical configurations manifest themselves algebraically?

Activity 2.5.

Consider the linear system

\begin{alignat*}{4} x_1 \amp {}-{} \amp x_2 \amp {}+{} \amp {}x_3 \amp {}={} \amp \ 2 \amp {}\\ x_1 \amp {}+{} \amp x_2 \amp {}-{} \amp {3}x_3 \amp {}={} \amp \ 1 \amp {}\\ {3}x_1 \amp {}-{} \amp x_2 \amp {}-{} \amp {}x_3 \amp {}={} \amp \ 6 \amp {.} \end{alignat*}
(a)

Apply the elimination process to the augmented matrix of this system. Write the system of equations that corresponds to the final reduced matrix.

(b)

Discuss which feature in the final simplified system makes it easy to determine that the system has no solution. Similarly, what features in the matrix representation makes is easy to see the system has no solution?

We summarize our observations about when a system has a solution, and which of those cases has a unique solution.

Subsection Examples

What follows are worked examples that use the concepts from this section.

Example 2.6.

Consider the linear system

\begin{alignat*}{5} {}x_1 \amp {}-{} \amp { }x_2 \amp {}{} \amp {} \amp {}+{} \amp {2}x_4 \amp {}= \amp \ 1\amp {}\\ {2}x_1 \amp {}+{} \amp {3}x_2 \amp {}-{} \amp {2}x_3 \amp {}+{} \amp {5}x_4 \amp {}= \amp \ 4\amp {}\\ {}x_1 \amp {}-{} \amp {}x_2 \amp {}+{} \amp {}x_3 \amp {}-{} \amp { }x_4 \amp {}= \amp \ 0\amp {}\\ x_1 \amp {}+{} \amp {}x_2 \amp {}-{} \amp {}x_3 \amp {}+{} \amp {6}x_4 \amp {}= \amp \ 5\amp {.} \end{alignat*}
(a)

Set up the augmented matrix for this linear system.

Solution.

The augmented matrix for this system is

\begin{equation*} \left[ \begin{array}{crrr|c} 1\amp -1\amp 0\amp 2\amp 1 \\ 2\amp 3\amp -2\amp 5\amp 4 \\ 1\amp -1\amp 1\amp -1\amp 0 \\ 4\amp 1\amp -1\amp 6\amp 5 \end{array} \right]\text{.} \end{equation*}
(b)

Find all solutions to the system using forward elimination.

Solution.

We apply forward elimination, first making the entries below the 1 in the upper left all 0. We do this by replacing row two with row two minus 2 times row 1, row three with row three minus row 1, and row four with row four minus 4 row one. This produces the augmented matrix

\begin{equation*} \left[ \begin{array}{crrr|r} 1\amp -1\amp 0\amp 2\amp 1 \\ 0\amp 5\amp -2\amp 1\amp 2 \\ 0\amp 0\amp 1\amp -3\amp -1 \\ 0\amp 5\amp -1\amp -2\amp 1 \end{array} \right]\text{.} \end{equation*}

Now we eliminate the leading 5 in the fourth row by replacing row four with row four minus row two to obtain the augmented matrix

\begin{equation*} \left[ \begin{array}{crrr|r} 1\amp -1\amp 0\amp 2\amp 1 \\ 0\amp 5\amp -2\amp 1\amp 2 \\ 0\amp 0\amp 1\amp -3\amp -1 \\ 0\amp 0\amp 1\amp -3\amp -1 \end{array} \right]\text{.} \end{equation*}

When we replace row four with row four minus row three, we wind up with a row of zeros:

\begin{equation*} \left[ \begin{array}{crrr|r} 1\amp -1\amp 0\amp 2\amp 1 \\ 0\amp 5\amp -2\amp 1\amp 2 \\ 0\amp 0\amp 1\amp -3\amp -1 \\ 0\amp 0\amp 0\amp 0\amp 0 \end{array} \right]\text{.} \end{equation*}

We see that there is no pivot in column four, so \(x_4\) is a free variable. We can solve for the other variables in terms of \(x_4\text{.}\) The third row shows us that

\begin{align*} x_3 - 3x_4 \amp = -1\\ x_3 \amp = 3x_4 - 1\text{.} \end{align*}

The second row tells us that

\begin{alignat*}{1} 5x_2 - 2x_3 + x_4 \amp = 2\\ 5x_2 \amp = 2x_3 - x_4 + 2\\ 5x_2 \amp = 2(3x_4-1) - x_4 + 2\\ 5x_2 \amp = 5x_4\\ x_2 \amp = x_4\text{.} \end{alignat*}

Finally, the first row gives us

\begin{alignat*}{1} x_1-x_2+2x_4 \amp = 1\\ x_1 \amp = x_2 - 2x_4 + 1\\ x_1 \amp = x_4 - 2x_4 + 1\\ x_1 \amp = -x_4 + 1\text{.} \end{alignat*}

So this system has infinitely many solutions, with \(x_1 = -x_4 + 1\text{,}\) \(x_2 = x_4\text{,}\) \(x_3 = 3x_4 - 1\text{,}\) and \(x_4\) is arbitrary. As a check, notice that

\begin{equation*} (-x_4+1) - x_4 + 2x_4 = 1 \end{equation*}

and so this solution satisfies the first equation in our system. You should check to verify that it also satisfies the other three equations.

(c)

Suppose, after forward elimination, the augmented matrix of the system

\begin{alignat*}{5} {}x_1 \amp {}-{} \amp { }x_2 \amp {}{} \amp {} \amp {}+{} \amp {2}x_4 \amp {}= \amp \ 1\amp {}\\ {2}x_1 \amp {}+{} \amp {3}x_2 \amp {}-{} \amp {2}x_3 \amp {}+{} \amp {5}x_4 \amp {}= \amp \ 4\amp {}\\ {}x_1 \amp {}-{} \amp {}x_2 \amp {}+{} \amp {}x_3 \amp {}-{} \amp { }x_4 \amp {}= \amp \ 0\amp {}\\ x_1 \amp {}+{} \amp {}x_2 \amp {}-{} \amp {}x_3 \amp {}+{} \amp {6}x_4 \amp {}= \amp \ h\amp {.} \end{alignat*}

has the form

\begin{equation*} \left[ \begin{array}{crrr|c} 1\amp -1\amp 0\amp 2\amp 1 \\ 0\amp 5\amp -2\amp 1\amp 2 \\ 0\amp 0\amp 1\amp -3\amp -1 \\ 0\amp 0\amp 0\amp 0\amp h-5 \end{array} \right]\text{.} \end{equation*}

For which values of \(h\) does this system have:

(i)

No solutions?

Solution.

The system has no solutions when there is an equation of the form \(0 = b\) for some nonzero number \(b\text{.}\) The last row will correspond to an equation of the form \(0 = h-5\text{.}\) So our system will have no solutions when \(h \neq 5\text{.}\)

(ii)

A unique solution? Find the solution.

Solution.

When \(h \neq 5\text{,}\) the system has no solutions. When \(h = 5\text{,}\) the variable \(x_4\) is a free variable and the system has infinitely many solutions. So there are no values of \(h\) for which the system has exactly one solution.

(iii)

Infinitely many solution? Determine all solutions?

Solution.

When \(h = 5\text{,}\) the variable \(x_4\) is a free variable and the system has infinitely many solutions. The solutions were already found in part (a).

Example 2.7.

After applying row operations to the augmented matrix of a system of linear equations, each of which describes a plane in 3-space, the following augmented matrix was obtained:

\begin{equation*} \left[ \begin{array}{ccc|r} 1\amp a\amp 0\amp 2 \\ 0\amp 2-2a\amp b\amp -4 \\ 0\amp 0\amp 3-\frac{1}{2}b\amp 1 \end{array} \right]\text{.} \end{equation*}
(a)

Describe, algebraically and geometrically, all solutions (if any), to this system when \(a=0\) and \(b=2\text{.}\)

Solution.

Throughout, we will let the variables \(x\text{,}\) \(y\text{,}\) and \(z\) correspond to the first, second, and third columns, respectively, of our augmented matrix.

When \(a=0\) and \(b=2\) our augmented matrix has the form

\begin{equation*} \left[ \begin{array}{ccc|r} 1\amp 0\amp 0\amp 2 \\ 0\amp 2\amp 2\amp -4 \\ 0\amp 0\amp 2\amp 1 \end{array} \right]\text{.} \end{equation*}

This matrix corresponds to the system

\begin{alignat*}{4} {}x \amp {}{} \amp {} \amp {}{} \amp {} \amp {}= \amp \ 2\amp {}\\ {} \amp {}{} \amp {2}y \amp {}+{} \amp {2}z \amp {}= \amp \ -4\amp {}\\ {} \amp {}{} \amp {} \amp {}{} \amp {2}z \amp {}= \amp \ 1\amp {.} \end{alignat*}

There are no equations of the form \(0 = b\) for a nonzero constant \(b\text{,}\) so the system is consistent. There are no free variables, so the system has a unique solution. Algebraically, the solution is \(x = 2\text{,}\) \(z = \frac{1}{2}\text{,}\) and \(y = -\frac{5}{2}\text{.}\) Geometrically, this tells us that the three planes given by the original system intersect in a single point.

(b)

Describe, algebraically and geometrically, all solutions (if any), to this system when \(a=0\) and \(b=6\text{.}\)

Solution.

Throughout, we will let the variables \(x\text{,}\) \(y\text{,}\) and \(z\) correspond to the first, second, and third columns, respectively, of our augmented matrix.

When \(a=0\) and \(b=6\) our augmented matrix has the form

\begin{equation*} \left[ \begin{array}{ccc|r} 1\amp 0\amp 0\amp 2 \\ 0\amp 2\amp 6\amp -4 \\ 0\amp 0\amp 0\amp 1 \end{array} \right]\text{.} \end{equation*}

The last row corresponds to the equation \(0 = 1\text{,}\) so our system is inconsistent and has no solution. Geometrically, this tells us that the three planes given by the original system do not all intersect at any common points.

(c)

Describe, algebraically and geometrically, all solutions (if any), to this system when \(a=1\) and \(b=12\text{.}\)

Solution.

Throughout, we will let the variables \(x\text{,}\) \(y\text{,}\) and \(z\) correspond to the first, second, and third columns, respectively, of our augmented matrix.

When \(a=1\) and \(b=12\) our augmented matrix reduces to

\begin{equation*} \left[ \begin{array}{ccc|r} 1\amp 1\amp 0\amp 2 \\ 0\amp 0\amp 1\amp -\frac{1}{3} \\ 0\amp 0\amp 0\amp 0 \end{array} \right]\text{.} \end{equation*}

There are no rows that correspond to equations of the form \(0 = c\) for a nonzero constant \(c\text{,}\) so the system is consistent. The variable \(y\) is a free variable, so the system has infinitely many solutions. Algebraically, the solutions are \(y\) is free, is \(z = -\frac{1}{3}\text{,}\) and \(x = 2-y\text{.}\) Geometrically, this tells us that the three planes given by the original system intersect in the line with \(z = -\frac{1}{3}\text{,}\) and \(x = 2-y\text{.}\)

Subsection Summary

  • A matrix is just a rectangular array of numbers or objects.

  • Given a system of linear equations, with the variables listed in the same order in each equation, we represent the system by writing the coefficients of the first equation as the first row of a matrix, the coefficients of the second equation as the second row, and so on. This creates the coefficient matrix of the system. We then augment the coefficient matrix with a column of the constants that appear in the equations. This gives us the augmented matrix of the system.

  • The operations that we can perform on equations translate exactly to row operations that we can perform on an augmented matrix:

    1. Replacing one row by the sum of that row and a scalar multiple of another row.

    2. Interchanging two rows.

    3. Replacing a row by a nonzero scalar multiple of itself.

  • The forward elimination phase of the elimination method recursively eliminates the variables in a linear system to reach an equivalent but simplified system.

  • The first non-zero entry in an equation in a linear system after elimination is called a pivot.

  • A basic variable in a linear system corresponds to a pivot of the system. A free variable is a variable that is not basic.

  • A linear system can be inconsistent (no solutions), have a unique solution (if consistent and every variable is a basic variable), or have infinitely many solutions (if consistent and there is a free variable).

  • A linear system has no solutions if, after elimination, there is an equation of the form \(0=b\) where \(b\) is a nonzero number.

  • A linear system after the elimination method can be solved using back-substitution. The free variables can be chosen arbitrarily and the basic variables can be solved in terms of the free variables through the back-substitution process.

Exercises Exercises

1.

Consider the system of linear equations whose augmented matrix is

\begin{equation*} \left[ \begin{array}{cc|r} 1 \amp 3 \amp -1 \\ 2\amp h \amp k \end{array} \right] \end{equation*}

where \(h\) and \(k\) are unknown constants. For which values of \(h\) and \(k\) does this system have

(a)

a unique solution,

(b)

infinitely many solutions,

(c)

no solution?

2.

Consider the following system:

\begin{alignat*}{5} {}x \amp {}-{} \amp {2}y \amp {}+{} \amp {}z \amp {}={} \amp \ {-}\amp 1\amp {}\\ {-}x \amp {}+{} \amp {}y \amp {}-{} \amp {3}z \amp {}={} \amp \ {}\amp 2\amp {}\\ {}x \amp {}+{} \amp {h}y \amp {}-{} \amp {}z \amp {}={} \amp \ {}\amp 0\amp {.} \end{alignat*}

Check that when \(h=-3\) the system has infinitely many solutions, while when \(h\neq -3\) the system has a unique solution.

3.

If possible, find a system of three equations (not in reduced form) in three variables whose solution set consists only of the point \(x_1=2, x_2=-1, x_3=0\text{.}\)

4.

What are the possible geometrical descriptions of the solution set of two linear equations in \(\R^3\text{?}\) (Recall that \(\R^3\) is the three-dimensional \(xyz\)-space — that is, the set of all ordered triples of the form \((x,y,z)\)).

5.

Two students are talking about when a linear system has infinitely many solutions.

Student 1: So, if we have a linear system whose augmented matrix has a row of zeros, then the system has infinitely many solutions, doesn't it?
Student 2: Well, but what if there is a row of the form \([\, 0\, 0\, \ldots\, 0\, |\, b\, ]\) with a non-zero \(b\) right above the row of 0's?
Student 1: OK, maybe I should ask “If we have a consistent linear system whose augmented matrix has a row of zeros, then the system has infinitely many solutions, doesn't it?”
Student 2: I don't know. It still doesn't sound enough to me, but I'm not sure why.
Is Student 1 right? Or is Student 2's hunch correct? Justify your answer with a specific example if possible.

6.

Label each of the following statements as True or False. Provide justification for your response.

(a) True/False.

A system of linear equations in two unknowns can have exactly five solutions.

(b) True/False.

A system of equations with all the right hand sides equal to 0 has at least one solution.

(c) True/False.

A system of equations where there are fewer equations than the number of unknowns (known as an underdetermined system) cannot have a unique solution.

(d) True/False.

A system of equations where there are more equations than the number of unknowns (known as an overdetermined system) cannot have a unique solution.

(e) True/False.

A consistent system of two equations in three unknowns cannot have a unique solution.

(f) True/False.

If a system with three equations and three unknowns has a solution, then the solution is unique.

(g) True/False.

If a system of equations has two different solutions, then it has infinitely many solutions.

(h) True/False.

If there is a row of zeros in the row echelon form of the augmented matrix of a system of equations, the system has infinitely many solutions.

(i) True/False.

If there is a row of zeros in the row echelon form of the augmented matrix of a system of \(n\) equations in \(n\) variables, the system has infinitely many solutions.

(j) True/False.

If a system has no free variables, then the system has a unique solution.

(k) True/False.

If a system has a free variable, then the system has infinitely many solutions.

Subsection Project: Polynomial Interpolation to Approximate the Area Under a Curve

Suppose we want to approximate the area of the region shown in Figure 2.8. As discussed in the introduction, we can approximate the area under a curve by approximating the curve by quadratics. First, we will see how Archimedes approached the problem of finding the area of a quadratic region, then we will determine how to determine a quadratic function that passes through three points, then we put it all together to approximate the area under a curve as in Figure 2.8.

Figure 2.8. A region whose area we want to approximate.

Archimedes approached the problem of calculating the area of a quadratic region in the following way. Given a quadratic \(q(x)\) on an interval \([a,b]\text{,}\) Archimedes drew in a base given by the secant line connecting the points \((a,q(a))\) and \((b,q(b))\) as illustrated at left in Figure 2.9. Then he found the point in the interval \([a,b]\) at which the tangent line to the curve is parallel to the secant line. Archimedes gave an argument using mechanics (based on balance points), and then another using geometry (through a method of exhaustion) to show that the area of the parabolic region is \(\frac{4}{3}\) times the area of the triangle determined by the endpoints and the point of tangency as shown at right in Figure 2.9.

Figure 2.9. Archimedes method.

Although we won't go through the details, a conclusion we can draw from Archimedes argument, using the formula for the area of a rectangle and the area of a triangle, is that the area between the graph of a quadratic with equation \(q(x) = ax^2+bx+c\) and the \(x\)-axis on an interval \([x_1, x_2]\) as illustrated in Figure 2.10 is

\begin{equation} \frac{a}{3}(x_2^3-x_1^3) + \frac{b}{2}(x_2^2-x_1^2) + c(x_2-x_1)\text{.}\tag{2.3} \end{equation}
Figure 2.10. Region between a parabola and the \(x\)-axis.

To approximate the area under the graph of a function, we will approximate the function itself with a collection of quadratics, and then use equation (2.3) repeatedly. To do this, we need to know how to fit a quadratic curve to a three points. We consider that question now.

Suppose we are given a collection of three points in the plane: \((x_1, y_1), (x_2, y_2)\) and \((x_3, y_3)\text{.}\) There is exactly one quadratic polynomial \(p(x)\) which goes through these points, i.e. there is exactly one quadratic \(p(x)\) such that for each \(x_i\text{,}\) \(p(x_i)=y_i\text{.}\) This is an example of polynomial curve fitting.

As an example, we use the points \((-1, 2)\text{,}\) \((1,6)\text{,}\) \((2,5)\text{.}\) To fit a quadratic to these points, consider a general quadratic of the form \(p(x)=a_2x^2+a_1x+a_0\text{.}\) By substituting the \(x\) value of each of the given points and setting that equal to the \(y\) value of that point, we find three equations

\begin{equation*} (-1)^2a_2-a_1+a_0=2 \; , \; a_2+a_1+a_0=6 \; , \; (2)^2a_2 +2a_1+a_0=5 \end{equation*}

that give us a system of three equations in the three unknowns \(a_2\text{,}\) \(a_1\text{,}\) and \(a_0\text{:}\)

\begin{alignat*}{3} {}a_2 \amp {}-{} \amp {}a_1 \amp {}+{} \amp {}a_0 \amp = 2\\ {}a_2 \amp {}+{} \amp {}a_1 \amp {}+{} \amp {}a_0 \amp = 6\\ a_2 \amp {}+{} \amp {2}a_1 \amp {}+{} \amp {}a_0 \amp = 5\text{.} \end{alignat*}

This system is the example we considered in Preview Activity 2.1, whose solution is \(a_2 = -1\text{,}\) \(a_1=2\text{,}\) and \(a_0 = 5\text{.}\) A graph of \(q(x) = -x^2+2x+5\) along with the three points \((-1, 2)\text{,}\) \((1,6)\text{,}\) \((2,5)\) is shown in Figure 2.11.

Figure 2.11. A quadratic fit to the points \((-1, 2)\text{,}\) \((1,6)\text{,}\) \((2,5)\text{.}\)

Now that we know how to fit a quadratic to three points, we next approximate a curve with a collection of quadratics. The method we use is to break the interval on which our curve is defined into several subintervals and create quadratics on each subinterval. The basic idea is contained in our first project activity.

Project Activity 2.6.

In this activity we model the function \(f\) defined by \(f(x) = \sin(2x)+2\) on the interval \([a,b]\text{,}\) where \(a = -\frac{\pi}{2}\) and \(b = \pi\) with a collection of quadratics. Let \(f(x) = \sin(x)\text{.}\) We divide the interval \([a,b]\) into three subintervals using the six points \(x_0 = -\frac{\pi}{2}\text{,}\) \(x_1 = -\frac{\pi}{4}\text{,}\) \(x_2 = 0\text{,}\) \(x_3 = \frac{\pi}{4}\text{,}\) \(x_4 = \frac{\pi}{2}\text{,}\) \(x_5 = \frac{3 \pi}{4}\text{,}\) and \(x_6 = \pi\text{.}\) We need three points to determine a quadratic, so the three subintervals of the interval \([a,b]\) will be the intervals \([x_0, x_2]\text{,}\) \([x_2, x_4]\text{,}\) and \([x_4,x_6]\text{.}\) An illustration of the process of dividing our interval \([a,b]\) and approximating by quadratics can be found at geogebra.org/m/spd4hhbw. Round all calculations in this activity to the nearest thousandth.

(a)

Set up a system of linear equations to fit a quadratic \(q_1(x) = r_1x^2+s_1x+t_1\) to the three points \((x_0, f(x_0))\text{,}\) \((x_1, f(x_1))\text{,}\) and \((x_2, f(x_2))\text{.}\) (The solution to this system to the nearest thousandth is \(r_1 = 2\text{,}\) \(s_1 \approx 2.546\text{,}\) and \(t_1 = 1.621\text{.}\))

(b)

Set up a system of linear equations to fit a quadratic \(q_2(x) = r_2x^2+s_2x+t_2\) to the three points \((x_2, f(x_2))\text{,}\) \((x_3, f(x_3))\text{,}\) and \((x_4, f(x_4))\text{.}\) (The solution to this system to the nearest thousandth is \(r_2 = 2\text{,}\) \(s_2 \approx 2.546\text{,}\) and \(t_2 \approx -1.621\text{.}\))

(c)

Set up a system of linear equations to fit a quadratic \(q_3(x) = r_3x^2+s_3x+t_3\) to the 3 points \((x_4, f(x_4))\text{,}\) \((x_5, f(x_5))\text{,}\) and \((x_6, f(x_6))\text{.}\) (The solution to this system to the nearest thousandth is \(r_3 \approx 10.000\text{,}\) \(s_3 \approx -7.639\text{,}\) and \(t_3 \approx 1.621\text{.}\))

(d)

Use the GeoGebra applet at geogebra.org/m/spd4hhbw to graph the three quadratics on their intervals on the same axes as the graph of \(f\text{.}\) Explain what you see.

Project Activity 2.6 illustrates how we can model a function on an interval using a sequence of quadratic functions. Now we apply this polynomial curve fitting technique to derive the general formula for approximating the area between a graph of a function \(f\) and the \(x\)-axis. We use parabolic arcs to approximate the graph of \(f\) on each subinterval.

We start by dividing the interval \([a,b]\) over which our function is defined into some number of subintervals. We need an even number of subintervals, since we have to use three points to define each parabola. Let \(n = 2m\) be the number of subintervals we use. In order to make the calculations a bit easier, let the subintervals all have the same length, which we denote by \(\Delta x\) (the symbol \(\Delta\) is often used in mathematics to indicate a change in a quantity). Since we have \(n\) subintervals, the length of each subinterval will be \(\Delta x = \frac{b-a}{n}\text{.}\) For each \(k\) we let \(x_k = a+k \Delta x\) and \(y_k = f(x_k)\text{.}\) Note that \(x_0 = a\) and \(x_n=b\text{.}\) This labeling scheme is illustrated in Figure 2.12

Figure 2.12. Subdividing the interval \([a,b]\text{.}\)

We approximate \(f\) on each subinterval using a quadratic. So we need to find the quadratic \(Q(x) = c_2x^2+c_1x+c_0\) that passes through two consecutive end points as well as the midpoint of a subinterval. That is, we need to find the coefficients of \(Q\) so that \(Q\) passes through the points \((x_k,y_k)\text{,}\) \((x_{k+2}, y_{k+2})\text{,}\) and the midpoint \((x_{k+1},y_{k+1})\) on the interval \([x_k, x_{k+2}]\) (so that we have three points to which to fit a parabola) as shown at left in Figure 2.13. Note that the length of the interval \([x_k, x_{k+2}]\) is \(2\Delta x\text{.}\) To make the calculations easier, we will translate our function so that our leftmost point is \((-r, y_k)\text{.}\) Then the middle point is \((0, y_{k+1})\) and the rightmost point is \((r, y_{k+2})\) as illustrated at right in Figure 2.13, where \(r = \Delta x\text{.}\)

Figure 2.13. Left: Three points. Right: Translated points.

Project Activity 2.7.

(a)

Set up a linear system that will determine the coefficients \(c_2\text{,}\) \(c_1\text{,}\) and \(c_0\) so that the polynomial \(Q(x) = c_2x^2+c_1x+c_0\) passes through the points \((-r, y_k)\text{,}\) \((0, y_{k+1})\text{,}\) and \((r, y_{k+2})\) with \(r \neq 0\text{.}\) Remember that the unknowns in this system are \(c_2\text{,}\) \(c_1\text{,}\) and \(c_0\text{.}\)

(b)

Explain why the coefficient matrix of the system in Task 2.7.a is \(\left[ \begin{array}{crc} r^2\amp -r\amp 1 \\ 0\amp 0\amp 1 \\ r^2\amp r\amp 1 \end{array} \right]\text{.}\) Then explain why row reducing the matrix \(\left[ \begin{array}{crcc} r^2\amp -r\amp 1\amp y_k \\ 0\amp 0\amp 1\amp y_{k+1} \\ r^2\amp r\amp 1\amp y_{k+2} \end{array} \right]\) will find the coefficients we want. Assume that a row echelon form of the matrix \(\left[ \begin{array}{crcc} r^2\amp -r\amp 1\amp y_k \\ 0\amp 0\amp 1\amp y_{k+1} \\ r^2\amp r\amp 1\amp y_{k+2} \end{array} \right]\) is

\begin{equation*} \left[ \begin{array}{crcc} r^2\amp -r\amp 1\amp y_k \\ 0\amp 2r\amp 0\amp y_{k+2}-y_k \\ 0\amp 0\amp 1\amp y_{k+1} \end{array} \right]\text{.} \end{equation*}

Use these matrices to explain why \(c_2=\frac{y_k-2y_{k+1}+y_{k+2}}{2r^2}\text{,}\) \(c_1=\frac{y_{k+2}-y_k}{2r}\text{,}\) and \(c_0 = y_{k+1}\text{.}\)

(c)

Our goal is to ultimately approximate the area under the curve on the interval \([a,b]\) by approximating \(f\) with quadratics on each subinterval. Use the Archimedean formula (2.3) to show that the area under the quadratic \(Q(x)\) from part (a) is

\begin{equation*} \frac{1}{3} \left(y_k+4y_{k+1}+y_{k+2}\right) \Delta x\text{.} \end{equation*}
(d)

Now add up all of the area approximations on each subinterval to show that the approximate area under the graph is given by the formula

\begin{equation} S(n) = \left(y_0 + 4y_1 + 2y_2 + 4y_3 + 2y_4 + \cdots 2y_{n-2} + 4y_{n-1} + y_n\right) \frac{\Delta x}{3}\text{.}\tag{2.4} \end{equation}

We conclude with an example.

Project Activity 2.8.

Let \(f(x) = \left(\frac{1}{2}\right)^x\) on the interval \([1,6]\text{.}\) A graph of \(f\) is shown in Figure 2.8. Use our approximation formula with \(n=8\) to approximate the area of the shaded region in Figure 2.8. Show all of your work and round all calculations to the nearest thousandth.

You should note that not every author uses this convention — when they do not, it is important that you be careful to understand if the matrix has an augmented column or not.
If there had been an \(a_0\) term in the second equation, we could have substituted \(a_0=5\) and solved for \(a_1\)