Skip to main content

Section 8 Matrix Operations

Subsection Application: Algorithms for Matrix Multiplication

Matrix multiplication is widely used in applications ranging from scientific computing and pattern recognition to counting paths in graphs. As a consequence, much work is being done in developing efficient algorithms for matrix multiplication.

We will see that a matrix product can be calculated through the row-column method. Recall that the product of two 2ร—2 matrices A=[a11a12a21a22] and B=[b11b12b21b22] is given by

AB=[a11b11+a12b21a11b12+a12b22a21b11+a22b21a21b12+a22b22],

This product involves eight scalar multiplications and some scalar additions. As we will see, multiplication is more computationally expensive than addition, so we will focus on multiplication. In 1969, a German mathematician named Volker Strassen showedโ€‰13โ€‰ that the product of two 2ร—2 matrices can be calculated using only seven multiplications. While this is not much of an improvement, the Strassen algorithm can be applied to larger matrices, using matrix partitions (which allow for parallel computation), and its publication led to additional research on faster algorithms for matrix multiplication. More details are provided later in this section.

Subsection Introduction

A vector is a list of numbers in a specified order and a matrix is an ordered array of objects. In fact, a vector can be thought of as a matrix of size nร—1. Vectors and matrices are so alike in this way that it would seem natural that we can define operations on matrices just as we did with vectors.

Recall that a matrix is made of rows and columns โ€” the entries reading from left to right form the rows of the matrix and the entries reading from top to bottom form the columns. The number of rows and columns of a matrix is called the size of the matrix, so an mร—n matrix has m rows and n columns. If we label the entry in the ith row and jth column of a matrix A as aij, then we write A=[aij].

We can generalize the operations of addition and scalar multiplication on vectors to matrices similarly. Given two matrices A=[aij] and B=[bij] of the same size, we define the sum A+B by

A+B=[aij+bij]

when the sizes of the matrices A and B match. In other words, for matrices of the same size the matrix addition is defined by adding corresponding entries in the matrices. For example,

[12โˆ’23]+[1324]=[2507].

We define the scalar multiple of a matrix A=[aij] by scalar c to be the matrix cA defined by

cA=[caij],

This means that we multiply each entry of the matrix A by the scalar c. As an example,

3[12โˆ’23]=[36โˆ’69].

Even though we did not have a multiplication operation on vectors, we had a matrix-vector product, which is a special case of a matrix-matrix product since a vector is a matrix with one column. However, generalizing the matrix-vector product to a matrix-matrix product is not immediate as it is not immediately clear what we can do with the other columns. We will consider this question in this section.

Note that all of the matrix operations can be performed on a calculator. After entering each matrix in the calculator, just use +, โˆ’ and ร— operations to find the result of the matrix operation. (Just for fun, try using รท with matrices to see if it will work.)

Preview Activity 8.1.

(a)

Pick three different varying sizes of pairs of A,B matrices which can be added. For each pair:

(ii)

How are the two matrices A+B and B+A related? What does this tell us about matrix addition?

(b)

Let A=[10โˆ’28], B=[1134], and C=[0โˆ’516]. Determine the entries of the matrix A+2Bโˆ’7C.

(c)

Now we turn to multiplication of matrices. Our first goal is to find out what conditions we need on the sizes of matrices A and B if the matrix-matrix product AB is defined and what the size of the resulting product matrix is. We know the condition and the size of the result in the special case of B being a vector, i.e., a matrix with one column. So our conjectures for the general case should match what we know in the special case. In each part of this problem, use any appropriate tool (e.g., your calculator, Maple, Mathematica, Wolfram|Alpha) to determine the matrix product AB, if it exists. If you obtain a product, write it down and explain how its size is related to the sizes of A and B. If you receive an error, write down the error and guess why the error occurred and/or what it means.

(i)

A=[120011]    and    B=[3500โˆ’21]

(ii)

A=[120011]    and    B=[305โˆ’201]

(iii)

A=[1234]    and    B=[111101020]

(iv)

A=[12345678]    and    B=[123โˆ’111]

(v)

Make a guess for the condition on the sizes of two matrices A,B for which the product AB is defined. How is the size of the product matrix related to the sizes of A and B?

(d)

The final matrix products, when defined, in problem 8.1.c might seem unrelated to the individual matrices at first. In this problem, we will uncover this relationship using our knowledge of the matrix-vector product. Let A=[3โˆ’1โˆ’23] and B=[021132].

(ii)

Using the matrix-vector product, calculate Ax where x is the first column (i.e., calculate A[01]), and then the second column of B (i.e., calculate A[23]), and then the third column of B (i.e., calculate A[12]). Do you notice these output vectors within AB?

(iii)

Describe as best you can a definition of AB using the matrix-vector product.

Subsection Properties of Matrix Addition and Multiplication by Scalars

Just as we were able to define an algebra of vectors with addition and multiplication by scalars, we can define an algebra of matrices. We will see that the properties of these operations on matrices are immediate generalizations of the properties of the operations on vectors. We will then see how the matrix product arises through the connection of matrices to linear transformations. Finally, we define the transpose of a matrix. The transpose of a matrix will be useful in applications such as graph theory and least-squares fitting of curves, as well as in advanced topics such as inner product spaces and the dual space of a vector space.

We learned in Preview Activity 8.1 that we can add two matrices of the same size together by adding corresponding entries and we can multiply any matrix by a scalar by multiplying each entry of the matrix by that scalar. More generally, if A=[aij] and B=[bij] are mร—n matrices and c is any scalar, then

A+B=[aij+bij]   and   cA=[caij].

As we have done each time we have introduced a new operation, we ask what properties the operation has. For example, you determined in Preview Activity 8.1 that addition of matrices is a commutative operation. More specifically, for every two mร—n matrices A and B, A+B=B+A. We can use similar arguments to verify the following properties of matrix addition and multiplication by scalars. Notice that these properties are very similar to the properties of addition and scalar multiplication of vectors we discussed earlier. This should come as no surprise since the n-dimensional vectors are nร—1 matrices. In a strange twist, we will see that matrices themselves can be considered as vectors when we discuss vector spaces in a later section.

Later on, we will see that these properties define the set of all mร—n matrices as a vector space. These properties just say that, regarding addition and multiplication by scalars, we can manipulate matrices just as we do real numbers. Note, however, we have not yet defined an operation of multiplication on matrices. That is the topic for the next section.

Subsection A Matrix Product

As we saw in Preview Activity 8.1, a matrix-matrix product can be found in a way which makes use of and also generalizes the matrix-vector product.

Definition 8.2.

The matrix product of a kร—m matrix A and an mร—n matrix B=[b1 b2 โ‹ฏ bn] with columns b1, b2, โ€ฆ, bn is the kร—n matrix

[Ab1 Ab2 โ‹ฏ Abn].

We now consider the motivation behind this definition by thinking about the matrix transformations corresponding to each of the matrices A,B and AB. Recall that left multiplication by an mร—n matrix B defines a transformation T from Rn to Rm by T(x)=Bx. The domain of T is Rn because the number of components of x have to match the number of entries in each of row of B in order for the matrix-vector product Bx to be defined. Similarly, a kร—m matrix A defines a transformation A from Rm to Rk. Since transformations are functions, we can compose them as long as the output vectors of the inside transformation lie in the domain of the outside transformation. Therefore if T is the inside transformation and S is the outside transformation, the composition Sโˆ˜T is defined. So a natural question to ask is if we are given

  • a transformation T from Rn to Rm where T(x)=Bx for an mร—n matrix B and

  • a transformation S from Rm to Rk with S(y)=Ay for some kร—m matrix A,

is there a matrix that represents the transformation Sโˆ˜T defined by (Sโˆ˜T)(x)=S(T(x))? We investigate this question in the next activity in the special case of a 2ร—3 matrix A and a 3ร—2 matrix B.

Activity 8.2.

In this activity, we look for the meaning of the matrix product from a transformation perspective. Let S and T be matrix transformations defined by

S(y)=Ay    and    T(x)=Bx,

where

A=[120011]    and    B=[305โˆ’201].
(a)

What are the domains and codomains of S and T? Why is the composite transformation Sโˆ˜T defined? What is the domain of Sโˆ˜T? What is the codomain of Sโˆ˜T? (Recall that Sโˆ˜T is defined by (Sโˆ˜T)(x)=S(T(x)), i.e., we substitute the output T(x) as the input into the transformation S.)

(b)

Let x=[xy]. Determine the components of T(x).

(c)

Find the components of Sโˆ˜T(x)=S(T(x)).

(d)

Find a matrix C so that S(T(x))=Cx.

(e)

Use the definition of composition of transformations and the definitions of the S and T transformations to explain why it is reasonable to define AB to be the matrix C. Does the matrix C agree with the

AB=[13โˆ’45โˆ’1]

you found in Preview Activity 8.1 using technology?

We now consider this result in the general case of a kร—m matrix A and an mร—n matrix B, where A and B define matrix transformations S and T, respectively. In other words, S and T are matrix transformations defined by S(x)=Ax and T(x)=Bx. The domain of S is Rm and the codomain is Rk. The domain of T is Rn and the codomain is Rm. The composition Sโˆ˜T is defined because the output vectors of T are in Rm and they lie in the domain of S. The domain of Sโˆ˜T is the same as the domain of T since the input vectors first go through the T transformation. The codomain of Sโˆ˜T is the same as the codomain of S since the final output vectors are produced by applying the S transformation.

Let us see how we can obtain the matrix corresponding to the transformation Sโˆ˜T. Let B=[b1 b2 โ‹ฏ bn], where bj is the jth column of B, and let x=[x1x2โ‹ฎxn]. Recall that the matrix vector product Bx is the linear combination of the columns of B with the corresponding weights from x. So

T(x)=Bx=x1b1+x2b2+โ‹ฏ+xnbn.

Note that each of the bj vectors are in Rm since B is an mร—n matrix. Therefore, each of these vectors can be multiplied by matrix A and we can evaluate S(Bx). Therefore, Sโˆ˜T is defined and

(8.1)(Sโˆ˜T)(x)=S(T(x))=A(Bx)=A(x1b1+x2b2+โ‹ฏ+xnbn).

The properties of matrix-vector products show that

(8.2)A(x1b1+x2b2+โ‹ฏ+xnbn)=x1Ab1+x2Ab2+โ‹ฏ+xnAbn.

This expression is a linear combination of Abi's with xi's being the weights. Therefore, if we let C be the matrix with columns Ab1, Ab2, โ€ฆ, Abn, that is

C=[Ab1 Ab2 โ‹ฏ Abn],

then

(8.3)x1Ab1+x2Ab2+โ‹ฏ+xnAbn=Cx

by definition of the matrix-vector product. Combining equations (8.1), (8.2), and (8.3) shows that

(Sโˆ˜T)(x)=Cx

where C=[Ab1 Ab2 โ‹ฏ Abn].

Also note that since T(x)=Bx and S(y)=Ay, we find

(8.4)(Sโˆ˜T)(x)=S(T(x))=S(Bx)=A(B(x)).

Since the matrix representing the transformation Sโˆ˜T is the matrix

[Ab1 Ab2 โ‹ฏ Abn]

where b1, b2, โ€ฆ, bn are the columns of the matrix B, it is natural to define AB to be this matrix in light of equation (8.4).

Matrix multiplication has some properties that are unfamiliar to us as the next activity illustrates.

Activity 8.3.

Let A = [3โˆ’1โˆ’26], B = [0213], C = [1111], D = [3โˆ’3โˆ’33] and E = [1001].

(a)

Find the indicated products (by hand or using a calculator).

ABBADCACBCAEEB
(b)

Is matrix multiplication commutative? Explain.

(c)

Is there an identity element for matrix multiplication? In other words, is there a matrix I for which AI=IA=A for any matrix A? Explain.

(d)

If a and b are real numbers with ab=0, then we know that either a=0 or b=0. Is this same property true with matrix multiplication? Explain.

(e)

If a, b, and c are real numbers with cโ‰ 0 and ac=bc, we know that a=b. Is this same property true with matrix multiplication? Explain.

As we saw in Activity 8.3, there are matrices A,B for which ABโ‰ BA. On the other hand, there are matrices for which AB=BA. For example, this equality will always hold for a square matrix A and if B is the identity matrix of the same size. It also holds if A=B. If the equality AB=BA holds, we say that matrices A and B commute. So the identity matrix commutes with all square matrices of the same size and every matrix A commutes with Ak for any power k.

There is an alternative method of calculating a matrix product that we will often use that we illustrate in the next activity. This alternate version depends on the product of a row matrix with a vector. Suppose A=[a1 a2 โ‹ฏ an] is a 1ร—n matrix and x=[x1x2โ‹ฎxn] is an nร—1 vector. Then the product Ax is the 1ร—1 vector

[a1 a2 โ‹ฏ an][x1x2โ‹ฎxn]=[a1x1+a2x2+โ‹ฏ+anxn].

In this situation, we usually identify the 1ร—1 matrix with its scalar entry and write

(8.5)[a1 a2 โ‹ฏ an]โ‹…[x1x2โ‹ฎxn]=a1x1+a2x2+โ‹ฏ+anxn.

The product โ‹… in (8.5) is called the scalar or dot product of [a1 a2 โ‹ฏ an] with [x1x2โ‹ฎxn].

Activity 8.4.

Let A=[1โˆ’1230โˆ’42โˆ’51] and B=[4โˆ’26013].

Let ai be the ith row of A and bj the jth column of B. For example, a1=[1โˆ’12] and b2=[โˆ’203].

Calculate the entries of the matrix C, where

C=[a1โ‹…b1a1โ‹…b2a2โ‹…b1a2โ‹…b2a3โ‹…b1a3โ‹…b2],

where aiโ‹…bj refers to the scalar product of row i of A with column j of B.โ€‰14โ€‰ Compare your result with the result of AB calculated via the product of A with the columns of B.

Activity 8.4 shows that these is an alternate way to calculate a matrix product. To see how this works in general, let A=[aij] be a kร—m matrix and B=[b1 b2 โ‹ฏ bn] an mร—n matrix. We know that

AB=[Ab1 Ab2 โ‹ฏ Abn].

Now let r1, r2, โ€ฆ, rk be the rows of A so that A=[r1r2โ‹ฎrk]. First we argue that if x=[x1x2โ‹ฎxm], then

Ax=[r1โ‹…xr2โ‹…xโ‹ฎrkโ‹…x].

This is the scalar product (or dot product) definition of the matrix-vector product.

To show that this definition gives the same result as the linear combination definition of matrix-vector product, we first let A=[c1 c2 โ‹ฏ cm], where c1, c2, โ€ฆ, cm are the columns of A. By our linear combination definition of the matrix-vector product, we obtain

Ax=x1c1+x2c2+โ‹ฏ+xmcm=x1[a11a21โ‹ฎak1]+x2[a12a22โ‹ฎak2]+โ‹ฏ+xm[a1ma2mโ‹ฎakm]=[a11x1+a12x2+โ‹ฏ+a1mxma21x1+a22x2+โ‹ฏ+a2mxmโ‹ฎak1x1+ak2x2+โ‹ฏ+akmxm]=[r1โ‹…xr2โ‹…xโ‹ฎrkโ‹…x].

Therefore, the above work shows that both linear combination and scalar product definitions give the same matrix-vector product.

Applying this to the matrix product AB defined in terms of the matrix-vector product, we see that

Abj=[r1โ‹…bjr2โ‹…bjโ‹ฎrkโ‹…bj].

So the i,jth entry of the matrix product AB is found by taking the scalar product of the ith row of A with the jth column of B. In other words,

(AB)ij=riโ‹…bj

where ri is the ith row of A and bj is the jth column of B.

Subsection Properties of Matrix Multiplication

Activity 8.3 shows that we must be very careful not to assume that matrix multiplication behaves like multiplication of real numbers. However, matrix multiplication does satisfy some familiar properties. For example, we now have an addition and multiplication of matrices under certain conditions, so we might ask if matrix multiplication distributes over matrix addition. To answer this question we take two arbitrary kร—m matrices A and B and an arbitrary mร—n matrix C=[c1 c2 โ‹ฏ cn]. Then

(A+B)C=[(A+B)c1 (A+B)c2 โ‹ฏ (A+B)cn]=[Ac1+Bc1 Ac2+Bc2 โ‹ฏ Acn+Bcn]=[Ac1 Ac2 โ‹ฏ Acn]+[Bc1 Bc2 โ‹ฏ Bcn]=AC+BC.

Similar arguments can be used to show the following properties of matrix multiplication.

We verified the second part of this theorem and will assume that all of the properties of this theorem hold. The matrix In introduced in Theorem 8.3 is called the (multiplicative) identity matrix. We usually omit the word multiplicative and refer to the In simply as the identity matrix. This does not cause any confusion since we refer to the additive identity matrix as simply the zero matrix.

Definition 8.4.

Let n be a positive integer. The nร—n identity matrix In is the matrix In=[aij], where aii=1 for each i and aij=0 if iโ‰ j.

We also write the matrix In as

In=[1000โ‹ฏ000100โ‹ฏ000010โ‹ฏ00โ‹ฎโ‹ฑโ‹ฎ0000โ‹ฏ100000โ‹ฏ01].

The matrix In has the property that for any nร—n matrix A,

AIn=InA=A.

so In is a multiplicative identity in the set of all nร—n matrices. More generally, for an mร—n matrix A,

AIn=ImA=A.

Subsection The Transpose of a Matrix

One additional operation on matrices is the transpose. The transpose of a matrix occurs in many useful formulas in linear algebra and in applications of linear algebra.

Definition 8.5.

The transpose of an mร—n matrix A=[aij] is the nร—m matrix AT whose i,jth entry is aji.

Written out, the transpose of the mร—n matrix

A=[a11a12โ‹ฏa1nโˆ’1a1na21a22โ‹ฏa2nโˆ’1a2nโ‹ฎโ‹ฑโ‹ฎam1am2โ‹ฏamnโˆ’1amn]

is the nร—m matrix

AT=[a11a21โ‹ฏamโˆ’11am1a12a22โ‹ฏamโˆ’12am2โ‹ฎโ‹ฑโ‹ฎa1na2nโ‹ฏamโˆ’1namn].

In other words, the transpose of a matrix A is the matrix AT whose rows are the columns of A. Alternatively, the transpose of A is the matrix AT whose columns are the rows of A. We can also view the transpose of A as the reflection of A across its main diagonal, where the diagonal of a matrix A=[aij] consists of the entries of the form [aii].

Activity 8.5.

(a)

Find the transpose of each of the indicated matrices.

[12345678][1โˆ’10][124โˆ’30โˆ’1]
(b)

Find the transpose of the new matrix for each part above. What can you conjecture based on your results? There are certain special types of matrices that are given names.

(c)

There are certain special types of matrices that are given names.

Definition 8.6.

Let A be a square matrix whose ijth entry is aij.

  1. The matrix A is a diagonal matrix if aij=0 whenever iโ‰ j.

  2. The matrix A is a symmetric matrix if AT=A.

  3. The matrix A is an upper triangular if aij=0 whenever i>j.

  4. The matrix A is a lower triangular if aij=0 whenever i<j.

(i)

Find an example of a diagonal matrix A. What can you say about AT?

(ii)

Find an example of a non-diagonal symmetric matrix B. If BT=B, must B be a square matrix?

(iii)

Find an example of an upper triangular matrix C. What kind of a matrix is CT?

We will see later that diagonal matrices are important in that their powers are easy to calculate. Symmetric matrices arise frequently in applications such as in graph theory as adjacency matrices and in quantum mechanics as observables, and have many useful properties including being diagonalizable and having real eigenvalues, as we will also see later.

Subsection Properties of the Matrix Transpose

As with every other operation, we want to understand what properties the matrix transpose has. Properties of transposes are shown in the following theorem.

The one property that might seem strange is the third one. To understand this property, suppose A is an mร—n matrix and B an nร—k matrix so that the product AB is defined. We will argue that (AB)T=BTAT by comparing the i,jth entry of each side.

  • First notice that the i,jth entry of (AB)T is the j,ith entry of AB. The j,ith entry of AB is found by taking the scalar product of the jth row of A with the ith column of B. Thus, the i,jth entry of (AB)T is the scalar product of the jth row of A with the ith column of B.

  • The i,jth entry of BTAT is the scalar product of the ith row of BT with the jth column of AT. But the ith row of BT is the ith column of B and the jth column of AT is the jth row of A. So the i,jth entry of BTAT is the scalar product of the jth row of A with the ith column of B.

Since the two matrices (AB)T and BTAT have the same size and same corresponding entries, they are the same matrix.

Subsection Examples

What follows are worked examples that use the concepts from this section.

Example 8.8.

Let

A=[120130โˆ’4576โˆ’10]B=[โˆ’24โˆ’351911โˆ’2]C=[0โˆ’163โˆ’25104]D=[10โˆ’4528โˆ’1]E=[104โˆ’35โˆ’1] and F=[โˆ’21563โˆ’810โˆ’170โˆ’5]..

Determine the results of the following operations, if defined. If not defined, explain why.

(a)

AF

Solution.

Since A is a 3ร—4 matrix and F is a 4ร—3 matrix, the number of columns of A equals the number of rows of F and the matrix produce AF is defined. Recall that if F=[f1 f2 f3], where f1, f2, f3 are the columns of F, then AF=[Af1 Af2 Af3]. Recall also that Af1 is the linear combination of the columns of A with weights from f1, so

Af1=[120130โˆ’4576โˆ’10][โˆ’2617]=(โˆ’2)[137]+(6)[206]+(1)[0โˆ’4โˆ’1]+(7)[150]=[โˆ’2+12+0+7โˆ’6+0โˆ’4+35โˆ’14+36โˆ’1+0]=[172521],
Af2=[120130โˆ’4576โˆ’10][1300]=(1)[137]+(3)[206]+(0)[0โˆ’4โˆ’1]+(0)[150]=[1+6+0+03+0+0+07+18+0+0]=[7325],

and

Af3=[120130โˆ’4576โˆ’10][5โˆ’8โˆ’1โˆ’5]=(5)[137]โˆ’(8)[206]โˆ’(1)[0โˆ’4โˆ’1]โˆ’(5)[150]=[5โˆ’16โˆ’0โˆ’515โˆ’0+4โˆ’2535โˆ’48+1โˆ’0]=[โˆ’16โˆ’6โˆ’12].

So AF=[177โˆ’16253โˆ’62125โˆ’12]. Alternatively, if A=[a1a2a3a4], then the matrix product AF is the matrix whose ij entry is aiโ‹…fj. Using this method we have

AF=[a1โ‹…f1a1โ‹…f2a1โ‹…f3a2โ‹…f1a2โ‹…f2a2โ‹…f3a3โ‹…f1a3โ‹…f2a3โ‹…f3].

Now

a1โ‹…f1=(1)(โˆ’2)+(2)(6)+(0)(1)+(1)(7)=17a1โ‹…f2=(1)(1)+(2)(3)+(0)(0)+(1)(0)=7a1โ‹…f3=(1)(5)+(2)(โˆ’8)+(0)(โˆ’1)+(1)(โˆ’5)=โˆ’16a2โ‹…f1=(3)(โˆ’2)+(0)(6)+(โˆ’4)(1)+(5)(7)=25a2โ‹…f2=(3)(1)+(0)(3)+(โˆ’4)(0)+(5)(0)=3a2โ‹…f3=(3)(5)+(0)(โˆ’8)+(โˆ’4)(โˆ’1)+(5)(โˆ’5)=โˆ’6a3โ‹…f1=(7)(โˆ’2)+(6)(6)+(โˆ’1)(1)+(0)(7)=21a3โ‹…f2=(7)(1)+(6)(3)+(โˆ’1)(0)+(0)(0)=25a3โ‹…f3=(7)(5)+(6)(โˆ’8)+(โˆ’1)(โˆ’1)+(0)(โˆ’5)=โˆ’12,

so AF=[177โˆ’16253โˆ’62125โˆ’12].

(b)

A(BC)

Solution.

Since BC is a 3ร—3 matrix but A is 3ร—4, the number of columns of A is not equal to the number of rows of BC. We conclude that A(BC) is not defined.

(c)

(BC)A

Solution.

Since BC is a 3ร—3 matrix and A is 3ร—4, the number of columns of BC is equal to the number of rows of A. Thus, the quantity (BC)A is defined. First we calculate BC using the dot product of the rows of B with the columns of C. Letting B=[b1b2b3] and C=[c1 c2 c3], where b1, b2, and b3 are the rows of B and c1, c2, and c3 are the columns of C, we have

BC=[b1โ‹…c1b1โ‹…c2b1โ‹…c3b2โ‹…c1b2โ‹…c2b2โ‹…c3b3โ‹…c1b3โ‹…c2b3โ‹…c3].

Now

b1โ‹…c1=(โˆ’2)(0)+(4)(3)+(โˆ’3)(1)=9b1โ‹…c1=(โˆ’2)(โˆ’1)+(4)(โˆ’2)+(โˆ’3)(0)=โˆ’6b1โ‹…c1=(โˆ’2)(6)+(4)(5)+(โˆ’3)(4)=โˆ’4b1โ‹…c1=(5)(0)+(1)(3)+(9)(1)=12b1โ‹…c1=(5)(โˆ’1)+(1)(โˆ’2)+(9)(0)=โˆ’7b1โ‹…c1=(5)(6)+(1)(5)+(9)(4)=71b1โ‹…c1=(1)(0)+(1)(3)+(โˆ’2)(1)=1b1โ‹…c1=(1)(โˆ’1)+(1)(โˆ’2)+(โˆ’2)(0)=โˆ’3b1โ‹…c1=(1)(6)+(1)(5)+(โˆ’2)(4)=3,

so BC=[9โˆ’6โˆ’412โˆ’7711โˆ’33]. If BC=[r1r2r3] and A=[s1 s2 s3 s4], where r1, r2, and r3 are the rows of BC and s1, s2, s3, and s4 are the columns of A, then

(BC)A=[r1โ‹…s1r1โ‹…s2r1โ‹…r3r1โ‹…s4r2โ‹…s1r2โ‹…s2r2โ‹…s3r2โ‹…s4r3โ‹…s1r3โ‹…s2r3โ‹…s3r3โ‹…s4].

Now

r1โ‹…s1=(9)(1)+(โˆ’6)(3)+(โˆ’4)(7)=โˆ’37r1โ‹…s2=(9)(2)+(โˆ’6)(0)+(โˆ’4)(6)=โˆ’6r1โ‹…s3=(9)(0)+(โˆ’6)(โˆ’4)+(โˆ’4)(โˆ’1)=28r1โ‹…s4=(9)(1)+(โˆ’6)(5)+(โˆ’4)(0)=โˆ’21r2โ‹…s1=(12)(1)+(โˆ’7)(3)+(71)(7)=488r2โ‹…s2=(12)(2)+(โˆ’7)(0)+(71)(6)=450r2โ‹…s3=(12)(0)+(โˆ’7)(โˆ’4)+(71)(โˆ’1)=โˆ’43r2โ‹…s4=(12)(1)+(โˆ’7)(5)+(71)(0)=โˆ’23r3โ‹…s1=(1)(1)+(โˆ’3)(3)+(3)(7)=13r3โ‹…s2=(1)(2)+(โˆ’3)(0)+(3)(6)=20r3โ‹…s3=(1)(0)+(โˆ’3)(โˆ’4)+(3)(โˆ’1)=9r3โ‹…s4=(1)(1)+(โˆ’3)(5)+(3)(0)=โˆ’14,

so (BC)A=[โˆ’37โˆ’628โˆ’21488450โˆ’43โˆ’2313209โˆ’14].

(d)

(B+C)D

Solution.

Since B and C are both 3ร—3 matrices, their sum is defined and is a 3ร—3 matrix. Because D is 3ร—2 matrix, the number of columns of B+C is equal to the number of rows of D. Thus, the quantity (B+C)D is defined and, using the row-column method of matrix multiplication as earlier,

(B+C)D=([โˆ’24โˆ’351911โˆ’2]+[0โˆ’163โˆ’25104])[10โˆ’4528โˆ’1]=[โˆ’2+04โˆ’1โˆ’3+65+31โˆ’29+51+11+0โˆ’2+4][10โˆ’4528โˆ’1]=[โˆ’2338โˆ’114212][10โˆ’4528โˆ’1]=[1911187โˆ’4841โˆ’8].
(e)

DTE

Solution.

Since DT is a 2ร—3 matrix and E is 3ร—2, the number of columns of DT is equal to the number of rows of E. Thus, DTE is defined and

DTE=[10โˆ’4528โˆ’1]T[104โˆ’35โˆ’1]=[1058โˆ’42โˆ’1][104โˆ’35โˆ’1]=[70โˆ’23โˆ’1โˆ’5].
(f)

(AT+F)T

Solution.

The fact that A is a 3ร—4 matrix means that AT is a 4ร—3 matrix. Since F is also a 4ร—3 matrix, the sum AT+F is defined. The transpose of any matrix is also defined, so (AT+F)T is defined and

(AT+F)T=([120130โˆ’4576โˆ’10]T+[โˆ’21563โˆ’810โˆ’170โˆ’5])T=([1372060โˆ’4โˆ’1150]+[โˆ’21563โˆ’810โˆ’170โˆ’5])T=([1โˆ’23+17+52+60+36โˆ’80+1โˆ’4+0โˆ’1โˆ’11+75+00โˆ’5])T=([โˆ’141283โˆ’21โˆ’4โˆ’285โˆ’5])T=[โˆ’181843โˆ’4512โˆ’2โˆ’2โˆ’5].

Example 8.9.

Let A=[2โˆ’17โˆ’2] and B=[46โˆ’35].

(a)

Determine the matrix sum A+B. Then use this sum to calculate (A+B)2.

Solution.

Adding corresponding terms shows that A+B=[6543]. Squaring this sum yields the result (A+B)2=[56453629].

(b)

Now calculate (A+B)2 in a different way. Use the fact that matrix multiplication distributes over matrix addition to expand (like foiling) (A+B)2 into a sum of matrix products. The calculate each summand and add to find (A+B)2. You should obtain the same result as part (a). If not, what could be wrong?

Solution.

Expanding (A+B)2 (remember that matrix multiplication is not commutative) gives us

(A+B)2=(A+B)(A+B)=A2+AB+BA+B2=[โˆ’300โˆ’3]+[1173432]+[50โˆ’1629โˆ’7]+[โˆ’254โˆ’277]=[56453629]

just as in part (a). If instead you obtained the matrix [17684168] you likely made the mistake of equating (A+B)2 with A2+2AB+B2. These two matrices are not equal in general, because we cannot say that AB is equal to BA.

Subsection Summary

In this section we defined a matrix sum, scalar multiples of matrices, the matrix product, and the transpose of a matrix.

  • The sum of two mร—n matrices A=[aij] and B=[bij] is the mร—n matrix A+B whose i,jth entry is aij+bij.

  • If A=[aij] is an mร—n matrix, the scalar multiple kA of A by the scalar k is the mร—n matrix whose i,jth entry is kaij.

  • If A is a kร—m matrix and B=[b1 b2 โ‹ฏ bn] is an mร—n matrix, then the matrix product AB of the matrices A and B is the kร—n matrix

    [Ab1 Ab2 โ‹ฏ Abn].

    The matrix product is defined in this way so that the matrix of a composite Sโˆ˜T of linear transformations is the product of matrices of S and T.

  • An alternate way of calculating the product of an kร—m matrix A with rows r1, r2, โ€ฆ, rk and an mร—n matrix B with columns b1, b2, โ€ฆ, bn is that the product AB is the kร—n matrix whose i,jth entry is riโ‹…bj.

  • Matrix multiplication does not behave as the standard multiplication on real numbers. For example, we can have a product of two non-zero matrices equal to the zero matrix and there is no cancellation law for matrix multiplication.

  • The transpose of an mร—n matrix A=[aij] is the nร—m matrix AT whose i,jth entry is aji.

Exercises Exercises

1.

Calculate AB for each of the following matrix pairs by hand in two ways.

2.

For each of the following A matrices, find all 2ร—2 matrices B=[abcd] which commute with the given A. (Two matrices A and B commute with each other if AB=BA.)

3.

Find all possible, if any, X matrices satisfying each of the following matrix equations.

(c)

[1โˆ’2โˆ’24]X=[010โˆ’2]

4.

For each of the following A matrices, compute A2=AA,A3=AAA,A4. Use your results to conjecture a formula for Am. Interpret your answer geometrically using the transformation interpretation.

5.

If Av=2v for unknown A matrix and v vector, determine an expression for A2v, A3v, โ€ฆ, Amv.

6.

If Av=2v and Au=3u, find an expression for Am(av+bu) in terms of v and u.

7.

A matrix A is a nilpotent matrix if Am=0, i.e., Am is the zero matrix, for some positive integer m. Explain why the matrices

A=[0a00],B=[0ab00c000]

are nilpotent matrices.

8.

Suppose A is an nร—n matrix for which A2=0. Show that there is a matrix B for which (In+A)B=In where In is the identity matrix of size n.

10.

Let A, B, and C be matrices of the appropriate sizes for all sums and products to be defined and let a be a scalar. Verify the remaining parts of Theorem 8.3. That is, show that

(c)

There is a square matrix In with the property that AIn=A or InA=A for whichever product is defined.

12.

The matrix exponential is an important tool in solving differential equations. Recall from calculus that the Taylor series expansion for ex centered at x=0 is

ex=โˆ‘n=0โˆžxnn!=1+x+x22!+x33!+โ‹ฏ,

and that this Taylor series converges to ex for every real number x. We extend this idea to define the matrix exponential eA for any square matrix A with real entries as

eA=โˆ‘n=0โˆž1n!An=In+A+12!A2+13!A3+โ‹ฏ

We explore this idea with an example. Let B=[200โˆ’1].

(a)

Calculate B2, B3, B4. Explain why Bn=[2n00(โˆ’1)n] for any positive integer n.

(b)

Show that I2+B+B2+B3+B4 is equal to

[1+2+222+233!+244!001+(โˆ’1)+(โˆ’1)22+(โˆ’1)33!+(โˆ’1)44!].
(c)

Explain why eB=[e200eโˆ’1].

13.

Show that if A and B are 2ร—2 rotation matrices, then AB is also a 2ร—2 rotation matrix.

14.

Label each of the following statements as True or False. Provide justification for your response. Throughout, assume that matrices are of the appropriate sizes so that any matrix sums or products are defined.

(a) True/False.

For any three matrices A,B,C with Aโ‰ 0, AB=AC implies B=C.

(b) True/False.

For any three matrices A,B,C with Aโ‰ 0, AB=CA implies B=C.

(c) True/False.

If A2 is the zero matrix, then A itself is the zero matrix.

(d) True/False.

If AB=BA for every nร—n matrix B, then A is the identity matrix In.

(e) True/False.

If matrix products AB and BA are both defined, then A and B are both square matrices of the same size.

(f) True/False.

If x1 is a solution for Ax=b1 (i.e., that Ax1=b1) and x2 is a solution for Bx=b2, then x1+x2 is a solution for (A+B)x=b1+b2.

(g) True/False.

If B is an mร—n matrix with two equal columns, then the matrix AB has two equal columns for every kร—m matrix.

(h) True/False.

If A2=I2, then A=โˆ’I2 or A=I2.

Subsection Project: Strassen's Algorithm and Partitioned Matrices

Strassen's algorithm is an algorithm for matrix multiplication that can be more efficient than the standard row-column method. To understand this method, we begin with the 2ร—2 case which will highlight the essential ideas.

Project Activity 8.6.

We first work with the 2ร—2 case.

(a)

Let A=[aij]=[1234] and B=[bij]=[5678].

(i)

Calculate the matrix product AB.

(ii)

Rather than using eight multiplications to calculate AB, Strassen came up with the idea of using the following seven products:

h1=(a11+a22)(b11+b22)h2=(a21+a22)b11h3=a11(b12โˆ’b22)h4=a22(b21โˆ’b11)h5=(a11+a12)b22h6=(a21โˆ’a11)(b11+b12)h7=(a12โˆ’a22)(b21+b22).

Calculate h1 through h7 for the given matrices A and B. Then calculate the quantities

h1+h4โˆ’h5+h7, h3+h5, h2+h4,  and h1+h3โˆ’h2+h6.

What do you notice?

(b)

Now we repeat part (a) in general. Suppose we want to calculate the matrix product AB for arbitrary 2ร—2 matrices A=[a11a12a21a22] and B=[b11b12b21b22]. Let

h1=(a11+a22)(b11+b22)h2=(a21+a22)b11h3=a11(b12โˆ’b22)h4=a22(b21โˆ’b11)h5=(a11+a12)b22h6=(a21โˆ’a11)(b11+b12)h7=(a12โˆ’a22)(b21+b22).

Show that

AB=[h1+h4โˆ’h5+h7h3+h5h2+h4h1+h3โˆ’h2+h6].

The next step is to understand how Strassen's algorithm can be applied to larger matrices. This involves the idea of partitioned (or block) matrices. Recall that the matrix-matrix product of the kร—m matrix A and the mร—n matrix B=[b1 b2 โ‹ฏ bn] is defined as

AB=[Ab1 Ab2 โ‹ฏ Abn].

In this process, we think of B as being partitioned into n columns. We can expand on this idea to partition both A and B when calculating a matrix-matrix product.

Project Activity 8.7.

We illustrate the idea of partitioned matrices with an example. Let A=[1โˆ’23โˆ’64752โˆ’103โˆ’8109].We can partition A into smaller matrices

A=[1โˆ’23โˆ’64752โˆ’103โˆ’8109],

which are indicated by the vertical and horizontal lines. As a shorthand, we can describe this partition of A as

A=[A11A12A21A22],

where A11=[1โˆ’23752], A12=[โˆ’64โˆ’10], A21=[3โˆ’81], and A22=[0 9]. The submatrices Aij are called blocks. If B is a matrix such that AB is defined, then B must have five rows. As an example, AB is defined if B=[1320416542]. The partition of A breaks A up into blocks with three and two columns, respectively. So if we partition B into blocks with three and two rows, then we can use the blocks to calculate the matrix product AB. For example, partition B as

B=[1320416542]=[B11B21].

Show that

AB=[A11A12A21A22][B11B21]=[A11B11+A12B21A21B11+A22B21].

An advantage to using partitioned matrices is that computations with them can be done in parallel, which lessens the time it takes to do the work. In general, we can multiply partitioned matrices as though the submatrices are scalars. That is,

[A11A12โ‹ฏA1mA21A22โ‹ฏA2mโ‹ฎโ‹ฎโ‹ฑโ‹ฎAi1Ai2โ‹ฏAimโ‹ฎโ‹ฎโ‹ฑโ‹ฎAk1Ak2โ‹ฏAkm][B11B12โ‹ฏB1jโ‹ฏB1nB21B22โ‹ฏB2jโ‹ฏB2nโ‹ฎโ‹ฎโ‹ฑโ‹ฎโ‹ฑโ‹ฎBm1Bm2โ‹ฏBmjโ‹ฏBmn]=[Pij],

where

Pij=Ai1B1j+Ai2B2j+โ‹ฏ+AimBmj=โˆ‘t=1mAitBtj,

provided that all the submatrix products are defined.

Now we can apply Strassen's algorithm to larger matrices using partitions. This method is sometimes referred to as divide and conquer.

Project Activity 8.8.

Let A and B be two rร—r matrices. If r is not a power of 2, then pad the rows and columns of A and B with zeros to make them of size 2mร—2m for some integer m. (From a practical perspective, we might instead just use unequal block sizes.) Let n=2m. Partition A and B as

A=[A11A12A21A22]  and  B=[B11B12B21B22],

where each submatrix is of size n2ร—n2. Now we use the Strassen algorithm just as in the 2ร—2 case, treating the submatrices as if they were scalars (with the additional constraints of making sure that the dimensions match up so that products are defined, and ensuring we multiply in the correct order). Letting

M1=(A11+A22)(B11+B22)M2=(A21+A22)B11M3=A11(B12โˆ’B22)M4=A22(B21โˆ’B11)M5=(A11+A12)B22M6=(A21โˆ’A11)(B11+B12)M7=(A12โˆ’A22)(B21+B22),

then the same algebra as in Project Activity 8.6 shows that

AB=[M1+M4โˆ’M5+M7M3+M5M2+M4M1+M3โˆ’M2+M6].

Apply Strassen's algorithm to calculate the matrix product AB, where

A=[13โˆ’12467โˆ’25] and B=[2532โˆ’41164].

While Strassen's algorithm can be more efficient, it does not always speed up the process. We investigate this in the next activity.

Project Activity 8.9.

We introduce a little notation to help us describe the efficiency of our calculations. We won't be formal with this notation, rather work with it in an informal way. Big O (the letter โ€œOโ€ ) notation is used to describe the complexity of an algorithm. Generally speaking, in computer science big O notation can be used to describe the run time of an algorithm, the space used by the algorithm, or the number of computations required. The letter โ€œOโ€ is used because the behavior described is also called the order. Big O measures the asymptotic time of an algorithm, not its exact time. For example, if it takes 6n2โˆ’n+8 steps to complete an algorithm, then we say that the algorithm grows at the order of n2 (we ignore the constants and the smaller power terms, since they become insignificant as n increases) and we describe its growth as O(n2). To measure the efficiency of an algorithm to determine a matrix product, we will measure the number of operations it takes to calculate the product.

(a)

Suppose A and B are nร—n matrices. Explain why the operation of addition (that is, calculating A+B) is O(n2).

(b)

Suppose A and B are nร—n matrices. How many multiplications are required to calculate the matrix product AB? Explain.

(c)

The standard algorithm for calculating a matrix product of two nร—n matrices requires n3 multiplications and a number of additions. Since additions are much less costly in terms of operations, the standard matrix product is O(n3). We won't show it here, but using Strassen's algorithm on a product of 2mร—2m matrices is O(nlog2โก(7)), where n=2m. That means that Strassen's algorithm applied to an nร—n matrix (where n is a power of 2) requires approximately nlog2โก(7) multiplications. We use this to analyze situations to determine when Strassen's algorithm is computationally more efficient than the standard algorithm.

(i)

Suppose A and B are 5ร—5 matrices. Determine the number of multiplications required to calculate the matrix product AB using the standard matrix product. Then determine the approximate number of multiplications required to calculate the matrix product AB using Strassen's algorithm. Which is more efficient? (Remember, we can only apply Strassen's algorithm to square matrices whose sizes are powers of 2.)

(ii)

Repeat part i. with 125ร—125 matrices. Which method is more efficient?

As a final note, Strassen's algorithm is approximately O(n2.81). As of 2018, the best algorithm for matrix multiplication, developed by Virginia Williams at Stanford University, is approximately O(n2.373).โ€‰15โ€‰

Strassen, Volker, Gaussian Elimination is not Optimal, Number. Math. 13, p. 354-356, 1969
Recall from Exercise 5 of Section 5 that the scalar product uโ‹…v of a 1ร—n matrix u=[u1 u2 โ€ฆ un] and an nร—1 vector v=[v1v2โ‹ฎvn] is uโ‹…v=u1v1+u2v2+u3v3+โ‹ฏ+unvn.
V. V. Williams, Multiplying matrices in O(n2.373) time, Stanford University, (2014).