Skip to main content

Section 25 Projections onto Subspaces and the Gram-Schmidt Process in Rn

Subsection Application: MIMO Systems

MIMO (Multiple Input Multiple Output) systems are widely used to increase data transmission rates and improve the quality of wireless communication systems. MIMO is one of several forms of smart antenna technology, and it makes use of multiple antennas at the transmitter to send signals and multiple antennas at the receiver to accept the signals. MIMO systems transmit the same data on multiple streams, which introduces redundancy into the system. MIMO systems can utilize bounced and reflected signals to actually improve signal strength. This is very useful in urban environments in the presence of large buildings and the absence of direct line-of-sight transmission. MIMO systems can transmit several information streams in parallel (known as spatial multiplexing), allowing for increased data transformation rates. The presence of multiple receiver antennas allows for greater reliability in the system.

Figure 25.1. A MIMO system.

In a MIMO system, every transmitter antenna sends a signal to every receiver antenna as illustrated in Figure 25.1. In each of these transmissions the signal can be disturbed in some way. Two types of disturbances are fading and noise. Fading is due to time variations in the received signal, which can occur because of atmospheric conditions or obstacles over the path which are varying with respect to time. In a MIMO system we have multipath fading for the different paths the signal takes from different antennas to receivers, which causes fluctuations in amplitude, phase and angle of arrival of the received signal. Noise is an unwanted signal which interferes with actual signal. Both fading and noise result in a received signal that is different than the transmitted signal. The problem is how to recover the original signal from the transmitted signal. The QR decomposition is used in this process. To improve the efficiency of MIMO systems, different methods are introduced to determine a QR decomposition. We will discuss MIMO systems in more detail later in this section, and how Householder transformations can be used to find QR decompositions.

Subsection Introduction

In many situations (least squares approximations, for example) want to find a vector w in a subspace W of Rn that is the “best” approximation to a vector v not in the subspace. A natural measure of “best” is to find a vector win W if one exists, such that the distance from w to u is a minimum. This means we want to find w so that the quantity

||wu||

is as small as possible over all vectors w in W To do this, we will need to find a suitable projection of the vector v onto the subspace W We have already done this in the case that W=Span{w1} is the span of a single vector -- we can project the vector v in the direction of w1. Our goal is to generalize this to project a vector v onto an entire subspace in a useful way.

Preview Activity 25.1.

Let B={w1,w2} be a basis for a subspace W of R3 where w1=[1 0 0]T and w2=[0 1 0]T Note that B is an orthonormal basis for W Let v=[1 2 1]T Note that W is the xyplane and that v is not in W as illustrated in Figure 25.2.

Figure 25.2. The space W and vectors w1, w2, and v.
(a)

Find the projection u1of vonto W1=Span{w1}

Hint.

See Equation (23.2).

(b)

Find the projection u2of vonto W2=Span{w2}

(c)

Calculate the distance between vand u1and vand u2 Which of u1and u2is closer to v?

(d)

Show that the vector 34[1 2 0]Tis in Wand find the distance between vand 34[1 2 0]T

(e)

Part (4) shows that neither vector u1=projw1vnor u2=projw2vis the vector in Wthat is closest to v We should probably expect this neither projection uses the fact that the other vector might contribute to the closest vector. Let us instead consider the sum u3=u1+u2 Calculate the components of this vector u3and determine the distance between u3and v Which of the three vectors u1 u2 and u3in Wis closest to v?

(f)

A picture of w1 w2 W and vis shown in Figure 25.2. Draw in u1 u2 and u3 Draw a picture of the vector win Wthat appears to be closest to v Does this vector seem to have any relationship to u1 u2 or u3? If yes, what relationship do you see?

Subsection Projections onto Subspaces and Orthogonal Projections

Preview Activity 25.1 gives an indication of how we can project a vector v in Rn onto a subspace W of Rn. If we have an orthogonal basis for W, we can just add the projections of v onto each basis vector. The resulting vector is called the projection of v onto the subspace W. As we did with projections onto vectors, we can also define the projection of v orthogonal to W. Note that to make this all work out properly, we will see that we need an orthogonal basis for W.

Definition 25.3.

Let W be a subspace of Rn and let {w1,w2,,wm} be an orthogonal basis for W. The projection of the vector v in V onto W is the vector

projWv=vw1w1w1w1+vw2w2w2w2++vwmwmwmwm.

The projection of v orthogonal to W is the vector

projWv=vprojWv.

The notation projWv indicates that we expect this vector to be orthogonal to every vector in W. We address this in the following activity.

Activity 25.2.

Let W=Span{w1,w2} in R3, with w1=[1 0 0]T and w2=[0 1 0]T, and v=[1 2 1]T as in Preview Activity 25.1. Recall that projWv=[1 2 0]T. Find the projection of v orthogonal to W and show directly that projWv is orthogonal to the basis vectors for W (and hence to every vector in W).

Activity 25.2 indicates that the vector projWv is in fact orthogonal to every vector in W. To see that this is true in general, let B={w1,w2,,wm} be an orthogonal basis for a subspace W of Rn and let v be a vector in Rn. Let

w=projWv=vw1w1w1w1+vw2w2w2w2++vwmwmwmwm.

Then vw is the projection of v orthogonal to W. We will show that vw is orthogonal to every basis vector for W. Since B is an orthogonal basis for W, we know that wiwj=0 for ij. So if k is between 1 and m then

wk(vw)=(wkv)(wkw)=(wkv)[wk(vw1w1w1w1+vw2w2w2w2++vwmwmwmwm)]=(wkv)(vwkwkwk)(wkwk)=(wkv)(vwk)=0.

So the vector vw is orthogonal to every basis vector for W, and therefore to every vector in W (Theorem 23.16). Because B is a basis for W, if x is in W, then

x=c1w1+c2w2++cmwm

for some scalars c1, c2, , cm. So

(vw)x=k=1mck(vw)wk=0,

and so vw is orthogonal to every vector in W.

Subsection Best Approximations

We have seen that projWv is orthogonal to every vector in W, which suggests that projWv is in fact the vector in W that is closest to v. We now verify this fact and conclude that projWv is the vector in W closest to v and therefore the best approximation of v by a vector in W.

Proof.

Let B={w1,w2,,wm} be an orthogonal basis for a subspace W of Rn and let v be a vector in Rn. Let x be a vector in W. The vector projWvx is in W, so is orthogonal to projWv=vprojWv. Thus, the dotted triangle whose vertices are the tips of the vectors v, x, and projWv in Figure 25.5 is a right triangle.

Figure 25.5. The best approximation to v in W

The Pythagorean theorem then shows that

||vx||2=||projWv||2+||projWvx||2.

Now xprojWv, so ||projWvx||2>0. This implies

||vx||2>||projWv||2

and it follows that

||vx||>||projWv||.

This theorem shows that the distance from projWv to v is less than the distance from any other vector in W to v. So projWv is the best approximation to v of all the vectors in W.

If v=[v1 v2 v3  vm]T and projWv=[w1 w2 w3  wm]T, then the square of the error in approximating v by projWv is given by

||vprojWv||2=i=1m(viwi)2.

So projWv minimizes this sum of squares over all vectors in W. As a result, we call projWv the least squares approximation to v.

Activity 25.3.

Let B={[1010],[0101],[1111]} and let W=Span(B) in R4. Find the best approximation in W to the vector v=[2311] in W.

Subsection The Gram-Schmidt Process

We have seen that orthogonal bases make computations very convenient. However, until now we had no convenient way to construct orthogonal bases. The problem we address in this section is how to create an orthogonal basis from any basis.

Preview Activity 25.4.

Let W=Span{v1,v2,v3} in R4, where v1=[1 1 1 1]T, v2=[1 4 4 1]T, and v3=[2 1 1 0]T. Our goal in this preview activity is to begin to understand how we can find an orthogonal set B={w1,w2,w3} in R4 so that Span B=W. To begin, we could start by letting w1=v1.

(a)

Now we want to find a vector in W that is orthogonal to w1. Let W1=Span{w1}. Explain why w2=projW1v2 is in W and is orthogonal to w1. Then calculate the vector w2.

(b)

Next we need to find a third vector w3 that is in W and is orthogonal to both w1 and w2. Let W2=Span{w1,w2}. Explain why w3=projW2v3 is in W and is orthogonal to both w1 and w2. Then calculate the vector w3.

(c)

Explain why the set {w1,w2,w3} is an orthogonal basis for W.

Preview Activity 25.4 shows the first steps of the Gram-Schmidt process to construct an orthogonal basis from any basis of a subspace in Rn. To understand why the process works in general, let {v1,v2,,vm} be a basis for a subspace W of Rn. Let w1=v1 and let W1=Span{w1}. Since w1=v1 we have that W1=Span{w1}=Span{v1}.

The vectors v1=w1 and v2 are possibly not orthogonal, but we know the orthogonal projection of v2 onto W1 is orthogonal to w1. Let

w2=projW1v2=v2v2w1w1w1w1.

Then {w1,w2} is an orthogonal set. Note that w1=v10, and the fact that v2W1 implies that w20. So the set {w1,w2} is linearly independent, being a set of non-zero orthogonal vectors. Now the question is whether Span{w1,w2}=Span{v1,v2}. Note that w2 is a linear combination of v1 and v2, so w2 is in Span{v1,v2}. Since Span{w1,w2} is a 2-dimensional subspace of the 2-dimensional space Span{v1,v2}, it must be true that Span{w1,w2}=W2=Span{v1,v2}.

Now we take the next step, adding v3 into the mix. The vector

w3=projW2v3=v3v3w1w1w1w1v3w2w2w2w2

is orthogonal to both w1 and w2 and, by construction, w3 is a linear combination of v1, v2, and v3. So w3 is in W3=Span{v1,v2,v3}. The fact that v3W2 implies that w30 and {w1,w2,w3} is a linearly independent set. Since Span{w1,w2,w3} is a 3-dimensional subspace of the 3-dimensional space Span{v1,v2,v3}, we conclude that Span{w1,w2,w3}=W3=Span{v1,v2,v3}.

We continue inductively in this same manner. If we have constructed a set {w1, w2, w3, , wk1} of k1 orthogonal vectors such that

Span{w1,w2,w3,,wk1}=Span{v1,v2,v3,,vk1},

then we let

wk=projWk1vk=vkvkw1w1w1w1vkw2w2w2w2vkwk1wk1wk1wk1,

where

Wk1=Span{w1,w2,w3,,wk1}.

We know that wk is orthogonal to w1, w2, , wk1. Since w1, w2, , wk1, and vk are all in Wk=Span{v1,v2,,vk} we see that wk is also in Wk. Since vkWk1 implies that wk0 and {w1,w2,,wk} is a linearly independent set. Then Span{w1,w2,w3,,wk} is a k-dimensional subspace of the k-dimensional space Wk, so it follows that

Span{w1,w2,w3,,wk}=Wk=Span{v1,v2,v3,,vk}.

This process will end when we have an orthogonal set {w1, w2, w3, , wm} with Span{w1, w2, w3, , wm} = W.

We summarize the process in the following theorem.

The Gram-Schmidt process builds an orthogonal basis {w1,w2,w3,,wm} for us from a given basis. To make an orthonormal basis {u1,u2,u3,,um}, all we need do is normalize each basis vector: that is, for each i, we let

ui=wi||wi||.

Activity 25.5.

Let W=Span{[1100],[0110],[0011]}.

(a)

Use the Gram-Schmidt process to find an orthogonal basis for W.

Hint.

Order your vectors carefully to minimize computation.

(b)

Find the projection of the vector v=[0 1 1 1]T onto W.

Subsection The QR Factorization of a Matrix

There are several different factorizations, or decompositions, of matrices where each matrix is written as a product of certain types of matrices: LU decomposition using lower and upper triangular matrices (see Section 22), EVD (EigenVector Decomposition) decomposition using eigenvectors and diagonal matrices (see Section 19), and in this section we will introduce the QR decomposition of a matrix. The QR decomposition is one of the most important tools for matrix computations. Jack Dongarra and Francis Sullivan 42  nominated the QR decomposition as one of the “10 algorithms with the greatest influence on the development and practice of science and engineering in the 20th century.” The QR algorithm's importance stems from the fact that is a “genuinely new contribution to the field of numerical analysis and not just a refinement of ideas given by Newton, Gauss, Hadamard, or Schur.” 43  Most computer algebra systems (e.g., MATLAB) use a variant of the QR decomposition to compute eigenvalues. While there are other methods for approximating eigenvalues, the QR algorithm stands out. For example, the power method is useful when a matrix is large and contains a lot of zero entries (such matrices are called sparse). When most of the entries of a matrix are nonzero, the power method is not recommended. The better, more sophisticated, alternative in these situations is to use the QR decomposition. The QR factorization has applications to solving least squares problems and approximating eigenvalues of matrices.

Activity 25.6.

Let A=[100002].

(a)

Find an orthonormal basis B for Col A. Let Q be the matrix whose columns are these orthonormal basis vectors.

(b)

Write the columns of A as linear combinations of the columns of Q. That is, if A=[a1 a2], find [a1]B and [a2]B. Let R=[[a1]B [a2]B].

(c)

Find the product QR and compare to A.

Activity 25.6 contains the main ideas to find the QR factorization of a matrix. Let

A=[a1 a2 a3  an]

be an m×n matrix with rank 44  n. We can use the Gram-Schmidt process to find an orthonormal basis {u1,u2,,un} for Col A. Recall also that Span{u1,u2,,uk}=Span{a1,a2,,ak} for any k between 1 and n. Let

Q=[u1 u2 u3  un].

If k is between 1 and n, then ak is in Span{u1,u2,,uk} and

ak=r1ku1+r2ku2++rkkuk

for some scalars r1k, r2k, , rkk. Then

Q[r1kr2krkk00]=ak.

If we let rk=[r1kr2krkk00] for k from 1 to n, then

A=[Qr1 Qr2 Qr3  Qrk].

This is the QR factorization of A into the product

A=QR

where the columns of Q form an orthonormal basis for Col A and

R=[r1 r2 r3  rn]=[r11r12r13r1n1r1n0r22r23r2n1r2n00r33r3n1r3n0000rnn]

is an upper triangular matrix. Note that Q is an m×n matrix and R=[rij] is an n×n matrix with rii0 for each i.

Activity 25.7.

The set {u1,u2}, where u1=[1 1 1 1]T and u2=[1 1 1 1]T is an orthogonal basis for the column space of a 4×2 matrix A=[a1 a2]. Moreover, a1=2u1 and a2=3u2+4u2. What is A?

The QR factorization provides a widely used algorithm (the QR algorithm) for approximating all of the eigenvalues of a matrix. The computer system MATLAB utilizes four versions of the QR algorithm to approximate the eigenvalues of real symmetric matrices, eigenvalues of real nonsymmetric matrices, eigenvalues of pairs of complex matrices, and singular values of general matrices.

The algorithm works as follows.

  • Start with an n×n matrix A. Let A1=A.

  • Find the QR factorization for A1 and write it as A1=Q1R1, where Q1 is orthogonal and R1 is upper triangular.

  • Let A2=Q11A1Q1=Q1TAQ1=R1Q1. Find the QR factorization of A2 and write it as A2=Q2R2.

  • Let A3=Q21A2Q2=Q2TAQ2=R2Q2. Find the QR factorization of A3 and write it as A3=Q3R3.

  • Continue in this manner to obtain a sequence {Ak} where Ak=QkRk and Ak+1=RkQk.

Note that Ak+1=Qk1AkQk and so all of the matrices Ak are similar to each other and therefore all have the same eigenvalues. We won't go into the details, but it can be shown that if the eigenvalues of A are real and have distinct absolute values, then the sequence {Ai} converges to an upper triangular matrix with the eigenvalues of A as the diagonal entries. If some of the eigenvalues of A are complex, then the sequence {Ai} converges to a block upper triangular matrix, where the diagonal blocks are either 1×1 (approximating the real eigenvalues of A) or 2×2 (which provide a pair of conjugate complex eigenvalues of A).

Subsection Examples

What follows are worked examples that use the concepts from this section.

Example 25.7.

Let W be the subspace of R4 spanned by w1=[1 0 0 0]T, w2=[1 1 1 0]T, and w3=[1 2 0 1]T.

(a)

Use the Gram-Schmidt process to find an orthonormal basis for the subspace W of R4 spanned by w1=[1 0 0 0]T, w2=[1 1 1 0]T, and w3=[1 2 0 1]T.

Solution.

First note that w1, w2, and w3 are linearly independent. We let v1=w1 and the Gram-Schmidt process gives us

v2=w2w2v1v1v1v1=[1 1 1 0]T11[1 0 0 0]T=[0 1 1 0]T

and

v3=w3w3v1v1v1v1w3v2v2v2v2=[1 2 0 1]T11[1 0 0 0]T22[0 1 1 0]T=[0 1 1 1]T.

So {v1,v2,v3} is an orthogonal basis for W. An orthonormal basis is found by dividing each vector by its magnitude, so

{[1 0 0 0]T,12[0 1 1 0]T,13[0 1 1 1]T}

is an orthonormal basis for W.

(b)

Find a QR factorization of the matrix A=[1110012001000011].

Solution.

Technology shows that the reduced row echelon form of A is I4, so the columns of A are linearly independent and A has rank 4. From part (a) we have an orthogonal basis for the span of the first three columns of A. To find a fourth vector to add so that the span is Col  A, we apply the Gram-Schmidt process one more time with w4=[0 0 0 1]T:

v4=w4w4v1v1v1v1w4v2v2v2v2w4v3v3v3v3=[0 0 0 1]T01[1 0 0 0]T02[0 1 1 0]T13[0 1 1 1]T=13[0 1 1 2]T.

So {u1,u2,u3,u4} is an orthonormal basis for Col  A, where

u1=[1 0 0 0]Tu2=22[0 1 1 0]Tu3=33[0 1 1 1]Tu4=66[0 1 1 2]T.

This makes

Q=[100002233660223366003363].

To find the matrix R, we write the columns of A in terms of the basis vectors u1, u2, u3, and u4. Technology shows that the reduced row echelon form of [Q | A] is

[1000111001000220001000333000100063].

So

R=[111002200033300063].

Example 25.8.

Let W=Span{[1 0 1 0]T,[0 0 1 1]T,[1 0 1 0]T}. Find the vector in W that is closest to the vector [1 1 1 1]T. Provide a numeric measure of the error in approximating [1 1 1 1]T by this vector.

Solution.

Our job is to find projW[1 1 1 1]T. To do this, we need an orthogonal basis of W. Let v1=[1 0 1 0]T, v2=[0 0 1 1]T, and v3=[1 0 1 0]T. Technology shows that each column of the matrix [v1 v2 v3] is a pivot column, so the set {v1,v2,v3} is a basis for W. We apply the Gram-Schmidt process to this basis to obtain an orthogonal basis {w1,w2,w3} of W. We start with w1=v1, then

w2=v2v2w1w1w1w1=[0 0 1 1]T12[1 0 1 0]T=12[1 0 1 2]T

and

w3=v3v3w1w1w1w1v3w2w2w2w2=[1 0 1 0]T02[1 0 1 0]T+23(12[1 0 1 2]T)=13[2 0 2 2]T.

Then, letting z=[1 1 1 1]T we have

projWz=zw1w1w1w1+zw2w2w2w2+zw3w3w3w3=22[1 0 1 0]T+13/212[1 0 1 2]T+2/34/313[2 0 2 2]T=[1 0 1 0]T13[1 0 1 2]T16[2 0 2 2]T=[1 0 1 1]T.

The norm of projWz=zprojWz tells us how well our projection [1 0 1 1]T approximates z=[1 1 1 1]T. Now

||projWz||=||[1 1 1 1]T[1 0 1 1]T||=||[0 1 0 0]T||=1,

so [1 0 1 1]T is one unit away from z=[1 1 1 1]T.

Subsection Summary

  • The projection of the vector v in V onto W is the vector

    projWv=vw1w1w1w1+vw2w2w2w2++vwmwmwmwm,

    where W is the a subspace of Rn with orthogonal basis {w1,w2,,wm}. These projections are important in that projWv is the best approximation of the vector v by a vector in W in the least squares sense.

  • With W as in (a), the projection of v orthogonal to W is the vector

    projWv=vprojWv.

    The norm of projWv provides a measure of how well projWv approximates the vector v.

  • The Gram-Schmidt process produces an orthogonal basis from any basis. It works as follows. Let {v1,v2,,vm} be a basis for a subspace W of Rn. The set {w1, w2, w3, , wm} defined by

    • w1=v1,

    • w2=v2v2w1w1w1w1,

    • w3=v3v3w1w1w1w1v3w2w2w2w2,

    • wm=vmvmw1w1w1w1vmw2w2w2w2vmwm1wm1wm1wm1.

    is an orthogonal basis for W. Moreover,

    Span{w1,w2,w3,,wk}=Span{v1,v2,v3,,vk}

    for each k between 1 and m.

  • The QR factorization has applications to solving least squares problems and approximating eigenvalues of matrices. The QR factorization writes an m×n matrix with rank n as a product A=QR, where the columns of Q form an orthonormal basis for Col A and

    R=[r1 | r2 | r3 |  | rn]=[r11r12r13r1n1r1n0r22r23r2n1r2n00r33r3n1r3n0000rnn]

    is an upper triangular matrix.

Exercises Exercises

1.

Let W=Span{w1,w2} in R3, with w1=[1 2 1]T and w2=[1 1 1]T, and v=[1 0 0]T.

(a)

Find projWv and projWv

(b)

Find the vector in W that is closest to v. How close is this vector to v?

2.

Let B={[1010],[0101],[1111]} and let W=Span(B) in R4. Find the best approximation in W to the vector v=[2311] in W and give a numeric estimate of how good this approximation is.

3.

In this exercise we determine the least-squares line (the line of best fit in the least squares sense) to a set of k data points (x1,y1), (x2,y2), , (xk,yk) in the plane. In this case, we want to fit a line of the form f(x)=mx+b to the data. If the data were to lie on a line, then we would have a solution to the system

mx1+b=y1mx2+b=y2mxk+b=yk

This system can be written as

mw1+bw2=y,

where w1=[x1 x2  xk]T, w2=[1 1  1]T, and y=[y1 y2  yk]T. If the data does not lie on a line, then the system won't have a solution. Instead, we want to minimize the square of the distance between y and a vector of the form mw1+bw2. That is, minimize

(25.1)||y(mw1+bw2)||2.

Rephrasing this in terms of projections, we are looking for the vector in W=Span{w1,w2} that is closest to y. In other words, the values of m and b will occur as the weights when we write projWy as a linear combination of w1 and w2. The one wrinkle in this problem is that we need an orthogonal basis for W to find this projection. Find the least squares line for the data points (1,2), (2,4) and (3,5) in R2.

4.

Each set S is linearly independent. Use the Gram-Schmidt process to create an orthogonal set of vectors with the same span as S. Then find an orthonormal basis for the same span.

(a)

S={[1 1 1]T,[5 1 2]T} in R3

(b)

S={[1 0 2]T,[1 1 1]T,[1 1 19]T} in R3

(c)

S={[1 0 1 0 1 0 1]T,[1 2 3 0 1 0 1]T,[1 0 4 1 2 0 1]T,[1 1 1 1 1 1 1]T} in R7

5.

Let S={v1,v2,v3} be a set of linearly dependent vectors in Rn for some integer n3. What is the result if the Gram-Schmidt process is applied to the set S? Explain.

6.

A fellow student wants to find a QR factorization for a 3×4 matrix. What would you tell this student and why?

7.

Find the QR factorizations of the given matrices.

(c)

[010000304005].

8.

Find an orthonormal basis {u1,u2,u3} of R3 such that Span{u1,u2}=Span{[1 2 3]T,[1 1 0]T}.

9.

(a)

Let A=[21262621]=[a1 a2]. A QR factorization of A has Q=[u1 u2] with u1=12[1 1 1 1]T and u2=12[1 1 1 1]T, and u3=112[1 1 1 3]T and R=[rij]=[4705]. Calculate the dot products aiuj for i and j between 1 and 2. How are these dot products connected to the entries of R?

(b)

Explain why the result of part (a) is true in general. That is, if A=[a1  a2  an] has QR factorization with Q=[u1 u2  un] and R=[rij], then rij=ajui.

10.

Upper triangular matrices play an important role in the QR decomposition. In this exercise we examine two properties of upper triangular matrices.

(a)

Show that if R and S are upper triangular n×n matrices with positive diagonal entries, then RS is also an upper triangular n×n matrices with positive diagonal entries.

(b)

Show that if R is an invertible upper triangular matrix with positive diagonal entries, then R1 is also an upper triangular matrix with positive diagonal entries.

11.

In this exercise we demonstrate that the QR decomposition of an m×n matrix with linearly independent columns is unique.

(a)

Suppose that A=QR, where Q is an m×n matrix with orthogonal columns and R is an n×n upper triangular matrix. Show that R is invertible.

Hint.

What can we say about x if Rx=0?

(b)

Show that the only n×n orthogonal upper triangular matrix with positive diagonal entries is the identity matrix In.

Hint.

Let A=[a1 a2  an]=[aij] be an n×n orthogonal upper triangular matrices with positive diagonal entries. What does that tell us about aiaj?

(c)

Show that if Q1 and Q2 are m×n matrices with orthogonal columns, and S is a matrix such that Q1=Q2S, then S is an orthogonal matrix.

Hint.

Write Q1TQ1 in terms of Q2 and S.

(d)

Suppose that Q1 and Q2 are m×n matrices with orthogonal columns and R1 and R2 are n×n upper triangular matrices such that

A=Q1R1=Q2R2.

Show that Q1=Q2 and R1=R2. Conclude that the QR factorization of a matrix is unique.

Hint.

Use the previous parts of this problem along with the results of Exercise 10.

12.

Label each of the following statements as True or False. Provide justification for your response. Throughout, let V be a vector space.

(a) True/False.

If {w1,w2} is a basis for a subspace W of R3, then the vector vw1w1w1w1+vw2w2w2w2 is the vector in W closest to v.

(b) True/False.

If W is a subspace of Rn, then the vector vprojWv is orthogonal to every vector in W.

(c) True/False.

If u1, u2, u3 are vectors in Rn, then the Gram-Schmidt process constructs an orthogonal set of vectors {v1,v2,v3} with the same span as {u1,u2,u3}.

(d) True/False.

Any set {u1,u2,,uk} of orthogonal vectors in Rn can be extended to an orthogonal basis of V.

(e) True/False.

If A is an n×n matrix with AAT=In, then the rows of A form an orthogonal set.

(f) True/False.

Every nontrivial subspace of Rn has an orthogonal basis.

(g) True/False.

If W is a subspace of Rn satisfying W={0}, then W=Rn.

Subsection Project: MIMO Systems and Householder Transformations

In a simplified model, a MIMO system will have k transmitting antennas and m receiving antennas. We record a transmitted symbol in a vector x=[x1 x2  xk]T (one xi for each transmitting antenna) and the received symbol as a vector y=[y1 y2  ym]T (one yj for each received symbol).

Between each transmit antenna and each receiver antenna is a fading channel. The MIMO system is the collection of fading channels. If we let hij be the fading between the jth transmitter and ith receiver, then we can represent this multipath fading as an m×k matrix H=[hij]. We assume that there is also some noise in the received signal that we represent by nj (for the jth receiving antenna). Our MIMO system is then represented as the matrix-vector equation

y=Hx+n,

where n=[n1 n2  nm]T. The goal is to reproduce the original signal x from the received signal y. This is where the QR decomposition comes into play.

Project Activity 25.8.

To see why and how the QR decomposition is used in MIMO systems, we begin with an example. Assume that we have two transmitters and three receivers, and suppose H=[212115]. From this point, to simplify the situation we assume that noise is negligible and consider the system y=Hx. Assume that the received signal is y=[214].

(a)

Show that the system Hx=y is inconsistent.

(b)

When we receive a signal, we need to interpret it, and that is difficult if we cannot find a solution. One reason that the system might not have a solution is that the elements in x have to come from a specified alphabet, and so we are looking for solutions that are in that alphabet, and there may be no direct solution in that alphabet. As a result, we generally have to find a “best” solution in the alphabet space. That can be done with the QR decomposition. Assume that H has a QR decomposition H=QR with Q having orthonormal columns and R a diagonal matrix. Explain how the equation y=Hx can be transformed into

(25.2)QTy=Rx.
(c)

Return to our example of H=[212115]. Assume that

H=QR=[2326232613223][33032].

Use (25.2) to find x if y=[214].

In Project Activity 25.8 we saw that a QR decomposition allows us to replace the system y=Hx with an upper triangular system QTy=Rx that has a solution. This solution is what is called a least squares solution (which we will discuss in more detail in a later section). The key to all of this is finding an efficient way to calculate a QR decomposition. This is the focus of the remainder of this project.

Subsection Householder Transformations and the QR Decomposition

There are three main ways to compute a QR decomposition.

  • The method discussed in this section uses Gram-Schmidt orthogonalization. This method is not recommended for numerical use as it is inherently numerically unstable. Under certain circumstances, cancellation can destroy the accuracy of the computed vectors and the resulting vectors may not be orthogonal.

  • The method that we will discuss in this project using Householder transformations (developed by Alston S. Householder). In Gaussian elimination, we can zero out entries below the diagonal by multiplying by elementary matrices to perform row operations. Householder transformations do something similar, allowing us to replace columns with columns of mostly zeros, which then allow for more efficient computations.

  • Givens (or Jacobi) rotations (used by W. Givens and originally invented by Jacobi for use with in solving the symmetric eigenvalue problem) is another method that allows us to selectively produce zeros into a matrix. We will focus on Householder transformations in this project.

The first step in a QR decomposition using Householder matrices is to create a matrix that is upper triangular. To get to this form, we use a Householder transformation. A Householder transformation (or matrix) is a matrix of the form

H=I2vvT||v||2,

where v is a vector in Rn and I is the n×n identity matrix.

Project Activity 25.9.

We will discover some properties of Householder transformations in this activity.

(a)

Let x be any vector and let y be a unit vector. In this part of the exercise we show that, with a suitable choice of v, the Householder transformation H transforms x into a vector of the same length parallel to y. That is,

Hx=σy

for some scalar σ. A similar argument shows that H(xσy)=σy. Letting y=e1 gives us the following result.

Hint.

Let σ=||x||, and let v=x+σy. Apply H to x, factor out x+σy, and ultimately show that Hx=σy.

(b)

The Householder matrix H has two other important properties. Show that H is symmetric and orthogonal.

There are many advantages to using Householder matrices. One is that instead of having to store all entries of the matrix H=I2vvT||v||2, all we need to store is the entries of v. Another advantage of Householder transformations is that they can be used to efficiently calculate QR decompositions. The next project activity shows how to use Lemma 25.9 and Householder transformations to compute a QR decomposition through an example.

Project Activity 25.10.

Assume that we have five receiving antennas and three transmitting antennas, and that

A=[1.00.01.07.07.08.01.02.01.07.07.06.00.01.00.0].

(We use A now instead of H so as to avoid confusion with Householder matrices.) The calculations in this activity can get rather messy, so use appropriate technology and round entries to four decimal places to the right of the decimal.

(a)

Let x1 be the first column of A. Identify an appropriate Householder transformation H1 so that H1x is a constant multiple of [1 0 0 0 0]T. Determine the matrix A1, where A1=H1A. (Special note: when deciding which of x±σe1 to use to create H, it is best to use the one whose sign is the same as x1 (the first entry of x). We won't go into the details, but this helps prevent problems due to cancellation,)

(b)

Next we consider just the bottom right 4×2 portion A^2 of the matrix A1 found in part (a).

(i)

Repeat the process on A^2 to find a Householder matrix H^2 that will make the first column of A^2 have all zeros below its first entry.

(ii)

Let

H2=[I100H^2].

Explain what the matrix A2=H2H1A is.

(c)

As a final step, we consider just the bottom right 3×1 portion A^3 of the matrix A2 and repeat the process.

(i)

Find a matrix H^3 that produces zeros below the first entry.

(ii)

Let

H3=[I200H^3].

What matrix is H3H2H1A? Why?

(d)

Explain why Q=H1H2H3 is an orthogonal matrix. Then find an upper triangular matrix R such that A=QR.

Project Activity 25.10 illustrates the general method for finding a QR decomposition of a matrix A by using Householder transformations to reduce A to an upper triangular matrix R. The product of the Householder matrices is then the orthogonal matrix Q. One final note: You may have noticed that the matrix Q you found in this project was a square 5×5 matrix and R was a 5×3 matrix, and so they have different sizes than the QR decomposition in this section. We could use the Gram-Schmidt process to obtain these larger matrices by extending the columns of a 5×3 matrix Q to form an orthonormal basis for R5. However, if we want a QR decomposition for A in which Q is 5×3 and R is 3×3 from the work we did in Project Activity 25.10, we can simply delete the last two columns of the matrix Q and the last two rows of the matrix R we calculated. This smaller QR decomposition is often referred to as a thin QR decomposition.

Jack Dongarra and Francis Sullivan. Introduction to the top 10 algorithms. Computing in Science and Engineering, 2: 22-23, 2000.
[3] Beresford N. Parlett. The QR algorithm. Computing in Science and Engineering, 2:38-42, 2000.
Recall that the rank of a matrix A is the dimension of the column space of A.