By the end of this section, you should be able to give precise and thorough answers to the questions listed below. You may want to keep these questions in mind to focus your thoughts as you complete the section.
What is a projection of a vector onto a subspace and why are such projections important?
What is a projection of a vector orthogonal to a subspace and why are such orthogonal projections important?
What is the Gram-Schmidt process and why is it useful?
What is the QR-factorization of a matrix and why is it useful?
MIMO (Multiple Input Multiple Output) systems are widely used to increase data transmission rates and improve the quality of wireless communication systems. MIMO is one of several forms of smart antenna technology, and it makes use of multiple antennas at the transmitter to send signals and multiple antennas at the receiver to accept the signals. MIMO systems transmit the same data on multiple streams, which introduces redundancy into the system. MIMO systems can utilize bounced and reflected signals to actually improve signal strength. This is very useful in urban environments in the presence of large buildings and the absence of direct line-of-sight transmission. MIMO systems can transmit several information streams in parallel (known as spatial multiplexing), allowing for increased data transformation rates. The presence of multiple receiver antennas allows for greater reliability in the system.
In a MIMO system, every transmitter antenna sends a signal to every receiver antenna as illustrated in Figure 25.1. In each of these transmissions the signal can be disturbed in some way. Two types of disturbances are fading and noise. Fading is due to time variations in the received signal, which can occur because of atmospheric conditions or obstacles over the path which are varying with respect to time. In a MIMO system we have multipath fading for the different paths the signal takes from different antennas to receivers, which causes fluctuations in amplitude, phase and angle of arrival of the received signal. Noise is an unwanted signal which interferes with actual signal. Both fading and noise result in a received signal that is different than the transmitted signal. The problem is how to recover the original signal from the transmitted signal. The QR decomposition is used in this process. To improve the efficiency of MIMO systems, different methods are introduced to determine a QR decomposition. We will discuss MIMO systems in more detail later in this section, and how Householder transformations can be used to find QR decompositions.
In many situations (least squares approximations, for example) want to find a vector in a subspace of that is the “best” approximation to a vector not in the subspace. A natural measure of “best” is to find a vector in if one exists, such that the distance from to is a minimum. This means we want to find so that the quantity
is as small as possible over all vectors in To do this, we will need to find a suitable projection of the vector onto the subspace We have already done this in the case that Span is the span of a single vector -- we can project the vector in the direction of . Our goal is to generalize this to project a vector onto an entire subspace in a useful way.
Let be a basis for a subspace of where and Note that is an orthonormal basis for Let Note that is the plane and that is not in as illustrated in Figure 25.2.
Part (4) shows that neither vector projnor projis the vector in that is closest to We should probably expect this neither projection uses the fact that the other vector might contribute to the closest vector. Let us instead consider the sum Calculate the components of this vector and determine the distance between and Which of the three vectors and in is closest to ?
A picture of and is shown in Figure 25.2. Draw in and Draw a picture of the vector in that appears to be closest to Does this vector seem to have any relationship to or ? If yes, what relationship do you see?
Preview Activity 25.1 gives an indication of how we can project a vector in onto a subspace of . If we have an orthogonal basis for , we can just add the projections of onto each basis vector. The resulting vector is called the projection of onto the subspace . As we did with projections onto vectors, we can also define the projection of orthogonal to . Note that to make this all work out properly, we will see that we need an orthogonal basis for .
Let Span in , with and , and as in Preview Activity 25.1. Recall that proj. Find the projection of orthogonal to and show directly that proj is orthogonal to the basis vectors for (and hence to every vector in ).
Activity 25.2 indicates that the vector proj is in fact orthogonal to every vector in . To see that this is true in general, let be an orthogonal basis for a subspace of and let be a vector in . Let
Then is the projection of orthogonal to . We will show that is orthogonal to every basis vector for . Since is an orthogonal basis for , we know that for . So if is between 1 and then
We have seen that proj is orthogonal to every vector in , which suggests that proj is in fact the vector in that is closest to . We now verify this fact and conclude that proj is the vector in closest to and therefore the best approximation of by a vector in .
Let be an orthogonal basis for a subspace of and let be a vector in . Let be a vector in . The vector proj is in , so is orthogonal to projproj. Thus, the dotted triangle whose vertices are the tips of the vectors ,, and proj in Figure 25.5 is a right triangle.
This theorem shows that the distance from proj to is less than the distance from any other vector in to . So proj is the best approximation to of all the vectors in .
We have seen that orthogonal bases make computations very convenient. However, until now we had no convenient way to construct orthogonal bases. The problem we address in this section is how to create an orthogonal basis from any basis.
Let Span in , where ,, and . Our goal in this preview activity is to begin to understand how we can find an orthogonal set in so that Span. To begin, we could start by letting .
Next we need to find a third vector that is in and is orthogonal to both and . Let Span. Explain why proj is in and is orthogonal to both and . Then calculate the vector .
Preview Activity 25.4 shows the first steps of the Gram-Schmidt process to construct an orthogonal basis from any basis of a subspace in . To understand why the process works in general, let be a basis for a subspace of . Let and let Span. Since we have that SpanSpan.
Then is an orthogonal set. Note that , and the fact that implies that . So the set is linearly independent, being a set of non-zero orthogonal vectors. Now the question is whether SpanSpan. Note that is a linear combination of and , so is in Span. Since Span is a 2-dimensional subspace of the 2-dimensional space Span, it must be true that SpanSpan.
is orthogonal to both and and, by construction, is a linear combination of ,, and . So is in Span. The fact that implies that and is a linearly independent set. Since Span is a 3-dimensional subspace of the 3-dimensional space Span, we conclude that SpanSpan.
We know that is orthogonal to ,,,. Since ,,,, and are all in Span we see that is also in . Since implies that and is a linearly independent set. Then Span is a -dimensional subspace of the -dimensional space , so it follows that
The Gram-Schmidt process builds an orthogonal basis for us from a given basis. To make an orthonormal basis , all we need do is normalize each basis vector: that is, for each , we let
There are several different factorizations, or decompositions, of matrices where each matrix is written as a product of certain types of matrices: LU decomposition using lower and upper triangular matrices (see Section 22), EVD (EigenVector Decomposition) decomposition using eigenvectors and diagonal matrices (see Section 19), and in this section we will introduce the QR decomposition of a matrix. The QR decomposition is one of the most important tools for matrix computations. Jack Dongarra and Francis Sullivan 42 nominated the QR decomposition as one of the “10 algorithms with the greatest influence on the development and practice of science and engineering in the 20th century.” The QR algorithm's importance stems from the fact that is a “genuinely new contribution to the field of numerical analysis and not just a refinement of ideas given by Newton, Gauss, Hadamard, or Schur.” 43 Most computer algebra systems (e.g., MATLAB) use a variant of the QR decomposition to compute eigenvalues. While there are other methods for approximating eigenvalues, the QR algorithm stands out. For example, the power method is useful when a matrix is large and contains a lot of zero entries (such matrices are called sparse). When most of the entries of a matrix are nonzero, the power method is not recommended. The better, more sophisticated, alternative in these situations is to use the QR decomposition. The QR factorization has applications to solving least squares problems and approximating eigenvalues of matrices.
be an matrix with rank 44 . We can use the Gram-Schmidt process to find an orthonormal basis for Col . Recall also that SpanSpan for any between 1 and . Let
The QR factorization provides a widely used algorithm (the QR algorithm) for approximating all of the eigenvalues of a matrix. The computer system MATLAB utilizes four versions of the QR algorithm to approximate the eigenvalues of real symmetric matrices, eigenvalues of real nonsymmetric matrices, eigenvalues of pairs of complex matrices, and singular values of general matrices.
Note that and so all of the matrices are similar to each other and therefore all have the same eigenvalues. We won't go into the details, but it can be shown that if the eigenvalues of are real and have distinct absolute values, then the sequence converges to an upper triangular matrix with the eigenvalues of as the diagonal entries. If some of the eigenvalues of are complex, then the sequence converges to a block upper triangular matrix, where the diagonal blocks are either (approximating the real eigenvalues of ) or (which provide a pair of conjugate complex eigenvalues of ).
Technology shows that the reduced row echelon form of is , so the columns of are linearly independent and has rank . From part (a) we have an orthogonal basis for the span of the first three columns of . To find a fourth vector to add so that the span is Col , we apply the Gram-Schmidt process one more time with :
.
So is an orthonormal basis for Col , where
This makes
.
To find the matrix , we write the columns of in terms of the basis vectors ,,, and . Technology shows that the reduced row echelon form of is
Let Span. Find the vector in that is closest to the vector . Provide a numeric measure of the error in approximating by this vector.
Solution.
Our job is to find proj. To do this, we need an orthogonal basis of . Let ,, and . Technology shows that each column of the matrix is a pivot column, so the set is a basis for . We apply the Gram-Schmidt process to this basis to obtain an orthogonal basis of . We start with , then
and
.
Then, letting we have
proj.
The norm of projproj tells us how well our projection approximates . Now
The projection of the vector in onto is the vector
proj,
where is the a subspace of with orthogonal basis . These projections are important in that proj is the best approximation of the vector by a vector in in the least squares sense.
With as in (a), the projection of orthogonal to is the vector
projproj.
The norm of proj provides a measure of how well proj approximates the vector .
The Gram-Schmidt process produces an orthogonal basis from any basis. It works as follows. Let be a basis for a subspace of . The set ,,,, defined by
,
,
,
.
is an orthogonal basis for . Moreover,
SpanSpan
for each between 1 and .
The QR factorization has applications to solving least squares problems and approximating eigenvalues of matrices. The QR factorization writes an matrix with rank as a product , where the columns of form an orthonormal basis for Col and
In this exercise we determine the least-squares line (the line of best fit in the least squares sense) to a set of data points ,,, in the plane. In this case, we want to fit a line of the form to the data. If the data were to lie on a line, then we would have a solution to the system
This system can be written as
,
where ,, and . If the data does not lie on a line, then the system won't have a solution. Instead, we want to minimize the square of the distance between and a vector of the form . That is, minimize
.(25.1)
Rephrasing this in terms of projections, we are looking for the vector in Span that is closest to . In other words, the values of and will occur as the weights when we write proj as a linear combination of and . The one wrinkle in this problem is that we need an orthogonal basis for to find this projection. Find the least squares line for the data points , and in .
Each set is linearly independent. Use the Gram-Schmidt process to create an orthogonal set of vectors with the same span as . Then find an orthonormal basis for the same span.
Let . A QR factorization of has with and , and and . Calculate the dot products for and between and . How are these dot products connected to the entries of ?
Show that if and are upper triangular matrices with positive diagonal entries, then is also an upper triangular matrices with positive diagonal entries.
Show that if is an invertible upper triangular matrix with positive diagonal entries, then is also an upper triangular matrix with positive diagonal entries.
In a simplified model, a MIMO system will have transmitting antennas and receiving antennas. We record a transmitted symbol in a vector (one for each transmitting antenna) and the received symbol as a vector (one for each received symbol).
Between each transmit antenna and each receiver antenna is a fading channel. The MIMO system is the collection of fading channels. If we let be the fading between the th transmitter and th receiver, then we can represent this multipath fading as an matrix . We assume that there is also some noise in the received signal that we represent by (for the th receiving antenna). Our MIMO system is then represented as the matrix-vector equation
To see why and how the QR decomposition is used in MIMO systems, we begin with an example. Assume that we have two transmitters and three receivers, and suppose . From this point, to simplify the situation we assume that noise is negligible and consider the system . Assume that the received signal is .
When we receive a signal, we need to interpret it, and that is difficult if we cannot find a solution. One reason that the system might not have a solution is that the elements in have to come from a specified alphabet, and so we are looking for solutions that are in that alphabet, and there may be no direct solution in that alphabet. As a result, we generally have to find a “best” solution in the alphabet space. That can be done with the QR decomposition. Assume that has a QR decomposition with having orthonormal columns and a diagonal matrix. Explain how the equation can be transformed into
In Project Activity 25.8 we saw that a QR decomposition allows us to replace the system with an upper triangular system that has a solution. This solution is what is called a least squares solution (which we will discuss in more detail in a later section). The key to all of this is finding an efficient way to calculate a QR decomposition. This is the focus of the remainder of this project.
There are three main ways to compute a QR decomposition.
The method discussed in this section uses Gram-Schmidt orthogonalization. This method is not recommended for numerical use as it is inherently numerically unstable. Under certain circumstances, cancellation can destroy the accuracy of the computed vectors and the resulting vectors may not be orthogonal.
The method that we will discuss in this project using Householder transformations (developed by Alston S. Householder). In Gaussian elimination, we can zero out entries below the diagonal by multiplying by elementary matrices to perform row operations. Householder transformations do something similar, allowing us to replace columns with columns of mostly zeros, which then allow for more efficient computations.
Givens (or Jacobi) rotations (used by W. Givens and originally invented by Jacobi for use with in solving the symmetric eigenvalue problem) is another method that allows us to selectively produce zeros into a matrix. We will focus on Householder transformations in this project.
The first step in a QR decomposition using Householder matrices is to create a matrix that is upper triangular. To get to this form, we use a Householder transformation. A Householder transformation (or matrix) is a matrix of the form
Let be any vector and let be a unit vector. In this part of the exercise we show that, with a suitable choice of , the Householder transformation transforms into a vector of the same length parallel to . That is,
for some scalar . A similar argument shows that . Letting gives us the following result.
There are many advantages to using Householder matrices. One is that instead of having to store all entries of the matrix , all we need to store is the entries of . Another advantage of Householder transformations is that they can be used to efficiently calculate QR decompositions. The next project activity shows how to use Lemma 25.9 and Householder transformations to compute a QR decomposition through an example.
Assume that we have five receiving antennas and three transmitting antennas, and that
.
(We use now instead of so as to avoid confusion with Householder matrices.) The calculations in this activity can get rather messy, so use appropriate technology and round entries to four decimal places to the right of the decimal.
Let be the first column of . Identify an appropriate Householder transformation so that is a constant multiple of . Determine the matrix , where . (Special note: when deciding which of to use to create , it is best to use the one whose sign is the same as (the first entry of ). We won't go into the details, but this helps prevent problems due to cancellation,)
Project Activity 25.10 illustrates the general method for finding a QR decomposition of a matrix by using Householder transformations to reduce to an upper triangular matrix . The product of the Householder matrices is then the orthogonal matrix . One final note: You may have noticed that the matrix you found in this project was a square matrix and was a matrix, and so they have different sizes than the QR decomposition in this section. We could use the Gram-Schmidt process to obtain these larger matrices by extending the columns of a matrix to form an orthonormal basis for . However, if we want a QR decomposition for in which is and is from the work we did in Project Activity 25.10, we can simply delete the last two columns of the matrix and the last two rows of the matrix we calculated. This smaller QR decomposition is often referred to as a thin QR decomposition.
Jack Dongarra and Francis Sullivan. Introduction to the top 10 algorithms. Computing in Science and Engineering, 2: 22-23, 2000.
[3] Beresford N. Parlett. The QR algorithm. Computing in Science and Engineering, 2:38-42, 2000.
Recall that the rank of a matrix is the dimension of the column space of .