Skip to main content

Section 31 Vector Spaces

Subsection Application: The Hat Puzzle

In a New York Times article (April 10, 2001) “Why Mathematicians Now Care About Their Hat Color”, the following puzzle is posed.

“Three players enter a room and a red or blue hat is placed on each person's head. The color of each hat is determined by a coin toss, with the outcome of one coin toss having no effect on the others. Each person can see the other players' hats but not his own. No communication of any sort is allowed, except for an initial strategy session before the game begins. Once they have had a chance to look at the other hats, the players must simultaneously guess the color of their own hats or pass. The group shares a hypothetical $3 million prize if at least one player guesses correctly and no players guess incorrectly.”

The game can be played with more players, and the problem is to find a strategy for the group that maximizes its chance of winning. One strategy is for a designated player to make a random guess and for the others to pass. This gives a 50% chance of winning. However, there are much better strategies that provide a nearly 100% probability of winning as the number of players increases. One such strategy is based on Hamming codes and subspaces of a particular vector space to implement the most effective approach.

Subsection Introduction

We have previously seen that \(\R^n\) and the set of fixed size matrices have a nice algebraic structure when endowed with the addition and scalar multiplication operations. In fact, as we will see, there are many other sets of elements that have the same kind of structure with natural addition and scalar multiplication operations. Due to this underlying similar structure, these sets are connected in some way and can all be studied jointly. Mathematicians look for these kinds of connections between seemingly dissimilar objects and, from a mathematical standpoint, it is convenient to study all of these similar structures at once by combining them into a larger collection. This motivates the idea of a vector space that we will investigate in this chapter.

An example of a set that has a structure similar to vectors is a collection of polynomials. Let \(\pol_1\) be the collection of all polynomials of degree less than or equal to 1 with real coefficients. That is,

\begin{equation*} \pol_1 = \{ a_0 + a_1t : a_0, a_1 \in \R\}\text{.} \end{equation*}

So, for example, the polynomials \(2+t\text{,}\) \(5t\text{,}\) \(-7\text{,}\) and \(\sqrt{12}-\pi t\) are in \(\pol_1\text{,}\) but \(\sqrt{t}\) is not in \(\pol_1\text{.}\) Two polynomials \(a(t)=a_0+a_1t\) and \(b(t)=b_0+b_1t\) in \(\pol_1\) are equal if \(a_0=b_0\) and \(a_1=b_1\text{.}\)

We define addition of polynomials in \(\pol_1\) by adding the coefficients of the like degree terms. So if \(a(t) = a_0+a_1t\) and \(b(t) = b_0+b_1t\text{,}\) then the polynomial sum of \(a(t)\) and \(b(t)\) is

\begin{equation*} a(t)+b(t) = (a_0+a_1t) + (b_0+b_1t) = (a_0+b_0) + (a_1+b_1)t\text{.} \end{equation*}

So, for example,

\begin{equation*} (2+3t)+(-1+5t) = (2+(-1)) + (3+5)t = 1+8 t\text{.} \end{equation*}

We now consider the properties of the addition operation. For example, we can ask if polynomial addition is commutative. That is, if \(a(t)\) and \(b(t)\) are in \(\pol_1\text{,}\) must it be the case that

\begin{equation*} a(t)+b(t)~=~b(t)+a(t)? \end{equation*}

To show that addition is commutative in \(\pol_1\text{,}\) we choose arbitrary polynomials \(a(t) = a_0 + a_1t\) and \(b(t) = b_0 + b_1t\) in \(\pol_1\text{.}\) Then we have

\begin{align*} a(t)+b(t) \amp = (a_0+b_0) + (a_1+b_1)t\\ \amp = (b_0+a_0) + (b_1+a_1)t\\ \amp = b(t)+a(t)\text{.} \end{align*}

Note that in the middle step, we used the definition of equality of polynomials since \(a_0+b_0=b_0+a_0\) and \(a_1+b_1=b_1+a_1\) due to the fact that addition of real numbers is commutative. So addition of elements in \(\pol_1\) is a commutative operation.

Preview Activity 31.1.

(a)

Now we investigate other properties of addition in \(\pol_1\text{.}\)

(i)

To show addition is associative in \(\pol_1\text{,}\) we need to verify that if \(a(t) = a_0 + a_1t\text{,}\) \(b(t) = b_0+b_1t\text{,}\) and \(c(t) = c_0 + c_1t\) are in \(\pol_1\text{,}\) it must be the case that

\begin{equation*} (a(t) + b(t)) + c(t) = a(t) + (b(t)+c(t))\text{.} \end{equation*}

Either verify this property by using the definition of two polynomials being equal, or give a counterexample to show the equality fails in that case.

(ii)

Find a polynomial \(z(t) \in \pol_1\) such that

\begin{equation*} a(t) + z(t) = a(t) \end{equation*}

for all \(a(t) \in \pol_1\text{.}\) This polynomial is called the zero polynomial or the additive identity polynomial in \(\pol_1\text{.}\)

(iii)

If \(a(t) = a_0 + a_1t\) is an element of \(\pol_1\text{,}\) is there an element \(p(t) \in \pol_1\) such that

\begin{equation*} a(t) + p(t) = z(t)\text{,} \end{equation*}

where \(z(t)\) is the additive identity polynomial you found above? If not, why not? If so, what polynomial is \(p(t)\text{?}\) Explain.

(b)

We can also define a multiplication of polynomials by scalars (real numbers).

(i)

What element in \(\pol_1\) could be the scalar multiple \(\frac{1}{2} (2+3t)\text{?}\)

(ii)

In general, if \(k\) is a scalar and \(a(t) = a_0 + a_1t\) is in \(\pol_1\text{,}\) how do we define the scalar multiple \(ka(t)\) in \(\pol_1\text{?}\)

(iii)

If \(k\) is a scalar and \(a(t) = a_0 + a_1t\) and \(b(t) = b_0 + b_1t\) are elements in \(\pol_1\text{,}\) is it true that

\begin{equation*} k(a(t) + b(t)) = ka(t) + kb(t)? \end{equation*}

If no, explain why. If yes, verify your answer using the definition of two polynomials being equal.

(iv)

If \(k\) and \(m\) are scalars and \(a(t) = a_0 + a_1t\) is an element in \(\pol_1\text{,}\) is it true that

\begin{equation*} (k+m)a(t) = ka(t) + ma(t)? \end{equation*}

If no, explain why. If yes, verify your answer.

(v)

If \(k\) and \(m\) are scalars and \(a(t) = a_0 + a_1t\) is an element in \(\pol_1\text{,}\) is it true that

\begin{equation*} (km)a(t) = k(ma(t))? \end{equation*}

If no, explain why. If yes, verify your answer.

(vi)

If \(a(t) = a_0 + a_1t\) is an element of \(\pol_1\text{,}\) is it true that

\begin{equation*} 1a(t) = a(t)? \end{equation*}

If no, explain why. If yes, verify your answer.

Subsection Spaces with Similar Structure to \(\R^n\)

Mathematicians look for patterns and for similarities between mathematical objects. In doing so, mathematicians often consider larger collections of objects that are sorted according to their similarities and then study these collections rather than just the objects themselves. This perspective can be very powerful — whatever can be shown to be true about an arbitrary element in a collection will then be true for every specific element in the collection. In this section we study the larger collection of sets that share the algebraic structure of vectors in \(\R^n\text{.}\) These sets are called vector spaces.

In Preview Activity 31.1, we showed that the set \(\pol_1\) of polynomials of degree less than or equal to one with real coefficients, with the operations of addition and scalar multiplication defined by

\begin{equation*} (a_0+a_1t)+(b_0+b_1t) = (a_0+b_0) + (a_1+b_1)t \ \ \ \text{ and } \ \ \ k(a_0+a_1t) = (ka_0) + (ka_1)t\text{,} \end{equation*}

has a structure similar to \(\R^2\text{.}\)

By structure we mean how the elements in the set relate to each other under addition and multiplication by scalars. That is, if \(a(t)=a_0+a_1t\text{,}\) \(b(t)\text{,}\) and \(c(t)\) are elements of \(\pol_1\) and \(k\) and \(m\) are scalars, then

  1. \(a(t) + b(t)\) is an element of \(\pol_1\text{,}\)

  2. \(a(t)+b(t) = b(t) + a(t)\text{,}\)

  3. \((a(t)+b(t)) + c(t) = a(t) + (b(t)+c(t))\text{,}\)

  4. there is a zero polynomial \(z(t)\) (namely, \(0+0t\)) in \(\pol_1\) so that \(a(t) + z(t) = a(t)\text{,}\)

  5. there is an element \(-a(t)\) in \(\pol_1\) (namely, \((-a_0)+(-a_1)t\)) so that \(a(t) + (-a(t)) = z(t)\text{,}\)

  6. \(k a(t)\) is an element of \(\pol_1\text{,}\)

  7. \((k+m) a(t) = ka(t) + ma(t)\text{,}\)

  8. \(k(a(t)+b(t)) = ka(t) + kb(t)\text{,}\)

  9. \((km) a(t) = k(m a(t))\text{,}\)

  10. \(1 a(t) = a(t)\text{.}\)

The properties we saw for polynomials in \(\pol_1\) stated above are the same as the properties for vector addition and multiplication by scalars in \(\R^n\text{,}\) as well as matrix addition and multiplication by scalars identified in Section 8. This indicates that polynomials in \(\pol_1\text{,}\) vectors in \(\R^n\text{,}\) and the set of \(m \times n\) matrices behave in much the same way as regards their addition and multiplication by scalars. There is an even closer connection between linear polynomials and vectors in \(\R^2\text{.}\) An element \(a(t) = a_0 + a_1t\) in \(\pol_1\) can be naturally associated with the vector \(\left[ \begin{array}{c} a_0 \\ a_1 \end{array} \right]\) in \(\R^2\text{.}\) All the results of polynomial addition and multiplication by scalars then translate to corresponding results of addition and multiplication by scalars of vectors in \(\R^2\text{.}\) So for all intents and purposes, as far as addition and multiplication by scalars is concerned, there is no difference between elements in \(\pol_1\) and vectors in \(\R^2\) — the only difference is how we choose to present the elements (as polynomials or as vectors). This sameness of structure of our sets as it relates to addition and multiplication by scalars is the type of similarity mentioned in the introduction. We can study all of the types of objects that exhibit this same structure at one time by studying vector spaces.

Subsection Vector Spaces

We defined vector spaces in the context of subspaces of \(\R^n\) in Definition 12.3. In general, any set that has the same kind of additive and multiplicative structure as our sets of vectors, matrices, and linear polynomials is called a vector space. As we will see, the ideas that we introduced about subspaces of \(\R^n\) apply to vector spaces in general, so the material in this chapter should have a familiar feel.

Definition 31.1.

A set \(V\) on which an operation of addition and a multiplication by scalars is defined is a vector space if for all \(\vu\text{,}\) \(\vv\text{,}\) and \(\vw\) in \(V\) and all scalars \(a\) and \(b\text{:}\)

  1. \(\vu + \vv\) is an element of \(V\) (we say that \(V\) is closed under the addition in \(V\)),

  2. \(\vu + \vv = \vv + \vu\) (we say that the addition in \(V\) is commutative),

  3. \((\vu + \vv) + \vw = \vu + (\vv + \vw)\) (we say that the addition in \(V\) is associative),

  4. there is a zero vector \(\vzero\) in \(V\) so that \(\vu + \vzero = \vu\) (we say that \(V\) contains an additive identity \(\vzero\)),

  5. for each \(\vx\) in \(V\) there is an element \(\vy\) in \(V\) so that \(\vx + \vy = \vzero\) (we say that \(V\) contains an additive inverse \(\vy\) for each element \(\vx\) in \(V\)),

  6. \(a \vu\) is an element of \(V\) (we say that \(V\) is closed under multiplication by scalars),

  7. \((a+b) \vu = a\vu + b\vu\) (we say that multiplication by scalars distributes over scalar addition),

  8. \(a(\vu + \vv) = a\vu + a\vv\) (we say that multiplication by scalars distributes over addition in \(V\)),

  9. \((ab) \vu = a(b\vu)\text{,}\)

  10. \(1 \vu = \vu\text{.}\)

Note.

Unless otherwise stated, in this book the scalars will refer to real numbers. However, we can define vector spaces where scalars are complex numbers, or rational numbers, or integers modulo \(p\) where \(p\) is a prime number, or, more generally, elements of a field. A field is an algebraic structure which generalizes the structure of real numbers and rational numbers under the addition and multiplication operations. Since we will focus on the real numbers as scalars, the reader is not required to be familiar with the concept of a field.

Because of the similarity of the way elements in vector spaces behave compared to vectors in \(\R^n\text{,}\) we call the elements in a vector space vectors. There are many examples of vectors spaces, which is what makes this idea so powerful.

Example 31.2.

(a)

The space \(\R^n\) of all vectors with \(n\) components is a vector space using the standard vector addition and multiplication by scalars. The zero element is the zero vector \(\vzero\) whose components are all 0.

(b)

The set \(\pol_1\) of all polynomials of degree less than or equal to 1 with addition and scalar multiplication as defined earlier. Recall that \(\pol_1\) is essentially the same as \(\R^2\text{.}\)

(c)

The properties listed in the introduction for \(\pol_1\) are equally true for the collection of all polynomials of degree less than or equal to some fixed number. We label as \(\pol_n\) this set of all polynomials of degree less than or equal to \(n\text{,}\) with the standard addition and scalar multiplication. Note that \(\pol_n\) is essentially the same as \(\R^{n+1}\text{.}\) More generally, the space \(\pol\) of all polynomials is also a vector space with standard addition and scalar multiplication.

(d)

As a subspace of \(\R^n\text{,}\) the eigenspace of an \(n \times n\) matrix corresponding to an eigenvalue \(\lambda\) is a vector space.

(e)

As a subspace of \(\R^n\text{,}\) the null space of an \(m \times n\) matrix is a vector space.

(f)

As a subspace of \(\R^m\text{,}\) the column space of an \(m \times n\) matrix is a vector space.

(g)

The span of a set of vectors in \(\R^n\) is a subspace of \(\R^n\text{,}\) and is therefore a vector space.

(h)

Let \(V\) be a vector space and let \(\vzero\) be the additive identity in \(V\text{.}\) The set \(\{\vzero\}\) is a vector space in which \(\vzero + \vzero = \vzero\) and \(k\vzero = \vzero\) for any scalar \(k\text{.}\) This space is called the trivial vector space.

(i)

The space \(\M_{m \times n}\) (or \(\M_{m \times n}(\R)\) when it is important to indicate that the entries of our matrices are real numbers) of all \(m \times n\) matrices with real entries with the standard addition and multiplication by scalars we have already defined. In this case, \(\M_{m \times n}\) is essentially the same vector space as \(\R^{mn}\text{.}\)

(j)

The space \(\F\) of all functions from \(\R\) to \(\R\text{,}\) where we define the sum of two functions \(f\) and \(g\) in \(\F\) as the function \(f+g\) satisfying

\begin{equation*} (f+g)(x) = f(x) + g(x) \end{equation*}

for all real numbers \(x\text{,}\) and the scalar multiple \(cf\) of the function \(f\) by the scalar \(c\) to be the function satisfying

\begin{equation*} (cf)(x) = cf(x) \end{equation*}

for all real numbers \(x\text{.}\) The verification of the vector space properties for this space is left to the reader.

(k)

The space \(\R^\infty\) of all infinite real sequences \((x_1, x_2, x_3, \ldots)\text{.}\) We define addition and scalar multiplication termwise:

\begin{equation*} (x_1, x_2, x_3, \ldots) + (y_1, y_2, y_3, \ldots) = (x_1+y_1, x_2+y_2, x_3+y_3, \ldots) \,\text{,} \end{equation*}
\begin{equation*} c(x_1, x_2, x_3, \ldots) = (cx_1, cx_2, cx_3, \ldots) \\text{,} \end{equation*}

is a vector space. In addition, the set of convergent sequences inside \(\R^\infty\) forms a vector space using this addition and multiplication by scalars (as we did in \(\R^n\text{,}\) we will call this set of convergent sequences a subspace of \(\R^{\infty}\)).

(l)

(For those readers who are familiar with differential equations). The set of solutions to a second order homogeneous differential equation forms a vector space under addition and scalar multiplication defined as in the space \(\F\) above.

(m)

The set of polynomials of positive degree in \(\pol_1\) is not a vector space using the standard addition and multiplication by scalars in \(\pol_1\) is not a vector space. Notice that \(t + (-t)\) is not a polynomial of positive degree, and so this set is not closed under addition.

(n)

The color space where each color is assigned an RGB (red, green, blue) coordinate between 0 and 255, with addition and scalar multiplication defined component-wise, however, does not define a vector space. The color space is not closed under either operation due to the color coordinates being integers ranging from 0 to 255.

It is important to note that the set of defining properties of a vector space is intended to be a minimum set. Any other properties of a vector space must be verified or proved using the defining properties. For example, in \(\R^n\) it is clear that the scalar multiple \(0\vv\) is the zero vector for any vector \(\vv\) in \(\R^n\text{.}\) This might be true in any vector space, but it is not a defining property. Therefore, if this property is true, then we must be able to prove it using just the defining properties. To see how this might work, let \(\vv\) be any vector in a vector space \(V\text{.}\) We want to show that \(0 \vv = \vzero\) (the existence of the zero vector is property (4)). Using the fact that \(0+0 = 0\) and that scalar multiplication distributes over scalar addition, we can see that

\begin{equation*} 0\vv = (0+0)\vv = 0\vv + 0\vv\text{.} \end{equation*}

Property (5) tells us that \(V\) contains an additive inverse for every vector in \(V\text{,}\) so let \(\vu\) be an additive inverse of the vector \(0\vv\) in \(V\text{.}\) Then \(0\vv + \vu = \vzero\) 56  and so

\begin{align*} 0\vv + \vu \amp = (0\vv + 0\vv) + \vu\\ \vzero \amp = 0\vv + (0\vv + \vu)\\ \vzero \amp = 0\vv + \vzero\text{.} \end{align*}

Now \(\vzero\) has the property that \(\vzero + \vw = \vw+ \vzero = \vw\) for any vector \(\vw\) in \(V\) (by properties (4) and (2)), and so we can conclude that

\begin{equation*} \vzero = 0\vv\text{.} \end{equation*}

Activity 31.2.

Another property that will be useful is a cancellation property. In the set of real numbers we know that if \(a+b = c+b\text{,}\) then \(a=c\text{,}\) and we verify this by subtracting \(b\) from both sides. This is the same as adding the additive inverse of \(b\) to both sides, so we ought to be able to make the same argument using additive inverses in a vector space. To see how, let \(\vu\text{,}\) \(\vv\text{,}\) and \(\vw\) be vectors in a vector space and suppose that

\begin{equation} \vu + \vw = \vv + \vw\text{.}\tag{31.1} \end{equation}
(a)

Why does our space contain an additive inverse \(\vz\) of \(\vw\text{?}\)

(b)

Now add the vector \(\vz\) to both sides of equation (31.1) to obtain

\begin{equation} (\vu+\vw) + \vz = (\vv+\vw) + \vz\text{.}\tag{31.2} \end{equation}

Which property of a vector space allows us to state the following equality?

\begin{equation} \vu+(\vw+\vz) = \vv+(\vw+\vz)\text{.}\tag{31.3} \end{equation}
(c)

Now use the properties of additive inverses and the additive identity to explain why \(\vu = \vv\text{.}\) Conclude that we have a cancellation law for addition in any vector space.

We should also note that the definition of a vector space only states the existence of a zero vector and an additive inverse for each vector in the space, and does not say that there cannot be more than one zero vector or more than one additive inverse of a vector in the space. The reason why is that the uniqueness of the zero vector and an additive inverse of a vector can be proved from the defining properties of a vector space, and so we don't list this consequence as a defining property. Similarly, the defining properties of a vector space do not state that the additive inverse of a vector \(\vv\) is the scalar multiple \((-1)\vv\text{.}\) Verification of these properties are left for the exercises. We summarize the results of this section in the following theorem.

Subsection Subspaces

In Section 12 we saw that \(\R^n\) contained subsets that we called subspaces that had the same algebraic structure as \(\R^n\text{.}\) The same idea applies to vector spaces in general.

Activity 31.3.

Let \(H = \left\{ at : a \in \R \right\}\text{.}\) Notice that \(H\) is a subset of \(\pol_1\text{.}\)

(a)

Is \(H\) closed under the addition in \(\pol_1\text{?}\) Verify your answer.

(b)

Does \(H\) contain the zero vector from \(\pol_1\text{?}\) Verify your answer.

(c)

Is \(H\) closed under multiplication by scalars? Verify your answer.

(d)

Explain why \(H\) satisfies every other property of the definition of a vector space automatically just by being a subset of \(\pol_1\) and using the same operations as in \(\pol_1\text{.}\) Conclude that \(H\) is a vector space.

Activity 31.3 illustrates an important point. There is a fundamental difference in the types of properties that define a vector space. Some of the properties that define a vector space are true for any subset of the vector space because they are properties of the operations (such as the commutative and associative properties). The other properties (closure, the inclusion of the zero vector, and the inclusion of additive inverses) are set properties, not properties of the operations. So these three properties have to be specifically checked to see if a subset of a vector space is also a vector space. This leads to the definition of a subspace, a subset of a vector space which is a vector space itself.

Definition 31.4.

A subset \(H\) of a vector space \(V\) is a subspace of \(V\) if

  1. whenever \(\vu\) and \(\vv\) are in \(H\) it is also true that \(\vu + \vv\) is in \(H\) (that is, \(H\) is closed under addition),

  2. whenever \(\vu\) is in \(H\) and \(a\) is a scalar it is also true that \(a\vu\) is in \(H\) (that is, \(H\) is closed under scalar multiplication),

  3. \(\vzero\) is in \(H\text{.}\)

Activity 31.4.

Is the given subset \(H\) a subspace of the indicated vector space \(V\text{?}\) Verify your answer.

(a)

\(V\) is any vector space and \(H = \{\vzero\}\)

(b)

\(V = M_{2 \times 2}\text{,}\) the vector space of \(2 \times 2\) matrices and \(H = \left\{ \left[ \begin{array}{cc} 2x\amp y \\ 0 \amp x \end{array} \right] \big| \text{ \(x\) and \(y\) are scalars } \right\}\text{.}\)

(c)

\(V = \pol_2\text{,}\) the vector space of all polynomials of degree less than or equal to 2 and \(H = \left\{ 2at^2+1 \mid a \text{ is a scalar } \right\}\text{.}\)

(d)

\(V=\pol_2\) and \(H=\left\{at\mid a \text{ is a scalar } \right\} \cup \left\{bt^2\mid b \text{ is a scalar } \right\}\text{.}\)

(e)

\(V=\F\) and \(H=\pol_2\text{.}\)

There is an interesting subspace relationship between the spaces \(\pol_1, \pol_2, \pol_3, \ldots\) and \(\pol\text{.}\) For every \(i\text{,}\) \(\pol_i\) is a subspace of \(\pol\text{.}\) Furthermore, \(\pol_1\) is a subspace of \(\pol_2\text{,}\) \(\pol_2\) is a subspace of \(\pol_3\text{,}\) and so on. Note however that a similar relationship does NOT hold for \(\R^n\text{,}\) even though \(\pol_i\) looks like \(\R^{i+1}\text{.}\) For example, \(\R^1\) is NOT a subspace of \(\R^2\text{.}\) Similarly, \(\R^2\) is NOT a subspace of \(\R^3\text{.}\) Since the vectors in different \(\R^n\)'s are of different sizes, none of the \(\R^i\)'s is a subset of another \(\R^n\) with \(i \neq n\text{,}\) and hence, \(\R^i\) is not a subspace of \(\R^n\) when \(i\lt n\text{.}\)

Subsection The Subspace Spanned by a Set of Vectors

In \(\R^n\) we showed that the span of any set of vectors forms a subspace of \(\R^n\text{.}\) The same is true in any vector space. Recall that the span of a set of vectors in \(\R^n\) is the set of all linear combinations of those vectors. So before we can discuss the span of a set of vectors in a vector space, we need to extend the definition of linear combinations to vector spaces (compare to Definition 4.8 and Definition 4.10).

Definition 31.5.

Let \(V\) be a vector space. A linear combination of vectors \(\vv_1\text{,}\) \(\vv_2\text{,}\) \(\ldots\text{,}\) \(\vv_k\) in \(V\) is a vector of the form

\begin{equation*} x_1\vv_1+x_2\vv_2 + \cdots + x_k \vv_k\text{,} \end{equation*}

where \(x_1\text{,}\) \(x_2\text{,}\) \(\ldots\text{,}\) \(x_k\) are scalars. The span of the vectors \(\vv_1\text{,}\) \(\vv_2\text{,}\) \(\ldots\text{,}\) \(\vv_k\) is the collection of all linear combinations of \(\vv_1\text{,}\) \(\vv_2\text{,}\) \(\ldots\text{,}\) \(\vv_k\text{.}\) That is,

\begin{equation*} \Span\{\vv_1, \vv_2, \ldots, \vv_k\} = \{x_1\vv_1+x_2\vv_2 + \cdots + x_k \vv_k \mid x_1, x_2, \ldots, x_k \text{ are scalars } \}\text{.} \end{equation*}

The argument that the span of any finite set of vectors in a vector space forms a subspace is the same as we gave for the span of a set of vectors in \(\R^n\) (see Theorem 12.7). The proof is left for the exercises.

The subspace \(\Span\{\vv_1, \vv_2, \ldots, \vv_m\}\) is called the subspace of \(V\) spanned by \(\vv_1, \vv_2, \ldots, \vv_m\).

Activity 31.5.
(a)

Let \(H = \left\{ a_2t^2-a_1t : a_2 \text{ and } a_1 \text{ are real numbers } \right\}\text{.}\) Note that \(H\) is a subset of \(\pol_2\text{.}\) Find two vectors \(\vv_1, \vv_2\) in \(\pol_2\) so that \(H= \Span\{\vv_1, \vv_2\}\) and hence conclude that \(H\) is a subspace of \(\pol_2\text{.}\) (Note that the vectors \(\vv_1, \vv_2\) are not unique.)

(b)

Let \(p_1(t) = 1-t^2\) and \(p_2(t) = 1+t^2\text{,}\) and let \(S = \{p_1(t), p_2(t)\}\) in \(\pol_2\text{.}\) Is the polynomial \(q(t) = 3-2t^2\) in \(\Span \ S\text{?}\) (Hint: Create a matrix equation of the form \(A \vx = \vb\) by setting up an appropriate polynomial equation involving \(p_1(t)\text{,}\) \(p_2(t)\) and \(q(t)\text{.}\) Under what conditions on \(A\) is the system \(A \vx = \vb\) consistent?)

(c)

With \(S\) as in part (b), describe as best you can the subspace \(\Span \ S\) of \(\pol_2\text{.}\)

Given a subspace \(H\text{,}\) the set \(S\) such that \(H=\Span \ S\) is called a spanning set of \(H\text{.}\) In order to determine if a set \(S=\{\vv_1, \vv_2, \ldots, \vv_k\}\) is a spanning set for \(H\text{,}\) all we need to do is to show that for every \(\vb\) in \(H\text{,}\) the equation

\begin{equation*} x_1 \vv_1 + x_2\vv_2 + \cdots + x_k \vv_k = \vb \end{equation*}

has a solution. We will see important uses of special spanning sets called bases in the rest of this chapter.

Subsection Examples

What follows are worked examples that use the concepts from this section.

Example 31.7.

Determine if each of the following sets is a vector space.

(a)

\(V = \{(x,y,z) : x,y,z \in \R\}\) with addition and multiplication by scalars defined by

\begin{equation*} (a,b,c) \oplus (x,y,z) = (a+x, c+z, b+y) \ \text{ and } \ k(x,y,z) = (kx,kz,ky)\text{,} \end{equation*}

where \((a,b,c)\) and \((x,y,z)\) are in \(V\) and \(k \in \R\)

Solution.

We consider the vector space properties in Definition 31.1. Let \((a,b,c)\text{,}\) \((u,v,w)\text{,}\) and \((x,y,z)\) be in \(V\) and let \(k,m \in \R\text{.}\) By the definition of addition and multiplication by scalars, both \((a,b,c) + (x,y,z)\) and \(k (x,y,z)\) are in \(V\text{.}\) Note also that

\begin{align*} (a,b,c) \oplus (x,y,z) \amp = (a+x, c+z, b+y)\\ \amp = (x+a, z+c, y+b)\\ \amp = (x,y,z) \oplus (a,b,c)\text{,} \end{align*}

and so addition is commutative in \(V\text{.}\) Since

\begin{equation*} ((1,1,0)\oplus(0,1,1)) \oplus (0,0,1) = (1,1,2) \oplus (0,0,1) = (1,3,1) \end{equation*}

and

\begin{equation*} (1,1,0)\oplus ((0,1,1) \oplus (0,0,1)) = (1,1,0) \oplus (0,2,1) = (1,1,3)\text{,} \end{equation*}

we see that addition is not associative and conclude that \(V\) is not a vector space. At this point we can stop since we have shown that \(V\) is not a vector space.

(b)

\(V = \{x \in \R : x > 0\}\) with addition \(\oplus\) and multiplication by scalars defined by

\begin{equation*} x \oplus y = xy \ \text{ and } \ kx = x^k\text{,} \end{equation*}

where \(x\) and \(y\) are in \(V\text{,}\) \(k \in \R\text{,}\) and \(xy\) is the standard product of \(x\) and \(y\)

Solution.

We consider the vector space properties in Definition 31.1. Let \(x\text{,}\) \(y\text{,}\) and \(z\) be in \(V\) and let \(k,m \in \R\text{.}\) Since \(x\) and \(y\) are both positive real numbers, we know that \(xy\) is a positive real number. Thus, \(x \oplus y \in V\) and \(V\) is closed under its addition. Also, \(x^k\) is a positive real number, so \(x^k \in V\) as well. Now

\begin{equation*} x \oplus y = xy = yx = y \oplus x \end{equation*}

and addition is commutative in \(V\text{.}\) Also,

\begin{equation*} (x \oplus y) \oplus z = (xy) \oplus z = (xy)z = x(yz) = x \oplus (yz) = x \oplus (y \oplus z) \end{equation*}

and addition is associative in \(V\text{.}\) Since

\begin{equation*} 1 \oplus x = 1x = x\text{,} \end{equation*}

\(V\) contains an additive identity, which is \(1\text{.}\) The fact that \(x\) is a positive real number implies that \(\frac{1}{x}\) is a positive real number. Thus, \(\frac{1}{x} \in V\) and

\begin{equation*} x \oplus \frac{1}{x} = x\left(\frac{1}{x}\right) = 1 \end{equation*}

and \(V\) contains an additive inverse for each of its elements. We have that

\begin{align*} (k+m)x \amp = x^{k+m} = x^kx^m = x^k \oplus x^m = k(x) \oplus m(x),\\ k(x \oplus +y) \amp = k(xy) = x^ky^k = x^k \oplus y^k = k(x) \oplus k(y),\\ (km)x \amp = x^{km} = \left(x^m\right)^k = (m(x))^k = k(m(x))\\ 1x \amp = x^1 = x\text{.} \end{align*}

So \(V\) satisfies all of the properties of a vector space.

(c)

The set \(W\) of all \(2 \times 2\) matrices of the form \(\left[ \begin{array}{cc} a\amp 0\\b\amp 0 \end{array} \right]\) where \(a\) and \(b\) are real numbers using the standard addition and multiplication by scalars on matrices.

Solution.

Recall that \(\M_{2 \times 2}\) is a vector space using the standard addition and multiplication by scalars on matrices. Any matrix of the form \(\left[ \begin{array}{cc} a\amp 0\\b\amp 0 \end{array} \right]\) can be written as

\begin{equation*} \left[ \begin{array}{cc} a\amp 0\\b\amp 0 \end{array} \right] = a\left[ \begin{array}{cc} 1\amp 0\\0\amp 0 \end{array} \right] + b \left[ \begin{array}{cc} 0\amp 0\\1\amp 0 \end{array} \right]\text{.} \end{equation*}

So \(W = \Span\left\{\left[ \begin{array}{cc} 1\amp 0\\0\amp 0 \end{array} \right], \left[ \begin{array}{cc} 0\amp 0\\1\amp 0 \end{array} \right]\right\}\) and \(W\) is a subspace of \(\M_{2 \times 2}\text{.}\) Thus, \(W\) is a vector space.

(d)

The set \(W\) of all functions \(f\) from \(\R\) to \(\R\) such that \(f(0) \geq 0\) using the standard addition and multiplication by scalars on functions.

Solution.

We will show that \(W\) is not a vector space. Let \(f: \R \to \R\) be defined by \(f(x) = 1\text{.}\) Then \(f(0) \geq 0\) and \(f \in W\text{.}\) However, if \(h = (-1)f\text{,}\) then \(h(0) = (-1)f(0) = -1\) and \(h \notin W\text{.}\) It follows that \(W\) is not closed under multiplication by scalars and \(W\) is not a vector space.

Example 31.8.

Let \(V\) be a vector space and \(\vu\) and \(\vv\) vectors in \(V\text{.}\) Also, let \(a\) and \(b\) be scalars. You may use the result of Exercise 5 that \(c \vzero = \vzero\) for any scalar \(c\) in any vector space.

(a)

If \(a \vv = b \vv\) and \(\vv \neq \vzero\text{,}\) must \(a = b\text{?}\) Use the properties of a vector space or provide a counterexample to justify your answer.

Solution.

We will show that this statement is true. Suppose \(a \vv = b \vv\) and \(\vv \neq \vzero\text{.}\) Then \(\vzero = a \vv - b \vv = (a-b)\vv\text{.}\) If \(a = b\text{,}\) then we are done. So suppose \(a \neq b\text{.}\) Then \(a-b \neq 0\) and \(\frac{1}{a-b}\) is a real number. Then

\begin{align*} \frac{1}{a-b} \vzero \amp = \frac{1}{a-b}((a-b)\vv)\\ \vzero \amp = \left(\frac{1}{a-b}(a-b)\right) \vv\\ \vzero \amp = \vv\text{.} \end{align*}

But we assumed that \(\vv \neq \vzero\text{,}\) so we can conclude that \(a = b\) as desired.

(b)

If \(a \vu = a \vv\) and \(a \neq 0\text{,}\) must \(\vu = \vv\text{?}\) Use the properties of a vector space or provide a counterexample to justify your answer.

Solution.

We will show that this statement is true. Suppose \(a \vu = a \vv\) and \(a \neq 0\text{.}\) Then \(\vzero = a \vu - a \vv = a(\vu - \vv)\text{.}\) Since \(a \neq 0\text{,}\) we know that \(\frac{1}{a}\) is a real number. Thus,

\begin{align*} \frac{1}{a} \vzero \amp = \frac{1}{a} \left(a (\vu - \vv)\right)\\ \vzero \amp = \left( \frac{1}{a}a\right) (\vu - \vv)\\ \vzero \amp = \vu - \vv\\ \vu \amp = \vv\text{.} \end{align*}
(c)

If \(a\vu = b\vv\text{,}\) must \(a = b\) and \(\vu = \vv\text{?}\) Use the properties of a vector space or provide a counterexample to justify your answer.

Solution.

We will demonstrate that this statement is false with a counterexample. Let \(a = 1\text{,}\) \(b = 2\text{,}\) \(\vu = [2 \ 0]^{\tr}\) and \(\vv = [1 \ 0]^{\tr}\) in \(\R^2\text{.}\) Then

\begin{equation*} a \vu = 1[2 \ 0]^{\tr} = [2 \ 0]^{\tr} = 2[1 \ 0]^{\tr} = b \vv\text{,} \end{equation*}

but \(a \neq b\) and \(\vu \neq \vv\text{.}\)

Subsection Summary

  • A set \(V\) on which an operation of addition and a multiplication by scalars is defined is a vector space if for all \(\vu\text{,}\) \(\vv\text{,}\) and \(\vw\) in \(V\) and all scalars \(a\) and \(b\text{:}\)

    1. \(\vu + \vv\) is an element of \(V\text{,}\)

    2. \(\vu + \vv = \vv + \vu\text{,}\)

    3. \((\vu + \vv) + \vw = \vu + (\vv + \vw)\text{,}\)

    4. there is a zero vector \(\vzero\) in \(V\) so that \(\vu + \vzero = \vu\text{,}\)

    5. for each \(\vx\) in \(V\) there is an element \(\vy\) in \(V\) so that \(\vx + \vy = \vzero\text{,}\)

    6. \(a \vu\) is an element of \(V\text{,}\)

    7. \((a+b) \vu = a\vu + b\vu\text{,}\)

    8. \(a(\vu + \vv) = a\vu + a\vv\text{,}\)

    9. \((ab) \vu = a(b\vu)\text{,}\)

    10. \(1 \vu = \vu\text{.}\)

  • A subset \(H\) of a vector space \(V\) is a subspace of \(V\) if

    1. whenever \(\vu\) and \(\vv\) are in \(H\) it is also true that \(\vu + \vv\) is in \(H\text{,}\)

    2. whenever \(\vu\) is in \(H\) and \(a\) is a scalar it is also true that \(a\vu\) is in \(H\text{,}\)

    3. \(\vzero\) is in \(H\text{.}\)

  • A linear combination of vectors \(\vv_1\text{,}\) \(\vv_2\text{,}\) \(\ldots\text{,}\) \(\vv_k\) in a vector space \(V\) is a vector of the form

    \begin{equation*} x_1\vv_1+x_2\vv_2 + \cdots + x_k \vv_k\text{,} \end{equation*}

    where \(x_1\text{,}\) \(x_2\text{,}\) \(\ldots\text{,}\) \(x_k\) are scalars.

  • The span of the vectors \(\vv_1\text{,}\) \(\vv_2\text{,}\) \(\ldots\text{,}\) \(\vv_k\) in a vector space \(V\) is the collection of all linear combinations of \(\vv_1\text{,}\) \(\vv_2\text{,}\) \(\ldots\text{,}\) \(\vv_k\text{.}\) That is,

    \begin{equation*} \Span\{\vv_1, \vv_2, \ldots, \vv_k\} = \{x_1\vv_1+x_2\vv_2 + \cdots + x_k \vv_k : x_1, x_2, \ldots, x_k \text{ are scalars } \}\text{.} \end{equation*}
  • The span of any finite set of vectors in a vector space \(V\) is always a subspace of \(V\text{.}\)

  • This concept of vector space is important because there are many different types of sets (e.g., \(\R^n\text{,}\) \(\M_{m \times n}\text{,}\) \(\pol_n\text{,}\) \(\F\)) that have similar structure, and we can relate them all as members of this larger collection of vector spaces.

Exercises Exercises

1.

The definition of a vector space only states the existence of a zero vector and does not say how many zero vectors the space might have. In this exercise we show that the zero vector in a vector space is unique. To show that the zero vector is unique, we assume that two vectors \(\vzero_1\) and \(\vzero_2\) have the zero vector property.

(a)

Using the fact that \(\vzero_1\) is a zero vector, what vector is \(\vzero_1 + \vzero_2\text{?}\)

Hint.

Use the fact that \(\vzero_1+\vv = \vv\) for any vector \(\vv\) in our vector space.

(b)

Using the fact that \(\vzero_2\) is a zero vector, what vector is \(\vzero_1 + \vzero_2\text{?}\)

Hint.

Same reasoning as in part (a).

(c)

How do we conclude that the zero vector is unique?

Hint.

Use the transitive property of equality.

2.

The definition of a vector space only states the existence of an additive inverse for each vector in the space, but does not say how many additive inverses a vector can have. In this exercise we show that the additive inverse of a vector in a vector space is unique. To show that a vector \(\vv\) has only one additive inverse, we suppose that \(\vv\) has two additive inverses, \(\vu\) and \(\vw\text{,}\) and demonstrate that \(\vu = \vw\text{.}\)

(a)

What equations must \(\vu\) and \(\vw\) satisfy if \(\vu\) and \(\vw\) are additive inverses of \(\vv\text{?}\)

(b)

Use the equations from part (a) to show that \(\vu = \vw\text{.}\) Clearly identify all vector space properties you use in your argument.

3.

4.

Let \(V\) be a vector space and \(\vv\) a vector in \(V\text{.}\) In all of the vector spaces we have seen to date, the additive inverse of the vector \(\vv\) is equal to the scalar multiple \((-1)\vv\text{.}\) This seems reasonable, but it is important to note that this result is not stated in the definition of a vector space, so this it is something that we need to verify. To show that \((-1)\vv\) is an additive inverse of the vector \(\vv\text{,}\) we need to demonstrate that

\begin{equation*} \vv + (-1)\vv = \vzero\text{.} \end{equation*}

Verify this equation, explicitly stating which properties you use at each step.

Hint.

Use the fact that \(-1 + 1 = 0\text{.}\)

5.

It is reasonable to expect that if \(c\) is any scalar and \(\vzero\) is the zero vector in a vector space \(V\text{,}\) then \(c \vzero = \vzero\text{.}\) Use the fact that \(\vzero + \vzero = \vzero\) to prove this statement.

6.

Let \(W_1, W_2\) be two subspaces of a vector space \(V\text{.}\) Determine whether \(W_1 \cap W_2\) and \(W_1 \cup W_2\) are subspaces of \(V\text{.}\) Justify each answer clearly.

7.

Find three vectors \(\vv_1, \vv_2, \vv_3\) to express

\begin{equation*} W=\left\{ \left[ \begin{array}{c} a+2b+c \\ b-3c \\ a-c \end{array} \right] : a, b, c \text{ in } \R \right\} \end{equation*}

as \(\Span \{\vv_1, \vv_2, \vv_3\}\text{.}\) How does this justify why \(W\) is a subspace of \(\R^3\text{?}\)

8.

Find three vectors \(\vv_1, \vv_2, \vv_3\) to express

\begin{equation*} W=\left\{ \left[ \begin{array}{cc} a+b \amp a-2c \\ 3b+c \amp a+b-c \end{array} \right] : a, b, c \text{ in } \R \right\} \end{equation*}

as \(\Span \{\vv_1, \vv_2, \vv_3\}\text{.}\) How does this justify why \(W\) is a subspace of \(\M_{2 \times 2}\text{?}\)

9.

Let \(\F\) be the set of all functions from \(\R\) to \(\R\text{,}\) where we define the sum of two functions \(f\) and \(g\) in \(\F\) as the function \(f+g\) satisfying

\begin{equation*} (f+g)(x) = f(x) + g(x) \end{equation*}

for all real numbers \(x\text{,}\) and the scalar multiple \(cf\) of the function \(f\) by the scalar \(c\) to be the function satisfying

\begin{equation*} (cf)(x) = cf(x) \end{equation*}

for all real numbers \(x\text{.}\) Show that \(\F\) is a vector space using these operations.

11.

Determine if each of the following sets of elements is a vector space or not. If appropriate, you can identify a set as a subspace of another vector space, or as a span of a collection of vectors to shorten your solution.

(a)

A line through the origin in \(\R^n\text{.}\)

(b)

The first quadrant in \(\R^2\text{.}\)

(c)

The set of vectors \(\left\{ \left[ \begin{array}{c} a \\ 0 \end{array} \right] : a \text{ in } \Z \right\}\text{.}\)

(d)

The set of all differentiable functions from \(\R\) to \(\R\text{.}\)

(e)

The set of all functions from \(\R\) to \(\R\) which are increasing for every \(x\text{.}\) (Assume that a function \(f\) is increasing if \(f(a) > f(b)\) whenever \(a > b\text{.}\))

(f)

The set of all functions \(f\) from \(\R\) to \(\R\) for which \(f(c)=0\) for some fixed \(c\) in \(\R\text{.}\)

(g)

The set of polynomials of the form \(a+bt\text{,}\) where \(a+b=0\text{.}\)

(h)

The set of all upper triangular \(4\times 4\) real matrices.

(i)

The set of complex numbers \(\C\) where scalar multiplication is defined as multiplication by real numbers.

12.

A reasonable way to extend the idea of the vector space \(\R^n\) to infinity is to let \(\R^\infty\) be the set of all sequences of real numbers. Define addition and multiplication by scalars on \(\R^{\infty}\) by

\begin{equation*} \{x_n\} + \{y_n\} = \{x_n + y_n\} \ \text{ and } \ c\{x_n\} = \{cx_n\} \end{equation*}

where \(\{x_n\}\) denotes the sequence \(x_1, x_2, x_3, \ldots\text{,}\) \(\{y_n\}\) denotes the sequence \(y_1, y_2, y_3, \ldots\) and \(c\) is a scalar.

(a)

Show that \(\R^{\infty}\) is a vector space using these operations.

(b)

Is the set of sequences that have infinitely many zeros a subspace of \(\R^{\infty}\text{?}\) Verify your answer.

(c)

Is the set of sequences which are eventually zero a subspace of \(\R^{\infty}\text{?}\) Verify your answer. (A sequence \(\{x_n\}\) is eventually zero if there is an index \(k_0\) such that \(x_n = 0\) whenever \(n \geq k_0\text{.}\))

(d)

Is the set of decreasing sequences a subspace of \(\R^{\infty}\text{?}\) Verify your answer. (A sequence \(\{x_n\}\) is decreasing if \(x_{n+1} \leq x_n\) for each \(n\text{.}\))

(e)

Is the set of sequences in \(\R^{\infty}\) that have limits at infinity a subspace of \(\R^{\infty}\text{?}\)

(f)

Let \(\ell^2\) be the set of all square summable sequences in \(\R^{\infty}\text{,}\) that is sequences \(\{x_n\}\) so that \(\sum_{k = 1}^{\infty} x_k^2\) is finite. So, for example, the sequence \(\left\{\frac{1}{n}\right\}\) is in \(\ell^2\text{.}\) Show that \(\ell^2\) is a subspace of \(\R^{\infty}\) (the set \(\ell^2\) is an example of what is called a Hilbert space by defining the inner product \(\langle \{x_n\}, \{y_n\} \rangle = \sum_{n = 1}^{\infty} x_ny_n\)). (Hint: show that \(2u^2 + 2v^2 - (u+v)^2 \geq 0\) for any real numbers \(u\) and \(v\text{.}\))

13.

Given two subspaces \(H_1, H_2\) of a vector space \(V\text{,}\) define

\begin{equation*} H_1+H_2 = \{ \vw \mid \vw=\vu+\vv \text{ where } \vu \text{ in } H_1, \vv \text{ in } H_2\} \,\text{.} \end{equation*}

Show that \(H_1+H_2\) is a subspace of \(V\) containing both \(H_1, H_2\) as subspaces. The space \(H_1+H_2\) is the sum of the subspaces \(H_1\) and \(H_2\text{.}\)

14.

Label each of the following statements as True or False. Provide justification for your response.

(a) True/False.

The intersection of any two subspaces of \(V\) is also a subspace.

(b) True/False.

The union of any two subspaces of \(V\) is also a subspace.

(c) True/False.

If \(H\) is a subspace of a vector space \(V\text{,}\) then \(-H=\{ (-1)\vv: \vv \text{ in } H\}\) is equal to \(H\text{.}\)

(d) True/False.

If \(\vv\) is a nonzero vector in \(H\text{,}\) a subspace of \(\R^n\text{,}\) then \(H\) contains the line through the origin and \(\vv\) in \(\R^n\text{.}\)

(e) True/False.

If \(\vv_1, \vv_2\) are nonzero, non-parallel vectors in \(H\text{,}\) a subspace of \(\R^n\text{,}\) then \(H\) contains the plane through the origin, \(\vv_1\) and \(\vv_2\) in \(\R^n\text{.}\)

(f) True/False.

The smallest subspace in \(\R^n\) containing a vector \(\vv\) is a line through the origin.

(g) True/False.

The largest subspace of \(V\) is \(V\text{.}\)

(h) True/False.

The space \(\pol_1\) is a subspace of \(\pol_n\) for \(n \geq 1\text{.}\)

(i) True/False.

The set of constant functions from \(\R\) to \(\R\) is a subspace of \(\F\text{.}\)

(j) True/False.

The set of all polynomial functions with rational coefficients is a subspace of \(\F\text{.}\)

Subsection Project: Hamming Codes and the Hat Puzzle

Recall the hat problem from the beginning of this section. Three players are assigned either a red or blue hat and can only see the colors of the hats of the other players. The goal is to devise a high probability strategy for one player to correctly guess the color of their hat. The players have a 50% chance of winning if one player guesses randomly and all of the others pass. However, the group can do better than 50% with a reasonably simple strategy. There are 2 possibilities for each hat color for a total of \(2^3 = 8\) possible distributions of hat colors. Of these, only red-red-red and blue-blue-blue contain only one hat color, so \(6/8\) of \(3/4\) of the possible hat distributions have two hats of one color and one of the other color. So if a player sees two hats of the same color, that player guesses the other color and passes otherwise. This gives a 75% chance of winning. This strategy will only work for three players, though. We want to develop an effective strategy that works for larger groups of players.

There is a strategy, based on Hamming codes that can be utilized when the number of players is of the form \(2^k-1\) with \(k \geq 2\text{.}\) This strategy will provide a winning probability of

\begin{equation*} 1 - 2^{-k}\text{.} \end{equation*}

Note that as \(k \to \infty\text{,}\) this probability has a limit of 1. Note also that if \(k=2\) (so that there are 3 players), then the probability is \(\frac{3}{4}\) or \(75\%\) — the same strategy we came up with earlier.

To understand this strategy, we need to build a slightly different kind of vector space than we have seen until now, one that is based on a binary choice of red or blue. To do so, we identify the hat colors with numbers — 0 for red and 1 for blue. So let \(\mathbb{F} = \{0,1\}\text{.}\) Assume there are \(n = 2^k-1\) players for some integer \(k \geq 2\text{.}\) We can then view a distribution of hats among the \(n = 2^k-1\) players as a vector with \(n\) components from \(\mathbb{F}\text{.}\) That is,

\begin{equation*} \mathbb{F}^n = \{ [\alpha_1 \ \alpha_2 \ \cdots \ \alpha_n]^{\tr} : \alpha_i \in \mathbb{F}\}\text{.} \end{equation*}

We can give some structure to both \(\mathbb{F}\) and \(\mathbb{F}^n\) by noting that we can define addition and multiplication in \(\mathbb{F}\) by

\begin{align*} {3} 0 + 0 \amp = 0, \amp 0 + 1 \amp = 1 + 0 = 1, \amp 1+1 \amp = 0\\ 0 \cdot 0 \amp = 0, \amp 0 \cdot 1 \amp = 1 \cdot 0 = 0, \amp 1 \cdot 1 \amp = 1\text{.} \end{align*}

Project Activity 31.6.

Show that \(\mathbb{F}\) has the same structure as \(\R\text{.}\) That is, show that for all \(x\text{,}\) \(y\text{,}\) and \(z\) in \(\mathbb{F}\text{,}\) the following properties are satisfied.

(a)

\(x + y \in \mathbb{F}\) and \(xy \in \mathbb{F}\)

(b)

\(x + y = y + x\) and \(xy=yx\)

(c)

\((x + y) + z = x + (y + z)\) and \((xy)z = x(yz)\)

(d)

There is an element \(0\) in \(\mathbb{F}\) such that \(x+ 0 = x\)

(e)

There is an element \(1\) in \(\mathbb{F}\) such that \((1)x = x\)

(f)

There is an element \(-x\) in \(\mathbb{F}\) such that \(x+(-x) = 0\)

(g)

If \(x \neq 0\text{,}\) there is an element \(\frac{1}{x}\) in \(\mathbb{F}\) such that \(x\left(\frac{1}{x}\right) = 1\)

(h)

\(x (y + z) = (x y) + (x z)\)

Project Activity 31.6 shows that \(\mathbb{F}\) has the same properties as \(\R\) — that is that \(\mathbb{F}\) is a field. Until now, we have worked with vector spaces whose scalars come from the set of real numbers, but that is not necessary. None of the results we have discovered so far about vector spaces require our scalars to come from \(\R\text{.}\) In fact, we can replace \(\R\) with any field and all of the same vector space properties hold. It follows that \(V = \mathbb{F}^n\) is a vector space over \(\mathbb{F}\text{.}\) As we did in \(\R^n\text{,}\) we define the standard unit vectors \(\ve_1 = [1 \ 0 \ 0 \ \cdots \ 0]^{\tr}\text{,}\) \(\ve_2 = [0 \ 1 \ 0 \ 0 \ \ldots \ 0]^{\tr}\text{,}\) \(\ldots\text{,}\) \(\ve_n = [0 \ 0 \ 0 \ \ldots \ 0 \ 1]^{\tr}\) in \(V = \mathbb{F}^n\text{.}\)

Now we return to the hat puzzle. We have \(n = 2^k-1\) players. Label the players \(1\text{,}\) \(2\text{,}\) \(\ldots\text{,}\) \(n\text{.}\) We can now represent a random placements of hats on heads as a vector \(\vv = [\alpha_1 \ \alpha_2 \ \cdots \ \alpha_n]^{\tr}\) in \(V = \mathbb{F}^n\text{,}\) where \(\alpha_i = 0\) in the \(i\)th entry represents a red hat and \(\alpha_i =1\) a blue hat on player \(i\text{.}\) Since player \(i\) can see all of the other hats, from player \(i\)'s perspective the distribution of hats has the form

\begin{equation*} \vv = \vv_i+ \beta_i \ve_i\text{,} \end{equation*}

where \(\beta_i\) is the unknown color of hat on player \(i\)'s head and

\begin{equation*} \vv_i = [\alpha_1 \ \alpha_2 \ \cdots \ \alpha_{i-1} \ 0 \ \alpha_{i+1} \ \cdots \alpha_n]^{\tr}\text{.} \end{equation*}

In order to analyze the vectors \(\vv\) from player \(i\)'s perspective and to devise an effective strategy, we will partition the set \(V\) into an appropriate disjoint union of subsets.

To provide a different way to look at players, we will use a subspace of \(V\text{.}\) Let \(W\) be a subspace of \(V\) that has a basis of \(k\) vectors. The elements of \(W\) are the linear combinations of \(k\) basis vectors, and each basis vector in a linear combination has 2 possibilities for its weight (from \(\mathbb{F}\)). Thus, \(W\) contains exactly \(2^k = n+1\) vectors. We can then use the \(n=2^k-1\) nonzero vectors in \(W\) to represent our players. Each distribution of hats can be seen as a linear combination of the vectors in \(W\text{.}\) Let \(\vw_1\text{,}\) \(\vw_2\text{,}\) \(\ldots\text{,}\) \(\vw_{2^k-1}\) be the nonzero vectors in \(W\text{.}\) We then define a function \(\varphi : V \to W\) as

\begin{equation*} \varphi([\alpha_1 \ \alpha_2 \ \cdots \ \alpha_n]^{\tr}) = \sum_{i=1}^{n} \alpha_i \vw_i \end{equation*}

that identifies a distribution of hats with a vector in \(W\text{.}\) The subspace that we need to devise our strategy is what is called a Hamming code.

Project Activity 31.7.

Let

\begin{equation*} H = \left\{[\alpha_1 \ \alpha_2 \ \cdots \ \alpha_n]^{\tr} \in V : \sum_{i=1}^{n} \alpha_i \vw_i = \vzero\right\}\text{.} \end{equation*}

Show that \(H\) is a subspace of \(V\text{.}\) (The subspace \(H\) is called the \(\left(2^k-1, 2^k-k-1\right)\) Hamming code (where the first component is the number of elements in a basis for \(V\) and the second the number of elements in a basis for \(H\)). Hamming codes are examples of linear codes — those codes that are subspaces of the larger vector space.)

Now for each \(i\) between 0 and \(n\) we define \(H_i = \ve_i + H\) as

\begin{equation*} H_i = \ve_i + H = \{\ve_i + \vh : \vh \in H\}\text{,} \end{equation*}

where we let \(\ve_0 = \vzero\text{.}\) The sets \(H_i\) are called cosets of \(H\text{.}\)

Project Activity 31.8.

To complete our strategy for the hat puzzle, we need to know some additional information about the sets \(H_i\text{.}\)

(a)

Show that the sets \(H_i\) are disjoint. That is, show that \(H_i \cap H_j = \emptyset\) if \(i \neq j\text{.}\)

Hint.

If \(\vv \in H_i\) and \(\vv \in H_j\text{,}\) what can we say about \(\ve_i - \ve_j\text{?}\)

(b)

Since \(H_i \subseteq V\) for each \(i\text{,}\) it follows that \(\bigcup_{i=0}^n H_i \subseteq V\text{.}\) Now we show that \(V = \bigcup_{i=0}^n H_i\) by demonstrating that \(\bigcup_{i=0}^n H_i\) has exactly the same number of elements as \(V\text{.}\) We will need one fact for our argument. We will see in a later section that \(H\) has a basis of \(n-k\) elements, so the number of elements in \(H\) is \(2^{n-k}\text{.}\)

(i)

Since the sets \(H_i\) are disjoint, the number of elements in \(\bigcup_{i=1}^n H_i\) is equal to the sum of the number of elements in each \(H_i\text{.}\) Show that each \(H_i\) has the same number of elements as \(H\text{.}\)

(ii)

Now use the fact that the number of elements in \(\bigcup_{i=0}^n H_i\) is equal to the sum of the number of elements in each \(H_i\) to argue that \(V = \bigcup_{i=0}^n H_i\text{.}\)

The useful idea from Project Activity 31.8 is that any hat distribution in \(V\) is in exactly one of the sets \(H_i\text{.}\) Recall that a hat distribution \(\vv = [\alpha_1 \ \alpha_2 \ \cdots \ \alpha_n]^{\tr}\) in \(V\) can be written from player \(i\)'s perspective as

\begin{equation*} \vv = \vv_i + \beta_i \ve_i\text{,} \end{equation*}

where \(\vv_i = [\alpha_1 \ \alpha_2 \ \cdots \ \alpha_{i-1} \ 0 \ \alpha_{i+1} \ \cdots \ \cdots \ \alpha_n]^{\tr}\text{.}\) Our strategy for the hat game can now be revealed.

  • If \(\vv_i + \beta_i \ve_i\) is not in \(H\) for either choice of \(\beta_i\text{,}\) then player \(i\) should pass.

  • If \(\vv_i + \beta_i \ve_i\) is in \(H\text{,}\) then player \(i\) guesses \(1 + \beta_i\text{.}\)

Project Activity 31.9.

Let us analyze this strategy.

(a)

Explain why every player guesses wrong if \(\vv\) is in \(H\text{.}\)

(b)

Now we see determine that our strategy is a winning strategy for all hat distributions \(\vv\) that are not in \(H\text{.}\) First we need to know that these two options are the only ones. That is, show that it is not possible for \(\vv_i + \beta_i \ve_i\) to be in \(H\) for both choices of \(\beta_i\text{.}\)

(c)

Now we want to demonstrate that this is a winning strategy if \(\vv \notin H\text{.}\) That is, at least one player guesses a correct hat color and no one else guesses incorrectly. So assume \(\vv \notin H\text{.}\)

(i)

We know that \(\vv \in H_i\) for some unique choice of \(i\text{,}\) so let \(\vv = \ve_i + h\) for some \(h \in H\text{.}\) Explain why player \(i\) can correctly choose color \(1+ \alpha_i\text{.}\)

(ii)

Finally, we need to argue that every player except player \(i\) must pass. So consider player \(j\text{,}\) with \(j \neq i\text{.}\) Recall that

\begin{equation*} \vv = \vv_j + \alpha_j \ve_j\text{.} \end{equation*}

Analyze our strategy and the conditions under which player \(j\) does not pass. Show that this leads to a contradiction.

Project Activity 31.9 completes our analysis of this strategy and shows that our strategy results in a win with probability

\begin{equation*} 1 - \frac{|H|}{|V|} = 1 - \frac{2^{2^k-k-1}}{2^{2^k-1}} = 1 - 2^{-k}\text{.} \end{equation*}
It is very important to keep track of the different kinds of zeros here — the boldface zero \(\vzero\) is the additive identity in the vector space and the non-bold 0 is the scalar zero.