Skip to main content

Section 23 The Dot Product in \(\R^n\)

Subsection Application: Hidden Figures in Computer Graphics

In video games, the speed at which a computer can render changing graphics views is vitally important. To increase a computer's ability to render a scene, programs often try to identify those parts of the images a viewer could see and those parts the viewer could not see. For example, in a scene involving buildings, the viewer could not see any images blocked by a solid building. In the mathematical world, this can be visualized by graphing surfaces. In Figure 23.1 we see a crude image of a house made up of small polygons (this is how programs generally represent surfaces). On the left in Figure 23.1 we see all of the polygons that are needed to construct the entire surface, even those polygons that lie behind others which we could not see if the surface was solid. On the right in Figure 23.1 we have hidden the parts of the polygons that we cannot see from our view.

Figure 23.1. Images of a house.

We also see this idea in mathematics when we graph surfaces. Figure 23.2 shows the graph of the surface defined by \(f(x,y) = \sqrt{4-x^2}\) that is made up of polygons. At left we see all of the polygons and at right only those parts that would be visible from our viewing perspective.

Figure 23.2. Graphs of \(f(x,y) = \sqrt{4-x^2}\text{.}\)

By eliminating the parts of the polygons we cannot see from our viewing perspective, the computer program can more quickly render the viewing image. Later in this section we will explore one method for how programs remove the hidden portions of images. This process involves the dot product of vectors.

Subsection Introduction

Orthogonality, a concept which generalizes the idea of perpendicularity, is an important concept in linear algebra. We use the dot product to define orthogonality and more generally angles between vectors in \(\R^n\) for any dimension \(n\text{.}\) The dot product has many applications, e.g., finding components of forces acting in different directions in physics and engineering. The dot product is also an example of a larger concept, inner products, that we will discuss later. We introduce and investigate dot products in this section.

We will illustrate the dot product in \(\R^2\text{,}\) but the process we go through will translate to any dimension. Recall that we can represent the vector \(\vv = \left[ \begin{array}{c} v_1 \\ v_2 \end{array} \right]\) as the directed line segment (or arrow) from the origin to the point \((v_1, v_2)\) in \(\R^2\text{,}\) as illustrated in Figure 23.3. Using the Pythagorean Theorem we can then define the length (or magnitude or norm) of the vector \(\vv\) in \(\R^2\) as

\begin{equation*} || \vv || = \sqrt{v_1^2 + v_2^2}\text{.} \end{equation*}

We can also write this norm as

\begin{equation*} \sqrt{v_1v_1 + v_2v_2}\text{.} \end{equation*}

The expression under the square root is an important one and we extend it and give it a special name.

Figure 23.3. A vector in \(\R^2\) from the origin to a point.

If \(\vu = [ u_1 \ u_2 ]^{\tr}\) and \(\vv = [ v_1 \ v_2]^{\tr}\) are vectors in \(\R^2\text{,}\) then we call the expression \(u_1v_1+u_2v_2\) the dot product of \(\vu\) and \(\vv\text{,}\) and denote it as \(\vu \cdot \vv\text{.}\) With this idea in mind, we can rewrite the norm of the vector \(\vv\) as

\begin{equation*} || \vv || = \sqrt{\vv \cdot \vv}\text{.} \end{equation*}

The definition of the dot product translates naturally to \(\R^n\) (see Exercise 5 in Section 5).

Definition 23.4.

Let \(\vu = [u_1 \ u_2 \ \cdots \ u_n]\) and \(\vv = [ v_1 \ v_2 \ \cdots \ v_n ]\) be vectors in \(\R^n\text{.}\) The dot product (or scalar product ) of \(\vu\) and \(\vv\) is the scalar

\begin{equation*} \vu \cdot \vv = u_1v_1 + u_2v_2 + \cdots + u_nv_n = \displaystyle \sum_{i=1}^n u_iv_i\text{.} \end{equation*}

The dot product then allows us to define the norm (or magnitude or length) of any vector in \(\R^n\text{.}\)

Definition 23.5.

The norm \(||\vv||\) of the vector \(\vv \in \R^n\) is

\begin{equation*} ||\vv|| = \sqrt{\vv \cdot \vv}\text{.} \end{equation*}

We also use the words magnitude or length as alternatives for the word norm. We can equivalently write the norm of the vector \(\vv = [ v_1 \ v_2 \ \cdots \ v_n ]^{\tr}\) as

\begin{equation*} ||\vv|| = \sqrt{v_1^2 + v_2^2 + \cdots + v_n^2}\text{.} \end{equation*}

We can also realize the dot product as a matrix product. If \(\vu = [ u_1 \ u_2 \ \cdots \ u_n ]^{\tr}\) and \(\vv = [ v_1 \ v_2 \ \cdots \ v_n ]^{\tr}\text{,}\) then

\begin{equation*} \vu \cdot \vv = \vu^{\tr}\vv \end{equation*}

 39 .

IMPORTANT NOTE.

The dot product is only defined between two vectors with the same number of components.

Preview Activity 23.1.

(a)

Find \(\vu \cdot \vv\) if \(\vu = [2 \ 3 \ -1 \ 4]^{\tr}\) and \(\vv = [4 \ 6 \ 7 \ -5]^{\tr}\) in \(\R^4\text{.}\)

(b)

The dot product satisfies some useful properties as given in the next theorem.

Verification of some of these properties is left to the exercises. Let \(\vu\) and \(\vv\) be vectors in \(\R^5\) with \(\vu \cdot \vv = -1\text{,}\) \(|| \vu || = 2\) and \(|| \vv || = 3\text{.}\)

(i)

Use property (c) of Theorem 23.6 to determine the value of \(\vu \cdot 2\vv\text{.}\)

(ii)

Use property (b) of Theorem 23.6 to determine the value of \((\vu + \vv) \cdot \vv\text{.}\)

(iii)

Use whatever properties of Theorem 23.6 that are needed to determine the value of \((2\vu+4\vv) \cdot (\vu - 7\vv)\text{.}\)

(c)

At times we will want to find vectors in the direction of a given vector that have a certain magnitude. Let \(\vu = [2 \ 2 \ 1]^{\tr}\) in \(\R^3\text{.}\)

(i)

What is \(|| \vu ||\text{?}\)

(ii)

Show that \(\left| \left| \frac{1}{||\vu||} \vu \right| \right| = 1\text{.}\)

(iii)

Vectors with magnitude 1 are important and are given a special name.

Definition 23.7.

A vector \(\vv\) in \(\R^n\) is a unit vector if \(|| \vv || = 1\text{.}\)

We can use unit vectors to find vectors of a given length in the direction of a given vector. Let \(c\) be a positive scalar and \(\vv\) a vector in \(\R^n\text{.}\) Use properties from Theorem 23.6 to show that the magnitude of the vector \(c \frac{\vv}{||\vv||}\) is \(c\text{.}\)

Subsection The Distance Between Vectors

Finding optimal solutions to systems is an important problem in applied mathematics. It is often the case that we cannot find an exact solution that satisfies certain constraints, so we look instead for the “best” solution that satisfies the constraints. An example of this is fitting a least squares line to a set of data. As we will see, the dot product will allow us to find “best” solutions to certain types of problems, where we measure accuracy using the notion of a distance between vectors. Geometrically, we can represent a vector \(\vu\) as a directed line segment from the origin to the point defined by \(\vu\text{.}\) If we have two vectors \(\vu\) and \(\vv\text{,}\) we can think of the length of the difference \(\vu - \vv\) as a measure of how far apart the two vectors are from each other. It is natural, then to define the distance between vectors as follows.

Definition 23.8.

Let \(\vu\) and \(\vv\) be vectors in \(\R^n\text{.}\) The distance between \(\vu\) and \(\vv\) is the length of the difference \(\vu - \vv\) or

\begin{equation*} || \vu - \vv ||\text{.} \end{equation*}

As Figure 23.9 illustrates, if vectors \(\vu\) and \(\vv\) emanate from the same initial point, and \(P\) and \(Q\) are the terminal points of \(\vu\) and \(\vv\text{,}\) respectively, then the difference \(|| \vu - \vv||\) is the standard Euclidean distance between the points \(P\) and \(Q\text{.}\)

Figure 23.9. \(|| \vu - \vv||\text{.}\)

Subsection The Angle Between Two Vectors

Determining a “best” solution to a problem often involves finding a solution that minimizes a distance. We generally accomplish a minimization through orthogonality — which depends on the angle between vectors. Given two vectors \(\vu\) and \(\vv\) in \(\R^n\text{,}\) we position the vectors so that they emanate from the same initial point. If the vectors are nonzero, then they determine a plane in \(\R^n\text{.}\) In that plane there are two angles that these vectors create. We will define the angle between the vectors to be the smaller of these two angles. The dot product will tell us how to find the angle between vectors. Let \(\vu\) and \(\vv\) be vectors in \(\R^n\) and \(\theta\) the angle between them as illustrated in Figure 23.10.

Figure 23.10. The angle between \(\vu\) and \(\vv\)

Using the Law of Cosines, we have

\begin{equation*} ||\vu-\vv||^2=||\vu||^2+||\vv||^2 - 2||\vu|| \ ||\vv|| \ \cos(\theta) \,\text{.} \end{equation*}

Rearranging, we obtain

\begin{align*} ||\vu|| \ ||\vv|| \ \cos(\theta) \amp = \frac{1}{2} \left( ||\vu||^2+||\vv||^2 - ||\vu-\vv||^2 \right)\\ \amp = \frac{1}{2} (||\vu||^2+||\vv||^2 - (\vu-\vv)\cdot (\vu-\vv) )\\ \amp = \frac{1}{2} (||\vu||^2+||\vv||^2 - \vu\cdot \vu +2\vu \cdot \vv - \vv\cdot \vv )\\ \amp = \vu \cdot \vv \, \text{.} \end{align*}

So the angle \(\theta\) between two nonzero vectors \(\vu\) and \(\vv\) in \(\R^n\) satisfies the equation

\begin{equation} \cos(\theta) = \frac{\vu \cdot \vv}{||\vu|| \ ||\vv||}\text{.}\tag{23.1} \end{equation}

Of particular interest to us will be the situation where vectors \(\vu\) and \(\vv\) are orthogonal (perpendicular). 40  Intuitively, we think of two vectors as orthogonal if the angle between them is \(90^{\circ}\text{.}\)

Activity 23.2.

(a)

The vectors \(\ve_1 = [ 1 \ 0]^{\tr}\) and \(\ve_2 = [0 \ 1]^{\tr}\) are perpendicular in \(\R^2\text{.}\) What is \(\ve_1 \cdot \ve_2\text{?}\)

(b)

Now let \(\vu\) and \(\vv\) be any vectors in \(\R^n\text{.}\)

(i)

Suppose the angle between nonzero vectors \(\vu\) and \(\vv\) is \(90^{\circ}\text{.}\) What does Equation (23.1) tell us about \(\vu \cdot \vv\text{?}\)

(ii)

Now suppose that \(\vu \cdot \vv = 0\text{.}\) What does Equation (23.1) tell us about the angle between \(\vu\) and \(\vv\text{?}\) Why?

(iii)

Explain why the following definition makes sense.

Definition 23.11.

Two vectors \(\vu\) and \(\vv\) in \(\R^n\) are orthogonal if \(\vu \cdot \vv = 0\text{.}\)

(iv)

According to Definition 23.11, to which vectors is \(\vzero\) orthogonal? Does this make sense to you intuitively? Explain.

Activity 23.3.

(a)

Find the angle between the two vectors \(\vu = [1 \ 3 \ -2 \ 5]^{\tr}\) and \(\vv = [5 \ 2 \ 3 \ -1]^{\tr}\text{.}\)

(b)

Find, if possible, two non-parallel vectors orthogonal to \(\vu = \left[ \begin{array}{r} 0\\3\\-2\\1 \end{array} \right]\text{.}\)

Subsection Orthogonal Projections

When running a sprint, the racers may be aided or slowed by the wind. The wind assistance is a measure of the wind speed that is helping push the runners down the track. It is much easier to run a very fast race if the wind is blowing hard in the direction of the race. So that world records aren't dependent on the weather conditions, times are only recorded as record times if the wind aiding the runners is less than or equal to 2 meters per second. Wind speed for a race is recorded by a wind gauge that is set up close to the track. It is important to note, however, that weather is not always as cooperative as we might like. The wind does not always blow exactly in the direction of the track, so the gauge must account for the angle the wind makes with the track. If the wind is blowing in the direction of the vector \(\vu\) in Figure 23.12 and the track is in the direction of the vector \(\vv\) in Figure 23.12, then only part of the total wind vector is actually working to help the runners. This part is called the orthogonal projection of the vector \(\vu\) onto the vector \(\vv\) and is denoted \(\proj_{\vv} \vu\text{.}\) The next activity shows how to find this projection.

Figure 23.12. The orthogonal projection of \(\vu\) onto \(\vv\text{.}\)

Activity 23.4.

Since the orthogonal projection \(\proj_{\vv} \vu\) is in the direction of \(\vv\text{,}\) there exists a constant \(c\) such that \(\proj_{\vv} \vu = c \vv\text{.}\) If we determine the value of \(c\text{,}\) we can find \(\proj_{\vv} \vu\text{.}\)

(a)

The wind component that acts perpendicular to the direction of \(\vv\) is called the projection of \(\vu\) orthogonal to \(\vv\) and is denoted \(\proj_{\perp \vv} \vu\) as shown in Figure 23.12. Write an equation that involves \(\proj_{\vv} \vu\text{,}\) \(\proj_{\perp \vv} \vu\text{,}\) and \(\vu\text{.}\) Then solve that equation for \(\proj_{\perp \vv} \vu\text{.}\)

(b)

Given that \(\vv\) and \(\proj_{\perp \vv} \vu\) are orthogonal, what does that tell us about \(\vv \cdot \proj_{\perp \vv} \vu\text{?}\) Combine this fact with the result of part (a) and that \(\proj_{\vv} \vu = c \vv\) to obtain an equation involving \(\vv\text{,}\) \(\vu\text{,}\) and \(c\text{.}\)

(c)

Solve for \(c\) using the equation you found in the previous step.

(d)

Use your value of \(c\) to identify \(\proj_{\vv} \vu\text{.}\)

To summarize:

Definition 23.13.

Let \(\vu\) and \(\vv\) be vectors in \(\R^n\) with \(\vv \neq \vzero\text{.}\)

  1. The orthogonal projection of \(\vu\) onto \(\vv\) is the vector

    \begin{equation} \proj_{\vv} \vu = \frac{\vu \cdot \vv}{||\vv||^2} \vv\text{.}\tag{23.2} \end{equation}
  2. The projection of \(\vu\) orthogonal to \(\vv\) is the vector

    \begin{equation*} \proj_{\perp \vv} \vu = \vu - \proj_{\vv} \vu\text{.} \end{equation*}

Activity 23.5.

Let \(\vu = \left[ \begin{array}{c} 5\\8 \end{array} \right]\) and \(\vv = \left[ \begin{array}{r} 6\\-10 \end{array} \right]\text{.}\) Find \(\proj_{\vv} \vu\) and \(\proj_{\perp \vv} \vu\) and draw a picture to illustrate.

The orthogonal projection of a vector \(\vu\) onto a vector \(\vv\) is really a projection of the vector \(\vu\) onto the space \(\Span\{\vv\}\text{.}\) The vector \(\proj_{\vv} \vu\) is the best approximation to \(\vu\) of all the vectors in \(\Span\{\vv\}\) in the sense that \(\proj_{\vv} \vu\) is the closest to \(\vu\) among all vectors in \(\Span\{\vv\}\text{,}\) as we will prove later.

Subsection Orthogonal Complements

In Activity 23.2 we defined two vectors \(\vu\) and \(\vv\) in \(\R^n\) to be orthogonal (or perpendicular) if \(\vu \cdot \vv = 0\text{.}\) A use of orthogonality in geometry is to define a plane. A plane through the origin in \(\R^3\) is a two dimensional subspace of \(\R^3\text{,}\) and a plane is defined to be the set of all vectors in \(\R^3\) that are orthogonal to a given vector (called a normal vector). For example, to find the equation of the plane through the origin in \(\R^3\) orthogonal to the normal vector \(\vn = [1 \ 2 \ -1]^{\tr}\text{,}\) we seek all the vectors \(\vv = [x \ y \ z]^{\tr}\) such that

\begin{equation*} \vv \cdot \vn = \vzero\text{.} \end{equation*}

This gives us the equation

\begin{equation*} x+2y-z = 0 \end{equation*}

as the equation of this plane. The collection of all vectors that are orthogonal to a given subspace of vectors is called the orthogonal complement of that subspace in \(\R^n\text{.}\)

Definition 23.14.

Let \(W\) be a subspace of \(\R^n\) for some \(n \geq 1\text{.}\) The orthogonal complement of \(W\) is the set

\begin{equation*} W^{\perp} = \{\vx \in \R^n : \vx \cdot \vw = 0 \text{ for all } \vw \in W\}\text{.} \end{equation*}

Preview Activity 23.6.

Let \(W = \Span\{[1 \ -1]^{\tr}\}\) in \(\R^2\text{.}\) Completely describe all vectors in \(W^{\perp}\) both algebraically and geometrically.

There is a more general idea here as defined in Preview Activity 23.6. If we have a set \(S\) of vectors in \(\R^n\text{,}\) we let \(S^{\perp}\) (read as “\(S\) perp”, called the orthogonal complement of \(S\)) be the set of all vectors in \(\R^n\) that are orthogonal to every vector in \(S\text{.}\) In our plane example, the set \(S\) is \(\{\vn\}\) and \(S^{\perp}\) is the plane with equation \(x+2y-z=0\text{.}\)

Activity 23.7.

We have seen another example of orthogonal complements. Let \(A\) be an \(m \times n\) matrix with rows \(\vr_1\text{,}\) \(\vr_2\) , \(\ldots\text{,}\) \(\vr_m\) in order. Consider the three spaces \(\Nul A\text{,}\) \(\Row A\text{,}\) and \(\Col A\) related to \(A\text{,}\) where \(\Row A = \Span\{\vr_1, \vr_2, \ldots, \vr_m\}\) (that is, \(\Row A\) is the span of the rows of \(A\)). Let \(\vx\) be a vector in \(\Row A\text{.}\)

(a)

What does it mean for \(\vx\) to be in \(\Row A\text{?}\)

(b)

Now let \(\vy\) be a vector in \(\Nul A\text{.}\) Use the result of part (a) and the fact that \(A \vy = \vzero\) to explain why \(\vx \cdot \vy = 0\text{.}\) Explain how this verifies \((\Row A)^{\perp} = \Nul A\text{.}\)

Hint.

Calculate \(A \vy\) using scalar products of rows of \(A\) with \(\vy\text{.}\)

(c)

Use \(A^\tr\) in place of \(A\) in the result of the previous part to show \((\Col A)^{\perp} = \Nul A^{\tr}\text{.}\)

The activity proves the following theorem:

To show that a vector is in the orthogonal complement of a subspace, it is not necessary to demonstrate that the vector is orthogonal to every vector in the subspace. If we have a basis for the subspace, it suffices to show that the vector is orthogonal to every vector in that basis for the subspace, as the next theorem demonstrates.

Proof.

Let \(\CB = \{\vw_1, \vw_2, \ldots, \vw_m\}\) be a basis for a subspace \(W\) of \(\R^n\) and let \(\vv\) be a vector in \(\R^n\text{.}\) Our theorem is a biconditional, so we need to prove both implications. Since \(\CB \subset W\text{,}\) it follows that if \(\vv\) is orthogonal to every vector in \(W\text{,}\) then \(\vv\) is orthogonal to every vector in \(\CB\text{.}\) This proves the forward implication. Now we assume that \(\vv\) is orthogonal to every vector in \(\CB\) and show that \(\vv\) is orthogonal to every vector in \(W\text{.}\) Let \(\vx\) be a vector in \(W\text{.}\) Then

\begin{equation*} \vx = x_1\vw_1 + x_2\vw_2 + \cdots + x_m\vw_m \end{equation*}

for some scalars \(x_1\text{,}\) \(x_2\text{,}\) \(\ldots\text{,}\) \(x_m\text{.}\) Then

\begin{align*} \vv \cdot \vx \amp = \vv \cdot (x_1\vw_1 + x_2\vw_2 + \cdots + x_m\vw_m)\\ \amp = x_1(\vv \cdot \vw_1) + x_2(\vv \cdot \vw_2) + \cdots + x_m(\vv \cdot \vw_m)\\ \amp = 0\text{.} \end{align*}

Thus, \(\vv\) is orthogonal to \(\vx\) and \(\vv\) is orthogonal to every vector in \(W\text{.}\)

Activity 23.8.

Let \(W = \Span\left\{ \left[ \begin{array}{c} 1 \\ 1 \\ 0 \end{array} \right], \left[ \begin{array}{c} 0 \\ 0 \\ 1 \end{array} \right] \right\}\text{.}\) Find all vectors in \(W^{\perp}\text{.}\)

We will work more closely with projections and orthogonal complements in later sections.

Subsection Examples

What follows are worked examples that use the concepts from this section.

Example 23.17.

Let \(\ell\) be the line defined by the equation \(ax+by+c=0\) with in \(\R^2\) and let \(P = (x_0,y_0)\) be a point in the plane. In this example we will learn how to find the distance from \(P\) to \(\ell\text{.}\)

(a)

Show that \(\vn = [a \ b]^{\tr}\) is orthogonal to the line \(\ell\text{.}\) That is, \(\vn\) is orthogonal to any vector on the line \(\ell\text{.}\)

Solution.

Any vector on the line \(\ell\) is a vector between two points on the line. Let \(Q = (x_1,y_1)\) and \(R = (x_2,y_2)\) be points on the line \(\ell\text{.}\) Then \(\vu = \overrightarrow{QR} = [x_2-x_1 \ y_2-y_1]^{\tr}\) is a vector on line \(\ell\text{.}\) Since \(Q\) and \(R\) are on the line, we know that \(ax_1+by_1+c=0\) and \(ax_2+by_2+c = 0\text{.}\) So \(-c = ax_1+by_1 = ax_2+by_2\) and

\begin{equation*} 0 = a(x_2-x_1) + b(y_2-y_1) = [a \ b]^{\tr} \vu\text{.} \end{equation*}

Thus, \(\vn = [a \ b]^{\tr}\) is orthogonal to every vector on the line \(\ell\text{.}\)

(b)

Let \(Q = (x_1, y_1)\) be any point on line \(\ell\text{.}\) Draw a representative picture of \(P\text{,}\) \(\vn\) with its initial point at \(P\text{,}\) along with \(Q\) and \(\ell\text{.}\) Explain how to use a projection to determine the distance from \(P\) to \(\ell\text{.}\)

Solution.

A picture of the situation is shown in Figure 23.18. If \(\vv = \overrightarrow{PQ}\text{,}\) then the distance from point \(P\) to line \(\ell\) is given by \(||\proj_{\vn} \vv||\text{.}\)

Figure 23.18. Distance from a point to a line.

(c)

Use the idea from part (b) to show that the distance \(d\) from \(P\) to \(\ell\) satisfies

\begin{equation} d = \frac{|ax_0+by_0+c|}{\sqrt{a^2+b^2}}\text{.}\tag{23.3} \end{equation}
Solution.

Recall that \(\vn = [a \ b]^{\tr}\) and \(\vv = \overrightarrow{PQ} = [x_1-x_0 \ y_1-y_0]^{\tr}\text{.}\) Since \(ax_1+by_1+c = 0\text{,}\) we have

\begin{align*} \proj_{\vn} \vv \amp = \frac{\vv \cdot \vv}{||\vn||^2} \vn\\ \amp = \frac{a(x_1-x_0)+ b(y_1-y_0)}{a^2+b^2} [a \ b]^{\tr}\\ \amp = \frac{ax_1+by_1-ax_0-by_0}{a^2+b^2} [a \ b]^{\tr}\\ \amp = \frac{-c-ax_0-by_0}{a^2+b^2} [a \ b]^{\tr}\text{.} \end{align*}

So

\begin{equation*} ||\proj_{\vn} \vv|| = \frac{|ax_0+by_0+c|}{a^2+b^2} \sqrt{a^2+b^2} = \frac{|ax_0+by_0+c|}{\sqrt{a^2+b^2}}\text{.} \end{equation*}
(d)

Use Equation (23.3) to find the distance from the point \((3,4)\) to the line \(y = 2x+1\text{.}\)

Solution.

Here we have \(P = (3,4)\text{,}\) and the equation of our line is \(2x-y+1=0\text{.}\) So \(a=2\text{,}\) \(b=-1\text{,}\) and \(c=-1\text{.}\) Thus, the distance from \(P\) to the line is

\begin{equation*} \frac{|ax_0+by_0+c|}{\sqrt{a^2+b^2}} = \frac{|2(3)-(4)+1|}{\sqrt{4+1}} = \frac{3}{\sqrt{5}}\text{.} \end{equation*}

Example 23.19.

Let \(a\text{,}\) \(b\text{,}\) and \(c\) be scalars with \(a \neq 0\text{,}\) and let

\begin{equation*} W = \{ax+by+cz=0 : x,y,z \in \R\} \,\text{.} \end{equation*}
(a)

Find two vectors that span \(W\text{,}\) showing that \(W\) is a subspace of \(\R^3\text{.}\) (In fact, \(W\) is a plane through the origin in \(\R^3\text{.}\))

Solution.

The coefficient matrix for the system \(ax+by+cz = 0\) is \([a \ b \ c]^{\tr}\text{.}\) The first column is a pivot column and the others are not. So \(y\) and \(z\) are free variables and

\begin{equation*} [x \ y \ z]^{\tr} = \left[-\frac{b}{a}y - \frac{c}{a}z, y, z\right]^{\tr} = y\left[ -\frac{b}{a} \ 1 \ 0\right]^{\tr} + z\left[ -\frac{c}{a} \ 0 \ 1\right]^{\tr}\text{.} \end{equation*}

So \(W = \Span\left\{\left[ -\frac{b}{a} \ 1 \ 0\right]^{\tr}, \left[ -\frac{c}{a} \ 0 \ 1\right]^{\tr}\right\}\text{.}\)

(b)

Find a vector \(\vn\) that is orthogonal to the two vectors you found in part (a).

Solution.

If we let \(\vn = [a \ b \ c]^{\tr}\text{,}\) then

\begin{align*} \vn \cdot \left[ -\frac{b}{a} \ 1 \ 0\right]^{\tr} \amp = -b+b = 0\\ \vn \cdot \left[ -\frac{c}{a} \ 0 \ 1\right]^{\tr}\ \amp = -c+c = 0\text{.} \end{align*}

Thus, \([a \ b \ c]^{\tr}\) is orthogonal to both \(\left[ -\frac{b}{a} \ 1 \ 0\right]^{\tr}\) and \(\left[ -\frac{c}{a} \ 0 \ 1\right]^{\tr}\text{.}\)

(c)

Explain why \(\{\vn\}\) is a basis for \(W^{\perp}\text{.}\)

Solution.

Let \(\vu = \left[ -\frac{b}{a} \ 1 \ 0\right]^{\tr}\) and \(\vv = \left[ -\frac{c}{a} \ 0 \ 1\right]^{\tr}\text{.}\) Every vector in \(W\) has the form \(x\vu + y\vv\) for some scalars \(x\) and \(y\text{,}\) and

\begin{equation*} \vn \cdot (x\vu + y\vv) = x(\vn \cdot \vu) + y (\vn \cdot \vv) = 0\text{.} \end{equation*}

So \(\vn \in W^{\perp}\text{.}\) Now we need to verify that \(\{\vn\}\) spans \(W^{\perp}\text{.}\) Let \(\vw = [w_1 \ w_2 \ w_3]^{\tr}\) be in \(W^{\perp}\text{.}\) Then \(\vw \cdot \vz = 0\) for every \(\vz \in W\text{.}\) In particular, \(\vw \cdot \vu = 0\) or \(-\frac{b}{a}w_1 + w_2 = 0\text{,}\) and \(\vw \cdot \vv = 0\) or \(-\frac{c}{a}w_1 + w_3 = 0\text{.}\) Equivalently, we have \(w_2 = \frac{b}{a}w_1\) and \(w_3 = \frac{c}{a}w_1\text{.}\) So

\begin{align*} \vw \amp = [w_1 \ w_2 \ w_3]^{\tr}\\ \amp = \left[ w_1 \ \frac{b}{a}w_1 \ \frac{c}{a}w_1\right]^{\tr}\\ \amp = \frac{1}{a}[a \ b \ c]^{\tr}w_1\\ \amp = \frac{w_1}{a} \vn\text{.} \end{align*}

So every vector in \(W^{\perp}\) is a multiple of \(\vn\text{,}\) and \(\{\vn\}\) spans \(W^{\perp}\text{.}\) We conclude that \(\{\vn\}\) is a basis for \(W^{\perp}\text{.}\) Thus, the vector \([a \ b \ c]^{\tr}\) is a normal vector to the plane \(ax+by+cz=0\) if \(a \neq 0\text{.}\) The same reasoning works if at least one of \(a\text{,}\) \(b\text{,}\)or \(c\) is nonzero, so we can say in every case that \([a \ b \ c]^{\tr}\) is a normal vector to the plane \(ax+by+cz=0\text{.}\)

Subsection Summary

  • The dot product of vectors \(\vu = [u_1 \ u_2 \ \cdots \ u_n ]^{\tr}\) and \(\vv = [ v_1 \ v_2 \ \cdots \ v_n ]^{\tr}\) in \(\R^n\) is the scalar

    \begin{equation*} \vu \cdot \vv = u_1v_1 + u_2v_2 + \cdots + u_nv_n = \displaystyle \sum_{i=1}^n u_iv_i\text{.} \end{equation*}
  • The angle \(\theta\) between two nonzero vectors \(\vu\) and \(\vv\) in \(\R^n\) satisfies the equation

    \begin{equation*} \cos(\theta) = \frac{\vu \cdot \vv}{||\vu|| \ ||\vv||} \end{equation*}

    and \(0\leq \theta \leq 180\text{.}\)

  • Two vectors \(\vu\) and \(\vv\) are orthogonal if \(\vu \cdot \vv = 0\text{.}\)

  • The length, or norm, of the vector \(\vu\) can be found as \(\ds || \vu || = \sqrt{\vu \cdot \vu}\text{.}\)

  • The distance between the vectors \(\vu\) and \(\vv\) in \(\R^n\) is \(||\vu - \vv ||\text{,}\) which is the length of the difference \(\vu - \vv\text{.}\)

  • Let \(\vu\) and \(\vv\) be vectors in \(\R^n\text{.}\)

    • The orthogonal projection of \(\vu\) onto \(\vv\) is the vector

      \begin{equation*} \proj_{\vv} \vu = \frac{\vu \cdot \vv}{||\vv||^2} \vv\text{.} \end{equation*}
    • The projection of \(\vu\) perpendicular to \(\vv\) is the vector

      \begin{equation*} \proj_{\perp \vv} \vu = \vu - \proj_{\vv} \vu\text{.} \end{equation*}
  • The orthogonal complement of the subspace \(W\) of \(\R^n\) is the set

    \begin{equation*} W^{\perp} = \{\vx \in \R^n : \vx \cdot \vw = 0 \text{ for all } \vw \in W\}\text{.} \end{equation*}

Exercises Exercises

1.

For each of the following pairs of vectors, find \(\vu \cdot \vv\text{,}\) calculate the angle between \(\vu\) and \(\vv\text{,}\) determine if \(\vu\) and \(\vv\) are orthogonal, find \(||\vu||\) and \(||\vv||\text{,}\) calculate the distance between \(\vu\) and \(\vv\text{,}\) and determine the orthogonal projection of \(\vu\) onto \(\vv\text{.}\)

(a)

\(\vu = [1 \ 2]^{\tr}\text{,}\) \(\vv = [-2 \ 1]^{\tr}\)

(b)

\(\vu = [2 \ -2]^{\tr}\text{,}\) \(\vv = [1 \ -1]^{\tr}\)

(c)

\(\vu = [2 \ -1]^{\tr}\text{,}\) \(\vv = [1 \ 3]^{\tr}\)

(d)

\(\vu = [1 \ 2 \ 0]^{\tr}\text{,}\) \(\vv = [-2 \ 1 \ 1]^{\tr}\)

(e)

\(\vu = [0 \ 0 \ 1]^{\tr}\text{,}\) \(\vv = [1 \ 1 \ 1]^{\tr}\)

2.

Given \(\vu=[2 \ 1\ 2]^\tr\text{,}\) find a vector \(\vv\) so that the angle between \(\vu\) and \(\vv\) is \(60^\circ\) and the orthogonal projection of \(\vv\) onto \(\vu\) has length 2.

3.

For which value(s) of \(h\) is the angle between \([1\ 1\ h]^\tr\) and \([1\ 2\ 1]^\tr\) equal to \(60^\circ\text{?}\)

4.

Let \(A = [a_{ij}]\) be a \(k \times m\) matrix with rows \(\vr_1\text{,}\) \(\vr_2\text{,}\) \(\ldots\text{,}\) \(\vr_k\text{,}\) and let \(B = [\vb_1 \ \vb_2 \ \cdots \ \vb_n]\) be an \(m \times n\) matrix with columns \(\vb_1\text{,}\) \(\vb_2\text{,}\) \(\ldots\text{,}\) \(\vb_n\text{.}\) Show that we can write the matrix product \(AB\) in a shorthand way as \(AB = [\vr_i \cdot \vb_j]\text{.}\)

5.

Let \(A\) be an \(m \times n\text{,}\) \(\vu\) a vector in \(\R^n\) and \(\vv\) a vector in \(\R^m\text{.}\) Show that

\begin{equation*} A\vu \cdot \vv = \vu \cdot A^{\tr} \vv\text{.} \end{equation*}

6.

Let \(\vu\text{,}\) \(\vv\text{,}\) and \(\vw\) be vectors in \(\R^n\text{.}\) Show that

(a)

\((\vu + \vv) \cdot \vw = (\vu \cdot \vw) + (\vv \cdot \vw)\) (the dot product distributes over vector addition)

(b)

If \(c\) is an arbitrary constant, then \((c\vu) \cdot \vv = \vu \cdot (c\vv) = c(\vu \cdot \vv)\)

7.

The Pythagorean Theorem states that if \(a\) and \(b\) are the lengths of the legs of a right triangle whose hypotenuse has length \(c\text{,}\) then \(a^2+b^2=c^2\text{.}\) If we think of the legs as defining vectors \(\vu\) and \(\vv\text{,}\) then the hypotenuse is the vector \(\vu+\vv\) and we can restate the Pythagorean Theorem as

\begin{equation*} ||\vu+\vv||^2 = ||\vu||^2+||\vv||^2\text{.} \end{equation*}

In this exercise we show that this result holds in any dimension.

(a)

Let \(\vu\) and \(\vv\) be orthogonal vectors in \(\R^n\text{.}\) Show that \(||\vu+\vv||^2 = ||\vu||^2+||\vv||^2\text{.}\)

Hint.

Rewrite \(||\vu+\vv||^2\) using the dot product.

(b)

Must it be true that if \(\vu\) and \(\vv\) are vectors in \(\R^n\) with \(||\vu+\vv||^2 = ||\vu||^2+||\vv||^2\text{,}\) then \(\vu\) and \(\vv\) are orthogonal? If not, provide a counterexample. If true, verify the statement.

8.

The Cauchy-Schwarz inequality,

\begin{equation} |\vu \cdot \vv| \leq ||\vu|| \ \||\vv||\tag{23.4} \end{equation}

for any vectors \(\vu\) and \(\vv\) in \(\R^n\text{,}\) is considered one of the most important inequalities in mathematics. We verify the Cauchy-Schwarz inequality in this exercise. Let \(\vu\) and \(\vv\) be vectors in \(\R^n\text{.}\)

(a)

Explain why the inequality (23.4) is true if either \(\vu\) or \(\vv\) is the zero vector. As a consequence, we assume that \(\vu\) and \(\vv\) are nonzero vectors for the remainder of this exercise.

(b)

Let \(\vw = \proj_{\vv} \vu = \frac{\vu \cdot \vv}{||\vv||^2} \vv\) and let \(\vz = \vu - \vw\text{.}\) We know that \(\vw \cdot \vz = 0\text{.}\) Use Exercise 7 of this section to show that

\begin{equation*} ||\vu||^2 \geq ||\vw||^2\text{.} \end{equation*}
(c)

Now show that \(||\vw||^2 = \frac{|\vu \cdot \vv|^2}{||\vv||^2}\text{.}\)

(d)

Combine parts (b) and (c) to explain why equation (23.4) is true.

9.

Let \(\vu\) and \(\vv\) be vectors in \(\R^n\text{.}\) Then \(\vu\text{,}\) \(\vv\) and \(\vu+\vv\) form a triangle. We should then expect that the length of any one side of the triangle is smaller than the sum of the lengths of the other sides (since the straight line distance is the shortest distance between two points). In other words, we expect that

\begin{equation} ||\vu + \vv|| \leq ||\vu|| + ||\vv||\text{.}\tag{23.5} \end{equation}

Equation (23.5) is called the Triangle Inequality. Use the Cauchy-Schwarz inequality (Exercise 8) to prove the triangle inequality.

10.

Let \(W\) be a subspace of \(\R^n\) for some \(n\text{.}\) Show that \(W^{\perp}\) is also a subspace of \(\R^n\text{.}\)

11.

Let \(W\) be a subspace of \(\R^n\text{.}\) Show that \(W\) is a subspace of \((W^\perp)^\perp\text{.}\)

12.

If \(W\) is a subspace of \(\R^n\) for some \(n\text{,}\) what is \(W \cap W^{\perp}\text{?}\) Verify your answer.

13.

Suppose \(W_1\subseteq W_2\) are two subspaces of \(\R^n\text{.}\) Show that \(W_2^\perp \subseteq W_1^\perp\text{.}\)

14.

What are \(\left(\R^n\right)^{\perp}\) and \(\{\vzero\}^{\perp}\) in \(\R^n\text{?}\) Justify your answers.

15.

Label each of the following statements as True or False. Provide justification for your response.

(a) True/False.

The dot product is defined between any two vectors.

(b) True/False.

If \(\vu\) and \(\vv\) are vectors in \(\R^n\text{,}\) then \(\vu \cdot \vv\) is another vector in \(\R^n\text{.}\)

(c) True/False.

If \(\vu\) and \(\vv\) are vectors in \(\R^n\text{,}\) then \(\vu \cdot \vv\) is always non-negative.

(d) True/False.

If \(\vv\) is a vector in \(\R^n\text{,}\) then \(\vv \cdot \vv\) is never negative.

(e) True/False.

If \(\vu\) and \(\vv\) are vectors in \(\R^n\) and \(\vu \cdot \vv = 0\text{,}\) then \(\vu = \vv = \vzero\text{.}\)

(f) True/False.

If \(\vv\) is a vector in \(\R^n\) and \(\vv \cdot \vv = 0\text{,}\) then \(\vv = \vzero\text{.}\)

(g) True/False.

The norm of the sum of vectors is the sum of the norms of the vectors.

(h) True/False.

If \(\vu\) and \(\vv\) are vectors in \(\R^n\text{,}\) then \(\proj_{\vv} \vu\) is a vector in the same direction as \(\vu\text{.}\)

(i) True/False.

The only subspace \(W\) of \(\R^n\) for which \(W^\perp=\{\vzero\}\) is \(W=\R^n\text{.}\)

(j) True/False.

If a vector \(\vu\) is orthogonal to \(\vv_1\) and \(\vv_2\text{,}\) then \(\vu\) is also orthogonal to \(\vv_1+\vv_2\text{.}\)

(k) True/False.

If a vector \(\vu\) is orthogonal to \(\vv_1\) and \(\vv_2\text{,}\) then \(\vu\) is also orthogonal to all linear combinations of \(\vv_1\) and \(\vv_2\text{.}\)

(l) True/False.

If \(\vu\neq \vzero\) and \(\vv\) are parallel, then the orthogonal projection of \(\vv\) onto \(\vu\) equals \(\vv\text{.}\)

(m) True/False.

If \(\vu\neq \vzero\) and \(\vv\) are orthogonal, then the orthogonal projection of \(\vv\) onto \(\vu\) equals \(\vv\text{.}\)

(n) True/False.

For any vector \(\vv\) and \(\vu\neq \vzero\text{,}\) \(||\proj_\vu \vv|| \leq ||\vv||\text{.}\)

(o) True/False.

Given an \(m\times n\) matrix, \(\dim(\Row A)+\dim(\Row A)^\perp = n\text{.}\)

(p) True/False.

If \(A\) is a square matrix, then the columns of \(A\) are orthogonal to the vectors in \(\Nul A\text{.}\)

(q) True/False.

The vectors in the null space of an \(m \times n\) matrix are orthogonal to vectors in the row space of \(A\text{.}\)

Subsection Project: Back-Face Culling

To identify hidden polygons in a surface, we will utilize a technique called back face culling. This involves identifying which polygons are back facing and which are front facing relative to the viewer's perspective. The first step is to assign a direction to each polygon in a surface.

Project Activity 23.9.

Consider the polygon \(ABCD\) in Figure 23.20. Since a polygon is flat, every vector in the polygon is perpendicular to a fixed vector (which we call a normal vector to the polygon). A normal vector \(\vn\) for the polygon \(ABCD\) in Figure 23.20 is shown. In this activity we learn how to find a normal vector to a polygon.

Let \(\vx = [x_1 \ x_2 \ x_3]^{\tr}\) and \(\vy = [y_1 \ y_2 \ y_3]^{\tr}\) be two vectors in \(\R^3\text{.}\) If \(\vx\) and \(\vy\) are linearly independent, then \(\vx\) and \(\vy\) determine a polygon as shown in Figure 23.20. Our goal is to find a vector \(\vn\) that is orthogonal to both \(\vx\) and \(\vy\text{.}\) Let \(\vw = [w_1 \ w_2 \ w_3]^{\tr}\) be another vector in \(\R^3\) and let \(C = \left[ \begin{array}{c} \vw^{\tr} \\ \vx^{\tr} \\ \vy^{\tr} \end{array} \right]\) be the matrix whose rows are \(\vw\text{,}\) \(\vx\text{,}\) and \(\vy\text{.}\) Let \(C_{ij}\) be the \(ij\)th cofactor of \(C\text{,}\) that is \(C_{ij}\) is \((-1)^{i+j}\) times the determinant of the submatrix of \(C\) obtained by deleting the \(i\)th row and \(j\)th column of \(C\text{.}\) Now define the vector \(\vx \times \vy\) as follows:

\begin{equation*} \vx \times \vy = C_{11}\ve_1 + C_{12}\ve_2 + C_{13} \ve_3\text{.} \end{equation*}

The vector \(\vx \times \vy\) is called the cross product of the vectors \(\vx\) and \(\vy\text{.}\) (Note that the cross product is only defined for vectors in \(\R^3\text{.}\)) We will show that \(\vx \times \vy\) is orthogonal to both \(\vx\) and \(\vy\text{,}\) making \(\vx \times \vy\) a normal vector to the polygon defined by \(\vx\) and \(\vy\text{.}\)

Figure 23.20. Normal vector to a polygon.
(a)

Show that

\begin{equation*} \vx \times \vy = \left[ \begin{array}{c} x_2y_3-x_3y_2 \\ x_3y_1-x_1y_3 \\ x_1y_2-x_2y_1 \end{array} \right]\text{.} \end{equation*}
(b)

Use a cofactor expansion of \(C\) along the first row and properties of the dot product to show that

\begin{equation*} \det(C) = \vw \cdot (\vx \times \vy)\text{.} \end{equation*}
(c)

Use the result of part (b) and properties of the determinant to calculate \(\vx \cdot (\vx \times \vy)\) and \(\vy \cdot (\vx \times \vy)\text{.}\) Explain why \(\vx \times \vy\) is orthogonal to both \(\vx\) and \(\vy\) and is therefore a normal vector to the polygon determined by \(\vx\) and \(\vy\text{.}\)

Project Activity 23.9 shows how we can find a normal vector to a parallelogram — take two vectors \(\vx\) and \(\vy\) between the vertices of the parallelogram and calculate their cross products. Such a normal vector can define a direction for the parallelogram. There is still a problem, however.

Project Activity 23.10.

Let \(\vx = [x_1 \ x_2 \ x_3]^{\tr}\) and \(\vy = [y_1 \ y_2 \ y_3]^{\tr}\) be any vectors in \(\R^3\text{.}\) There is a relationship between \(\vx \times \vy\) and \(\vy \times \vx\text{.}\) Find and verify this relationship.

Project Activity 23.10 shows that the cross product is anticommutative, so we get different directions if we switch the order in which we calculate the cross product. To fix a direction, we establish the convention that we always label the vertices of our parallelogram in the counterclockwise direction as shown in Figure 23.20. This way we always use \(\vx\) as the vector from vertex \(A\) to vertex \(B\) rather than the reverse. With this convention established, we can now define the direction of a parallelogram as the direction of its normal vector.

Once we have a normal vector established for each polygon, we can now determine which polygons are back-face and which are front-face. Figure 23.21 at left provides the gist of the idea, where we represent the polygons with line segments to illustrate. If the viewer's eye is at point \(P\) and views the figures, the normal vectors of the visible polygons point in a direction toward the viewer (front-face) and the normal vectors of the polygons hidden from the viewer point away from the viewer (back-face). What remains is to determine an effective computational way to identify the front and back facing polygons.

Figure 23.21. Left: Hidden faces. Right: Back face culling.

Project Activity 23.11.

Consider the situation as depicted at right in Figure 23.21. Assume that \(AB\) and \(RS\) are polygons (rendered one dimensionally here) with normal vectors \(\vn\) at their centers as shown. The viewer's eye is at point \(P\) and the viewer's line of vision to the centers \(C_{AB}\) and \(C_{RS}\) are indicated by the vectors \(\vv\text{.}\) Each vector \(\vv\) makes an angle \(\theta\) with the normal to the polygon.

(a)

What can be said about the angle \(\theta\) for a front-facing polygon? What must be true about \(\vv \cdot \vn\) for a front-facing polygon? Why?

(b)

What can be said about the angle \(\theta\) for a back-facing polygon? What must be true about \(\vv \cdot \vn\) for a back-facing polygon? Why?

(c)

The dot product then provides us with a simple computational tool for identifying back-facing polygons (assuming we have already calculated all of the normal vectors). We can then create an algorithm to cull the back-facing polygons. Assuming that we the viewpoint \(P\) and the coordinates of the polygons of the surface, complete the pseudo-code for a back-face culling algorithm:

for all polygons on the surface do
  calculate the normal vector n using the ______ product for the current polygon
  calculate the center C of the current polygon
  calculate the viewing vector ______
    if ______ then
      render the current polygon
  end if
end for

As a final comment, back-face culling generally reduces the number of polygons to be rendered by half. This algorithm is not perfect and does not always do what we want it to do (e.g., it may not remove all parts of a polygon that we don't see), so there are other algorithms to use in concert with back-face culling to correctly render objects.

Technically, \(\vu^{\tr}\vv\) is a \(1 \times 1\) matrix and not a scalar, but we usually think of \(1 \times 1\) matrices as scalars.
We use the term orthogonal instead of perpendicular because we will be able to extend this idea to situations where we normally don't think of objects as being perpendicular.