Skip to main content

Section The Hamming Metric

In our society, a great deal of information is communicated electronically. Bank transactions, television programs, military communications, cell phone calls, digital images, and almost any interchange one can think of either can be or is digitized and transmitted electronically. In many situations we need to compare one set of data to another (e.g., Internet searches for text strings or image matches, DNA strands), and metrics are often used for this purpose. Computers work in a binary system, that is they recognize only zeros and ones. So a digital text message is a string of zeros and ones. That is, a digital message is a collection of elements in the space Xn for some positive integer n, where X={0,1}. Each element in Xn is called a word - that is, a word is an element in Xn denoted in the form (x1,x1,…,xn). Just like in the English language, where not every combination of letters corresponds to words that make sense, not every word is recognizable as part of an intelligible message. We might, for example, code the letters of the alphabet by assigning numbers 1-26 to the letters, then make them elements of Xn by converting to binary. The collection of all intelligible words is called a code. So a code is just some subset of Xn that all parties agree are sensible words. The words in a code are called code words. To deal with problems that occur in transmitting digital messages, like scrambling (encoding) messages, unscrambling (decoding) messages, and detecting and correcting errors in messages, it is useful to have a way to measure distance between words. One way is to use the Hamming metric.

Definition 4.1.

Let x=(x1,x2,…,xn) and y=(y1,y2,…,yn) be words in Xn. The Hamming distance dH between x and y is
dH(x,y)=βˆ‘i=1n|xiβˆ’yi|.
Recall that for each i both xi and yi are either 0 or 1. So
|xiβˆ’yi|={0 if xi=yi1 if xiβ‰ yi.
In other words, dH(x,y) counts the number of components at which x and y are different.

Activity 4.1.

(b)

Suppose we create a code
C={c1,c2,c3,c4,c5,c6,c7,c8}
in X6, where
c1=(0,0,0,0,0,0)c2=(0,0,0,0,1,1)c3=(0,0,0,1,0,1)c4=(0,0,1,0,0,1)c5=(0,0,0,1,1,0)c6=(0,0,1,0,1,0)c7=(0,0,1,1,0,0)c8=(0,0,1,1,1,1)
That is, the words c1, c2, c3, c4, c5, c6, c7, c8 are the only words that can comprise a message. Find dH(c2,c8).

(c)

Suppose we are on the receiving end of the message
(4.1)(0,0,0,1,1,1) (0,0,1,1,0,0) (1,0,0,0,0,0) (0,0,0,0,1,1) (0,0,1,0,0,1).
(i)
How do we know that an error has occurred in transmission of the message we received?
(ii)
To correct the errors in this received message, we replace the incorrect words with the code word(s) in C closest to them. Correct this message. (Note that there may be more than one possible substitution. Find all of the possibilities.)