The Hamming Metric

Section The Hamming Metric

In our society, a great deal of information is communicated electronically. Bank transactions, television programs, military communications, cell phone calls, digital images, and almost any interchange one can think of either can be or is digitized and transmitted electronically. In many situations we need to compare one set of data to another (e.g., Internet searches for text strings or image matches, DNA strands), and metrics are often used for this purpose. Computers work in a binary system, that is they recognize only zeros and ones. So a digital text message is a string of zeros and ones. That is, a digital message is a collection of elements in the space

X^{n}

for some positive integer

n,

where

X = {0, 1} .

Each element in

X^{n}

is called a word - that is, a word is an element in

X^{n}

denoted in the form

(x_{1}, x_{1}, \dots, x_{n}) .

Just like in the English language, where not every combination of letters corresponds to words that make sense, not every word is recognizable as part of an intelligible message. We might, for example, code the letters of the alphabet by assigning numbers 1-26 to the letters, then make them elements of

X^{n}

by converting to binary. The collection of all intelligible words is called a code. So a code is just some subset of

X^{n}

that all parties agree are sensible words. The words in a code are called code words. To deal with problems that occur in transmitting digital messages, like scrambling (encoding) messages, unscrambling (decoding) messages, and detecting and correcting errors in messages, it is useful to have a way to measure distance between words. One way is to use the Hamming metric.

🔗