Codes and ciphers

Codes and ciphers are forms of secret communication. In general, a code replaces words, phrases, or sentences with groups of letters or numbers. A cipher rearranges letters or uses substitutes to disguise a message.

The technology of secret communication is called cryptology. It has two opposing parts: communications security and communications intelligence. People use communications security, also called COMSEC, to make messages secret. The study and practice of COMSEC methods is called cryptography. Communications intelligence, also called COMINT, consists of learning about messages without the permission of the communicators. COMINT includes eavesdropping, bugging rooms, wiretapping telephone conversations, and cracking the codes or ciphers of enemy forces. Solving such secret communications is called cryptanalysis.

In cryptology, the original message is called the plaintext. Its secret form is the ciphertext or cryptogram. The mathematical process that changes one into the other is the cryptographic algorithm. A key controls the operations of an algorithm. The receiver of a ciphertext must have been given the algorithm and key to convert the ciphertext back into plaintext. Encrypting is the process of converting plaintext into ciphertext. Decrypting is the process by which the intended receiver changes ciphertext back into plaintext.

There are two types of cryptosystems, or types of algorithms: (1) secret-key or symmetrical systems and (2) public-key or asymmetrical systems. In a secret-key system, the same key is used for both encryption and decryption. Anyone knowing the key can both encrypt and decrypt messages. In a public-key system, there are two keys. One key, the public key, encrypts a message. Another key, the private key, decrypts it.

This article explains only encrypting procedures. From them, decrypting procedures can be determined.

Communications security

The letters, numbers, words, punctuation marks, and other symbols that make up a plaintext can be turned into secret form in only two ways. One method, called transposition, rearranges them. The other, called substitution, replaces them with other characters or symbols. Simple ciphers treat the plaintext as letters and numbers and use transposition or substitution alone or in simple combinations to construct ciphertext. The sender often transmits the resulting ciphertext in blocks (groups) of an equal number of letters or numbers, regardless of the true word divisions, to help further conceal the plaintext. More complex ciphers first convert the plaintext into a sequence of numbers and then use combinations of transposition and substitution, along with arithmetic operations, to construct the ciphertext. The sender then often transmits the ciphertext as a continuous stream of numbers.

Transposition.

All transposition ciphers need a rule for mixing up the symbols in the plaintext. A simple transposition reverses consecutive pairs of letters. In such a cipher, the message DO NOT DEPART would become ODOND TPERA T. A more secure form of transposition is columnar transposition. In this method, shown below, the coder writes the plaintext horizontally by lines under a series of key numbers and then takes out the coded message vertically by columns in the order of the key numbers.

Transposition example
Transposition example

Thus, the message AWAIT MY ORDERS becomes WYRAO STDAM EIR.

Substitution.

The simplest form of substitution is monoalphabetic substitution, where a single cipher alphabet is used. It replaces each letter of the plaintext with a particular symbol. For example, if the substitute for a is X, all the a’_s in the plaintext become _X’_s in the ciphertext. The complete list of substitutes for the 26 letters may be set out in a _cipher alphabet:

Cipher alphabet example
Cipher alphabet example

The plaintext attack would become ciphertext XBBX+6. The ciphertext 3$ decrypts into go.

In polyalphabetic substitution, the coder replaces a plaintext letter with substitutes from several cipher alphabets rather than from a single one. A common substitution method uses a table like that below.

Substitution table example
Substitution table example

A common and more flexible method of polyalphabetic substitution employs a keyword to specify the cipher alphabets to be used. If the keyword is BOX, for example, the correspondents will use the cipher alphabets beginning with the letters B, O, and X in that order. To encipher, the coder writes the keyword repeatedly above the plaintext. The substitute for each plaintext letter appears under that plaintext letter in the cipher alphabet that begins with its key letter:

Keyword example
Keyword example

The great advantage of the use of a keyword is that correspondents can change it easily in case of overuse or actual or feared discovery. Its disadvantage is its regular repetition. One way of avoiding this repetition is to use a long phrase. Such a phrase is called a running key.

Polyalphabetic substitutions adapt easily to cipher machines and, in certain systems, are difficult to solve. They are thus among the most widely used ciphers.

In polygraphic substitution, the coder puts two or more letters into cipher as a unit. The Playfair algorithm, used during World War I (1914-1918) and World War II (1939-1945), starts with a 25-letter scrambled alphabet arranged in a 5-letter by 5-letter square. The algorithm enciphers plaintext letters in pairs according to their positions in relation to each other within the square. The Hill Cipher converts plaintext letters into numbers and then plugs them into algebraic equations. To encipher the text, the coder simply solves the equations.

Public-key systems

involve mathematical problems that are easy to solve in one direction but hard to solve in the other. Such a system is called a trapdoor function. The most widely used public-key system is known as the RSA algorithm, named for its inventors, the computer scientists Leonard Adleman and Ronald L. Rivest of the United States and Adi Shamir of Israel. The algorithm is based on the fact that it is easy to multiply together two prime numbers (numbers evenly divisible only by themselves and the number one), but hard to find the prime numbers after they have been multiplied together. The actual construction of the algorithm relies on the branch of mathematics known as number theory. Other public-key algorithms use other types of mathematical problems.

Cipher machines.

Machines can generate complex algorithms that people cannot work easily or accurately by hand. The most famous mechanical algorithm was invented by Boris C. W. Hagelin, a Swedish engineer. It shifts a cipher alphabet to various positions in a key sequence more than 100 million letters long. The U.S. Army used Hagelin’s machine during World War II.

An Enigma machine (shown on the left) during World War II
An Enigma machine (shown on the left) during World War II

Electromechanical rotor machines are more powerful than mechanical machines. The “Enigma,” used by the Germans during World War II, was the best known. This machine had an electric keyboard into which the plaintext was typed. The electric signal representing the plaintext letter passed through a succession of wired code wheels called rotors. This created an electric maze that changed as the rotors turned, thus constantly changing the cipher alphabet. The most common version of the Enigma used 17,576 cipher alphabets. Today, electronic computers regularly carry out cryptographic algorithms.

Codes.

Most codes fill a book, while a cipher can be written on a single piece of paper or embodied in a cipher machine. The cryptosystem is called a code if the plaintext elements consist not only of letters but also of hundreds of words, phrases, sentences, syllables, numbers, and punctuation marks. In the code, all these elements are replaced by groups of numbers or letters. For example, 77181 may mean Wait for further instructions.

Communications intelligence

Frequency analysis.

Cryptanalysis is the process of studying the ciphertext to extract information about the plaintext when one does not have the key. Statistics plays an important role in this process. Letters occur with varying frequency in English and other languages. The proportion of their frequency is remarkably stable. For example, in English, the letter e is used more than any other (13 percent), followed by t (9 percent). If a cryptanalyst counts the letters of a long monoalphabetic substitution and finds that X is the most common, he or she guesses that X stands for e. The analyst replaces all the X’_s with _e’_s and starts to guess at words. For example, _e?e? might be even or ever. But in short messages, the most frequent letter may not be e.

Clues are also provided by contacts—_that is, which letters stand to the right and to the left of a particular letter. For example, three high-frequency letters that rarely contact each other are _a, o, and i. A high-frequency letter that follows vowels in 80 percent of its appearances is n. One that precedes vowels 100 times more often than it follows them is h. The five most common letter pairs are, in order, th, he, in, er, and an. The five most common words are the, of, and, to, and a.

Frequency analysis is much harder to use when the cryptogram is enciphered by polyalphabetic substitution. The cryptanalyst must first identify the different cipher alphabets used, then solve each key separately.

Unbreakable ciphers.

A basic assumption of practical cryptography is that outsiders know the general system. Secrecy must reside only in the keys. For example, possession of a cipher machine should not permit a cryptanalyst to solve messages encrypted with it if he or she does not know the key settings.

Modern cryptosystems are designed so that frequency analysis has little effect against them. But even the most sophisticated cryptosystems are vulnerable to a “brute-force” attack. Such an attack tries every possible key until one is successful. The best protection against a brute-force attack is to make the number of possible keys so large that it is impractical to try all of them in a reasonable amount of time, even with the fastest available computers.

Voynich manuscript page from botanical section
Voynich manuscript page from botanical section

The only cryptosystem known to be unbreakable, even by a brute-force attack, is called the one-time pad. On a computer, the plaintext is first converted to a sequence of zeroes and ones (called “bits”). This stage may be performed by another cipher that represents the letters of the alphabet by batches of bits. Then the key, which consists of another sequence of zeroes and ones exactly as long as the plaintext, is constructed completely at random. For example, if the plaintext has 20 bits, one may choose the key by flipping a coin twenty times and recording heads as one and tails as zero. The ciphertext is made by writing the key above the plaintext and combining the bits according to the rules 0 + 0 = 0, 0 + 1 = 1, 1 + 0 = 1, and 1 +1 = 0. For example:

Computer data encryption example
Computer data encryption example

To recover the plaintext, the process is repeated by combining the bits of the ciphertext with the key according to the same rules. Because the key is random and just as long as the plaintext, it is impossible to analyze the ciphertext to recover any information. Trying all possible keys will only yield all possible strings of zeroes and ones of the given length.

History

Documents indicate that secret writing arose independently in many civilizations as soon as writing became widely used. The Arabs first devised a science of cryptanalysis in the A.D. 700’s, using letter frequencies.

Zimmermann note
Zimmermann note

Cryptology came into widespread use in the West during the 1300’s, when ambassadors were first assigned residence in other countries. They frequently used codes to send confidential reports home and to get secret orders.

Decoded Zimmermann telegram
Decoded Zimmermann telegram

During the mid-1800’s, the widespread use of the telegraph led to the development of military field ciphers. In the early 1900’s, military forces sent many messages in cipher by radio. Because enemies could easily intercept these transmissions, cryptanalysis became a powerful intelligence force during World War I.

Possibly the most important single solution in history occurred during World War I. The British cryptanalyzed a message from the German foreign minister, Arthur Zimmermann, to the German ambassador in Mexico. It promised that if Mexico would fight the United States, Germany would see that Mexico got back its “lost territories” of Texas, Arizona, and New Mexico. This disclosure helped bring the United States into the war.

The enormous wartime burden of encrypting radiograms stimulated inventors to mechanize the work. In 1917, Gilbert S. Vernam, an American engineer, automated cryptography by joining an electromagnetic ciphering device to a teletypewriter. Using a key of punched tape, the mechanism encrypted the plaintext and transmitted the cryptogram. A receiving cipher teletypewriter automatically decrypted the ciphertext and printed out the plaintext. In 1918, Joseph O. Mauborgne, a U.S. Army major, devised the one-time pad. About the same time, the rotor was invented independently by both Edward H. Hebern, an American businessman, and Arthur Scherbius, a German electrical engineer.

Later developments.

In 1932, Marian Rejewski, a Polish mathematician, aided by information from a spy, solved the coding procedures of Scherbius’s machine, the Enigma. During World War II, the British mathematician Alan Turing modified Rejewski’s solution to decrypt German messages. United States and British codebreakers helped defeat German submarines in the Atlantic Ocean. In the Pacific Ocean, cryptanalysis played a crucial role in sinking Japan’s merchant marine fleet. Codebreaking enabled Allied forces to identify and shoot down the airplane carrying Admiral Isoroku Yamamoto, Japan’s chief naval leader. Cryptanalysis also led to victories against German forces in North Africa and Europe. The code solutions hastened the defeats of Germany and Japan and shortened the war by months. See World War II (The Ultra secret) .

The Colossus machine, a British computer used during World War II
The Colossus machine, a British computer used during World War II

In 1976, Martin Hellman, an electrical engineer at Stanford University, and his student Whitfield Diffie published the concept of asymmetric, or public-key, ciphers. The first practical realization of this concept was the RSA algorithm, developed in 1977 at the Massachusetts Institute of Technology.

Since the 1970’s, the use of cryptography in private business has grown rapidly. In 1977, the U.S. government approved a secret-key system that uses a complicated electronic transposition-substitution algorithm called the Data Encryption Standard (DES). DES was designed to protect data stored in or transmitted between computers. During the 1990’s, the tremendous increase in computer speed made DES vulnerable to brute-force attacks. A new algorithm called the Advanced Encryption Standard (AES), approved in 2001, replaced DES in many electronic commerce applications.

Modern applications of cryptography

include the encryption and decryption of information sent via cellular telephone or the internet. In both types of communication, cryptography provides a way to protect the privacy of the transmitted information. For example, in an e-commerce (electronic commerce) transaction on the internet, it prevents wrongdoers from intercepting credit card numbers or other personal data. All modern web browsers—computer programs used to access websites on the internet—have a built-in feature called Transport Layer Security (TLS). This feature uses both symmetric encryption and public-key encryption to protect e-commerce transactions. An application called cryptographic authentication is used to verify that a person transmitting or receiving information is who he or she claims to be.

Cellular telephones encrypt voice transmissions to prevent eavesdropping. A cell phone must also identify the caller so that the phone center can bill the correct person for the call. Identity authentication involves the use of a secret key and algorithm. Each time a call is placed, the phone center sends a different random number, called a challenge, to the caller’s phone. A secret key stored in the phone, and shared by the phone center, encrypts the challenge and returns it. If the response agrees with the center’s encryption, the center accepts the caller as legitimate.