The Code Book: The Secret History of Codes and Code-breaking. Simon Singh
a certain number of places (in this case three), relative to the plain alphabet. The convention in cryptography is to write the plain alphabet in lower-case letters, and the cipher alphabet in capitals. Similarly, the original message, the plaintext, is written in lower case, and the encrypted message, the ciphertext, is written in capitals.
Caesar used secret writing so frequently that Valerius Probus wrote an entire treatise on his ciphers, which unfortunately has not survived. However, thanks to Suetonius’ Lives of the Caesars LVI, written in the second century AD, we do have a detailed description of one of the types of substitution cipher used by Julius Caesar. He simply replaced each letter in the message with the letter that is three places further down the alphabet. Cryptographers often think in terms of the plain alphabet, the alphabet used to write the original message, and the cipher alphabet, the letters that are substituted in place of the plain letters. When the plain alphabet is placed above the cipher alphabet, as shown in Figure 3, it is clear that the cipher alphabet has been shifted by three places, and hence this form of substitution is often called the Caesar shift cipher, or simply the Caesar cipher. A cipher is the name given to any form of cryptographic substitution in which each letter is replaced by another letter or symbol.
Although Suetonius mentions only a Caesar shift of three places, it is clear that by using any shift between 1 and 25 places it is possible to generate 25 distinct ciphers. In fact, if we do not restrict ourselves to shifting the alphabet and permit the cipher alphabet to be any rearrangement of the plain alphabet, then we can generate an even greater number of distinct ciphers. There are over 400,000,000,000,000,000,000,000,000 such rearrangements, and therefore the same number of distinct ciphers.
Each distinct cipher can be considered in terms of a general encrypting method, known as the algorithm, and a key, which specifies the exact details of a particular encryption. In this case, the algorithm involves substituting each letter in the plain alphabet with a letter from a cipher alphabet, and the cipher alphabet is allowed to consist of any rearrangement of the plain alphabet. The key defines the exact cipher alphabet to be used for a particular encryption. The relationship between the algorithm and the key is illustrated in Figure 4.
Figure 4 To encrypt a plaintext message, the sender passes it through an encryption algorithm. The algorithm is a general system for encryption, and needs to be specified exactly by selecting a key. Applying the key and algorithm together to a plaintext generates the encrypted message, or ciphertext. The ciphertext may be intercepted by an enemy while it is being transmitted to the receiver, but the enemy should not be able to decipher the message. However, the receiver, who knows both the key and the algorithm used by the sender, is able to turn the ciphertext back into the plaintext message.
An enemy studying an intercepted scrambled message may have a strong suspicion of the algorithm, but would not know the exact key. For example, they may well suspect that each letter in the plaintext has been replaced by a different letter according to a particular cipher alphabet, but they are unlikely to know which cipher alphabet has been used. If the cipher alphabet, the key, is kept a closely guarded secret between the sender and the receiver, then the enemy cannot decipher the intercepted message. The significance of the key, as opposed to the algorithm, is an enduring principle of cryptography. It was definitively stated in 1883 by the Dutch linguist Auguste Kerckhoffs von Nieuwenhof in his book La Cryptographie militaire: ‘Kerckhoffs’ Principle: The security of a cryptosystem must not depend on keeping secret the crypto-algorithm. The security depends only on keeping secret the key.’
In addition to keeping the key secret, a secure cipher system must also have a wide range of potential keys. For example, if the sender uses the Caesar shift cipher to encrypt a message, then encryption is relatively weak because there are only 25 potential keys. From the enemy’s point of view, if they intercept the message and suspect that the algorithm being used is the Caesar shift, then they merely have to check the 25 possibilities. However, if the sender uses the more general substitution algorithm, which permits the cipher alphabet to be any rearrangement of the plain alphabet, then there are 400,000,000,000,000,000,000,000,000 possible keys from which to choose. One such is shown in Figure 5. From the enemy’s point of view, if the message is intercepted and the algorithm is known, there is still the horrendous task of checking all possible keys. If an enemy agent were able to check one of the 400,000,000,000,000,000,000,000,000 possible keys every second, it would take roughly a billion times the lifetime of the universe to check all of them and decipher the message.
Plain alphabet a b c d e f g h i j k l m n o p q r s t u v w x y z
Cipher alphabet J L P A W I Q B C T R Z Y D S K E G F X H U O N V M
Plaintext e t t u, b r u t e ?
Ciphertext W X X H, L G H X W ?
Figure 5 An example of the general substitution algorithm, in which each letter in the plaintext is substituted with another letter according to a key. The key is defined by the cipher alphabet, which can be any rearrangement of the plain alphabet.
The beauty of this type of cipher is that it is easy to implement, but provides a high level of security. It is easy for the sender to define the key, which consists merely of stating the order of the 26 letters in the rearranged cipher alphabet, and yet it is effectively impossible for the enemy to check all possible keys by the so-called brute-force attack. The simplicity of the key is important, because the sender and receiver have to share knowledge of the key, and the simpler the key, the less the chance of a misunderstanding.
In fact, an even simpler key is possible if the sender is prepared to accept a slight reduction in the number of potential keys. Instead of randomly rearranging the plain alphabet to achieve the cipher alphabet, the sender chooses a keyword or keyphrase. For example, to use JULIUS CAESAR as a keyphrase, begin by removing any spaces and repeated letters (JULISCAER), and then use this as the beginning of the jumbled cipher alphabet. The remainder of the cipher alphabet is merely the remaining letters of the alphabet, in their correct order, starting where the keyphrase ends. Hence, the cipher alphabet would read as follows.
Plain alphabet a b c d e f g h i j k l m n o p q r s t u v w x y z
Cipher alphabet J U L I S C A E R T V W X Y Z B D F G H K M N O P Q
The advantage of building a cipher alphabet in this way is that it is easy to memorise the keyword or keyphrase, and hence the cipher alphabet. This is important, because if the sender has to keep the cipher alphabet on a piece of paper, the enemy can capture the paper, discover the key, and read any communications that have been encrypted with it. However, if the key can be committed to memory it is less likely to fall into enemy hands. Clearly the number of cipher alphabets generated by keyphrases is smaller than the number of cipher alphabets generated without restriction, but the number is still immense, and it would be effectively impossible for the enemy to unscramble a captured message by testing all possible keyphrases.
This simplicity and strength meant that the substitution cipher dominated the art of secret writing throughout the first millennium AD. Codemakers had evolved a system for guaranteeing secure communication, so there was no need for further development – without necessity, there was no need for further invention. The onus had fallen upon the codebreakers, those who were attempting to crack the substitution cipher. Was there any way for an enemy interceptor to unravel an encrypted message? Many ancient scholars considered that the substitution cipher was unbreakable, thanks to the gigantic number of possible keys, and for centuries this seemed to be true. However, codebreakers would eventually find a shortcut to the process of exhaustively searching all keys. Instead of taking billions of years to crack a cipher, the shortcut could reveal the message in a matter of minutes. The breakthrough occurred in the East, and required a brilliant combination of linguistics, statistics and religious devotion.
The Arab Cryptanalysts
At the age of about forty, Muhammad began regularly visiting an isolated cave on Mount Hira just outside Mecca. This was a retreat, a place for prayer, meditation and contemplation. It was during a period of deep reflection, around AD 610, that he was visited by the archangel Gabriel, who proclaimed that Muhammad was to be the messenger of God. This was the first