Differential Cryptanalysis: A Literature Survey

Research Comments from Ciphers By Ritter

Terry Ritter

Differential Cryptanalysis covers a growing variety of attacks on various block ciphers. It appears to be most useful on iterative (round-based) ciphers, perhaps because these can only weakly diffuse the transformations which occur in later rounds. Differential Cryptanalysis is normally a defined-plaintext attack.

The basic idea of Differential Cryptanalysis is to first cipher some plaintext, then make particular changes in that plaintext and cipher it again. Particular ciphertext differences occur more frequently with some key values than others, so when those differences occur, particular keys are (weakly) indicated. With huge numbers of tests, false indications will be distributed randomly, but true indications always point at the same key values and so will eventually rise above the noise to indicate some part of the key.

The basic concept can be applied to virtually any sort of statistic which relates ciphertext changes to key values, even in relatively weak ways. But because the probabilities involved are generally quite small, success generally depends upon having very substantial amounts of known-plaintext. Thus, in practice, Differential Cryptanalysis would seem to be defeated by the simple use of message keys and limitations on the amount of material ciphered under a single message key.

Some versions (e.g., Biham and Shamir 92) can be applied to separately-keyed blocks with a similar overall probability of success. But that success reveals only one of the many keys at random, and a success does not help with the other keys. Nor does Differential Cryptanalysis apply to message keys, since the message key value is not available as known-plaintext. Thus, while Differential Cryptanalysis is powerful, it is not magic, and has very significant requirements which may not be met in practice.

Differential Cryptanalysis depends upon known tables in which the key value selects various data differentials. Consequently, Differential Cryptanalysis might also be defeated by:

  1. Keying which selects among "every" possible table (instead of using a few pre-defined "ideal" tables);
  2. Using data to dynamically select among a large working set of tables (instead of just four); and
  3. Effectively mixing table results as soon as table operations occur (rather than depending upon future "rounds" for mixing, which is risky since there are no future rounds after the last one). Effective mixing should prevent tables from being isolated and separately attacked.


1990 -- Biham and Shamir

Biham, E. and A. Shamir. 1990. Differential Cryptanalysis of DES-like Cryptosystems. Advances in Cryptology -- CRYPTO '90. Springer-Verlag. 2-21.

"Iterated cryptosystems are a family of cryptographically strong functions based on iterating a weaker function n times. Each iteration is called a round and the cryptosystem is called an n round cryptosystem. The round function is a function of the output of the previous round and of a subkey which is a key dependent value calculated via a key scheduling algorithm. The round function is usually based on S boxes, bit permutations, arithmetic operations and the exclusive-or (denoted by + and XOR) operations. The S boxes are nonlinear translation tables mapping a small number of input bits to a small number of output bits. They are usually the only part of the cryptosystem which is not linear and thus the security of the cryptosystem crucially depends upon their choice. The bit permutation is used to rearrange the output bits of the S boxes in order to make the input bits of each S box in the following round depend upon the output of as many S boxes as possible."

"In this paper we describe a new kind of attack that can be applied to many DES-like iterated cryptosystems. This is a chosen plaintext attack which uses only the resultant ciphertexts. The basic tool of the attack is the ciphertext pair which is a pair of ciphertexts whose plaintexts have particular differences. The two plaintexts can be chosen at random, as long as they satisfy the difference condition, and the cryptanalyst does not have to know their values. The attack is statistical in nature and can fail in rare instances."

"Differential cryptanalysis is a method which analyzes the effect of particular differences in plaintext pairs on the differences of the resultant ciphertext pairs. These differences can be used to assign probabilities to the possible keys and to locate the most probable key. This method usually works on many pairs of plaintexts with the same particular difference using only the resultant ciphertext pairs. For DES-like cryptosystems the difference is chosen as a fixed XORed value of two plaintexts."

"Although DES seems to be very non linear in its input bits, when particular combinations of input bits are modified simultaneously, particular intermediate bits are modified in a usable way with a relatively high probability after several rounds."

". . . every input XOR of an S box suggests a probabilistic distribution of the possible output XORs. In this distribution several output XORs have a relatively high probability. Table 1 describes the distribution of the output XORs for several input XORs in S1." "Many entries are impossible or have a small probability. However, there are several entries with probability 1/4 (i.e., 16 out of 64) or close to it."

"We can use this property as a tool to identify key bits. If we can find the output XOR of the F function of the last round, we can filter the set of possible subkeys entering this F function when the pair of ciphertexts is known. Using both ciphertexts it is easy to calculate the input XOR of the F function of the last round and its output XOR. Then the input XOR and output XOR of each S box in the last round are known. In case k input pairs can lead to that entry in the table, exactly k values of the corresponding six subkey bits are possible. Most subkey values are suggested by only a few pairs. However, the real value of the subkey bits is suggested by all the pairs and can be identified."

1991 -- Biham and Shamir

Biham, E. and A. Shamir. 1991. Differential Cryptanalysis of Snefru, Khafre, REDOC-II, LOKI and Lucifer. Advances in Cryptology -- CRYPTO '91. Springer-Verlag. 156-171.

"Two-pass Snefru is easily breakable within three minutes on a personal computer."

"Khafre with 16 rounds is breakable by a differential cryptanalytic chosen plaintext attack using about 1500 encryptions within about an hour on a personal computer."

"REDOC-II with one round is breakable by a differential cryptanalytic chosen plaintext attack using about 2300 encryptions within less than a minute on a personal computer."

1991 -- Nyberg

Nyberg, K. 1991. Perfect nonlinear S-boxes. Advances in Cryptology -- EUROCRYPT '91. 378-386.


"A perfect nonlinear S-box is a substitution transformation with evenly distributed directional derivatives. Since the method of differential cryptanalysis presented by E. Biham and A. Shamir makes use of nonbalanced direction derivatives, the perfect nonlinear S-boxes are immune to this attack. The main result is that for a perfect nonlinear S-box the number of input variables is at least twice the number of output variables." [p.378]

1. Introduction

"In [12] Meier and Stafflebach discuss perfect nonlinear Boolean functions, which are defined to be at maximum distance from linear structures. These functions are the same as the previously known bent functions [15]. To construct perfect nonlinear S-boxes it is necessary that each output bit is a perfect nonlinear function of the input. But it is not sufficient, indeed, also every linear combination of output variables have to be perfect nonlinear. We present two different constructions to achieve this property." [p.378]

1991 -- Dawson and Tavares

Dawson, M. and S. Tavares. 1991. An Expanded Set of S-box Design Criteria Based on Information Theory and its Relation to Differential-Like Attacks. Advances in Cryptology -- EUROCRYPT '91. 353-367.


"In this work we present an expanded set of design criteria for creating good S-boxes based on information theoretic concepts and show that an S-Box that meets these criteria is immune to differential cryptanalysis [1]."

"We have defined a set of six properties that an Ideal S-box is required to meet. This set of properties has a broader scope than those of Forre and any S-box that meets these properties will also meet Forre's. The properties are grouped into a set of static properties and a set of dynamic properties."

Static Properties

"The first static property is that the partial information about the inputs and outputs does not reduce the uncertainty in an unknown output."

"The second static property is that the partial information about the inputs and outputs does not reduce the uncertainty in an unknown output."

"The third static property is that the uncertainty in a data value is reduced by the minimum amount possible when it passes through an S-box."

Dynamic Properties

"The dynamic properties are similar to the static properties except that they deal with the changes in inputs and outputs."

Analysis of DES S-boxes Using The Design Criteria

". . . we could not find S-boxes with substantially better information theoretic properties than the S-boxes of DES and which also meet the acknowledged DES design criteria." ". . . there were many S-boxes found which met the acknowledged DES design criteria but had poor information theoretic properties."

". . . the properties of the inverses of the DES 4x4 S-boxes were as good as those of the S-boxes themselves." ". . . the inverses of the DES 4x4 S-boxes meet the acknowledged DES design criterion which requires that at least two bits change in the output whenever one input bit is changed. These two discoveries indicate that the designers of DES placed an equal emphasis on the properties of the S-boxes and their inverses.

"In every case we found that the properties of the complete 6x4 S-boxes were better than any individual 4x4 sub-box. We concluded that using multiple sub-boxes to form a larger S-box is an important method which can be used to create S-boxes that have better properties than are possible in a single S-box."

". . . no nxn S-box can meet the Dynamic criteria perfectly because, due to the nature of the XOR function, output XOR values always occur in pairs (since a XOR b = b XOR a)."

1992 -- Biham and Shamir

Biham, E. and A. Shamir. 1992. Differential Cryptanalysis of the Full 16-round DES. Advances in Cryptology -- CRYPTO '92. Springer-Verlag. 487-496.

"In this paper we finally break the 16-round barrier. We develop an improved version of differential cryptanalysis which can break the full 16-round DES in 237 time and negligible space by analyzing 236 ciphertexts obtained from a larger pool of 247 chosen plaintexts. An interesting feature of the new attack is that it can be applied with the same complexity and success probability even if the key is frequently changed and thus the collected ciphertexts are derived from many different keys."

"Any pair of plaintexts which gives rise to the intermediate XORs specified by this characteristic is called a right pair. The attack tries many pairs of plaintexts, and eliminates any pair which is obviously wrong due to its known input and output values. However since the cryptanalyst cannot actually determine the intermediate values, the elimination process is imperfect and leaves behind a mixture of right and wrong pairs.

"In earlier versions of differential cryptanalysis, each surviving pair suggested several possible values for certain key bits. Right pairs always suggest the correct value for these key bits (along with several wrong values), while wrong pairs suggest random values. When sufficiently many right pairs are analyzed, the correct value (signal) overcomes the random values (noise) by becoming the most frequently suggested value. The actual algorithm is to keep a separate counter for the number of times each value is suggested, and to output the index of the counter with the maximal final value. This approach requires a huge memory (with up to 242 counters in the attack on the 15-round variant of DES), and has a negligible probability of success when the number of analyzed pairs is reduced below the threshold implied by the signal to noise ratio.

"In the new version of differential cryptanalysis, we work somewhat harder on each pair, and suggest a list of complete 56-bit keys rather than possible values for a subset of key bits. As a result, we can immediately test each suggested key via trial encryption, without using any counters. These tests can be carried out in parallel on disconnected processors with very small local memories, and the algorithm is guaranteed to discover the correct key as soon as the first right pair is encountered. Since the processing of different pairs are unrelated, they can be generated by different keys at different times due to frequent key changes, and the discovery of a key can be announced in real time while it is still valid (e.g., in order to forge authenticators for banking messages)."

1992 -- Nyberg and Knudsen

Nyberg, K. and L. Knudsen. 1993. Provable Security Against Differential Cryptanalysis. Advances in Cryptology -- CRYPTO '92. 566-574.

1 Introduction

"The purpose of this paper is to show that there exist DES-like iterated ciphers which are provably resistant against differential attacks."

2 Differential Cryptanalysis of DES-like Iterated Ciphers

"In [1] Biham and Shamir introduced differential cryptanalysis of DES-like ciphers. In their attacks they make use of characteristics, which describe the behaviour of input and output differences for some number of consecutive rounds. The probability of a one-round characteristic is the conditional probability that given a certain difference in the inputs to the round we get a certain difference in the outputs of that round. Assume that in every round the inputs E(R) XOR K to f are independent and random. This assumption is satisfied if the round keys are uniformly random and independent. Then the probability of an r-round characteristic is obtained by multiplying the probabilities of the r one-round characteristics.

"Lai and Massey [3] observed that for the success of differential cryptanalysis it is not necessary to fix the values of input and output differences for the intermediate rounds in a characteristic. They introduced the notion of differentials. The probability of an r-round differential is the conditional probability that given an input difference at the first round, the output difference at the r'th round will be some fixed value. Note that the probability of an r-round differential with input difference A and output difference B is the sum of the probabilities of all r-round characteristics with input difference A and output difference B. For r <= 2 the probabilities for a differential and for the corresponding characteristic are equal, but in general the probabilities for differentials would be higher.

"In order to make a successful attack on a DES-like iterated cipher by differential cryptanalysis the existence of good characteristics is sufficient. On the other hand to prove security against differential attacks for DES-like iterated ciphers we must ensure that there is no differential with a probability high enough to enable successful attacks."

1992 -- Adams

Adams, C. 1992. On immunity against Biham and Shamir's "differential cryptanalysis." Information Processing Letters. 41: 77-80.

2. Avoiding differential cryptanalysis

"Differential cryptanalysis [5] is based on the fact that in many s-boxes certain input XORs (i.e., certain fixed changes in the s-box input vector) lead to certain output XORs (fixed changes in the s-box output vector) with fairly high probability ([about] 25%) and to certain other output XORs with very low or zero probability. Chosen plaintext attacks can be mounted which take advantage of the relatively high probabilities to reduce the search space for the key in use. It is obvious, therefore, that if all output XORs occurred with similar (ideally, equal) probability, differential cryptanalysis would have no greater chance of success than exhaustive search.

"We can design s-boxes with equiprobable output XORs through the use of bent functions ([10,14,2], and others)."

". . . the s-boxes described above cannot be n x n bijective s-boxes since columns in the representative matrix are bent and bent functions are not weight balanced. Therefore, SPN cryptosystems taking advantage of this work must be constructed such that it is never required to go 'backwards' through any of their component s-boxes."

1993 -- Ben-Aroya and Biham

Ben-Aroya, I. and E. Biham. 1993. Differential Cryptanalysis of Lucifer. Advances in Cryptology -- CRYPTO '93. Springer-Verlag. 186-199.


"Differential cryptanalysis was introduced as an approach to analyze the security of DES-like cryptosystems. The first example of a DES-like cryptosystem was Lucifer, the direct predecessor of DES, which is still believed by many people to be much more secure than DES, since it has 128 key bits, and since no attacks against (the full variant of) Lucifer were ever reported in the cryptographic literature. In this paper we introduce a new extension of differential cryptanalysis, devised to extend the class of vulnerable cryptosystems. This new extension suggests key-dependent characteristics, called conditional characteristics, selected to enlarge the characteristics' probabilities for keys in subsets of the key space. The application of conditional characteristics to Lucifer shows that more than half of the keys of Lucifer are insecure, and the attack requires about 236 complexity and chosen plaintexts to find those keys. The same extension can also be used to attack a new variant of DES, called RDES, which was designed to be immune against differential cryptanalysis. These new attacks flash new light on the design of DES, and show that the transition of Lucifer to DES strengthened the later cryptosystem."

"In this paper we extend differential cryptanalysis in several directions: The main extension of this paper lets differential cryptanalysis to analyze a wider set of cryptosystems. We define conditional characteristics as key-dependent characteristics selected to maximize the characteristic's probability (the fraction of right pairs) for only a specific subset of the key space. The required coverage of (almost) all the key space is done via selection of several conditional characteristics designed for different fractions of the key space."

"Several researchers studied how to make cryptosystems immune against differential analysis, but till now, this effort was not very successful. Many of them[1,9,18] suggested the use of S boxes whose difference distribution tables are uniform, and in particular they suggested the use of bent functions. However, the application of this suggestion to DES was studied in [2,7], and it was shown that the resultant cryptosystems become much weaker than DES."

"Differential cryptanalysis requires one to find good characteristics, i.e., to find pairs of messages, such that the difference of the output of the nth round during encryption of these messages is predictable with a relatively high probability." "In [3,2] the characteristic's probability is defined as the probability that a random pair (whose plaintext difference is omegap) is a right pair with respect to a random key, and it is shown that the probability that a random pair is a right pair with respect to a fixed key may depend on the choice of the key. In this paper we are interested in characteristics for which the probability that a random pair is a right pair vary between different keys. We call these characteristics conditional characteristics."

1993 -- O'Connor

O'Connor, L. 1993. On the Distribution of Characteristics in Bijective Mappings. Advances in Cryptology -- EUROCRYPT 93. 360-370.

1 Introduction and Results

"Differential cryptanalysis is a statistical attack popularized by Biham and Shamir in a series of well-known papers [1, 2, 3]. The attack has been applied to a wide range of iterated mappings including LUCIFER, DES, FEAL, REDOC, Kahfre [4, 5, 12, 13, 17, 19]. As explained below, the attack is based on a quantity O called a characteristic, which has some probability pO of giving information about the secret key used in the mapping. The attack is universal in that characteristics O will always exist for any iterated mapping; however pO may be very small, and possibly less likely to furnish information concerning the key than the success of guessing the secret key at random. For this reason, differential cryptanalysis has had varying success against the iterated mappings it has been applied to, and little is known about how useful the attack is expected to be against an arbitrary iterated mapping."

"We will give a brief description of differential cryptanalysis with reference to product ciphers, though any iterated mapping would suffice. For a product cipher E that consists of R rounds, let Er(X,K) be the encryption of the plaintext X under the key K for r rounds, 1 <= r <= R. Note that ER(X,K) = E(X,K) = C is the ciphertext for X. Let dC(r) = Er(X,K) + Er(X',K) be the difference between the ciphertexts of two plaintexts X,X' after r rounds where 1 <= r <= R. For our purposes the difference operator + will refer to addition in the vector space Z2m. An r-round characteristic is defined as an (r+1)-tuple OR(dX,dY1, dY2,...,dYr) where dX is a plaintext difference, and the dYi are ciphertext differences. A plaintext pair X,X' of difference dX = X + X' is called a right pair with respect to a key K and a characteristic Or(dX,dY1, dY2,...,dYr) if when X and X' are encrypted, dC(i) = dYi for 1 <= i <= r. That is, X,X' is a right pair if the characteristic correctly predicts the ciphertext differences at each round. The characteristic Or has probability pOr if a fraction pOr of the plaintext pairs of difference dX are right pairs. On the other hand, if X,X' such that dX = X + X' is not a right pair, then it is said to be a wrong pair (with respect to the characteristic and the key). A table which records the number of pairs of difference dX that give the output difference dY for a mapping PI is called the XOR table distribution of PI. A characteristic dX,dY is said to be impossible for PI if its corresponding XOR table entry is zero. Also a characteristic will be called nonzero if w(dX),w(dY) > 0 , where w(.) is the Hamming weight function. Using a characteristic O of appropriate length it is then possible to devise a statistical experiment which when repeated a sufficient number of times will yield the subkey of the last round (see [1] for details)."

4 Conclusion and Remarks

"Our results then show that a relatively simple design can produce product ciphers for which all characteristics O are expected to (correctly) predict differences with low probability. We further note that random m-bit permutations can be generated efficiently [15], and that the fraction of permutations that are . . . linear [7] or degenerate [14] in any output bit is tending to zero rapidly as a function of m. On the other hand, Biham and Shamir [3] found that replacing the S-boxes of DES by random 4-bit permutations yielded systems that were far weaker than the original DES. The weakness of these S-boxes appears to be due to the dimension of the permutation rather than the use of [random] permutations per se."

1995 -- Youssef, Tavares, Mister and Adams

Youssef, A., S. Tavares, S. Mister and C. Adams. 1995. Linear Approximation of Injective S-boxes. IEE Electronics Letters. 31(25): 2168-2169.


"Nonlinearity is a crucial requirement for the substitution boxes in secure block ciphers. In this letter, we derive an estimate for the expected nonlinearity of a randomly selected injective substitution box."


"Differential cryptanalysis [1] and linear cryptanalysis [2] are powerful cryptanalytic attacks on private-key block ciphers. The complexity of differential cryptanalysis depends on the size of the largest entry in the XOR table, the total number of zeros in the XOR table, and the number of nonzero entries in the first column of that table [1], [3]. The complexity of linear cryptanalysis depends on the size of the largest entry in the linear approximation table (LAT)[2].

"One way to reduce the size of the largest entry in the XOR table is to use injective substitution boxes (s-boxes) such that the number of output bits of the s-box is sufficiently larger than the number of input bits. In this way, it is very likely that the entries in the XOR distribution table of a randomly chosen injective s-box will have only small values, making the block cipher resistant to differential cryptanalysis. Some proposed block ciphers, such as CAST [4] and Blowfish [5], take advantage of this property.

"On the other hand, Biham [6] proved that if for an nxm s-box described by f: Z2n -> Z2m we have m >= 2n - n, then at least one linear combination of the output bits must be an affine combination of the input bits and the block cipher can be trivially broken by linear cryptanalysis. In this letter, we estimate the size of the largest entry in the LAT of a randomly selected injective s-box."

1995 -- Youssef and Tavares

Youssef, A., S. Tavares. 1995. Resistance of Balanced S-boxes to Linear and Differential Cryptanalysis. Information Processing Letters. 56: 249-252.


"In this letter, we study the marginal density of the XOR distribution table, and the linear approximation table entries of regular substitution boxes (s-boxes). Based on this, we show that the fraction of good s-boxes (with regard to immunity against linear and differential cryptanalysis) increases dramatically with the number of input variables."


"Differential cryptanalysis [1], and linear cryptanalysis [3] are currently the most powerful cryptanalytic attacks on private-key block ciphers. The complexity of differential cryptanalysis depends on the size of the largest entry in the XOR table, the total number of zeros in the XOR table, and the number of nonzero entries in the first column in that table [1], [8]. The complexity of linear cryptanalysis depends upon the size of the largest entry in the linear approximation table (LAT).

"One requirement in s-box design is to have a balanced s-box (also known as a regular s-box). This means that each output symbol should appear an equal number of times when the input is varied aver all possible values.

"Gordon and Retkin calculated the probability that one or more of the output coordinates of a random, reversible s-box is an affine function. By showing that this probability decreases dramatically with the number of input variables, they conjectured that larger s-boxes are better. In this letter, we provide further evidence for their conjecture by showing that the fraction of good s-boxes, with regard to immunity against linear and differential cryptanalysis, increases dramatically with the number of input variables."

1995 -- Vaudenay

Vaudenay, S. 1995. An Experiment on DES Statistical Cryptanalysis.


"Linear cryptanalysis and differential cryptanalysis are the most important methods of attack against block ciphers. Their efficiency have been demonstrated against several ciphers, including the Data Encryption Standard. We prove that both of them can be considered, improved and joined in a more general statistical framework. We also show that the very same results as those obtained in the case of DES can be found without any linear analysis and we slightly improve them into an attack with theoretical complexity 242.9.

"We can apply another statistical attack -- the X2-cryptanalysis -- on the same characteristics without a definite idea of what happens in the encryption process. It appears to be roughly as efficient as both differential and linear cryptanalysis."

"The success of those methods have focused the attention on the linear properties of the boxes. In this paper, we try to prove that the linear properties are not so important."

Terry Ritter, his current address, and his top page.

Last updated: 1997-09-25