Baby DES 3 Statistical considerations \302\251 2007 by Mike May, S.J., Saint Louis University - maymk@slu.edu The third worksheet with BabyDES looks at diffusion and does a statistical analysis of how many rounds we need to use so that \342\200\234every bit of the ciphertext is equally dependent on each bit of the plaintext and each bit of the key. The methods used here for statistical analysis can easily be adapted for other block ciphers. restart: with(Statistics):
<Text-field style="Heading 1" layout="Heading 1">Overview</Text-field> This worksheet looks at how effectively Baby DES is when measured from some statistical criteria. These criteria can be used to evaluate other symmetric ciphers. The idea to be explored is that if Baby DES is a strong cryptographic system then each output bit acts like a random variable with an even chance of being 1 or 0, and that each bit of the ciphertext is independent of all of the other bits of ciphertext. There are many ways of testing that assertion, but here are a few specific tests: 1) I can focus first on the number of bits changed in the ciphertext if I change a single bit of the plain text or of the key. Does the number of bits changed in the ciphertext behave like what I should expect if each bit was changed independently? 2) I can switch focus to a single bit in the ciphertext. If I look at the behavior of a single bit as I change each bit of key or plaintext, does it behave like I should expect if it were changing randomly? 3) Instead of focusing on the collection of encryptions obtained by moving a single bit from a fixed point, I can look at collections of pairs of words that differ by a single bit and ask the same questions I asked above. I can also look at pairs of words that differ by any fixed difference. 4) I can also look at the collection of ciphertexts I get by encrypting a bock of plaintext with a fixed key or encrypting a single word with a block of keys. 5) A final focus to consider is the connection between bits in ciphertexts. In particular I can ask if bits in pairs of positions act as if they are independent. (This is beyond the scope of this worksheet.) This worksheet assumes that you have already worked through the first worksheet on Baby DES. That first Baby DES worksheet creates the BabyDES.m. The BabyDES.m file should be in the current directory. read `BabyDES.m`: If the attempt to read the file failed, check the home directory and see that BabyDES.m is stored there. If not rerun the previous worksheet and put a copy in the current directory. currentdir();
<Text-field style="Heading 1" layout="Heading 1">Statistical background</Text-field> Since we are checking if particular bits behave like a random variable with an even probability of coming up 0 or 1 it is worthwhile reviewing some of the statistics we need to do the examination. If each bit is changed with a probability of .5, then the sum of a collection of bits should produce a binomial distribution. Maple lets us produce random numbers from such a distribution. We use the Statistics package and produce a random variable R12 that mimics this behavior. R12 := RandomVariable(Binomial(12,.5)); We are looking at numbers that we hope are the result of counting the number of successes in 12 trials, each of which has a .5 chance of success. We expect to get 6 = 12*.5, but will get a variety of answers with 6 being most common. Sample(R12,1); Sample(R12,10); We expect the standard deviation of this measure to be sqrt(12*.5*.5) or approximately 1.7 Consider what happens if we take 400 such fake lists and do basic descriptive statistics on the set of lists. L1 := Sample(R12,400); DataSummary(L1); Tally(L1); "Mean"=Mean(L1); Histogram(L1,discrete=true,frequencyscale=absolute); "Mode"=Mode(L1,type=sample); We can construct a confidence interval for the mean in a case like this. If our sample is random, the observed mean should be within 1.96 standard deviations of the true mean 95% of the time. If we are looking at a large sample as above the standard deviation for the average of the means of n samples is sqrt(12*.5*.5/n). Thus in the case above with 400 samples, the observed mean should be within .17 of 6.0 for 19 out of 20 tries. sqrt(12*.5*.5)/20*1.96;
<Text-field style="Heading 2" layout="Heading 2">Exercise</Text-field> 1) Repeat the process above 10 times, each time finding the mean of 400 samples of the random binomial distribution (12, .5). In how many of the 10 cases is the mean in the expected range?
<Text-field style="Heading 1" layout="Heading 1">Shifting the plaintext one bit at a time</Text-field> The first way to test for random behavior is to ask if we start from a fixed plaintext and key, and change each bit of the plaintext one at a time, do the bits of the ciphertext behave randomly.
<Text-field style="Heading 2" layout="Heading 2">Counting the number of bits changed in each ciphertext word</Text-field> A first examination looks at how many bits of output are changed if we change each single bit of the input or of the key. Each bit of the output should be changed with a probability of .5. To make our task easier, we want routines that will easily produce various keys or plaintext messages. Recall that for Baby DES, the input block is 12 bits and the key is 9 bits. intToBitKey := i -> substring(convert(convert(2^9+i,binary),string),2..10): intToBitMessage := i -> substring(convert(convert(2^12+i,binary),string),2..13): zerokey := intToBitKey(0); zeromessage := intToBitMessage(0); The first test is to compare the ciphertext obtained from the zerokey and zeromessage with the ciphertext obtained by encoding a message with a single bit changed. We start by looking at 2 rounds. numRounds := 2; cipher0 := BabyDES(zeromessage, zerokey, numRounds); print("bit changed, plain text, cipher text, change in cipher"); for loc from 0 to 11 do message := intToBitMessage(2^loc); cipher := BabyDES(message, zerokey,numRounds); comparison := xorNbits(cipher, cipher0, 12); print(loc,message, cipher, comparison); end do: Ideally we expect the change in the ciphertext to be half ones and half zeroes with the ones randomly distributed. Starting with the zerotext and zerokey, with 2 round Baby DES, we changed an average of 3 bits of output when we changed one round of input. This is enough less than the hoped for value of 6 bits that we conclude 2 rounds is not enough. Obviously Baby DES need more than 2 rounds to be effective. Before testing more rounds we want to produce procedures that will make the results easier to analyze. We want procedures to convert a bit string into a list of digits and another procedure to count the number of ones in such a list. zeroList12 := [seq(0,place=1..12)]: bitStringToList := bitString -> map(parse,StringTools[Explode](bitString)): addList := bitList -> add(i, i=bitList): We now count the number of bits changes by individually changing each bit in the plaintext message and compute the mean and standard deviation of these results. diffCount := [seq(addList(bitStringToList(xorNbits(cipher0, BabyDES(intToBitMessage(2^i),zerokey,2),12))),i=0..11)]; Mean(diffCount); This looks quite a bit different than the behavior we expected. While it does not formally prove that Baby DES with 2 rounds is weak, it makes us very suspicious. (Taking 12 samples from a (12, .5) binomial distribution simplifies the computation. The standard deviation of the sample means is sqrt(12*.5*.5/12) = .5. Thus in 19 of 20 cases the mean should be within .98 of 6. Maybe we are very unlucky with the starting point, but we doubt it.) We want to generalize the procedure above so that we can input a key, a message and a number of rounds and we look at the change caused by changing each bit of the plaintext, one at a time. changeEachBitMessage := proc(message, key, rounds) local cipher0, diffCount, newMessage, changebits, pos: cipher0 := BabyDES(message, key, rounds): diffCount := vector(12): for pos from 0 to 11 do newMessage := xorNbits(message, intToBitMessage(2^pos), 12): changebits := xorNbits(cipher0, BabyDES(newMessage, key, rounds), 12): diffCount[pos+1] := addList(bitStringToList(changebits)): od: convert(diffCount,list); end: count1 := changeEachBitMessage(zeromessage, zerokey,2); Mean(count1); We can try Baby DES with a variable number of rounds and see how many rounds we need for the behavior to look like bits are randomly changed. for rounds from 2 to 10 do countList := changeEachBitMessage(zeromessage, zerokey,rounds): print(rounds,countList, "Mean"=Mean(countList), "SD"=StandardDeviation(countList)): end do: At this first level of analysis, it looks like 5 rounds of Baby DES changes bits with a probability of about .5. With a little more work we can repeat this process starting with a random plaintext an a random key. This lets us take more samples and reduce the confidence interval. for counter from 1 to 10 do message := intToBitMessage(rand(2^12)()): key := intToBitKey(rand(2^9)()): countList := changeEachBitMessage(message, key, 5): print("mesage"=message, "key"=key, countList, "Mean"=Mean(countList)): end do: From the above output I am really only interested in the means of the average number of changed bits. Visual inspection has the average centering closer to 5.8 than to 6.0. meanList := [seq(0, i=1..16)]: for counter from 1 to 16 do message := intToBitMessage(rand(2^12)()): key := intToBitKey(rand(2^9)()): countList := changeEachBitMessage(message, key, 5): meanList[counter] := Mean(countList): end do: print("List"=meanList, "Mean"=Mean(meanList)): Since I have increased the number of samples by 16, I reduce the confidence interval by sqrt(1/16) = 1/4. I repeat the process with a variable number of rounds and more samples, so that my confidence interval is about .1. for roundCount from 2 to 10 do meanList := [seq(0, i=1..100)]: for counter from 1 to 100 do message := intToBitMessage(rand(2^12)()): key := intToBitKey(rand(2^9)()): countList := changeEachBitMessage(message, key, roundCount): meanList[counter] := Mean(countList): end do: print("Number of rounds = ", roundCount, " averagebits changed = ", Mean(meanList)): end do: Unless we believe too many rounds will make the algorithm less secure, it looks like we want 8 rounds for Baby DES to act randomly on bits.
<Text-field style="Heading 3" layout="Heading 3">Exercise:</Text-field> 2) Modify the above to look at 10 trials of 100 randomly chosen plaintexts and keys. What is the confidence interval for the means of bits changed? In how many of your trials was the mean in the confidence interval?
<Text-field style="Heading 2" layout="Heading 2">Counting the number of words for which a bit is changed</Text-field> We initially asked if we change the plaintext one bit at a time, how many bits are changed in each new ciphertext. The other obvious way to arrange the data asks in how many of the ciphertexts is a particular bit changed. Since we are looking at sets of 12 words, the statistics will behave the same way. changeEachBitMessage3 := proc(message, key, rounds) local cipher0, diffCount, newMessage, changebits, pos: cipher0 := BabyDES(message, key, rounds): diffCount := [seq(0,i=1..12)]: for pos from 0 to 11 do newMessage := xorNbits(message, intToBitMessage(2^pos), 12): changebits := xorNbits(cipher0, BabyDES(newMessage, key, rounds), 12): diffCount := zip(`+`,diffCount,bitStringToList(changebits)): od: diffCount: end: changeEachBitMessage3(zeromessage, zerokey,8); If we do this for a number of words we can then accumulate these changes starting at a number of different base words. key := intToBitKey(rand(2^9)()); numBaseWords := 100: countList := [seq(0,i=1..12)]: for count from 1 to numBaseWords do message := intToBitMessage(rand(2^12)()): tempList := changeEachBitMessage3(message, key, 8): countList := zip(`+`,countList, tempList): end do: countList; map(x -> x/numBaseWords*1.0, countList); errInt := sqrt(12/(4*numBaseWords)*1.96); As above, each mean should be within errInt of the expected normalized mean of 6. We expect this to happen 19 out of 20 times. To gain a better perspective, we repeat the process from 10 randomly chosen keys. numBaseWords := 100: errInt := sqrt(12/(4*numBaseWords)*1.96); for i from 1 to 10 do key := intToBitKey(rand(2^9)()); countList := [seq(0,i=1..12)]: for count from 1 to numBaseWords do message := intToBitMessage(rand(2^12)()): tempList := changeEachBitMessage3(message, key, 8): countList := zip(`+`,countList, tempList): end do: print(key,countList); print(map(x -> x/numBaseWords*1.0, countList)); end do:
<Text-field style="Heading 3" layout="Heading 3">Exercise</Text-field> 3) How many of the means found above are outside the expected confidence interval? Is this consistent with the expected results if the bit switched are random?
<Text-field style="Heading 1" layout="Heading 1">Pairs of plaintext blocks</Text-field> If the ciphertext behavior is random, we get the same effect if we look at pairs of bits that are separated by a single bit or by any specified difference. It is worth verifying this pattern of behavior.
<Text-field style="_cstyle7" layout="_pstyle12">Counting the number of bits changed in each word</Text-field> We start by looking at how many bits are changed between the ciphertexts of a pair of plaintexts key := intToBitKey(rand(2^9)()); shiftWord := intToBitMessage(rand(2^12)()); firstWord := intToBitMessage(rand(2^12)()); secondWord := xorNbits(firstWord, shiftWord, 12); cipherDiff := xorNbits(BabyDES(firstWord,key,8),BabyDES(secondWord,key,8),12); diffCount := addList(bitStringToList(cipherDiff)); Now we create a procedure to do this with a bunch of pairs pairShiftCounter := proc(key, shiftWord, rounds, numPairs) local countList, firstWord, secondWord, cipherDiff, i: countList := table(): for i from 1 to numPairs do firstWord := intToBitMessage(rand(2^12)()); secondWord := xorNbits(firstWord, shiftWord, 12); cipherDiff := xorNbits(BabyDES(firstWord,key,rounds), BabyDES(secondWord,key,rounds),12); countList[i] := addList(bitStringToList(cipherDiff)); od: convert(countList,list): end: We use this procedure to test a selection of 400 randomly chosen words. The expected mean of the number of bits shifted in each word is 6 with a confidence interval of sqrt(12/4*400)*1.96, or about .17 each way. countList := pairShiftCounter(key, shiftWord, 8, 400): Tally(countList); Mean(countList); StandardDeviation(countList); Once again, the behavior is reasonably close to what we expect from random
<Text-field style="Heading 3" layout="Heading 3">Exercise</Text-field> 4) Repeat the test above with 10 randomly chosen keys. In how many of the cases does the mean fall in the appropriate confidence interval?
<Text-field style="Heading 2" layout="Heading 2">Counting the number of words for which a bit is changed</Text-field> The other way of arranging the data is to look at how many pairs change a particular bit. We want a procedure that takes a given key and shiftword and looks at random pairs that differ by the shift, encrypted by the key, pairShiftCounter2 := proc(key, shiftWord, rounds, numPairs) local countList, firstWord, secondWord, cipherDiff, i: countList := [seq(0,i=1..12)]: for i from 1 to numPairs do firstWord := intToBitMessage(rand(2^12)()); secondWord := xorNbits(firstWord, shiftWord, 12); cipherDiff := xorNbits(BabyDES(firstWord,key,rounds), BabyDES(secondWord,key,rounds),12); countList := zip(`+`,countList,bitStringToList(cipherDiff)); od: countList: end: We want to apply this procedure to 400 words and see how often each bit is changed. We expect the expected mean for the number of pairs where each bit changes with 400 pairs of plaintext is 400*.5=200 and the expected standard deviation is sqrt(400*.5*.5)=10 key := intToBitKey(rand(2^9)()); shiftWord := intToBitMessage(rand(2^12)()); print("Key = ", key, " Shiftword = ", shiftWord); countList := pairShiftCounter2(key, shiftWord, 8, 400); Tally(countList); Mean(countList); StandardDeviation(countList); The confidence interval for the mean of the 12 values should be 200 plus or minus sqrt(400/(4*12))*1.96 = 5.658.
<Text-field style="Heading 3" layout="Heading 3">Exercise</Text-field> 5) Repeat the test above 10 times with randomly chosen keys and shift words. In how many of the 10 trials is the mean in the expected confidence interval?
<Text-field style="Heading 1" layout="Heading 1">Clusters of plaintext blocks</Text-field> Another way to test for randomness is to fix a key and a starting plaintext and look at the aggregated result of encrypting a block of plaintext words. We will look at a block of 32 words obtained by letting 5 consecutive plaintext bits vary. We start with some technical procedures for looking at a block of plaintexts. makePlainBlock := proc(startPlain,shiftBits) local intShifts,i: intShifts := [seq(i*2^shiftBits,i=0..31)]: map(xorNbits,map(intToBitMessage,intShifts),startPlain,12): end: starter :=intToBitMessage(rand(2^12)()); makePlainBlock(starter,7); We want a procedure to take each entry in such a block, encrypt it with Baby DES, turn the ciphertext into a list of bits, and add the bits for each position. testPlainBlock := proc(startPlain, key, shiftBits, rounds) local plainBlock, i, distList: plainBlock := makePlainBlock(startPlain, shiftBits): distList := [seq(0,i=1..12)]: for i from 1 to 32 do distList := zip(`+`, distList, bitStringToList(BabyDES(plainBlock[i],key,rounds))): end do: distList; end: Now we can test the procedure with a randomly chosen plaintext, key and shift. key := intToBitKey(rand(2^9)()); startPlain := intToBitMessage(rand(2^12)()); shiftBits := rand(8)(); testPlainBlock(startPlain, key, shiftBits, 8); We can run this procedure with a variable number of rounds to see when Baby DES becomes effective. print(key, startPlain, shiftBits); for rounds from 2 to 10 do d2list := testPlainBlock(startPlain, key, shiftBits, rounds): d2listMean := evalf(Mean(d2list),3): d2listStandDev := evalf(StandardDeviation(d2list),3): print(rounds, d2list, d2listMean, d2listStandDev); end do: Since we are looking at 32 words we expect the mean of the the values to be 32*.5=16. If the bits are acting like random variables the standard deviation should be about sqrt(32*.5*.5)=2.828. We can also compute a confidence interval for the standard deviation. The standard deviation of the standard deviation is called the standard error. The standard error is the standard deviation divided by the square root of twice the sample size. In this case the standard error is about .35. That means our confidence interval for standard deviations is roughly from 2.13 to 3.53.
<Text-field style="_pstyle9" layout="_pstyle9">Exercise</Text-field> 6) Repeat the above exercise with 10 randomly chosen keys, initial words, and shift blocks. Explain why the evidence shows 3 rounds is not enough to make Baby DES act randomly on bits.
The ways we have gathered data can obviously be adapted to other cryptographic systems. One can also do more sophisticated statistical analyses on the gathered data. Reference: Cryptography with Coding Theory, by Trappe and Washington, second edition, ISBN-13: 978-0131862395, Prentise Hall Publishing \302\2512005. Legal Notice: The copyright for this application is owned by the author(s). Neither Maplesoft nor the author are responsible for any errors contained within and are not liable for any damages resulting from the use of this material. This application is intended for non-commercial, non-profit use only. Contact the author for permission if you wish to use this application in for-profit activities.