RSA Dissertation

An Examination of the RSA Cryptosystem Using Distributed and Heuristic Methods Submitted May 2004, in partial fulfilment...

2 downloads 324 Views 288KB Size
An Examination of the RSA Cryptosystem Using Distributed and Heuristic Methods Submitted May 2004, in partial fulfilment of the conditions of the award of the degree B.Sc in Computer Science Drew Masters Djm01u School of Computer Science and Information Technology University of Nottingham I hereby declare that this dissertation is all my own work, except as indicated in the text: Signature ______________________ Date ____/_____/_____

1

Abstract This document begins by exploring and extending an existing brute force attack on the RSA cryptosystem, after discovering particular flaws in the original system, it continues to investigate these flaws in a heuristic manner, using a distributed system, finding conspicuous artefacts and examining these. It concludes with the results of the investigation and provides possible areas for expansion in the future.,

2

Table of Contents 1. Introduction........................................................................................................... 4 1.1 Aim ............................................................................................................. 4 1.2 Outline......................................................................................................... 5 2. Background Information........................................................................................ 6 2.1 History of Encryption .................................................................................. 6 2.2 Secret Key Cryptography............................................................................. 6 2.3 Public Key Encryption ................................................................................. 7 2.4. Hybrid Systems........................................................................................... 8 2.5. RSA............................................................................................................ 8 2.5.1. Proof of RSA ......................................................................................... 11 2.5.2. Generating a Secure Key........................................................................ 12 2.5.3. Phi(n)..................................................................................................... 14 2.5.4. Search Space Boundaries ....................................................................... 14 2.6. Related Work............................................................................................ 18 3. System Design..................................................................................................... 20 3.1 Finding Phi(n)............................................................................................ 20 3.1. Choice of Language .................................................................................. 21 3.2. Distribution............................................................................................... 22 3.3. Final Design ............................................................................................. 23 3.4. Server Design ........................................................................................... 23 3.4.1 Random keys .......................................................................................... 24 3.4.2 Generating Prime Numbers ..................................................................... 25 3.5. Client Design ............................................................................................ 25 3.6. Communication ........................................................................................ 25 3.7. Deployment .............................................................................................. 27 4. Data Collection.................................................................................................... 28 4.1 False positives ........................................................................................... 28 4.2. Adapted Aim ............................................................................................ 29 4.3. Adapted System Design ............................................................................ 30 5. Results & Analysis .............................................................................................. 31 5.1 Counting False Positives ............................................................................ 31 5.2. Counting Phi(n) With Assigned e.............................................................. 32 5.3. Conclusion of Results ............................................................................... 34 6. Conclusion .......................................................................................................... 36 6.1 Final Thoughts........................................................................................... 36 7. References........................................................................................................... 37 7.1. References ................................................................................................ 37 7.2. Bibliography ............................................................................................. 37 A. Appendix A – Relevant Mathematics.................................................................. 39 A.1. Fermat Theorem....................................................................................... 39 A.2. Euler Theorem ......................................................................................... 39 A.3. Totient Function....................................................................................... 39 B. Appendix B – Sample Data................................................................................. 40 B.1. Sample 16 Bit Key data............................................................................ 40 B.2. Sample 16 Bit Keys Counting Candidates ................................................ 41 B.3. Sample 16 Bit Keys Counting Candidates e set as 17................................ 42

3

1. Introduction As the world becomes increasingly paranoid in the wake of modern terrorist acts both home and abroad, the question of security both at a personal and national level becomes an important issue in today’s society. Protecting our children, money and property against criminals becomes more difficult, especially as certain aspects of these tangible items become digital, for example the majority of money (or at least it’s transactions) is now stored in a digital form, intellectual property is stored throughout the world both in public and private repositories. Some of this data is public, or is not thought important enough to protect from prying eyes, however, much of it is not. How many people would be happy at the idea that anyone could view their bank details or could view other personal details, banks now store digital representations of a signature, coupled with other details this provides all the information a dishonest individual might need to conduct fraudulent acts, from stealing money to stealing a person’s identity. It used to be that data could be secured by limiting who had access to it, or possibly hiding the very fact the information existed, and in some respects this is probably the securest methods, after all a secret shared is no longer a secret. However, ever since the need to send a private message has existed, so has the need to ensure that even should the message fall into the wrong hands it would remain unreadable; the most famous example of this being the Caesar shift cipher, where messages to his generals were written with the letter shifted 3 spaces. Obviously the security of this method relies on keeping the information about the cipher secret, but even if it were not this still presents a simple problem to solve, giving (with today’s alphabet) only 26 possibilities for the message, the majority of which won’t make sense. Since then, encryption methods have become much more advanced, making use of complicated mathematical theories and methods. These have developed into two branches of encryption, public key encryption and secret key encryption.

1.1 Aim The original aim of this project was to investigate a possible brute force attack on the RSA encryption system, initially it would closely follow and extend Matthew Diver’s “A distributed Heuristic Brute Force Attack of the RSA Cryptosystem” [1] to gain a complete set of results, and improve on the methods already implemented and hopefully from there continue on investigating new ideas along the same theme. However based on results from a test run of the project code, this aim was changed to further examine certain artefacts found in the results and how, using a heuristic method these might relate to certain properties of the RSA system.

4

1.2 Outline The project initially consisted of following Diver’s methods [1] and reproducing his results. Much justification is given as to how Diver may have calculated his boundaries, and this project’s initial boundaries agree with Diver’s, however upon implementing this it was found that Diver’s methods may have been flawed in their method of calculation, also certain unexpected results were found. The project then further examines these properties rather than continuing in trying to improve on Diver’s methods. To examine these properties the system generates a large amount of data that shows these unexpected results and uses these for analysis.

5

2. Background Information 2.1 History of Encryption The first known encryption technique was used by the Spartans [10] several thousand years B.C, the technique involved wrapping a narrow strip of paper ( or equivalent ) around a thin stick or cylinder known as a “scytale” and writing the message. Once unwound this message was unreadable unless rewound around another stick or cylinder of the same diameter. Essentially this type of encryption reorders the plaintext into an unreadable format; mathematically all that is required to break the Spartan’s encryption is to know the separation between letters, once this is known the message can easily and quickly be deciphered. Another form of early encryption was steganography, to some extent this can still be used today; steganography consists of hiding messages in a public medium. Steganography usually consists of trying to hide that the message exists rather than scrambling it completely. For example the use of a grille over a large piece of text can reveal a hidden message, a particular early example of this method was around 500 years B.C when a Greek at the Persian court sent a message by tattooing it into the head of a slave, (invisible once the hair grew back) [10]. A more modern example of steganography is the belief that many terrorist organisations may use public television and written media to trigger events or pass important messages to particular cells both abroad and at home. From these early seeds over the past millennia, encryption has developed, both into a complex and important technology and a lucrative way to make money. Today encryption is classified into two distinct groups, public key cryptography and secret key cryptography.

2.2 Secret Key Cryptography Secret key cryptography works around the idea that some mathematical algorithm combined with a key is used to “scramble” the message in some way, to decrypt the message the recipient must reverse the mathematical process using the same key. The secrecy of the message relies on keeping the key a secret. However there are many ways for secret key cryptography to be implemented, some much more secure than others. A shift cipher is a cipher that involves applying a particular algorithm and key to each letter to produce the cipher text to be transmitted. The Caesar Cipher is a perfect example of secret key shift cipher albeit an insecure one. The Caesar cipher is a shift cipher with the key as 3, such that A becomes D and B becomes E. However this cipher only has 26 distinct possibilities, for once 26 is used as the key the results are

6

the same as using 0 as the key (the equivalent of the plaintext message). The algorithm for the Caesar cipher is simply to add 3 to each letter. Better secret key encryption relies on mono-alphabetic ciphers; these have a 1 to 1 mapping for each letter, giving 26! possible solutions to each problem, although with today’s computers this is still an incredibly easy encryption to break using a simple brute force system. The most secure and actually unbreakable cipher known is the one time pad; this involves using a key, usually a string of random data, as long as the message itself to create the cipher text, the key is usually a way to generate the pseudorandom data rather than the data itself. It is impossible to decrypt the data without the key, for example the data “11111111” could mean any 8 letter string, depending upon the contents of the key. The problem with Secret Key Cryptography is the fact that it requires a secret key, the security of the message can only ever be as secure as the key. The main use of cryptography is to send messages such that if they were intercepted they remain unreadable, however this does not address the problem of avoiding the interception of the key itself. The very fact the encryption is being used often means the that the transmission medium, be it electronic, audible or through the postal system is not trusted, this means that another medium must be used to send the key, it is incredibly hard to implement a way of sending any type of data so as to completely remove the possibility of interception. So in essence the security of any secret key cryptosystem becomes reliant upon the security of its key distribution.

2.3 Public Key Encryption Public key encryption addresses the problem of key distribution. An extremely good analogy of the idea behind this is; two people are able to shout numbers at each other across a crowded room of mathematicians, and only the two people are able to understand what is being said. At one time this was thought impossible, however today, it is the basis for almost all modern cryptosystems. The idea for public key cryptography first came to light in a paper published by Diffie and Hellman, although it later turned out that James Ellis working at the UK’s GCHQ invented the idea although it was kept a secret, and by the time the idea was publicly release GCHQ had already developed working implementations. Public key cryptography uses two different keys that are related through a particular mathematical function, one to encrypt and one to decrypt. The encrypting key is made public for all to see and use and the decrypting key is kept secret, it is in this way that public key cryptography solves the key distribution problem. It is important to realise that once the message is encrypted, even the sender can’t decrypt it (of course they may still have the original message). So if person A wanted 7

to send a message to person B, B must first have created their own public/private key pair and made the public key available. A would use B’s public key to encrypt the message and send it to B, only B has the private key that enables the message to be decrypted, any interception of the message could not be decrypted using the available public key. The idea behind public key cryptography is the one way function, or “trapdoor” [5]. It is easy to use the function to create an encrypted message, but not feasible to use the encrypted message and the function to re-create the original message. The main one way function used in today’s cryptography is factoring integers [12]. Finding the prime factors of an integer is computationally inefficient. It is important to know that it is not impossible to factor an integer into its component primes but that it becomes increasingly more time consuming as the size of the integer grows. Public key encryption can also be used to create digital signatures, a message encrypted with the senders private key, can only properly be decrypted by their public key, this can help ensuring correct verification of the sender, as only they could have created the message (known as a signature).

2.4. Hybrid Systems There exists today a hybrid system that makes use of both secret key and public key encryption; the most common of these is called PGP (pretty good privacy). Because encrypting and decrypting messages using public key cryptography can often be computationally expensive compared to secret key cryptography, public key cryptography is now often used mainly as a key distribution method. An example of PGP would be; if A wanted to send a message to B, A would generate a secret key for that session ( session key ), the message is then hashed and encrypted using A’s private key ( signing the message ) this is added to the end of the message as a signature, the message and signature are then encrypted using the session key, the session key is encrypted using B’s public key and both the encrypted key and message are then sent to B. B reverses the process, decrypting the key using his private key, using this to decrypt the message and signature and then uses A’s public key to decrypt the signature and verify that it was A who sent the message.

2.5. RSA The RSA algorithm is perhaps the most well known of today’s public key encryption methods and its strength is based on the problem of factoring integers. The RSA [2] algorithm is a public key cryptography system developed in 1977 by Ronald Rivest, Adi Shamir, and Leonard Adleman (hence “RSA”).

8

Public key cryptography works by generating two keys, one public and one private. The public key is then distributed to a third party. To send a message, the message is encrypted using the public key, on receiving the message the recipient can decrypt the message using the private key.

The RSA algorithm is used as shown;

• • • • •

Firstly to generate keys two prime numbers, p and q are chosen, the length of these depends upon the size of the required key. Secondly the modulus n is generated as p.q. phi(n) is now calculated as (p-1)(q-1), this is the totient function; the number of integers less than n that are co-prime to n. The public value e is now chosen, this must be co-prime to phi(n). Finally the private value d is generated as the inverse of e modulo phi(n).

The public key pair is (e, n) and the private key pair is (d, n).

The encryption formula, used by a 3rd party sending a message m is:-

c=me (mod n)

Where c is the ciphertext created.

Upon receiving the message c the recipient decodes it using the formula:-

m=cd (mod n)

For example, to create an 8 bit key, two primes are chosen to have a bit length of approximately 4.

9

p = 13 q = 11

n = 143 phi(n) = (13-1)(11-1) = 120

Choose e such that it is co-prime to 120, so for example e might be 23

d = e-1 (mod phi(n)) = 47

Assuming m = 100 (almost all messages are converted to numeric values before sending)

c =100^23 (mod 143) = 133

To decrypt this message

m=c^47 (mod 143) = 100

When dealing with the RSA cryptosystem it is important to understand the magnitude of the numbers we’re dealing with. In fact it is usually the size of the modulus that determines how difficult a particular key would be to break, the larger the modulus the longer it would take to factor. It is important when choosing a key size to think about the required security of the data. Especially remembering that the current speed at which computers work, might not necessarily be the speed at which the key is broken as computer specifications improve almost daily. To account for this it is usually assumed that an individual key will have an expiry date, so for practical purposes of cracking an RSA key, it must be cracked before its expiry date.

10

The current recommended bit size for corporate use by RSA Laboratories is 1024 bits, using 2048 for extremely valuable data [2], it is to be assumed that government security services use bit lengths far exceeding this. A 1024 bit number is larger than 1 followed by 300 zeros; this is a great many times larger than the estimated number of atoms in the universe.

2.5.1. Proof of RSA To further understand the workings of the RSA algorithm and perhaps to help in providing information that may make it easier to set boundaries or use particular methods to break the encryption it is important to understand exactly how RSA works. To do this one must look at the proof of RSA. To Prove the RSA algorithm we must first understand what exactly requires proving. RSA is based on being able to recover the original message from an encrypted message.

The RSA algorithm states that;

me = c mod (n) - Encrypt message m to c.

cd = m mod (n) – Decrypt cipher c to m.

So we need to prove that (me)d = m mod (n) OR m = med mod(n)

We know that ed = 1 (mod phi(n))

So ed – 1 is a multiple of phi(n)

For some integer k, ed = k.phi(n) + 1

11

mk.phin(n)+1 = m.mk.phin(n)

Applying Euler’s Theorem

med = mk.phin(n)+1 = m

2.5.2. Generating a Secure Key As has previously been mentioned the security of any individual key depends on how it is generated, using an extremely low modulus for example can result in a quick factoring and an easy cracking of the key. However, due to the nature of RSA it is possible that any particular crack attempt may break any key on the first go, although this is not likely at all, it is worth remembering. RSA state “Users should keep in mind that the estimated times to break the RSA system are averages only. A large factoring effort, attacking many thousands of moduli, may succeed in factoring at least one in a reasonable time.”[2]. The main feature to bear in mind when generating keys, is choosing the prime numbers p and q. Picking these two primes is a fine balance between making the difference between them too great, which can leave the system open to certain known factoring techniques that are fairly efficient at factoring primes, and in making sure the two primes are not so close together as to be approximately the square root of n. It was suggested that “strong primes” (a prime number p such that p-1 has a large prime factor) should be used although RSA now acknowledge that such primes have no real advantage with today’s factoring techniques [2]. Diver has an excellent graph [1], recreated in fig.1 with 16 bit numbers that shows how the possible sizes of search space may change with the difference between p and q. It illustrates that the search space size has a great variability when p and q are near each other in value, from the point of view of cracking RSA this makes it much harder to set boundaries as to where to search for phi(n), obviously the complete search space must cover the minimum and maximum values for n. The same “cone” shape graph can be seen in larger bit sizes, each with their own peaks, and increasing rapidly as the bit size increases, illustrated in fig.2.

12

16 Bit Search Space 70000 60000

p*q

50000 40000 30000 20000 10000 0 0

20

40

60

80

100

120

140

Difference between p and q

Fig.1 Possible search space shown for 16 bit numbers. This data was generated from 1000 random 16 bit keys.

Search Space up to 32 bits 4500000000 4000000000 3500000000

p*q

3000000000 2500000000 2000000000 1500000000 1000000000 500000000 0 0

5000

10000

15000 20000

25000 30000

35000

Difference Between p and q

Search spaces between 16 and 32 bits. Generated from 17000 random keys.

13

Fig.2

As can be seen in Fig.2 the search space range for a 32bit key is much larger than that of a 31bit key, between 3 and 4 times as big. This is a good example of how increasing the size of p and q (and thus n) can increase the strength of the RSA algorithm.

2.5.3. Phi(n) If we examine how phi(n) is created we can cut down the range of values that it could possibly be in. One improvement to Diver’s method would be to only check even numbers, we know that phi(n) must be even as p and q are both odd numbers, accepting that 2 is too low to be used sensibly. We can then surmise that phi(n) is the product of two even numbers ( an odd number minus 1 is even), therefore phi(n) must also be even.

2.5.4. Search Space Boundaries To determine the search space boundaries one must first look at the heuristics of the RSA algorithm regarding the value of phi(n). The largest search space for phi(n) would be a brute force search between 0 and n, for larger key sizes this space quickly becomes infeasible, and with current computing power would take longer than the lifespan of a human and more importantly would take longer than the lifespan of the particular privacy element of the encrypted data, so that even if the encryption was broken the data recovered would be worthless and most likely already in the public domain. We know that the highest value of phi(n) would be n-1 if n was a prime number, however considering that n is made up of two primes, p and q, we know that this is not the case. This leaves us with estimating the possible values of phi(n) based on the possible values of p and q such that

Phi(n) = (p-1)(q-1)

Being aware that the maximum values for p and q are approximately sqrt(n) this can give us a first initial estimate for the maximum value of phi(n) using phi(n) = (sqrt(n)1)2

= n – 2(sqrt(n))+1

14

However the lower boundary of phi(n) takes a little more thought. Knowing that p and q are roughly around the same length ( to avoid certain algorithms for factoring n ) and that these values will be of a bit length of around half that of n. We can surmise that both p and q are approximately 2 (n/2).

Placed back into the quadratic equation we get, n – 2(2 (n/2))+1.

These are a logical starting place to begin a search for efficient boundary values as they should encapsulate the entire possible ranges for phi(n) based on our knowledge of the bit size and of the value of n. The same values are used to create the basis for Diver’s final boundary values. After generating several thousand keys using an adapted version of the server program, it was possible to further analyse these boundaries. The keys generated were 1000 keys each for every bit length between 16 and 48 ( the 48 bit limit was imposed here as the main program to analyse the data would be Microsoft Excel, this has a maximum digit length of 15, 49 bit numbers can exceed this). The initial findings were similar to that of Diver [1], that the actual phi(n) appears to be towards the top of the search space generated by the boundaries, although his report suggests that only around 33% are found within the top 10% of the search space, my data suggests otherwise. A sample of data show in fig.3 taken from a list of 16 bit keys displays how the phi(n) seems to skew towards the maximum boundary value rather than the minimum.

15

Phi(n) Distribution 16 Bits 41600 41500

phi(n)

41400 Minimum Boundary 41300

Maximum Boundary Phi(n)

41200 41100 41000 102

103

103

104

104

Difference between p and q

Fig.3 Sample of Data from 16 bit Keys.

It is not just at the lower bit lengths that this can be observed, phi(n) appears to be heavily shifted to towards the maximum boundary throughout the bit space, as can be shown by the single point taken from a 48 bit key shown in fig.4 Phi(n) Distribution 48 bits 162287318000000 162287317000000

Phi(n)

162287316000000 Phi(n)

162287315000000

Minimum Boundary

162287314000000

Maximum Boundary

162287313000000 162287312000000 162287311000000 162287310000000 8076021

8076022

8076022

8076023

8076023

Difference between p and q

Fig.4 A single point showing one 48 Bit key, due to the scale it is impossible to show both the separation between the boundaries and more than one point.

16

Looking further into the location of phi(n) it was calculated where, on average phi(n) fell within the search boundaries, what was found was intriguing as it suggested a reason to use bit lengths that were odd, rather than bit lengths that were even ( the odd bit lengths created a phi(n) that fell into a much smaller search space). As RSA keys are generally cited as powers of 2 this was not a problem, although it does raise certain security issues should anyone choose to use an even number of bits for their key. This is shown in fig.5. It is also worthy of note that the difference between the average and the minimum for odd bit lengths is much smaller than the difference between the average and minimum for even bit lengths. Search Space for Phi(n) 100 98

Percentage

96 94 Average

92

Minimum value

90 88 86 84 0

10

20

30

40

50

60

Bit Length

Fig.5 Showing the distribution of Average and minimum Percentages against Bit Length, it is interesting to note that two distinct lines are formed for each set.

Based on the values shown in fig.5 it was decided to increase the lower boundary for phi(n) as it appeared that on average at least 95% of the numbers being tested would not be phi(n). As the bottom line displays the minimum percentages for that bit length, a good starting point would be just below that. It was believed that the system should initially search within the top 6% ( as opposed to divers 10%), this is approximately + or – 3% of the average of the search space created by the boundaries initially, as the average phi(n) is around 97% of the search space. After this, should the correct value not be found the minimum boundary should be extended down towards the 85% mark, if after that the correct value has not been found, the minimum boundary should then extend to the original value ( marked as n – 2(2 (n/2)+1 ).

17

2.6. Related Work There are several different methods available to try and break RSA, the most simple of these, but generally computationally infeasible is to factor n into p and q. According to RSA Security [2] it is also possible to break RSA by computing eth roots modulo (n) if an effective method of doing this could be found. Much of the problem of breaking RSA is equivalent to factoring n; however there are other techniques that can improve the chances of reading individual messages without knowing the values for d, p or q. There are guessed plaintext attacks which rely on guessing what could be encrypted, encrypting it with the known public key and comparing the encrypted texts. There are other techniques available if the same message encrypted with the same exponent can be intercepted, using comparison techniques it is possible to work out what the message reads. It is important to note that these attacks do not mean that the private key is known, or that future messages will be easily broken. They are one off techniques designed to break a single message. This project’s aim was to extend and improve Matthew Diver’s paper [1] so we shall first examine this as a reference point for the rest of the project. The paper begins with a brief introduction to public key cryptography and then a slightly more in depth explanation of the RSA algorithm. The paper then describes certain heuristics of the RSA algorithm, these are standard guidelines to choosing the values for p and q, and also how e and d are related to each other and to p and q (through phi(n) and n). The paper has made certain assumptions as to how p and q have been chosen and what follow on effects this has on the resulting values for n and phi(n), this has allowed an approximate value for phi(n) to be presented, these approximate values will provide a starting point for my own project and allow me to begin searching for phi(n) around this area. Although one would agree that the maximum estimated value for phi(n) is a sensible value, the accuracy of the minimum bounds appear to lack justification.. The minimum boundary has been chosen using a probabilistic search on known values of phi(n) generated for keys of different bit lengths. The probability that phi(n) is within a certain percentage of the maximum search space has been generated. Again, some of this data appears to be flawed and incomplete. This project will re-create this data and confirm the correct values for the probabilities of phi(n) being in different percentages of the maximum search space. Unfortunately Diver’s paper leaves many questions unanswered which this project hopes to address. There is little resulting data, and even fewer conclusions drawn from this. It is felt that the boundary values for the maximum and minimum values for phi(n) can be vastly improved. It also appears that there has been no implementation to avoid checking certain values of phi(n) that cannot possibly be valid, such as even numbers, avoiding these numbers could drastically reduce the number of possible values checked. Other distributed attacks have been carried out on the RSA algorithm, in fact RSA itself runs challenges and in 1999 the RSA' s 512 bit challenge number was factorised using the General Number Field Sieve method [9]. The GNFS is one of the fastest known factoring algorithms used today.

18

It is important to remember that the security of RSA has not been proven to be that of factoring n, although it is thought that this might be the case.

19

3. System Design The whole reason that RSA remains a strong cryptosystem is that the factoring of large numbers is time intensive on even the most up-to-date machines and architecture. The computational infeasibility of this means that even for relatively small bit numbers compared to the recommended 1024 bit length [2], on a standard 1.8Ghz Pentium 4 the estimated times to find phi(n) using Diver’s search space would take many thousands of years [1]. Because of this, speed was an essential factor in system design.

The system was designed such that it could make the best use of the resources available, it had to be able to handle being spread across 1 or 2 machines or up to and over 100. It was hoped the system would be able to handle clients connecting and disconnecting all the time (to allow for possible extendibility in the future). The system setup was as follows, a central server would generate RSA keys and the clients would connect to the server, the keys would be split up into blocks to more easily manage the distribution. Once the client had received its block it would search through the range of numbers it had been given for the correct value of phi(n), there would be no contact between the client and server unless the client had found the correct value.

3.1 Finding Phi(n) The idea behind the project was that given the public key, (e, n) is it possible to come up with the correct value of (d, n)? The project looked at the heuristics of the RSA algorithm and attempted to use these to compute phi(n) from (e, n). Once a possible value for phi(n) had been generated, a possible value for d could also be created using the known value for e and the possible value for phi(n). Typically a standard brute force attack on the RSA system would involve trying to factor n to create p and q, and use these to create a value for d [5]. However this project used known ideas behind the choosing of the original values for p and q ( the heuristics of RSA) to generate a range that one would expect the value for phi(n) to be in, it used this range to search through initially, and then assuming phi(n) was found not to be in this range, the system would expand the range based on generated probabilities for the value of phi(n). With the primary goal being to set boundaries for the maximum and minimum values of phi(n) that accounts for a 100% success rate for finding the correct value of phi(n) first time round. Once we have a possible value for phi(n) we must decided whether it is correct, the steps for this are:• • •

Compare phi(n) to our known value of e, they must be co-prime (the greatest common divisor must equal 1). Generate a possible value for d such that d = e-1 (mod phi(n)). Use d to decrypt some known ciphertext to compare for a match.

20

In the real world this method for determining the correct value for phi(n) is fairly false as known ciphertext is not usually available (if it were, trying to break RSA on it would be redundant).Perhaps a method involving matching text with dictionary words, or other known values would be used, however for the purpose of testing this idea we are using some known value to compare against as we are only trying to test whether finding phi(n) is possible rather than deciding whether the plaintext generated is correct or not. However certain file formats contain standard headers and footers to their data and depending upon the block size used in the encryption process it may be possible to exploit this to obtain at least partly known ciphertext.

3.1. Choice of Language The two initial contenders for a language to implement this system, were C++ and Java, both had their advantages and disadvantages. Java has a very extensive API and contains much of the functionality required to implement a distributed system, it also provides many supporting functions for large integers and cryptographic functions, such as easy creation of secure random integers [7]. C++ has a well established support structure and is extremely well documented, it provides the power and functionality required for a project such as this, although it doesn’t contain as standard, functionality for large integers or many of the required cryptographic functions, it does have freely available libraries [4] that provide the functionality and support. C++ also has one major advantage over Java for this project, and that is that it is faster than Java [6], although for small applications, this value is usually insignificant, over the amount of time this project is expected to run, and the required processing time this value could make the difference between finding a given value of phi(n) in a specified time, or not. For this reason it was decided to implement the system using C++, it was felt this would have a great speed advantage over Diver’s implementation using Java. One of the main problems to overcome in designing the system was the hardware limitation on the size of integers, currently on most home computers, and on the machines being used for this project this is 32bits, giving an unsigned integer range of 0 – 4294967295. Currently the recommended bit length for keys is between 768 – 2048 bits. This is approximately 10^617 or equivalent to a decimal number with 617 figures. Unfortunately as stated earlier, standard data types in most programming languages only allow for integers of 32bits (approximately 10 characters long). To overcome this, a way had to be found to allow for large integers to be processed in C++. To allow for these large integers it was decided to use the GMP library [4], this is a free library for C++ that allows arbitrarily long integers, above 32 bits. It has support for many different operations that would be useful throughout the project such as finding prime numbers and computing the greatest common divisor. The library operates under the GNU LGPL [15], this allows anyone to freely use, edit and distribute the library. The library is intended for cryptographic use and has, in 21

particular the functions for manipulating numbers within a particular modulus; this was incredibly useful during the development of the software.

3.2. Distribution Because of the processing required by this project it was to be distributed across many machines, all working together to find the correct value of phi(n) for each key values given. Each machine would be given a possible range for phi(n) and then search through these, if found, the client would flag, either in a database, or to a server application that it has found the value of phi(n). There were several different options available to use as a distribution medium, although it was initially felt that almost every feasible implementation would make use of a database to hold certain data, this eventually changed. The first solution was that a simple server/client application could be created using the C++ socket connectivity, although this would only allow data-streams to be set up between a client and server, rather than extensive variable passing between the two. The Server would access the database which would contain certain values, such as known plaintext and known values for d and phi(n). It would also contain some representation of the possible values of phi(n) that are distributed to the clients. Another design could use the remote access possibilities of a database, and allow the information in the database to govern the actions of individual clients, each client would connect to the database to collect its values for phi(n) and could flag a Boolean value in the database if they find the correct value. A separate “governor” program could be responsible for generating and governing the actual data in the database, this would mean that no interaction between clients or the governor was necessary, it may be slightly easier to implement this method although the actual system may be limited in its extendibility. The final method available to was CORBA (Common Object Request Broker Architecture). This is a platform and language independent system. It would allow many of the clients to interact with the server using a common protocol. CORBA allows great extensibility and would allow clients written in languages other than C++ to easily interact with the server program. One final problem left to solve was how to update the client software, as it would probably be necessary to make changes to the way it works and it would be cumbersome and time consuming to update each client individually, The possibilities of batch processes, ftp requests and .dll downloads were looked at to allow the clients to automatically update themselves whenever it is required.

22

3.3. Final Design After researching the technologies and implementations of CORBA, it was felt it was overkill for this particular project, coupled with this was the fact that there aren’t many free distributions that fully comply with the CORBA standard. Secondly it was decided that the requirements for the server would probably require that it be threaded to handle multiple client requests at once. It was decided that the quickest and simplest way to do this would be to implement a Java version. As the Server itself wasn’t going to be doing any heavy processing (unlike the clients) the speed element could be forgotten as having a slower server would not slow the entire system down. Once it had been decided to use a Java server and C++ clients, a communications method had to be devised. As CORBA had already been eliminated as not feasible for this particular project, and Java’s RMI wouldn’t work with C++ clients, it was decided to use the Socket API that is available to both languages. This allowed byte streams to be set up between the client and server. The basic idea for the system was that the server would create a key of a specified bit length, it would then place the boundaries for the search space (this would change during the course of the project as the boundaries were adjusted to find the optimum). The clients would simply connect to the server and request a block of numbers to try, it would check these for the correct value of phi(n), if a client finds the correct value, it returns that result to the server and the server generates a new key. Should the client not find the required result it simply requests a new block from the server. It was decided that for easy collection and processing of data, that the server would be the only point that had data that was needed; this meant that each client would have to pass what data it had to the server at regular intervals. The server would output data in comma separated variables.

3.4. Server Design The server was written in Java, its main aim was to generate RSA keys and create minimum and maximum boundaries based on the value of n, in which to search for phi(n). It would then separate this block of numbers into smaller blocks to be distributed amongst the clients. If a key was broken ( the correct value of phi(n) found) then the server would store all the values of the key, and the time in which it was found. The server operated by listening on a particular port for client connections via TCP (initially the port was 666 although it was possible to change this should it be required). As each client connected it was added to a list of clients and assigned an id and the job id of the current key being distributed. Once a client requested a key block, the server would pass it a text string of the required values ( a starting point, end point, value for e, n and d). It would then increment the starting point and ending 23

point such that the next client request would b assigned the correct values, should the values extend further than the set boundaries then the server would create a new key, and reassign the starting and ending points for the next block to be distributed. The client also had the capability of shutting individual clients down by sending a particular command. To handle the clients a threaded handler was created that would listen on the specified port for the connection, once a connection was received an individual client thread would be created for that client. The client thread could then call other parts of the server system, however to ensure concurrency throughout the system, for example to ensure that two clients weren’t issued with the same key block for processing (this would waste valuable resources) any method that was accessed by a client thread was synchronized, ensuring that only one client at a time could perform a critical command on the server. To enable the server to shutdown at a specified time (or day) a timertask was created at the beginning, a timertask allows a thread to be scheduled to be run at a specific time, the timertask simple caused a system.exit(0) command to be executed, thus shutting the server down, the clients were programmed to also shutdown once the server did. To determine what time to shut down the program was required to be passed a date and time on its command line. This was parsed to tell the timertask when to run.

3.4.1 Random keys To generate a random key, it was decided to implement the server so that it generated a random value for p and q of a specified bit length (half that of the desired modulus length). Generating random numbers is a problem for computers. Usually the numbers of pseudo random, such that they appear random but in actual fact are generated from an algorithm, this is often given a seed such that the starting point for generating the numbers changes. However, knowing that these numbers are not truly random can be exploited, especially from the point of view of cryptography. Luckily the Java language provides a particular implementation of a random number generator that is cryptographically secure, this is the SecureRandom class. It was decided to use this, even though for the sake of this particular project pseudo random numbers would have made little or no difference to the outcome. This was mainly to remove the “laboratory environment” from the testing and also to allow for possible expansion in the future.

24

3.4.2 Generating Prime Numbers Another problem that was presented during the server design was that of generating prime numbers. It was undesirable to generate a large random number, check that it was prime and if not generate another as this would take a large amount of processing time for each key that was generated. The BigInteger class has an acceptable method probablePrime(), this will generate an integer that is probably prime with a specified bit length, the chances of it not being a prime is no greater than 2-100 . This is acceptable for generating RSA keys [2].

3.5. Client Design The client was written in C++ because of its speed benefits. The aim of the client was to run at a specific time, and connect to the server. Two parameters had to be passed to the client as command line arguments, the IP address of the server, and the port to which it should connect, without these the client wouldn’t know where to connect to, this also left the possibility of running two servers and running separate jobs on each of them, with some clients connecting to one and the others to the other. The client worked by requesting a key block from the server. Once this was received, in a text string. The client would then parse the string and interpret it. The client was to loop through the range of numbers given, it would increment in twos as we already know that phi(n) must be an even number. The client would now check its value of e to see if it was co-prime to the test phi(n), if it was it would then find the inverse of e modulo test phi(n). The client would then use this to decrypt a message that had known plaintext. If the plaintext matched the text that the client had been passed it would return that value of phi(n) as correct. If the client lost its connection to the server it would assume the server had shutdown and shutdown itself.

3.6. Communication As has been mentioned, the server and client communicated with each other using the sockets interface over TCP/IP. This meant that most commands were sent in ASCII text, the server and client both had to understand what was being communicated from the other party. To do this a code had to be decided and implemented, almost like a protocol. It was decided that the first letter of each message would indicate the type of message it was,

25

the message body would then follow, the format of this depending upon the message, and the message was terminated with a new line character. Due to the way the client handled strings, it was easier to separate messages from the server into lines, such that the first line contained the message code, and the second contained the message body, it was unnecessary for a third line as the end of line character of the second line could be used to terminate the message. It was decided to use the idea of characters for type of message as it was easy to then add extra functionality to both the server and client should it be required.

Messages from the server were;

X – Caused the client to shutdown upon finishing its key block. R – Sets the client ready to receive a key block. A key block was sent with the individual values separated by hashes, for example “2#30#3#33#7#4” would indicate that the client was to loop from 2 to 30 using e = 3,n=33,d=7.

Client Messages were;

F – Told the server that a value for phi(n) had been found, the message body was the value of phi(n). So the reply to “2#30#3#33#7#4” would be “F20”. X – Told the server that the client was shutting down ( usually abnormally ). R – A request to the server to send a key block, the server would then respond with message type R.

It was found that once both the server and client had been correctly set up to read in lines from the socket interface that this method worked extremely well, the time to parse the strings into actual numbers was minimal and did not appear to slow the system down at all.

26

3.7. Deployment There were several issues to overcome when deploying the system on multiple machines. The first to overcome was the problem of installing the client software on possibly hundreds of machines; however this was easily solved as TSG have administration software that allows this to be done remotely from one source. The second was starting the software at a specific time, this could be done by scheduling tasks to start at a specific time, both the windows XP operating system, and Linux have this capability. The final problem to overcome was updating the software on the client machines remotely; this was addressed by storing a copy of the client software in a central location. The client machine ran a script that simple downloaded and ran another script that was stored along side the central client version, this script could be edited to either just run the client on the machine, or download a new one via ftp and run that instead. This solved the problem of editing code to fix bugs or address new ideas. The server would be run as a scheduled task from a script, the script would pass the date and time for the server to stop, this would mean that the script would have to be updated every day, although passing the date value did mean that the server could run all weekend without interruption.

27

4. Data Collection The data would all be outputted from the server as a file, initially it was to be piped straight from the Java output to a file, the output would be in comma separated variables, this is a common format that would allow custom code to interpret the data if needed or ready made professional software such as Microsoft Excel, or , in fact most other spreadsheet programs. Allowing the data to be represented in a format readable by a spreadsheet allowed easy manipulation of data and especially allowed for plotting graphs to gain a visual representation of the data collected.

4.1 False positives The first test run of the code, purely to test that the algorithms worked, was aimed at running the code without a calculated boundary value (essentially a brute force attack between 0 and n). During the initial test running of the code, originally based on Diver’s methods several anomalies in the returned results were noticed, it appeared that the value for phi(n) that was returned was significantly lower than the correct value of phi(n). After checking what these values were, and that they did indeed give the correct inverse of e, such that the plaintext value could be decrypted, it appeared that there were other “candidate” values for phi(n) that could return the correct value of d from e. As there were now several possible values that would decrypt some plaintext and this meant that it was impossible for the system, without checking against the known value, to know whether it had found the correct value of phi(n) or a false positive. Unfortunately allowing the code access to the actual value of phi(n) would serve no, useful purpose in attacking the RSA system. However, the fact that the value gave the correct value for d, was an incredibly valuable discovery as it may allow a possible weakness to be exploited. Upon further analysis it seemed that phi(n) was a multiple of some of these false positives, although some still remain unexplained.

An excellent example of this is:

p=3

28

q = 11 n = 33 phi(n) = 20

Choose e = 3, therefore d = 7

Consider that the system starts from zero and counts up to n, when it reaches 10

e.d = 21 = 1 mod (10)

The system has found a candidate value of phi(n) that gives the correct value of d, notice that 20 is a multiple of 10, and that actually under the modulo of any value X that divides 10 exactly where e.d is greater than X and e and d individually are smaller than X, the correct value of d would be generated. What this meant was that the system was finding earlier false positives for phi(n) that while incorrect compared to the real value computed from the original p & q, it could still generate the correct value of d from e. So despite not achieving the desired results the initial method did highlight areas of further interest. Remembering that these results were found without using a calculated search space, it is worth pointing out that once the search space is reduced to optimise the search for phi(n) we are unlikely to come across candidate values for phi(n) although, this too may be worth examining at a later date. Based on these test runs it was decided to change the direction of the project and examine the idea of false positive more closely. The software and hardware already set up and ready to be used to try and find correct values of phi(n) was ideal for this and only required a few changes to the code to make it so.

4.2. Adapted Aim The adapted aim was now to examine how it might be possible to exploit the phi(n) candidates to further increase the chance of cracking an RSA key. The main aim of simply trying to beat Diver’s cracking attempts was laid aside, and it was decided that

29

instead more research would be done into the workings of RSA itself and how these false positives could be used. The aim was to try and understand why this might be happening, and how it could be predicted in a way that was useful to attacking the RSA system.

4.3. Adapted System Design Once it was decided to further research the false positive it became fairly unnecessary to distribute the system in the original way as, pure processing power was simply not needed as it was before, however keeping the distributed system was going to allow the project to generate more data to analyse. Due to the change in nature of the data required, much of it could now be generated on a single computer that simply ran the server and one client. It is worth remembering that a decision was made to search for false positives outside the generated boundaries, the implications of this were that for each key, many more values had to be checked compared to simply trying to crack it. It was felt that the size of the keys however was not as important as it had been, so it was just as valuable to collect lots of data at lower bit lengths as less data with higher bit lengths. The server was changed such that it now did not generate a new key every time a client returned a value, rather, it recorded the minimum value for phi(n) coupled with the maximum value, more importantly however, it recorded the number of false positive for each key. The server generated a new key every time every value had been checked between 0 and n. Also the message that was passed to the client was changed, instead of passing a plaintext and ciphertext value to be checked against, all that was required was e, n and d. The client was changed such that on finding a value of phi(n) that gave the correct value for d, it would pass a message to the server with that value, but continue to work through the rest of the number range it had been given instead of requesting a new block.

30

5. Results & Analysis Due to the changing nature of the project, much of these results were collected on machines that were available on a small private network, using the same software developed for distribution. It is important to note that the same value of phi(n) may create numbers of false positives for a different value of e. This means that the number of false positives may well rely as much on the picking of the public key as it does the picking of the two primes p and q.

5.1 Counting False Positives Initially it was decided to count the number of false positives for each key over a distribution of around 1000 16 bit keys, and 1000 24 bit keys. It is important to note that as the boundary values are now not in place this is equivalent to running the original specification software on a much higher key length.

The initial results suggested that although there were a significant number of keys that had more than one false positive, the majority of keys only had one value for phi(n) that inverts e correctly.

Frequency Distribution 120 100

Frequency

80 60 40 20 0 0

100

200

300

400

500

600

-20 Number of candidates

Fig.6 Frequency Distribution for false positives in 1000 random 24 bit keys.

31

700

As we can see from Fig.6 the frequency of phi(n)s falls off dramatically as the number of phi(n)s falls. This suggests that we are unlikely to find a false positive for many of RSA keys. However if we examine the frequency table Fig.7 we can see that approximately 1/10th of all the keys only have one candidate for phi(n) ( this is the actual value of (p-1)(q-1) ) however, from the point of view of attacking RSA this means that almost 90% of keys will have at least one extra value or false positive to search for.

Frequency 106 357 382 81 36 16 6 3 7 3 2 0 1

Values 1 10 50 100 150 200 250 300 350 400 450 500 600

Fig.7 Frequency table for phi(n)

5.2. Counting Phi(n) With Assigned e. Whilst researching RSA it was discovered that often rather than randomly picking a value of e, Fermat Primes [16] are used, these speed up certain operations upon them. A Fermat Prime is a prime that satisfies 2n +1, where n is the mth power of 2, m can be between 0 and 4 inclusive. It was decided to re-run the counting of phi(n) using a Fermat Prime as it was thought this may make a difference to the number of false positives that can be classed as candidates for that particular key. The particular Fermat Prime that was used was the value 17, this satisfies 24 +1.

32

Frequency Distribution 100 90 80 Frequency

70 60 50 40 30 20 10 0 0

20

40

60

80

100

120

140

160

12

14

16

Number of Candidates

Fig.8 Frequency of phi(n) for 16bit keys without assigned e. Frequency Distribution 250

Frequency

200 150 100 50 0 0

2

4

6

8

10

Number of Candidates

Fig.9 Frequency distribution of candidates for phi(n) for 16bit keys with e set to 17. As we can see by comparing fig.9 and fig.8, the frequency when using a Fermat Prime is greatly reduced, this indicates that perhaps using a Fermat Prime could make RSA more secure, and thus harder to crack by exploiting the false positive theory. However the graph still shows that over 80% of keys had more than 1 false positive. So still this continues to be an area of great interest when considering ways to attack RSA.

33

The most interesting graph obtained from these results is the spread of the minimum values of phi(n) found for 16bit keys Fig.10. Minimum Values of Phi(n) Candidates 70000 60000

Min Phi(n)

50000 Phi(n)/2

40000

Phi(n)/4

30000

Min Phi(n)

20000 10000 0 0

10000

20000

30000

40000

50000

60000

70000

N

Fig.10 Minimum values of phi(n) set against n, phi(n)/2 and phi(n)/4 are shown to highlight certain artefacts.

From the graph it is possible to spot the plots that only had 1 false positive, the top line displays these as obviously they’re also the actual value of phi(n), which is proportional to n. However the highlighted line Phi(n)/2 is much more interesting, a line is formed along the halfway point of phi(n), this suggests that many false positive may often lie along this point, another line can be seen to be forming along the Phi(n)/4 line also, although this may become more apparent with more data.

5.3. Conclusion of Results These results suggest that the phi(n) and properties of it may have much more to play in the breaking of RSA than was previously thought at the beginning of the project, it may be possible to further study these properties to either create much more accurate boundaries or possibly even find an exploitable weakness in the choosing of phi(n). Of course it may be possible to deliberately choose p and q such that the phi(n) they generate has no false positives. However this simple act will reduce the possible phi(n) for any p q pair and possibly further increase the risk of finding the correct phi(n) and thus the correct value of d, breaking that individual key. The mathematical explanation for the false positives is fairly easy to comprehend, however there appears to be no previous work carried out in this particular area, for this reason it is believed that this would be a valuable area of research in the future, to

34

either rule it out as an option for attacking RSA or including it’s methods to further increase the chance of finding the correct value of phi(n). Given more time it would be useful to see how these properties hold with larger key sizes, it may be that the frequencies and distributions change the larger the keys sizes, should the frequencies increase and more false positives be found then the greater the chance of breaking a larger key.

35

6. Conclusion To conclude, this project has been successful in that it has now identified further areas of study in the RSA cryptosystem. Although the original aims were not met, the adapted aims were studied and produced enough results for analysis and summations to be made upon them. The results of the project for false positives proved to be extremely interesting with many visible artefacts in the generated data ( most visible in the graphs). The idea that perhaps there are alternative methods to cracking an RSA key is often overlooked with the majority of implementations attempting to factor n. Even if it turns out that guessing the value of phi(n) is of a similar complexity to factoring n, it still becomes an important avenue for research, it may even occur that the existence of false positives for phi(n) may further decrease the complexity, making that avenue the most desirable for attacking the RSA system.

6.1 Final Thoughts My final thoughts on the project are that it was a partial success in examine Diver’s [1] methods, and a complete success in starting to examine the false positives identified after the first test runs of the original code. I believe that further research into this area could yield interesting results, both in perhaps more suggestions on how to generate more secure keys ( from the point of view of protecting information) and on how to attack keys. Large amounts of data were generated to examine, and this in itself was also a success as the numbers involved are incredibly large, even using today’s computers.

36

7. References 7.1. References [1] Mathew Diver, 2003, A Distributed Heuristic Brute Force Attack of the RSA Cryptosystem. [2] RSA Security,[online]http://www.rsasecurity.com, last accessed 10th December 2003. [3] Mike Hanson, RSA Encryption, University of Minnesota. [4] GMP, [online] http://www.swox.com/gmp/, last accessed 10th December 2003. [5] Centre for Quantum Computation, [online] http://cam.qubit.org/articles/crypto/publickey.php,last accessed 11th December 2003. [6] Eric Galyon, C++ vs Java, [online] http://www.cs.colostate.edu/~cs154/PerfComp/, last accessed 7th December 2003. [7] James Gosling, [online] http://java.sun.com, last accessed 9th December. [8] Advanced Micro Devices, [online] http://www.amd.com. Last accessed 7th December. [9] J. Buchmann, J. Loho, J. Zayer, An implementation of the general number field sieve, 1993 [10] CES, [online] http://www.cescomm.co.nz/about/history.html, last accessed 28th April 2004. [15] GNU LGPL, [online] http://www.gnu.org/copyleft/lesser.html, last accessed 1st May 2004. [16] DI Management, [online] http://www.di-mgt.com.au/rsa_alg.html, last accessed 28th April 2004.

7.2. Bibliography [11] C.W. Dawson, The Essence of Computing Projects, Prentice Hall, Harlow, 2000. [12] N. Smart, Cryptography an introduction, McGraw-Hill, Berkshire, 2003.

37

[13] J.K. Truss, Discrete Mathematics for Computer Scientists, Addison-Wesley, Harlow, 1999. [14] Herbert Schildt, C/C++ Programmer’s Reference, Osbourne McGraw-Hill, California, 1997.

38

A. Appendix A – Relevant Mathematics A.1. Fermat Theorem If p is prime and a is less than p and also co-prime to p. ap-1 = 1 (mod p)

A.2. Euler Theorem aphi(n) = 1 (mod n)

Multiply by a gives us; aphi(n) +1 = a (mod n)

A.3. Totient Function The Totient or Euler function of any natural number n is the number of relatively prime numbers less than n. Commonly written as phi(n). For any prime number p, phi(p) = p-1.

39

B. Appendix B – Sample Data B.1. Sample 16 Bit Key data Bit Length 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16

p 149 157 173 167 157 239 229 251 139 191 227 167 157 173 181 179 233 179 229 131 179 173 197 251 173 233 163 197 193 223 163 239 197 197 233

q 227 179 227 151 167 151 193 173 173 173 239 197 199 229 197 197 211 251 227 137 179 157 241 181 173 193 181 167 241 167 173 181 179 179 173

n 33823 28103 39271 25217 26219 36089 44197 43423 24047 33043 54253 32899 31243 39617 35657 35263 49163 44929 51983 17947 32041 27161 47477 45431 29929 44969 29503 32899 46513 37241 28199 43259 35263 35263 40309

phi(n) 33448 27768 38872 24900 25896 35700 43776 43000 23736 32680 53788 32536 30888 39216 35280 34888 48720 44500 51528 17680 31684 26832 47040 45000 29584 44544 29160 32536 46080 36852 27864 42840 34888 34888 39904

phi(n)s percentage of n 98.8912869940573 98.8079564459310 98.9839830918490 98.7429115279375 98.7680689576262 98.9221092299593 99.0474466592755 99.0258618704373 98.7066993803801 98.9014314680870 99.1429045398411 98.8966229976595 98.8637454789873 98.9878082641290 98.9427040973722 98.9365624025182 99.0989158513516 99.0451601415567 99.1247138487582 98.5122861759626 98.8858025654630 98.7887043923272 99.0795543105082 99.0513085778433 98.8472718767750 99.0549044897596 98.8374063654544 98.8966229976595 99.0690774622149 98.9554523240514 98.8120146104472 99.0314154280034 98.9365624025182 98.9365624025182 98.9952616041083

40

Minimum Boundary 33312 27592 38760 24706 25708 35578 43686 42912 23536 32532 53742 32388 30732 39106 35146 34752 48652 44418 51472 17436 31530 26650 46966 44920 29418 44458 28992 32388 46002 36730 27688 42748 34752 34752 39798

Maximum Boundary 33456 27769 38876 24900 25896 35710 43778 43007 23738 32680 53788 32537 30890 39220 35280 34888 48721 44506 51528 17680 31684 26832 47042 45006 29584 44546 29160 32537 46083 36856 27864 42844 34888 34888 39908

Search Space Size 144 177 116 194 188 132 92 95 202 148 46 149 158 114 134 136 69 88 56 244 154 182 76 86 166 88 168 149 81 126 176 96 136 136 110

B.2. Sample 16 Bit Keys Counting Candidates Bit Length 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16

p

q 179 131 173 167 191 157 191 229 179 251 157 173 241 173 191 233 211 179 167 167 233 131 157 157 191 199 233 193 157 193 241 181 167 229 157

n 241 251 131 223 151 173 211 157 223 179 167 193 197 131 191 223 229 173 229 151 191 251 151 227 173 137 131 227 131 251 229 179 251 197 137

43139 32881 22663 37241 28841 27161 40301 35953 39917 44929 26219 33389 47477 22663 36481 51959 48319 30967 38243 25217 44503 32881 23707 35639 33043 27263 30523 43811 20567 48443 55189 32399 41917 45113 21509

phi(n) 42720 32500 22360 36852 28500 26832 39900 35568 39516 44500 25896 33024 47040 22360 36100 51504 47880 30616 37848 24900 44080 32500 23400 35256 32680 26928 30160 43392 20280 48000 54720 32040 41500 44688 21216

41

Number of Candidates 9 22 2 4 12 36 65 8 1 4 49 31 9 4 2 7 22 29 2 19 4 2 3 26 5 32 7 10 16 21 38 7 16 2 1

Minimum 28480 7592 21758 24568 8916 3984 15050 15852 39516 31250 636 9216 15680 13910 31748 34336 22914 11214 37544 6700 35264 32240 11700 5424 21470 6936 17680 7232 12168 20840 11096 19640 15800 44016 21216

Maximum 42720 32500 22360 36852 28500 26832 39900 35568 39516 44500 25896 33024 47040 22360 36100 51504 47880 30616 37848 24900 44080 32500 23400 35256 32680 26928 30160 43392 20280 48000 54720 32040 41500 44688 21216

B.3. Sample 16 Bit Keys Counting Candidates e set as 17 Bit Length 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16

p

q 233 167 151 197 179 167 229 229 139 179 193 191 149 157 163 157 131 191 173 193 139 223 229 167 137 197 233 211 229 163 149 179 139 191 151

n 197 211 211 163 131 163 137 181 163 223 197 179 157 181 251 157 223 229 191 211 139 173 181 173 239 251 211 157 181 157 163 131 229 241 241

45901 35237 31861 32111 23449 27221 31373 41449 22657 39917 38021 34189 23393 28417 40913 24649 29213 43739 33043 40723 19321 38579 41449 28891 32743 49447 49163 33127 41449 25591 24287 23449 31831 46031 36391

phi(n) 45472 34860 31500 31752 23140 26892 31008 41040 22356 39516 37632 33820 23088 28080 40500 24336 28860 43320 32680 40320 19044 38184 41040 28552 32368 49000 48720 32760 41040 25272 23976 23140 31464 45600 36000

42

Number of Candidates 6 6 10 2 2 1 2 6 1 5 9 2 4 2 2 1 7 8 1 4 5 3 6 1 4 1 5 1 6 7 1 2 4 3 10

Minimum 17052 11620 2100 29484 19580 26892 27744 20520 22356 6586 7056 27056 11544 24336 37800 24336 5772 10830 32680 32760 6348 19092 20520 28552 22848 49000 29232 32760 20520 8424 23976 19580 15732 39900 6750

Maximum 45472 34860 31500 31752 23140 26892 31008 41040 22356 39516 37632 33820 23088 28080 40500 24336 28860 43320 32680 40320 19044 38184 41040 28552 32368 49000 48720 32760 41040 25272 23976 23140 31464 45600 36000