|
|
|
PasswordsHow to choose a good password |
|
|
|
|
||||
|
|
||||
LegendUser password = a string of characters known by the user, which is supplied to an encryption program. Hash = a sort of a signature of a string of characters, which does not reveal the original text (= there is no "decryption" of the hash which would lead to the original text). Encryption = process through which a string of characters is transformed into another string of characters by mixing it with the user password, result which is unreadable without the user password. Decryption = the reverse process of the encryption. This transforms the result of an encryption process into the original text by mixing it with the user password (which was used for encryption). Entropy = measure of the disorder / randomness of a password.
Text passwordsHow can you create a strong text password? Such a password has to resist to dictionary attacks for years, even if the attack is conducted with dedicated distributed processing systems. A text password should be long enough so that it can't be found out by using dictionary attacks, and easy enough to remember. The best method to form a password is to use only (small) letters with which to form words, written one after another. Tips:
Any language has tens of thousands of words, but even if we consider just 10'000 words for a language, a password formed from six words would offer 10^24 possible combinations. This is better and easier than any password formed from lots of weird characters. Strong text passwords have to be used only in few cases. You should not try to memorize more than two passwords at the same time. One password should be for your computer, and another one for your private key for asymmetric encryption. Note that encryption algorithms are much stronger than passwords are. For example, an encryption algorithm on 128 bits has more than 10^38 possible combinations. This means that for a password to be just as strong, it would have to be formed from ten words. So, don't worry about the strength of the encryption algorithm, worry about your password. See this for how to use the password generator from FileMatrix. See this for how to compute securely. See Diceware for how to choose a strong random password.
Dictionary attackA dictionary attack is one way crackers use to find out what password was used to encrypt something, or to find out what text was hashed. If there is no flaw in the encryption algorithm and the encryption password can't be retrieved in any other way, a dictionary attack is the only method possible to find the user password. When crackers try to find out your password using a dictionary attack, they simply try all combinations of the words from a dictionary against the encryption algorithm. A cracker can also try all combinations of the binary password, but this requires far more time to be useful; this is a brute force attack. To do this, they use a computer program which first generates a text password by cycling through all possible combinations of words from a given dictionary. Then, this text password is used to decrypt the message which has to be cracked. Usually, an implementation of an encryption algorithm has some sort of signature which allows the encryption program to know if the user password is the one used for encryption. Basically, the program uses the user password to decrypt a short text, then it creates a hash of the decrypted text and compares it with a stored signature. If the two match, the program considers that the user password is good and proceeds with the decryption of the entire encrypted message. A brute force attack requires an enormous amount of processing power because the cracker needs to check every binary combination of user passwords. Normally, this would make the method unacceptably slow. However, the total number of user password combinations remains stable because people can't remember long passwords, while computers become faster every year. This means that brute force attacks becomes more and more feasible. To make a dictionary attack unfeasible these days, you would have to use a lot of words in your text password.
Pre-computingOne solution to make dictionary attacks even faster is to pre-compute the signatures for every encryption and hashing algorithm. Once they are computed, they are stored (including their associated user password). Basically, the cracker pre-computes the signatures for all combinations of words (for example, up to 4 words), and stores them. When a password must be found out, a simple binary comparison is performed:
When a match is found, the user password associated with the pre-computed stored signature / hash is simply retrieved from the storage. The problem with this method is that it requires enormous amounts of memory to store the pre-computed data.
Biometric informationThe biometric information of a person is whatever data related to the physical appearance of that person. Some of this information changes, some doesn't. Fingerprints, for example, don't change in time. Some biometric information can be read only with special devices, so this is not important for passwords. However, some biometric information can be obtained with means available to anyone. This type of information is easy to remember. Here are some examples: birth date, city of birth, personal name (first and last), names of parents, height, shoe size (sole length). So, what good is such information, considering that it can be obtained by just about anyone? Well, it is not useful directly, but it makes passwords personal. For example, anyone could use a password which is easy to remember, like "silly password". If you add your biometric information to this password, you force a cracker to personalize his attack for you, and not generate an all-purpose dictionary of hashes. The biometric information acts like the "salt" used by encryption programs, which is an un-encrypted string of random bits used to make pre-computing beyond technical possibilities.
HashingThe main purpose of hashing is to strengthen (to dictionary attacks) the text passwords which you must remember. To hash your text password, you can use the password generator from FileMatrix or you can use BedazzlePass. Write your text password in the "Password" edit-box of the dialog. Click the "Generate hash" button to generate a hash value of the password and place it in the edit-box (instead of your original password). Now, use this hash value where you need a password. How does this help you? Well, on one hand you only need to remember a simple password, just a few words. On the other hand, since you use the hash value as password, anybody who attacks your password by brute force will have to crack a password of maximum length (= the hash), length beyond which is easier to crack the encryption algorithm than to crack the password. The cracker could also attempt to find out (through a dictionary attack) the text password which was used to generate the encryption password. This is exactly the point where hashing provides value. The cracker first puts together a combination of word from a dictionary. Since this is done very fast, it would be easy for the cracker to try all combinations of (say) 4 words. But, if you have hashed these words and used the result as password, you also force the cracker to hash each combination of words (because he needs to try the password which was actually used for encryption). Thus, he needs more time to try each combination. A single hashing is of little importance, but the hashing can be applied iteratively to require more time to get to the result. The hash can be computed while the user is typing the password. This way the user would not feel the time consumed for hashing. However, this is too complicated to implement, relative to its advantages.
Iterations As explained above, a single hashing slows down a dictionary attack in an insignificant manner. So, in order to increase the strength of your text passwords to dictionary attacks, you can increase the number of iterations of the hash. To choose a number of iterations, test how much time your computer needs to generate a hash, by increasing this number in steps until your computer needs about one second to generate a hash. This way you force the cracker to spend about one second for each combination of words that he wants to try. If you hash your password once, a cracker would need a specific time to find it. If you iterate the hashing 100'000 times, the cracker needs 100'000 times more time to find your password. My test system can generate 730'000 hashes per second. I consider a second is fast enough to generate the hash each time I have to use the password, but really slow for a cracker. Some encryption programs use some sort of hashing as described here, but to be sure you should do this yourself directly to the password. The effect is the same, but if the encryption program (where you will use the password) also uses user password hashing, it's even better because crackers will need twice the amount of time to crack your password.
Unknown iterations Note that the cracker doesn't know what number of iterations you used, and this means that he needs to waste even more time for each combination he tries. However, this only works if the encryption algorithm itself uses iterations to initialize the encryption. If the encryption algorithm starts directly with the user's password, the cracker only needs to test each choice for the number of iterations of the hash, one after another. He would not need to start over for each choice because the hashes are consecutive for each iteration. But, if the encryption algorithm also uses some iterative process (which starts in a different way than the hashing algorithm used by the user), the cracker does need to start over this iterative process for each password supplied to the encryption algorithm. An unknown number of iterations if good for one thing: it increases the memory necessary to store pre-computed hashes. The bottom line is that it's not worth to bother using some hidden number of iterations, which you would have to remember, because you are safer using one more word or biometric information.
Putting it all together Remember that the cracker could pre-compute hashes? Even though he needs a lot of memory to store the pre-computed data, he could still find enough memory to do this for passwords up to 4 words. After all, how many people would use longer passwords. At this point, your biometric information comes in. By adding your biometric information to your text password, you make it longer without making it harder for you to remember it. Sure, the cracker would probably know your biometric information, but he doesn't know which information you actually use and how you arrange it. The cracker would also be unable to pre-compute the hashes for all text passwords which are combinations of (say) 4 words, because these passwords are no longer the same for all people in the world. Each of these combinations becomes personal because it is added your biometric information. Thus, the cracker would need even more time and memory to pre-compute the hashes, and at this point his technical abilities are left in dust by his needs. In order to thwart pre-computing, the program which performs the hashing should concatenate (to the user's password) a predefined text, text which would be changed periodically; old values must be preseved in the program, in an option field.
Root passwordOne nice thing about iterated hashing is that you can use the same root password to login to different websites, if you add something different for each website (like its name). For example, use the password "Root password used First website" for the website "First website", and the password "Root password used Second website" for the website "Second website". This works because the hash that you use for one website can't be used to deduce the hash you use for another website. To do this you need the starting point, and that can be obtained only through dictionary attack. So, even though each website knows the hash you use to login to it, without your root password they can't obtain the hashes you use for the other websites. But, use this with caution. If your text password is obtained from one hash, using a dictionary attack, all you other passwords with the same root are exposed.
Plain explanationEncryption algorithm strength is never measured in entropy, but in processing power required to crack it by brute force or dictionary attacks. Password strength is measured in entropy. For the same password entropy, encryption algorithms will be weaker in 100 years from now (because computers will be faster). The entropy of passwords is what makes pre-computing of encryption algorithm solutions more difficult.
What does "iterated" mean? An iterative process is a cycle. For example adding 1 iteratively N times would give you a total N. It's ((1 + 1) + 1) + 1... Basically, each time you perform the operation again, you do it on the previous result. In the case of addition, there is a formula which gives you the result right away (= N * 1), but there are functions (like hash functions) which don't have a reduction formula, so you are forced to calculate all N steps. This takes time.
Do I have to remember all that nasty looking digits and letters? No. If your text password is "SomePassword", you need to remember only this. But every time you have to encrypt something, you hash it and use the result as password (for 100'000 iterations you get "5D6A E83B F0EE BDDE 350C 9999 99EE F5F7 8BC0 A934"). It only takes an extra second for you.
Anything like this is cracked once the method is known because the space to look in is small and the hashing doesn't increase the entropy. True, but that's not easy to do. If you consider that there are about 10'000 in a dictionary, for a password of 3 words there are 10^12 combinations. The cracker needs to try (virtually) all combinations to find out your text password. Let's say he can generate 10^10 (non-iterated) hashes per seconds. Thus, he needs 100 seconds to try all passwords. Now, here is the thing. Let's say your text password is "stuff flying around". It you hash it iteratively 100'000 times, you get "1DC8 42C7 E7D3 3D22 3774 7310 1F2A 8C18 18B7 AD03". You can get this done in real-time (= less than a second). Then you use this hash to encrypt what you want. Obviously, the cracker can't try all combinations of the hash because he would start from 0 and would need a lot of time to get to the hash you used. So, what can he do instead? He can try all combinations of 3 words, but he must also hash each of them 100'000 times. So instead of 100 seconds, he needs 100 * 100'000 seconds (which is 10^7 seconds) to find your text password. At this point you could say that the cracker can pre-compute the hashes. True, but here is where the user's biometric information kicks in. The user puts his name in the password: "stuff flying around john doe". For the user, this is very easy to remember because it's information he uses all the time, or information which he can retrieve at any time (like his height). Yet, the cracker is confronted with a problem: he can no longer use the pre-computed passwords made of 3 words. He is forced to pre-compute more passwords, and this requires more time and memory. The name alone would be equivalent with one more word; your birth date the same.
Okay, but why not simply use more random words? Because people can't remember too many unrelated words. Using this method, you get the equivalent of 4 more words with no effort.
Okay, so this works, but that's only if nobody knows on your personal information. It doesn't matter if anyone knows or not your biometric information. The biometric information is there only to make it (almost) impossible for the cracker to pre-compute the hash. It would simply take him too much time and memory.
So, if the cracker would know that my text password is either "1" or "2" and I hashed it 100'000 times, he would have to compute 100'000 hashes in order to find out the hash password (the one I actually used for encryption). Thus, even though I only have 1 bit of entropy in my text password, the cracker would have to do a lot of work to find it. Right. Hashing doesn't increase the entropy, but has a similar effect. It's all about processing power. Consider this, if I would use my computer to make an encryption and you would know I only have to choices for my password, but you would have to do the decryption manually, would it matter the entropy? No, it's all about processing power. Of course, having just two choices for my text password means that the cracker can pre-compute the solutions of the encryption (for these two passwords). This is the reason for the addition of biometric info. The cracker can not pre-compute the hashes before he knows who to attack, because he doesn't know the biometric information of the person who has to be attacked.
So, the goal is to force the cracker to do more work. This can be accomplished either by increasing entropy or by adding computational complexity (or a combination of both). Right! Of course, it's a matter of balance: how much time do you need to generate the hash compared with how much time does the cracker need.
ConclusionWhen you need to use a password for an encryption program (like your PGP passphrase), password which must be strong and easy to remember, but must not be stored, do the following:
Use as password the text which results from this. You don't have to worry about losing the hashed password because it can be reproduced from the text password you remember. Simply apply the hash algorithm again on your text password. Don't write down the hash! Every time you need to use the password (for example, to encrypt something), hash your text password again and use the result. Don't try to remember the hash, and don't store it. Note that in order to reproduce the hash of your text password, you must remember and use both the text password and the number of hash iterations. |
||||
|
|
||||