Home : Blogs :

? Help Post Blog in Response to Blog Passwords

On 18 Nov 2018 at 18:09:04 user 'jcobban' wrote: Passwords

You all know the recommendations about managing your passwords:

  • Do not use ordinary dictionary words. In particular do not use 'password' as your password!
  • Do not use the same password for multiple sites
  • Use a mixture of character types: upper case, lower case, numbers, and special characters.

These may seem annoying at times but they all represent guidelines which are just the simplest rules required to protect not just to protect your own privacy, but the privacy of the other users of your web-sites.  In this essay I intend to explain how your password may be compromised and how once your password is compromised not only are you at risk, but everyone else on the sites you use are at risk.  Using good passwords is like getting vaccinated against common childhood diseases: it protects not just you but others who may become infected because you are carrying the infection.

The risks addressed by the advice arise from the way that hackers work to break into sites.

The most common form of attack on a site is simply trying common passwords in combination with e-mail addresses.  As you know from the amount of SPAM which comes your way there is no way to keep e-mail addresses secret, and most people have user names which are closely related to their e-mail addresses. Furthermore if possible most clients try to use the same user name on all of their sites.  If a client has both the same user name and the same password on multiple sites then it is like leaving the door key under the welcome mat because hackers just try that combination on as many sites as possible.

The really dangerous form of attack is the type that results in the penetration becoming front-page news around the world.  This is the situation where the hackers manage to steal a copy of the sites client database.  Because information is so easy to move these days this sort of theft is unfortunately all too common.  This sort of attack is particularly dangerous because the hackers do not have to guess at the user names, they appear in the clear in the database. Because this sort of theft is so common of course the passwords are not stored as they are entered. Instead the site uses a mathematical transformation of the password which generates an extremely large number and only that number, called a hash, is stored.  Your password is validated by applying the algorithm to the value you supply and then comparing the resulting number to the saved hash. 

The special class of formulae used for generating a hash for a password have two important characteristics:  There is no reverse formula that can retrieve the original password from the computed number and the algorithm is extremely unlikely to generate the same number given two different passwords.  If there were hackers once in possession of a copy of the user database could just apply the reverse of the algorithm to extract all of the passwords.  The  numbers generated by these hash algorithms are typically enormous, most of the algorithms in use today generate numbers which would require over 75 decimal digits to represent!  The numbers are usually represented as 64 hexadecimal digits.  The first characteristic of the hash algorithm means that even the administrators of a site have no way of knowing what your password is.  That is why the only option you are given if you forget your password is to change your password.  The second characteristic avoids what are called "collisions" which would assist hackers in identifying the exact algorithm used to generate the hash values.  Hackers identify the algorithm by assuming that there is at least one user who is using a common password.  The hackers have a large file which contains the hash values generated by all of the combinations of common passwords and common algorithms. All the hackers need is to find one matching hash value and they know not only the password for that particular account, but also the algorithm which was used to obfuscate all of the passwords on the site.  This still does not give them access to all of the accounts on the site, but it does give them access to all of the accounts that are using weak passwords.

The site can protect against this sort of attack by simply appending a secret string to the password before applying the algorithm.  That is because the hackers must determine not only what algorithm is used to generate the has, but also what the secret string is.  In particular any site which collects financial information about its customers, for example credit card numbers required to pay for services provided by the site, must take extra precautions because of the potential of fraud.  Of course hackers are more likely to attack such sights because otherwise there would be no way to recover the costs of breaking into the site.

So if you use a common password and the site uses one of the common hash algorithms you are increasing not just your own risk but the risk to everyone else on the site, and the risk to all of your other accounts on other sites.  So in particular avoid any of the 10,000 most commonly used passwords as available from the site PasswordRandom.com.  This list contains actual passwords which were broken by hackers.  Note that the most common password is 'password'!

Which brings us to the question of what constitutes a good password.  The 3rd guideline at the top of this page gives a general impression, but there is actually a mathematical formula that can be used to determine how good a password is.  That number is called the "entropy" of the password and is the number of different passwords that can possibly exist using the type of characters used for the password and the number of characters in the password.  That is the total number of characters in the chosen set raised to the power of the length of the password.  For example if you construct an 8 character password using only lower case letters the entropy is 268 or 208,827,064,576.  That may seem like a large number but hackers apply hundreds or even thousands of computers working in parallel and they can try that many passwords in a few hours.  This is why you are advised to use mixed case, numbers and special characters.  There are 95 characters that can be easily typed on an American English keyboard, so if the hackers must assume that any of those 95 characters may be used in a password then they must try 6.6342043e+15 passwords which takes almost 32,000 times longer!  One of my pet peeves is sites that arbitrarily restrict which characters may be included in a password, for example by explicitly defining that only some of the 31 special characters in the basic ASCII code page may be used.  For example while it will probably avoid confusing results to permit a password to either begin or end with a space, there is no reason to avoid a space in the middle of a password, for example "To be or Not to be" is unlikely to be tried by a hacker because almost no sites permit a space anywhere in a password.  But beyond that there is actually no particular reason to limit the characters used in a password to the 95 defined by the basic ASCII code page, particularly if you are not an English speaking user.  The algorithm used to construct the hash number works on any data. For example one of the popular hash algorithms is used to verify that very large datafiles, for example DVD images, have not been corrupted in transmission.  Multilingual text is normally represented as a string of data bytes using multiple characters for each character which is not part of the ASCII code page.  This is called Universal Coded Character Set Transformation Format – 8-bit or UTF-8 for short.  UTF-8 is not only used for foreign alphabets but also for emojis, Egyptian hieroglyphics, Mayan hieroglyphics, Norse runes, Chinese, Japanese, Thai, Tamil, Hindi, Arabic, Hebrew, and every other language representation you can think of (except for Klingon).  Since the hash algorithm works on any data it will work on UTF-8, except that most web-sites will not permit you to use it. As of June 2018 UTF-8 can represent 137,439 different characters. it is left as an exercise for the reader to calculate how much 137,439 raised to the 8th power is.

The best way to obtain a strong password, that is one with both high entropy and not likely to be guessed even by someone who knows your personal history and could therefore guess what you might use in your password, is to use a random password generator. There are many sites on the Web that will do this for you.  Many of them will generate multiple passwords at a time:  just copy them to your computer and delete them from the list each time you change the password on one of the sites that you use.  I actually recommend that anyone administering a site include a random password generator button on the change password page of their site.  In particular if you click on the "contribute" button at the top right hand of every page on this site and register as a user of the site you will see both the effect of an algorithm which computes the "entropy" of any password you type in and a button which generates a new random password.

But, you say, it is too hard to manage a different password on every service you use.  Not really.  Your browser will remember your passwords for you and store them in a file on your computer, cell phone, or tablet that you can view while you are signed on.  As long as sites use authentication on the signon page these stored passwords are safe. That is because the authentication mechanism verifies the identity of the remote site, for example that the site claiming to be your bank really is your bank, before it makes the password information available to the dynamic functionality sent to your browser by the remote site.  This is important because hackers often create fake web-sites with all the expected logos et cetera and then send the links to those sites in phishing e-mails.  You can ask your browser to display all of the username and password information which it has collected.

Any page used to sign on to a site should use encrypted transmission, identified by a URL starting with https:.  This means that every message exchanged between you and the site is encrypted and therefore extremely difficult for anyone, even the National Security Agency, to intercept and read.  Your browser will also show an indicator that your connection is secure, for example a green padlock.  Even if you are using a public wi-fi hotspot, where in theory a hacker could be monitoring your Web traffic, this encryption protects you. Never supply a username or password except to a web-site that uses encryption.

In particular never enter any private information in an ordinary e-mail because in most cases e-mails to not use encryption because e-mails are sent hop by hop around the world and there is no end-to-end session to exchange encryption information.  Furthermore the reason there is so much e-mail SPAM is because even e-mails which encrypt the body of the e-mail cannot encrypt the addressing information.  That is similar to conventional or snail-mail.  In snail-mail the contents of the letter is largely protected because it is sealed into an envelope, and in any event could be an encrypted message, but the destination and return addresses on the envelope must be readable by every post office along the path to the addressee.

So to communicate with anyone, for example the support department at a supplier, it is best to use a web form because that delivers the information directly from you to the addressee along a secure encrypted path. Similarly it is safer to communicate with anyone using the features of a social-media web-site like Facebook, than by sending an e-mail.  You can communicate with the author of this page in such a secure way by clicking on the Contact Author link at the bottom of this page.

James Cobban


Follow-up:

You are not signed in as a registered user of this site. Any material that you enter will be associated with your e-mail address.

This field is used to provide a title or summary of the contents of the blog. If this is omitted the first line of the message body is used as the subject.
This field is used to edit a message to be posted as a blog entry against this individual.
If you are not signed in as a registered contributor to the web-site you are required to supply an e-mail address to identify the source of any blog messages you post. This e-mail address will be exposed in the posted message.
Click on this button to post the message you have entered. The application also supports the keyboard shortcut Alt-B.
Click on this button to signon to access extended features of the web-site or to manage your account with the web-site.