I’ve done some research on hashing and information security at the Fraunhofer CESE. After some recent data record breaches, I thought it would be relevant to share some security considerations for users and administrators.

User account databases are common targets for hackers that are frequently breached. To prevent attackers from gaining access to user accounts, user passwords should never be stored in plain string form, whether in a text file or a database. Passwords are generally encrypted with a hashing algorithm, such as one of the SHA family, so that even if an attacker were to gain access to the database, they would not be able to access the passwords directly.

As with any security measure, attackers have found ways to retrieve user passwords, despite hashing, through clever methods like rainbow tables. Database administrators have responded in kind by introducing their own countermeasure to these attacks: adding salt, an artificial addition to a password.

Hash cracking: attacks

Brute force

The simplest form of any attack is the brute-force attack. For hash-cracking, this means taking a set of characters, hashing it, comparing this guess hash to the hash being cracked, and repeating this for every set of characters until a match is found.

Because the entire possible password space is explored a brute-force attack technically guarantees a solution, meaning it cannot be prevented. However, it can be made ineffective to the point of being impractical to even try by increasing the time it takes. One common way of doing this is by making the hashing algorithm slower, requiring the attacker to spend more time computing hashes.

Another way to make brute-force attacks less effective is to make passwords longer. Modern processors can compute the hashes for all 3 character passwords very quickly, but the time for computing all hashes for x characters increases exponentially with x.

Jeff Atwood did some experimenting in 2012 with his own hardware (i.e. two ATI Radeon 7970 GPUs, which is pretty impressive for a personal desktop but nothing compared to what a dedicated cracking rig would have in 2016) and came to the conclusion that cracking a 12 character password hashed with MD5 (which is now considered to be insecure, even) takes about 75 days - very impractical for an attacker for a single password. Even if they get access to the database, is it worthwhile to wait 75 days to crack a single password?

Lookup and Rainbow tables

Attackers got around the problem of computation time by precomputing the hashes and putting them in a lookup table. This allows attackers to quickly retrieve the corresponding password for a hash they find in the breached database.

Because the size of the possible password space could be trillions, lookup tables are memory expensive to the point where it’s impractical to be able to store every single possible password. Attackers then traded off hash-cracking speed for more storage space, allowing the table to store more passwords and increase its effectiveness. A lookup table with this tradeoff is called a rainbow table.

The idea of a rainbow table is terrifying to a database administrator. Up to a point limited by storage space rainbow tables can crack any password with 100% success. There exist MD5 rainbow tables that can crack all passwords up to 8 characters in length.

Lookup and rainbow tables are giant monsters that threaten the peace and security of database records across the internet. Luckily, these monsters happen to be giant slugs - and salt defeats the slug.

Adding salt

A salt is a string that is appended or prepended to a password before it is hashed.

salt("weakpassword") -> "NaCl-weakpassword"

By changing a string right before hashing, the mappings between passwords and hashes are changed completely. Rainbow and lookup tables are made useless because all of their precomputed mappings no longer apply!

Proper salting

  1. A salt should be unique to each user. If there were a single salt for every password, the attacker could just precompute a lookup table with the salt. Randomization and uniqueness prevents this.
  2. A salt doesn’t need to be hidden, just attached to the user. Salts protect against lookup tables simply by preventing precomputation. It is recommended to store the salt in the user entry in the database.
  3. A salt should be generated randomly using a cryptographically secure pseudo-random number generator (CSPRNG). Most programming langauges come with its own implementation of a CSPRNG (e.g. Java’s SecureRandom).
  4. A salt should be sufficiently long to prevent attackers from creating lookup tables for all possible salts (possible to do for a short salt). A rule of thumb is that the salt should be at least the same size as the output of the hash function. For example, when using SHA-512 (a 256-bit hash), the salt should also be at least 256 bits in length.

That’s all nice; but what’s a person to do?

As an administrator, choose a good password encryption function. The defacto standard for password hashing right now is bcrypt. Despite it being around for over 10 years and its cipher for over 20, it is well vetted and tested and is accepted by the security community as standard. Not only does it salt its passwords, it has a cool adaptive feature where it varies hashing speed with iteration count: more attempts, and the algorithm will take longer, making it more resistant to brute-force attacks. It also comes in a couple of libraries for common languages technologies, such as Ruby and node.js. PBKDF2 and scrypt are both good alternatives. Stay far away from MD5, however; it is widely considered to be insecure.

As a user, choose a password that is at least 12 characters. The 12 character length isn’t only suggested by Jeff Atwood, but also the Georgia Tech Research Institute and Taylor Hornby of Defuse Security.

Using a sentence as a password

It is widely suggested that you should choose a sentence as your password, rather than a cryptic and complicated sequence of special characters. Not only is it easier for a person to remember, it can be pretty long and secure. This suggestion isn’t only made by a popular xkcd; researchers at the Georgia Tech Research Institute also agree.

Of course, progress in security is always met with new techniques for attack. Numerous papers have found that the trend towards sentence and phrase-like passwords has actually resulted in lower password security because attackers can now anticipate passwords to have grammar, and thus certain patterns of words, making them easier to attack. The effectiveness of their grammar-based guessing schemes is pretty alarming.

The Weir approach cracked significantly more passwords than John the Ripper:

In one series of experiments, training on a set of disclosed passwords, our approach was able to crack 28% to 129% more passwords than John the Ripper, a publicly available standard password cracking program.

The University of Ontario did even better than Weir:

The Weir approach is in fact the worst, with our approach cracking approximately 67% more passwords than it.

Carnegie Mellon actually used Parts-of-Speech tagging and tried to identify a relationship between sentence structure (i.e., it’s tag-rule) of a password and its difficulty to crack:

From the guessing effort and time estimates, we see that passphrase strength is not a direct function of the number of words or characters in the passphrase. The passphrase “Th3r3 can only b3 #1!” has more words than “Hammered asinine requirements.”, but is one order of magnitude weaker. Similarly, “Hammered asinine requirements.” has more characters than “Superman is $uper str0ng!”, but is one order of magnitude weaker. Underlying structures and not just the number of characters or words determine the strength of a passphrase.

Unhelpfully, it seems that they were unsuccessful in finding such a relationship:

More research is necessary to fully understand the effect of structures on long passwords.

So it seems that using a sentence or phrase as a password is quickly becoming a security mistake. Though these are recent developments, there is no doubt that attackers are developing new cracking methods based on grammar identification and tag-rules. So what does that mean we should do about our passwords?

Bruce Schneier suggests taking a sentence or phrase, but abbreviating it in a non-uniform way:

My advice is to take a sentence and turn it into a password. Something like “This little piggy went to market” might become “tlpWENT2m”. That nine-character password won’t be in anyone’s dictionary.

This is excellent: though it is a little harder to remember and not quite as long as a simple sentence password, it eliminates grammar as information that attackers can use to more efficiently guess passwords.

The bottom line for your passwords

Use the Schneier method to generate your own passwords that are at least 12 characters in length, being sure to mix up abbreviations, words, capitalization, and special characters. Doing so should keep your passwords resilient against cracking threats for the forseeable future - at least, until a new development in the security world results in a new type of attack!