The weakest point to any one-way hash algorithm, such as the MD5 and SHA series, is that one string will always generate the exact same algorithm string - no matter how many times you hash it. This leads to a major downfall in that common words have common hashes. Take the following code which simply hashes the string buddha and hands us back a MD5 hash.
No matter how many times I execute that PHP script, the resulting hash for buddha will always be the previously stated 32 bit value. From this unavoidable downfall, many databases have been created that will accept a 32 bit MD5 hash, or a 40 bit SHA1 hash, or even a 256 bit SHA256 (although very rare that you'll find a database that stores these due to their rarity in common applications) hash, and give me back the word. No Albert Einsteins were resurrected for any mathematical equation into reversing the hash. As of the time of writing, MD5 and SHA1 are still both irreversible despite their source code being readily for the entire world and its gremlins to see. All it has done is performed a simple query on its database:
SELECT myWord FROM myTable WHERE myHash = '4b3bd325788f47666ec36669c8aaf5d0'
You see, if the stored MD5 hash was scrambled in such a way that it would give me a nonsensical word such as !7ex?buddha then the resulting hash would return null when entered into a database of commonly stored hash strings. Notice that my password, buddha, is still in the string, but hashing the aforementioned string would give me a totally different hash algorithm: 936cc740307df25a1aca31b3afda2d3d.
You may now be asking yourself how, if my password is buddha, is a user going to remember his new password that we have constructed for him. The answer is, he doesn't. He still has to remember his entered password but we can append the rest when he attempts to login. The only aspect we are concerned about is having a unique hash to prevent individuals from somehow stealing our hash strings from the database and then cheekily accessing all our members' accounts.
In the example that I have crafted, !7ex?buddha, the bold part would become our salt. Whereas the word, buddha, would be our normal password that we enter. Upon logging in we append !7ex? to the beginning of buddha and MD5 them together. The hash would match the one in our database if we hashed !7ex? with buddha in the first place when the user registered. However, it would not match any common MD5 strings and that's definitely a big plus.
If you can find a way to transform 936cc740307df25a1aca31b3afda2d3d into 4b3bd325788f47666ec36669c8aaf5d0 then you're wasting your life away on the Internet! You may be able to do it once in this instance, but coming up with an algorithm that transforms them all into their common hash, by removing the salt, is certainly a little more tricky.
Speaking only in essence, there are two ways of determining a hash. See our article on randomising values for generating a random hash. The 2 ways are as follow:
- Static hash: the hash is always the same for every password generated.
- Dynamic hash: each user has a unique salt which is stored in the database alongside their hashed password.
Using the static option, guessing one salt, although pretty harmless, would mean you know every single salt. However, with the dynamic salt, you would need to acquire the salts individually, because everyone has a different salt. The dynamic option would also require a little extra load on your query, but nothing at all major to cause any concerns. We could achieve the login SQL statement like so:
SELECT username FROM members WHERE password = MD5(CONCAT(salt, 'buddha'))
As you can see, our hash string is now undecipherable. Placing the hash into a website that checks against thousands, if not millions, of commonly used hashes is not going to make any different what-so-ever. In fact, the only security issue you have to worry about is how the cheeky little devils got the hash algorithm in the first place!
Incidentally, this article actually stemmed from an article I clicked through to from Pixel2Life. In the guy's tutorial he mentioned the following. Note that I've quoted exactly how he wrote it - no humourous modifications.
Note: I would link to the article itself but it's not even worth a back-link.
As you can clearly see from the article, a cryptographic salt prevents a MD5 or SHA1 algorithm from outputting common hash strings if the string they receive to hash is a common word. Often a dictionary word. Although in itself salt cannot prevent brute force attacks, they do prevent people who have access to the algorithms from deducing the password.