All hail the humourous titles. The cryptography definition of salt, as taken from Wikipedia.org, is that salt consists of random bits used as one of the inputs to a key derivation function. In more human terms, a salt is probably your best friend.
The weakest point to any one-way hash algorithm, such as the MD5 and SHA series, is that one string will always generate the exact same algorithm string - no matter how many times you hash it. This leads to a major downfall in that common words have common hashes. Take the following code which simply hashes the string buddha and hands us back a MD5 hash.
The resulting hash is 4b3bd325788f47666ec36669c8aaf5d0. MD5 is a 32-bit algorithm which typically means we have 32 characters in the base 16 numbering system, hexadecimal, giving us 32 characters from 0 to F.
No matter how many times I execute that PHP script, the resulting hash for buddha will always be the previously stated 32 bit value. From this unavoidable downfall, many databases have been created that will accept a 32 bit MD5 hash, or a 40 bit SHA1 hash, or even a 256 bit SHA256 (although very rare that you'll find a database that stores these due to their rarity in common applications) hash, and give me back the word. No Albert Einsteins were resurrected for any mathematical equation into reversing the hash. As of the time of writing, MD5 and SHA1 are still both irreversible despite their source code being readily for the entire world and its gremlins to see. All it has done is performed a simple query on its database:
myHash = '4b3bd325788f47666ec36669c8aaf5d0'
The result is the word buddha because buddha always generates the exact same hash - the one stored in the database. If that hash had been taken from a member table, I would now know the individual's password and be able to login immediately. This is where salt changes all of that.
You see, if the stored MD5 hash was scrambled in such a way that it would give me a nonsensical word such as !7ex?buddha then the resulting hash would return null when entered into a database of commonly stored hash strings. Notice that my password, buddha, is still in the string, but hashing the aforementioned string would give me a totally different hash algorithm: 936cc740307df25a1aca31b3afda2d3d.
You may now be asking yourself how, if my password is buddha, is a user going to remember his new password that we have constructed for him. The answer is, he doesn't. He still has to remember his entered password but we can append the rest when he attempts to login. The only aspect we are concerned about is having a unique hash to prevent individuals from somehow stealing our hash strings from the database and then cheekily accessing all our members' accounts.
In the example that I have crafted, !7ex?buddha, the bold part would become our salt. Whereas the word, buddha, would be our normal password that we enter. Upon logging in we append !7ex? to the beginning of buddha and MD5 them together. The hash would match the one in our database if we hashed !7ex? with buddha in the first place when the user registered. However, it would not match any common MD5 strings and that's definitely a big plus.
This will give us the exact same hash as we previously hashed: 936cc740307df25a1aca31b3afda2d3d. The crafty thing is, even if an attacker knows what the salt is, it's pretty useless because neither MD5 nor SHA1 can be reversed. Removing the salt is relatively impossible and would take an exceptionally fast machine - we do have those capabilities in the super-computers around the world, but this is outside the scope of the article.
If you can find a way to transform 936cc740307df25a1aca31b3afda2d3d into 4b3bd325788f47666ec36669c8aaf5d0 then you're wasting your life away on the Internet! You may be able to do it once in this instance, but coming up with an algorithm that transforms them all into their common hash, by removing the salt, is certainly a little more tricky.
Speaking only in essence, there are two ways of determining a hash. See our article on randomising values for generating a random hash. The 2 ways are as follow:
Static hash: the hash is always the same for every password generated.
Dynamic hash: each user has a unique salt which is stored in the database alongside their hashed password.
Using the static option, guessing one salt, although pretty harmless, would mean you know every single salt. However, with the dynamic salt, you would need to acquire the salts individually, because everyone has a different salt. The dynamic option would also require a little extra load on your query, but nothing at all major to cause any concerns. We could achieve the login SQL statement like so:
password = MD5(CONCAT(salt, 'buddha'))
This would accept out password, buddha, and conflate it to the salt and MD5 it. If it is found then it would return the username, Wildhoney.
As you can see, our hash string is now undecipherable. Placing the hash into a website that checks against thousands, if not millions, of commonly used hashes is not going to make any different what-so-ever. In fact, the only security issue you have to worry about is how the cheeky little devils got the hash algorithm in the first place!
Incidentally, this article actually stemmed from an article I clicked through to from Pixel2Life. In the guy's tutorial he mentioned the following. Note that I've quoted exactly how he wrote it - no humourous modifications.
$password = 'dog'; // lets pretend the password is dog - it is very bad to keep it in the database like this.
$password = md5('dog'); // slightly better, but can be decrypted easily.
$password = sha1(md5(md5(sha1(md5(sha1(sha1('dog'))))))); // much better, hackers would be quite good to decrypt that.
Now what's so bad about this? Well, apart from the fact that hash algorithms inside hash algorithms inside hash algorithms has the likely possibility of causing all sorts of unwanted issues, such as a significant increase in the chance hash collisions. But the most worrying part of the script is that it would considerably slow down the script itself. I can't put it any simpler, but the code is both unnecessary and idiotic. The surprising thing is that I found that code in an article that was aptly named, Top 10 Beginner Mistakes in PHP - Learn about common coding errors in PHP. Now if that was me, that would be number 1.
Note: I would link to the article itself but it's not even worth a back-link.
As you can clearly see from the article, a cryptographic salt prevents a MD5 or SHA1 algorithm from outputting common hash strings if the string they receive to hash is a common word. Often a dictionary word. Although in itself salt cannot prevent brute force attacks, they do prevent people who have access to the algorithms from deducing the password.
The man who comes back through the Door in the Wall will never be quite the same as the man who went out.