All hail the humourous titles. The cryptography definition of salt, as taken from Wikipedia.org, is that salt consists of random bits used as one of the inputs to a key derivation function. In more human terms, a salt is probably your best friend.
The weakest point to any one-way hash algorithm, such as the MD5 and SHA series, is that one string will always generate the exact same algorithm string - no matter how many times you hash it. This leads to a major downfall in that common words have common hashes. Take the following code which simply hashes the string buddha and hands us back a MD5 hash.
The resulting hash is 4b3bd325788f47666ec36669c8aaf5d0. MD5 is a 32-bit algorithm which typically means we have 32 characters in the base 16 numbering system, hexadecimal, giving us 32 characters from 0 to F.
No matter how many times I execute that PHP script, the resulting hash for buddha will always be the previously stated 32 bit value. From this unavoidable downfall, many databases have been created that will accept a 32 bit MD5 hash, or a 40 bit SHA1 hash, or even a 256 bit SHA256 (although very rare that you'll find a database that stores these due to their rarity in common applications) hash, and give me back the word. No Albert Einsteins were resurrected for any mathematical equation into reversing the hash. As of the time of writing, MD5 and SHA1 are still both irreversible despite their source code being readily for the entire world and its gremlins to see. All it has done is performed a simple query on its database:
myHash = '4b3bd325788f47666ec36669c8aaf5d0'
The result is the word buddha because buddha always generates the exact same hash - the one stored in the database. If that hash had been taken from a member table, I would now know the individual's password and be able to login immediately. This is where salt changes all of that.
You see, if the stored MD5 hash was scrambled in such a way that it would give me a nonsensical word such as !7ex?buddha then the resulting hash would return null when entered into a database of commonly stored hash strings. Notice that my password, buddha, is still in the string, but hashing the aforementioned string would give me a totally different hash algorithm: 936cc740307df25a1aca31b3afda2d3d.
You may now be asking yourself how, if my password is buddha, is a user going to remember his new password that we have constructed for him. The answer is, he doesn't. He still has to remember his entered password but we can append the rest when he attempts to login. The only aspect we are concerned about is having a unique hash to prevent individuals from somehow stealing our hash strings from the database and then cheekily accessing all our members' accounts.
In the example that I have crafted, !7ex?buddha, the bold part would become our salt. Whereas the word, buddha, would be our normal password that we enter. Upon logging in we append !7ex? to the beginning of buddha and MD5 them together. The hash would match the one in our database if we hashed !7ex? with buddha in the first place when the user registered. However, it would not match any common MD5 strings and that's definitely a big plus.
This will give us the exact same hash as we previously hashed: 936cc740307df25a1aca31b3afda2d3d. The crafty thing is, even if an attacker knows what the salt is, it's pretty useless because neither MD5 nor SHA1 can be reversed. Removing the salt is relatively impossible and would take an exceptionally fast machine - we do have those capabilities in the super-computers around the world, but this is outside the scope of the article.
If you can find a way to transform 936cc740307df25a1aca31b3afda2d3d into 4b3bd325788f47666ec36669c8aaf5d0 then you're wasting your life away on the Internet! You may be able to do it once in this instance, but coming up with an algorithm that transforms them all into their common hash, by removing the salt, is certainly a little more tricky.
Speaking only in essence, there are two ways of determining a hash. See our article on randomising values for generating a random hash. The 2 ways are as follow:
Static hash: the hash is always the same for every password generated.
Dynamic hash: each user has a unique salt which is stored in the database alongside their hashed password.
Using the static option, guessing one salt, although pretty harmless, would mean you know every single salt. However, with the dynamic salt, you would need to acquire the salts individually, because everyone has a different salt. The dynamic option would also require a little extra load on your query, but nothing at all major to cause any concerns. We could achieve the login SQL statement like so:
password = MD5(CONCAT(salt, 'buddha'))
This would accept out password, buddha, and conflate it to the salt and MD5 it. If it is found then it would return the username, Wildhoney.
As you can see, our hash string is now undecipherable. Placing the hash into a website that checks against thousands, if not millions, of commonly used hashes is not going to make any different what-so-ever. In fact, the only security issue you have to worry about is how the cheeky little devils got the hash algorithm in the first place!
Incidentally, this article actually stemmed from an article I clicked through to from Pixel2Life. In the guy's tutorial he mentioned the following. Note that I've quoted exactly how he wrote it - no humourous modifications.
$password = 'dog'; // lets pretend the password is dog - it is very bad to keep it in the database like this.
$password = md5('dog'); // slightly better, but can be decrypted easily.
$password = sha1(md5(md5(sha1(md5(sha1(sha1('dog'))))))); // much better, hackers would be quite good to decrypt that.
Now what's so bad about this? Well, apart from the fact that hash algorithms inside hash algorithms inside hash algorithms has the likely possibility of causing all sorts of unwanted issues, such as a significant increase in the chance hash collisions. But the most worrying part of the script is that it would considerably slow down the script itself. I can't put it any simpler, but the code is both unnecessary and idiotic. The surprising thing is that I found that code in an article that was aptly named, Top 10 Beginner Mistakes in PHP - Learn about common coding errors in PHP. Now if that was me, that would be number 1.
Note: I would link to the article itself but it's not even worth a back-link.
As you can clearly see from the article, a cryptographic salt prevents a MD5 or SHA1 algorithm from outputting common hash strings if the string they receive to hash is a common word. Often a dictionary word. Although in itself salt cannot prevent brute force attacks, they do prevent people who have access to the algorithms from deducing the password.
The man who comes back through the Door in the Wall will never be quite the same as the man who went out.
For passwords systems, I much prefer sha1, easy and uncrackable. sha1 uses a destructive, but consistent algorithm, it has been reverse engineered, but it still leaves an infinite amount of possibilities. More then one sha1 is not needed.
Very good tut. I don't usually register to sites, but after finding this one on pixel2life I felt like I wanted to respond, so now you have a new member. I too read the other tutorial found there about encrypting your strings with a ton of md5's and sha1's and did a legit spit take all over my monitor. Don't worry, it's clean now. The way I do my salts is I concatenate the timestamp from when they registered to their password. Before reading this I would do 2 queries, one to get the timestamp, then concat, md5, then another query to check against the hashed password in the database. Now thanks to this tut, I know that you can md5 something right in your query. Strange I've never come across that before in any of my books or other tutorials, so thank you very much, you saved me 1 query and a couple of micro seconds, which is always a plus. Keep up the good work.
That system hasn't cracked SHA1, it seems to be a comparison system...
If you go here and create an alphanumeric string with lower and upper case characters say about 6 characters long and make an SHA1 hash of it then try to decrypt it with the one on undoMD5, it will more than likely not be able to to it. But if you first enter those same characters into undoMD5; create a hash of it and then try to decrypt it, it will find the characters.
Thats all it does is search for the entered characters and bring back the hash from another field. The hash was created and stored in the database along with the string you wanted to encrypt when you first created a hash, so when you go to decrypt the hash thats all it does is search for the hash and bring back the plain text characters for that hash. It's hardly "decrypting" anything, it's kind of false advertising;). The more plain text characters entered on that site the better it becomes and finding the hash.
Not at all, its simply a database comparison, not cracked at all. I could write a script that complex in 20min, the ajax would take longer then the php.
It is literally impossible to crack sha1, it is a destructive algorithm. Have you ever noticed all hashes, no matter how long are the same length? The closest you can come to hacking it is reverse engineering the algorithm, leaving an unlimited number of possibilities, a large number of practical possibilities. If you know the format of what it is (such as a credit card number), you might be able to crack it. Therefore it is not safe to keep credit card numbers with it, you use 256 bit encryption for that.
A group in china is rumored to have found a process to reverse engineer, but I dont know if its true.