I wrote some articles about hash collisions a while back. Short introduction: A cryptographic hash-function is a function where you can put in any data and you'll get a unique, fixed-size value. »unique« in this case scenario means that it's very hard to calculate two different strings matching to the same hash value. If you can do that, the function should be considered broken.
The MD5 function got broken some years back (2004) and it's more or less a question of time when the same will happen to SHA1. There have been scientific results claiming that an attacker with enough money could easily create a supercomputer able to create collisions on SHA1. The evil thing is: Due to the design of both functions, if you have one collision, you can create many more easily.
Although those facts are well known, SHA1 is still widely used (just have a look at your SSL connections or at the way the PGP web of trust works) and MD5 isn't dead either. The fact that a well-known piece of software got issues depending on hash collisions should raise attention. Pretty much all security considerations on cryptographic protocols rely on the collision resistance of hash functions.
The NIST plans to define new hash functions until 2012, until then it's probably a safe choice to stick with SHA256 or SHA512.
I was always under the impression that the attacks against MD5 just showed that it's possible to generate collisions when you can control both inputs. This would not allow you to generate a colliding input for a given hash?!
I think the same goes for SHA1... but I'm not really sure.
A new hash function would be nice though.
Besides that... With a 128 bit hash function like MD5 you'll always end up with collisions. Just use a proper salt, so Rainbow Tables won't work for your hashes :)
Hi, that is party true. You can not generate a hash to a given input (that would be a preimage attack), but you can generate different inputs that make sense if you only control parts of it. Someone has shown this with two postscript files with the same md5.
But the fact is, this doesn't help you, as hash functions are often used in complex protocols where all security assumptions rely on the collision resistance.
For your last statement, an 128 bit hash still leads to a optimal complexity of 64 bit, which makes it nearly impossible to generate collisions. The fact that you can generate them on md5 is because it has additional flaws.