-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Yesterday, I read an article about L. I have no comment on the article, but I do have a comment about the "teaser image", namely the PHP code that goes like this: lang:Php function IsLoggedIn() { global $username, $password; if ($username && $password) { $pass = md5(GetPassword($username)); return ($password == $pass); } return FALSE; } I am not going to comment on the frightening global data dependencies, the poor choice of variable names, or the fact that it's PHP as an intro to a Javascript article. While all are hilarious, that's not what this article is about. This article is about the call to C. Just md5-ing a password before storing it is, these days, basically the same as storing the password in cleartext. As an example, I ran "foobar" through md5 and googled the hash. The first few results make it clear that the L of 8263563fa9b78f349b9103ca2840f304 is "foobar". You can download "rainbow tables" of pretty much every n-character permutation and its md5 sum, these days. So while md5-ing every possible password may theoretically take a long time, it's been a long time since people started md5-ing their passwords. They've all been cracked. Unless your password is a 20-character random string of line noise, md5ing it is about the same as storing it cleartext. (And if there's one thing you can count on, it's that your users aren't using very good passwords.) With that in mind, why did the author of this code choose to md5 the passwords? Probably because he heard that the correct storage solutions are based on md5, and PHP has the md5 function built-in. I couldn't figure out how to reverse the md5 hashes, so he typed three characters, and was on his way with the rest of the code. The underlying problem is with the built-in functions. If you restrict yourself to only using built-in functions, you start programming at the wrong level. This guy is trying to write a business application of some sort. Unfortunately, he is stuck in the weeds and is also forced to implement a password storage system. His application should not be concerned with the details of securely storing passwords; it should only be concerned with the business domain. But since "securely storing passwords" is a Really Easy call to this built-in md5 function, his application becomes a business-app-and-oh-yeah-some-password-storage-logic. It's easy! To really do it right, his code should not concern itself with the implementation of things outside the domain of his application. He should have some external function to determine if the password the user typed in is the one in the database. Anything else is beyond the scope of what he's writing. (I'm not saying he shouldn't write this, just that it shouldn't be mixed right in with the application code. The IsLoggedIn function should delegate to something that only knows about passwords, and that does not know about the rest of the application.) So now we are the part of the blog post where I tell you why Perl is better than PHP. Let's imagine that I am sitting around, writing an application, and need to store a password. I've heard of this md5 thing, and I want to use it to store the passwords. I type C, and oh look, nothing. Off to CPAN, I search for "md5 password" and see C. OK, that's nice, I can md5 stuff with Perl. But I browse around in the results a bit and also see C. That sounds like something closer to the domain I'm working in (passwords, not message digests). Well, those both look good, so I'll see what my code with each would look like: lang:Perl sub get_encoded_password { my ($password) = @_; my $md5 = Digest::MD5->new; $md5->add($password); return $md5->hexdigest; } sub check_encoded_password { my ($clear, $encoded) = @_; my $md5 = Digest::MD5->new; $md5->add($clear); return $encoded eq $md5->hexdigest; } That's fine, I guess, but what about with Authen::Passphrase: sub get_encoded_password { my ($password) = @_; my $pw = Authen::Passphrase::BlowfishCrypt->new( cost => 8, salt_random => 1, passphrase => $password, ); return $pw->as_rfc2307; } sub check_encoded_password { my ($clear, $encoded) = @_; my $pw = Authen::Passphrase->from_rfc2307($encoded); return $pw->check_passphrase($clear); } This is about the same amount of code, but it is closer to the actual domain and turns out to be more secure and more maintainable. (If we ever start using a different hashing algorithm, C will still work with the old data!) Now, having decided to use Authen::Passphrase instead of Digest::MD5, we are back in our application, writing the C function. (Not that I would ever call it that.) sub is_logged_in { # I am not advocating the use of global state, or this API, I # am just trying to be true to the original example return unless $username && $password; return get_password($username)->check_passphrase($password); } (As an aside, I realize that this is not true to the original example. In the original example, the global $password is already md5-encoded. I have no idea why anyone would do that, especially without the safety of a type system, so I am just going to pretend that you are trying to do things right.) The point is, don't let tasks like this distract your application from its true purpose. When doing something in your application, make the application code say I it's doing, not I it's doing it. In this PHP example, the call to "md5" doesn't mean anything. That's I you hash a password, yes, but the function in the PHP example is called IsLoggedIn. "md5" has nothing to do with being logged in. In the Perl example, our function is called check_passphrase, which is reasonable in the domain of seeing if a user is logged in. (In a real app, obviously you would have a user object, a session object, and a web framework to determine the relationship between the two. But the point is, we're I to something sane.) The reason you want to do this is so that you can change the implementation without changing the application. If this guy realizes his mistake of using unsalted hashed passwords, he is going to have a hard time fixing his application. He will have to dive deep into his application's internal logic to fix a problem that has nothing to do with his application. In our Perl example, we would just write a new Authen::Passphrase subclass, make the "create user" and "change password" functions use the new subclass, and everything would continue to work. (Ideally, "passphrase_class" would be an attribute supplied by your L, and then you would just change the class name in your configuration. This is the advantage of thinking about your code a bit before typing it in to your editor.) So anyway, I think the reason the author of the PHP used "md5" was because it was easy. It's built right into the language, and is only three characters. Typing those three characters is a lot easier than searching for a library, finding one, leaning how it works, navigating a complex class hierarchy, and so on. But his app is insecure, and he can't easily fix it. His language convinced him to do it wrong, and now he is pretty much stuck. And that's bad. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) iEYEARECAAYFAkp1594ACgkQ2rw+dVvzZm1YIwCePP8u3S77kniUTjuknmBECBa9 3lQAnR3YHM63WKGW8Hdt8b2AaYQCu5z6 =bB39 -----END PGP SIGNATURE-----