-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Yesterday, I read an article about L.
I have no comment on the article, but I do have a comment about the
"teaser image", namely the PHP code that goes like this:
lang:Php
function IsLoggedIn() {
global $username, $password;
if ($username && $password) {
$pass = md5(GetPassword($username));
return ($password == $pass);
}
return FALSE;
}
I am not going to comment on the frightening global data dependencies,
the poor choice of variable names, or the fact that it's PHP as an
intro to a Javascript article. While all are hilarious, that's not
what this article is about.
This article is about the call to C.
Just md5-ing a password before storing it is, these days, basically
the same as storing the password in cleartext. As an example, I ran
"foobar" through md5 and googled the hash. The first few results make
it clear that the L of
8263563fa9b78f349b9103ca2840f304 is "foobar". You can download
"rainbow tables" of pretty much every n-character permutation and its
md5 sum, these days. So while md5-ing every possible password may
theoretically take a long time, it's been a long time since people
started md5-ing their passwords. They've all been cracked. Unless
your password is a 20-character random string of line noise, md5ing it
is about the same as storing it cleartext. (And if there's one thing
you can count on, it's that your users aren't using very good
passwords.)
With that in mind, why did the author of this code choose to md5 the
passwords? Probably because he heard that the correct storage
solutions are based on md5, and PHP has the md5 function built-in.
I couldn't figure out how to reverse the md5 hashes, so he typed
three characters, and was on his way with the rest of the code.
The underlying problem is with the built-in functions. If you
restrict yourself to only using built-in functions, you start
programming at the wrong level. This guy is trying to write a
business application of some sort. Unfortunately, he is stuck in the
weeds and is also forced to implement a password storage system. His
application should not be concerned with the details of securely
storing passwords; it should only be concerned with the business
domain. But since "securely storing passwords" is a Really Easy call
to this built-in md5 function, his application becomes a
business-app-and-oh-yeah-some-password-storage-logic. It's easy!
To really do it right, his code should not concern itself with the
implementation of things outside the domain of his application. He
should have some external function to determine if the password the
user typed in is the one in the database. Anything else is beyond the
scope of what he's writing. (I'm not saying he shouldn't write this,
just that it shouldn't be mixed right in with the application code.
The IsLoggedIn function should delegate to something that only knows
about passwords, and that does not know about the rest of the
application.)
So now we are the part of the blog post where I tell you why Perl is
better than PHP. Let's imagine that I am sitting around, writing an
application, and need to store a password. I've heard of this md5
thing, and I want to use it to store the passwords. I type
C, and oh look, nothing. Off to CPAN, I search for
"md5 password" and see C. OK, that's nice, I can md5
stuff with Perl. But I browse around in the results a bit and also
see C. That sounds like something closer to the
domain I'm working in (passwords, not message digests). Well, those
both look good, so I'll see what my code with each would look like:
lang:Perl
sub get_encoded_password {
my ($password) = @_;
my $md5 = Digest::MD5->new;
$md5->add($password);
return $md5->hexdigest;
}
sub check_encoded_password {
my ($clear, $encoded) = @_;
my $md5 = Digest::MD5->new;
$md5->add($clear);
return $encoded eq $md5->hexdigest;
}
That's fine, I guess, but what about with Authen::Passphrase:
sub get_encoded_password {
my ($password) = @_;
my $pw = Authen::Passphrase::BlowfishCrypt->new(
cost => 8,
salt_random => 1,
passphrase => $password,
);
return $pw->as_rfc2307;
}
sub check_encoded_password {
my ($clear, $encoded) = @_;
my $pw = Authen::Passphrase->from_rfc2307($encoded);
return $pw->check_passphrase($clear);
}
This is about the same amount of code, but it is closer to the actual
domain and turns out to be more secure and more maintainable. (If we
ever start using a different hashing algorithm,
C will still work with the old data!)
Now, having decided to use Authen::Passphrase instead of Digest::MD5,
we are back in our application, writing the C function.
(Not that I would ever call it that.)
sub is_logged_in {
# I am not advocating the use of global state, or this API, I
# am just trying to be true to the original example
return unless $username && $password;
return get_password($username)->check_passphrase($password);
}
(As an aside, I realize that this is not true to the original example.
In the original example, the global $password is already md5-encoded.
I have no idea why anyone would do that, especially without the safety
of a type system, so I am just going to pretend that you are trying to
do things right.)
The point is, don't let tasks like this distract your application from
its true purpose. When doing something in your application, make the
application code say I it's doing, not I it's doing it.
In this PHP example, the call to "md5" doesn't mean anything. That's
I you hash a password, yes, but the function in the PHP example
is called IsLoggedIn. "md5" has nothing to do with being logged in.
In the Perl example, our function is called check_passphrase, which is
reasonable in the domain of seeing if a user is logged in. (In a real
app, obviously you would have a user object, a session object, and a
web framework to determine the relationship between the two. But the
point is, we're I to something sane.)
The reason you want to do this is so that you can change the
implementation without changing the application. If this guy realizes
his mistake of using unsalted hashed passwords, he is going to have a
hard time fixing his application. He will have to dive deep into his
application's internal logic to fix a problem that has nothing to do
with his application. In our Perl example, we would just write a new
Authen::Passphrase subclass, make the "create user" and "change
password" functions use the new subclass, and everything would
continue to work. (Ideally, "passphrase_class" would be an attribute
supplied by your L, and then you
would just change the class name in your configuration. This is the
advantage of thinking about your code a bit before typing it in to
your editor.)
So anyway, I think the reason the author of the PHP used "md5" was
because it was easy. It's built right into the language, and is only
three characters. Typing those three characters is a lot easier than
searching for a library, finding one, leaning how it works, navigating
a complex class hierarchy, and so on. But his app is insecure, and he
can't easily fix it. His language convinced him to do it wrong, and
now he is pretty much stuck. And that's bad.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
iEYEARECAAYFAkp1594ACgkQ2rw+dVvzZm1YIwCePP8u3S77kniUTjuknmBECBa9
3lQAnR3YHM63WKGW8Hdt8b2AaYQCu5z6
=bB39
-----END PGP SIGNATURE-----