-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 =pod Lately I have been bugged more and more by the "theoretical divide" between industry and academia in the computer science world. Specifically, it bugs me that we "know" that subtypes should be substitutable for their supertypes (Liskov), but that nearly all OO programs ignore this and do the opposite. In fact, Moose actively discourages you from applying Liskov by allowing a hierarchy like: lang:Perl class A { has 'attribute' => ( is => 'rw', isa => 'Num', required => 1 ); } class B extends A { has '+attribute' => ( isa => 'Int' ); } (This is C syntax, BTW. Everything in the article is valid Perl code if you load this module.) And prohibiting a hierarchy like: class A { has 'attribute' => ( is => 'rw', isa => 'Int', required => 1 ); } class B extends A { has '+attribute' => ( isa => 'Num' ); } The subtle difference is how a program expecting an A will react when it gets a B instead. In the first case, you would not be able to set "attribute" to 3.1415 when you have a B, but you could have when you had an A. That means the calling code that understood A will have to behave in a special way when it sees a B, or the program will die. (Or hey, you could just get lucky, and the calling code only ever tries to assign an integer to the attribute.) In the second example, any code that was correct with an A would also be correct with a B. Integers are a subset of Numbers, so B is more permissive than A. This is Liskov's substitution principle. I have thought about this disparity between what Moose wants you to do and what Liskov wants you to do for a while, and finally rectified it in my mind (while writing that Java article from earlier today). Basically, there are at least two kinds of inheritance that programmers use. The first, less common, variety is type/subtype inheritance that Liskov covers. This is when you subclass something to add a new property, not to take things away or change old properties. An example that comes to mind is subclassing a Set to get a Bag. In the Set, when a duplicate element is added, the class silently ignores it. In the Bag subclass, the class will increment a counter for that element, and provide a method to introspect that counter. The Set and Bag would both have C methods that would return a list of unique elements, but the Bag could provide another method like C that would return elements the same number of times that they were added. The idea is that if you drop a Bag into a place expecting a Set, the program will still behave exactly as it did before. You also get to reuse a lot of the Set code, which is always good. As an aside, I read an article today where the author "proved" that C++ is bad because he tried to subclass a Bag into a Set, and noticed that he violated Liskov. This doesn't say much about C++, but it I why I picked the example. Intuitively, it seems like a Set I Bag but with some restrictions on adding elements. You have all the element adding and extracting logic in the Bag class, and all you need to do to get a Set is to throw an exception when the user tries to add the same element twice. So you write some elegant code like this: class Set extends Bag { before insert($element) { die 'adding element twice' if $self->contains($element); } } And you get a whole new data type with about one line of code. The problem is that you can't substitute a Set for a Bag. So you really have two new types, not a supertype and a subtype. (BTW, I can think of an API that lets you subclass a Bag into a Set without violating Liskov. Handwave, exercise for the reader, and all that...) This is actually OK, and it brings us to the second variety of inheritance -- subclassing something because it is "like" another thing you want. In the example above, we subclass Bag to get a Set not because Set is actually a subtype of Bag, but because we want to use the code in Bag to make something that is kinda-sorta like a Bag, Set. We add a restriction or two here and there, and voila, we have our Set, and we reused the Bag code. This makes up about 95% of the OO code that I see in real life. We use subclassing to reuse code, not to make a type hierarchy, so we ignore Liskov. It makes perfect sense, and isn't really a "conflict of interests" -- it is simply using one tool for two purposes. The good news that is with Roles, we can reuse code, and always use inheritance for type/subtype relationships. This makes us happy as engineers (reusing the maximum amount of code possible), and as computer scientists (keeping types/subtypes meaningful). (If you are not familiar with Roles yet, please read the L first. Roles are Traits with state.) Let's try an example. Something I see mentioned from time to time is that filenames and strings are similar, and it would be nice to have a subclass, Filename, of String that concatenated things by adding C between elements, instead of with the empty string as with normal Strings. We will also enforce the validity of the filename; so no slashes or nulls in the strings we are joining. This is a terrible way to implement filename handling code, but it is a good example for our purposes. Traditionally, one would be tempted to write: class String { has 'string' => ( is => 'rw', isa => 'Str' ); ... reusable things go here ... method concatenate(String $right) { # we are "left" return $self->string( $self->string . $right ); } } class Filename extends String { around concatenate(String $right) { die 'no slashes or nulls' if $right =~ m{[/\000]}; return $self->$orig("/$right"); # call into superclass } } The main idea is to reuse all the stuff that we elide with the C<...>s and is common to both Strings and Filenames. This is generally good - -- we got our new type with a minimum of bloat -- but Filename is not really a String, it is just something that I a string, sometimes. When you use the words I or I, you know you really want to use a Role. We can be rigorous and still reuse code this way: role HasString { has 'string' => ( is => 'rw', isa => 'Str' ); ... reusable things go here ... } class String with HasString { method concatenate(String $right) { ... } } class Filename with HasString { method concatenate(String $right) { ... } } Here, we get the same reuse that we did with inheritance, but without creating a problem with subtype substitutability. It is clear from the definition that a Filename I not a String, so you wouldn't think to substitute one for the other. But if you just want something that reads like a string, you can say you want something that I HasString, and use either a Filename or a String. Your code is reusable, and you don't break Liskov. (If concatenation was too complicated to implement in both classes, we could factor that out into a role that implemented the raw concatenation routine, and compose that Role into each class or into the HasString role. Perl already does this with the "." operator, so I omitted it from this example.) Anyway, I am glad to have finally cleared this up for myself. When I violate Liskov, at least I know why, and I know how to avoid it if I think it would improve my application. If you are using a language without Roles, all is not lost. Often, you can use an abstract base class. Lisp takes this approach with Vectors and Lists, for example. Instead of a List being a Vector (or vice-versa), substitutability is provided by making each a subtype of Sequence. A function that expects a List can't accept a Vector; but one that accepts a Sequence can use either without its behavior changing. So even without Roles, you can make it work. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) iEYEARECAAYFAkk+Z/0ACgkQ2rw+dVvzZm0miwCgoHsoUpiCAMQAX/zVc3RHE1Ia w2wAn1grjHqXWuqZbtJVRVI1IG24Dnkw =PSVi -----END PGP SIGNATURE-----