-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 The main goal in programming is to generate a computer program that works. A secondary goal is to generate a program whose intent is obvious just by glancing at the source code. After you have let your program sit for a few weeks, you don't want to have to reverse-engineer it in the debugger; you want to open up a file, glance at what's there, and start hacking away again. The way you ensure that this is by writing the smallest amount of code possible, and writing it in such a way as to minimize any redundancy. If you are working in a certain domain, even a broad domain like "object-oriented programming", it makes sense to write the program text in the vocabulary of the domain. That way, your program describes its intent, instead of merely describing its specific actions. Computers are good at figuring out the specific actions, while humans are good at getting "the big picture". If you use words that convey a lot of meaning to both the computer and the human, then you can build complicated programs that are easy for humans to understand. That means you satisfy two goals; you get a working computer program, and you actually understand what it does. Let's explore the "object-oriented programming" domain a bit. Plain OO in Perl, as you are likely aware, is a little on the verbose side. Here's a class that has a single attribute: lang:Perl package HasFoo; use strict; sub new { my ($class, %args) = @_; die 'foo is required' unless exists $args{foo}; return bless { foo => $args{foo} }, $class; } sub get_foo { my $self = shift; return $self->{foo}; } This program only satisfies the first goal; it is a working program, but it doesn't say anything about what it is trying to do. It is very clear to the computer what happens. When someone calls new on this "class", an exception is thrown if there is not a foo key in the args hash, and then some data structure is blessed into a "class", and the result is returned. When you call the get_foo method, the computer extracts something from this self hash and returns it. That is all very fine and nice, but there is no I. When I described the problem, I was talking about classes and attributes, but the word class only appears in a variable name (holding a class I, incidentally), and the word attribute never appears at all. The code looks nothing like the problem I said I was trying to solve. All the code is is a sequence of instructions that returns the right answer. If there were a thousand "attributes" described like this, a human would have no hope of really understanding what was going on. (This, incidentally, has resulted in a lot of blog posts about how "Perl is dead".) We can do better. What if we abstracted away this common pattern, and used domain words to do it? What if we could describe the above class like: class HasFoo { has_attribute 'foo' => ( reader_method => 'get_foo', ); } This is mostly-valid Perl, and is much easier to read. We can see that we're making a class called HasFoo, because we declare it with a keyword called "class". We see that it has an attribute called foo, because it says, C. We see that we can read foo with a method called C, because ... it says that. Now we are managing complexity properly; the difficult task of building a class is hidden behind keywords that describe what the human actually wants to happen. The details of making this work are in the guts of a library which You Don't Have To Maintain. This sort of thinking is what lead to L and L. Moose is more than just sugar, of course, but the sugar is what's relevant to this discussion. C is even more sugar to make the domain-specific language Really Nice, even when Perl itself isn't. (No more C, which is very liberating to not have to type anymore. But the big thing is to get C syntax, which Perl uses extensively for itself, but which it does not allow user programs to use. A little bit of cut-n-paste from the perl core into an XS module, and I problem is solved.) Anyway, once you start using Moose, you will see new places in your code that would benefit from domain-specific languages. Sometimes plain Moose is a bit verbose and only describes the OOP domain, rather than the class's real role in your application. This is better than a bunch of Perl code, but is not better than a custom language for describing the domain you want to work in. Fortunately, Moose comes with machinery that makes supplementing it with domain-specific keywords really easy. An example of the verbosity I see in Moose is when I create a class that operates on files and directories. The naive programmer would write: class File::Frob { has [qw/input_file output_file/] => ( is => 'ro', isa => 'Str', required => 1, ); } This is OK, you can say my $frob = File::Frob->new( input_file => hello.txt', output_file => 'OH_HAI.txt', ); and it is pretty clear what's going on. The problem comes when you want to use the files (not names) in your object. It would be nice to say: method write_to_output_file(Str $string){ use autodie ':file'; my $fh = $self->output_file->openw; $fh->write($string); $fh->close; } The problem is that C is just a string, it isn't actually the output file. That's easy to fix, though, with L and L. We just have to inflate our class definition a bit: class File::Frob { use MooseX::Types::Path::Class qw(File); has [qw/input_file output_file/] => ( is => 'ro', isa => File, coerce => 1, required => 1, ); } Now you can write a method like C above. It all works fine. The problem is that we are getting farther and farther from readable code. C? What does that have to do with files? C<< is => 'ro' >>? What? This attribute definition is more about describing an object-oriented program than saying "I want a file". So, let's fix that. The first step is to imagine what you want. I want to be able to define an attribute that holds a file like this: has_file 'input_file'; has_file 'output_file'; It would also be nice to parameterize it: has_file 'input_file' => ( must_exist => 1, ); This tells the reader a lot more about what's going on. We reuse the C keyword so that the reader realizes this is creating an attribute. But we also introduce domain-specific language; we have a I, and it has to have some property that a file could have, namely I. Now that we know what we want, it's time to start writing some code to make this possible. The first step is to write the C method. To do this, you need to know a little about Moose, namely that if you have a class metaclass object ("metaclass" from here), you call the method C on it to setup an attribute. In other words: sub has_file { my ($meta, $name, %options) = @_; $meta->add_attribute( $name => ( is => 'ro', isa => File, coerce => 1, required => 1, %options, )); } Simple enough. Now we need to get this function into the package where we are defining attributes, and we need to curry in C<$meta> like the plain-old C keyword does. That is where we meet the Moose machinery, specifically C. Let's call our module which provides this sugar C. The module ends up looking like this: package HasFile; use strict; use warnings; use Moose::Exporter; Moose::Exporter->setup_import_methods( with_meta => ['has_file'], ); use MooseX::Types::Path::Class qw(File); sub has_file { ... } The C<< Moose::Exporter->setup_import_methods >> is what actually curries the C function and installs it the consuming package's namespace. (Actually, it sets up code so that this will happen at the right time, but that is a minor implementation detail.) Now you can write classes like: class System::Killer::Unix { use HasFile; has_file 'password_file' => ( default => '/etc/passwd', ); method kill_system { $self->password_file->unlink; } } And use it as you'd expect: my $killer = System::Killer::Unix->new; $killer->kill_system; say 'Good luck logging in' unless -e $killer->password_file; (This is probably a good example B to cut-n-paste into your REPL that's running as root, by the way.) Anyway, we now have slightly more readable code, and in the future there will be much less typing required. This is a real problem I had, and a more complete version of the example code is on CPAN as L. This is pretty good, sure, but sometimes just adding some sugar isn't enough. You need to change the very core of what an "object" is to something that is more suitable to your application or task. Fortunately, Moose has a meta-object protocol that allows you to cleanly customize how individual classes work. Before you can start changing what objects and classes I, you need to understand what they are. That's simple, though; a class is just a container with some properties, a name, a list of attributes, a list of methods, a note about which method is the "constructor", some information about how to actually store the values of the attributes in memory, etc. You could even write a class to represent classes, like: class Class { has 'attributes' => ( traits => ['Array'], is => 'ro', isa => ArrayRef[Attribute], default => sub { [] }, required => 1, handles => { 'add_attribute' => 'push', } ); ... } (This is very readable, BTW; a class has attributes, and we wrote "class has attributes".) Then when you define a class like this: class Foo { has 'bar' => ( .... ); } you are really saying: my $foo = Class->new( name => 'foo', attributes => [] ); $foo->add_attribute( Attribute->new( name => 'bar') ); Then when you instantiate a class, you really mean: my $instance = $foo->create_instance( attribute_values => { bar => 42 } ); This is actually exactly how Moose works. Classes are defined as classes. (Attributes, methods, packages, and so on are also instances of classes.) The only sticky point is that C<< Class->new >> requires an instance of Class's class: my $class = $class_class->create_instance( ... ); my $foo = $class->create_instance( ... ); my $instance = $foo->create_instance( ... ); Well, actually, C<$class_class> needs a class too, meaning there should be a C<$class_class_class>. This is sticky because every class is a class, but every class is a class. This is called I, and is handled by hard-coding some assumptions into the object system, and then overwriting those assumptions with a real object "later". You can read the C source code to see how Perl does that, or you can read a book called I to see how Common Lisp does it. Good reading, but not actually relevant to our domain-specific classes. While our app's classes will be "special", the class of these classes won't be, so any metacircularity is already handled for you by Moose's usual method. So you don't even need to think about this, but you can if you want to. With that out of the way, let's start making classes. One of my ongoing projects at work is to convert Excel spreadsheets to Perl, so that their calculations and results can be useful to something other than a PowerPoint presentation. Given a spreadsheet like: lang:undef A B C 1 Offset: 42 2 3 a: b: result: 4 2 3 47 5 0 0 42 I would translate this to Perl as something like: lang:Perl class Add { has [qw/offset a b/] => ( is => 'ro', isa => Num, required => 1); has 'result' => ( is => 'ro', isa => Num, lazy_build => 1 ); method _build_result { return $self->offset + $self->a + $self->b } } This is OK, but not perfect, as it doesn't convey as much information as the original spreadsheet does. "Offset" is a common value shared between rows, whereas a and b are unique for the row. a and b are also required to be defined, whereas "result" is built dynamically. So all in all, this is not good. There would be no way to build a batch process around this, where the common variables (offset) are specified in advance and the individual inputs (a and b) are read from a file, and the result column is written to a file. Well, there is a way to do that, but it would involve hard-coding that exact logic somewhere. And that is tedious and error-prone, just like our original "pure perl" OO class. We don't want that, because while it would be a program that would produce the right answer, we couldn't reuse that code to produce similar right answers. You also can't understand the spreadsheet logic at a glance; you would have to dive into the internals that are handling the reading and writing of the files to see what the inputs and outputs are. Let's fix that. As with the other languages we've designed, we need to decide what a domain-specific language for representing spreadsheets should look like before we can begin implementing it. I think we should have three types of attributes, regular attributes, which are per-run, and inputs and outputs, which are per-row. Inputs are inputs, and outputs are outputs. (How did you guess?) So, with this in mind, our new class might look like: class Add { use Spreadsheet; has 'offset' => ( is => 'ro', isa => Num, required => 1); has_input 'a' => ( isa => 'Num' ); has_input 'b' => ( isa => 'Num' ); has_output 'result' => ( isa => 'Num' ); method _build_result { $self->offset + $self->a + $self->b } } Clean. It's now clear exactly what our spreadsheet is doing. We know a is an input, because the source code says, C. Now we can implement this! Because Cs and Cs are properties of the class, we will eventually need to store that information with the rest of the class' information; in the metaclass. This means we need to write a role to apply to the metaclass that will contain the data we care about: role Spreadsheet::Meta::Class { use MooseX::Types::Moose qw(ArrayRef Str HashRef); has 'inputs' => ( traits => ['Array'], reader => 'get_all_inputs', isa => ArrayRef[Str], default => sub { [] }, required => 1, auto_deref => 1, handles => { '_add_input' => 'push', } ); has 'outputs' => ( traits => ['Array'], reader => 'get_all_outputs', isa => ArrayRef[Str], default => sub { [] }, required => 1, auto_deref => 1, handles => { '_add_output' => 'push', } ); method add_input(Str $name, HashRef $options){ $self->_add_input($name); $self->add_attribute( $name => %$options ); } method add_output(Str $name, HashRef $options){ $self->_add_output($name); $self->add_attribute( $name => %$options ); } } If this is applied to your metaclass, you will be able to say: $meta->add_input( foo => { is => 'ro', isa => Int, required => 1 } ); $meta->add_input( bar => { is => 'ro', isa => Int, required => 1 } ); Then, in addition to treating foo and bar as attributes, you could also see that they were inputs by calling C<< Add->meta->get_all_inputs >>. (I called the accessor C instead of just C because that's what the rest of Moose does. At some point, to actually work like Moose, we will need to rewrite this accessor to inspect roles and superclasses for additional inherited or composed inputs and outputs. That is left as an exercise to the reader.) So that is one piece. We also need the Moose sugar, like: sub has_input { my ($meta, $name, %options) = @_; $meta->add_input( $name => { is => 'ro', %options, }); } C looks similar but with a C; metaprogramming I is left as an exercise for the reader... or C ;) The final piece of the puzzle is a method that will "run" a given spreadsheet: role Spreadsheet::Object with (MooseX::Runnable, MooseX::Getopt, MooseX::Clone) { use autodie ':file'; use MooseX::FileAttribute; # told you this was useful. has_file 'input_file' => ( required => 1, must_exist => 1 ); has_file 'output_file' => ( required => 1, default => 'out' ); method _parse_input_line(Str $line) { my %result; /^([^:]+):(.+)$/ and $result{$1} = $2 for split /\s*,\s*/, $line; return %result; } method run { my $ifh = $self->input_file->openr; my $ofh = $self->output_file->openw; while( my $line = <$ifh> ){ my %inputs = $self->_parse_input_line($line); my $row = $self->clone( @inputs{$self->meta->get_all_inputs} ); for my $output ($self->meta->get_all_outputs) { print {$ofh} "$output:",$row->$output,","; } print {$ofh} "\n"; } } } This is a role that, when composed into a spreadsheet class, makes it runnable from the command-line. It composes in L so that you can run it with C (instead of a boilerplate script). C is so that you can specify run-wide parameters (and the input_file and output_file) on the command line, and so that C will print out some documentation. C is so that we can create a fresh instance of the spreadsheet for each set of inputs. This way, data won't accidentally leak between rows, and we won't need any messy C<'rw'> attributes. (Mutable state is bad, m'kay?) The rest of the code is the implementation of the batch process; we read inputs in the form of "input1:value,input2:value2" from a file, create a spreadsheet row from that information, and then print out the output in a similar form. That's everything. Now we need to tie this all together so that when someone says C, they get the metaclass, the syntax sugar, and the role automatically. Exporting the sugar to the consuming class is easy; it works just like the C example: package Spreadsheet; use strict; use warnings; use Moose::Exporter; Moose::Exporter->setup_import_methods( with_meta => [qw/has_input has_output/], ); sub has_input ... sub has_output ... That's the sugar. Now we need to apply the metaclass role C to the metaclass, and apply the instance role C to our class's base class. The C module takes care of this for us. We just need to implement an C function: sub init_meta { my ($m, %options) = @_; my $caller = $options{for_class}; Moose::Util::MetaRole::apply_metaclass_roles( for_class => $caller, metaclass_roles => ['Spreadsheet::Meta::Class'], ); Moose::Util::MetaRole::apply_base_class_roles( for_class => $caller, roles => ['Spreadsheet::Object'], ); return $caller->meta; } That is the Spreadsheet module. (It is worth noting at this point that because of how we implemented this, with roles, you can use most other C modules with this one. Everything is designed to compose cleanly and generally play nice. If something conflicts, you will get a descriptive error message, not strange runtime behavior. That's what makes the Moose system so nice to work with!) Now we can write a spreadsheet class like: class Add { use Spreadsheet; has 'offset' => ( is => 'ro', isa => Num, required => 1); has_input 'a' => ( isa => 'Num' ); has_input 'b' => ( isa => 'Num' ); has_output 'result' => ( isa => 'Num' ); method _build_result { $self->offset + $self->a + $self->b } } Then we can create an input file C: lang:undef a:0,b:0 a:2,b:3 And run the spreadsheet from the command line: $ mx-run -Ilib Add --offset 0 --input_file in That runs, and produces an output file that looks like: result:0, result:5, Now when you want to add another input, output, parameter, or an entirely new spreadsheet, there is almost no code to write. And, of course, you can still use your spreadsheet objects as normal classes: lang:Perl my $add = Add->new( offset => 1, a => 1, b => 2 ); say "1 + 1 + 2 is ", $add->result; So, I hope this helps demonstrate how Moose and your own custom domain-specific languages can make it easier for you to write correct programs that are easy for humans to understand! (Next week, we'll talk about the implementation of L, which uses even more MetaRole features.) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) iEYEARECAAYFAksBTV8ACgkQ2rw+dVvzZm3wtwCeP6uO0BihnbYSwcVozq8Frg73 zFEAnjEg2H4DIkYGO2kkpG6iyqCE3QeX =3MFj -----END PGP SIGNATURE-----