Creating (and Maintaining) Perl Modules
Goals
The goal of this web page is to help you write easily maintainable and re-usable code. In Perl, re-usability is implemented through modules, which are similar to libraries in other languages.
This page will guide you through creating your module and documenting it, as well as giving you some tips on how to make your code as maintainable and re-usable as possible.
Creating Perl Modules
Perl modules are those files that end in .pm. If you do things right, you can make the process of writing, testing, and installing your module really slick. You'll also be able to easily bundle up your module for testing and installation on other machines, or uploading to CPAN.
Here are the steps in creating a module:
* Create a place to develop your module
* Create skeleton files for your module.
* Document your module
* Write some Perl code
* Write some tests for your code
* Install the module
* Tips
Create a place to develop your module
The simplest way to do this is to create one directory per module. Give this directory any name that clearly identifies the module that it contains. See the Math::BaseCalc module for a simple example, or the Apache::Filter modules for a more involved one.
Create skeleton files for your module
Perl is distributed with a program called h2xs. This program, while initially intended to help programmers implement C extensions to Perl, can also be used to generate skeleton files for a new module.
Let's create a module called NewModule.pm that doesn't do very much. I'll run the h2xs program:
[~/modules],2:05pm% h2xs -AXc -n NewModule
Writing NewModule/NewModule.pm
Writing NewModule/Makefile.PL
Writing NewModule/test.pl
Writing NewModule/Changes
Writing NewModule/MANIFEST
[~/modules],2:05pm% cd NewModule/
[~/modules/NewModule],2:05pm% ls
Changes MANIFEST Makefile.PL NewModule.pm test.pl
The Changes file is where you might keep track keep track of changes you make to your module as you write new versions. If you're using RCS or CVS version control, you shouldn't use the Changes file, since all your history & logs will be in revision control and is much more reliable there (you are adding detailed revision notes in version control, aren't you?). I've found that the best scheme is to automatically build the Changes file from the revision control history, but your preferences might vary.
MANIFEST contains a list of files in this directory. If you add new files to the directory, you should also add them to the MANIFEST. The MANIFEST is used to create a tarball of your module for distribution, and it's also checked when people unpack the tarball and install the module.
Makefile.PL is a Perl program used to create a Unix Makefile. You'll use this Makefile to test and install your module.
NewModule.pm is your module. You'll write the code here in the next step.
test.pl is a Perl program that tests your module. You don't run it directly, you type "make test" at a Unix prompt and it runs it for you. We'll develop this test suite a little later.
Document your module
One of the great things about Perl modules is that they can have their documentation right in the same file. Once this module is installed, its documentation can be read by typing "perldoc NewModule" at a Unix prompt. Keeping the code and documentation together is a great thing, since it means you'll always have the most recent documentation if you've got the most recent code.
Here is some sample documentation.
=head1 NAME
NewModule - Perl module for hooting
=head1 SYNOPSIS
use NewModule;
my $hootie = new NewModule;
$hootie->verbose(1);
$hootie->hoot; # Hoots
$hootie->verbose(0);
$hootie->hoot; # Doesn't hoot
=head1 DESCRIPTION
This module hoots when it's verbose, and doesn't do anything
when it's not verbose.
=head2 Methods
=over 4
=item * $object->verbose(true or false)
=item * $object->verbose()
Returns the value of the 'verbose' property. When called with an
argument, it also sets the value of the property. Use a true or false
Perl value, such as 1 or 0.
=item * $object->hoot()
Returns a hoot if we're supposed to be verbose. Otherwise it returns
nothing.
=back
=head1 AUTHOR
Ken Williams (ken@mathforum.org)
=head1 COPYRIGHT
Copyright 1998 Swarthmore College. All rights reserved.
This library is free software; you can redistribute it and/or
modify it under the same terms as Perl itself.
=head1 SEE ALSO
perl(1).
=cut
When you create the module using h2xs it will create several sections for you automatically. They are:
* NAME - The name of the module and a very short description of what it does.
* SYNOPSIS - This should be a few lines of example code demonstrating how to use the major functions/methods of your module.
* DESCRIPTION - This should be some prose text describing what your module is for.
* AUTHOR - You.
* SEE ALSO - This should point the person reading your docs to other documentation that may be useful (docs for other Modules, C library docs, etc.)
One other critical section that you should create is FUNCTIONS or METHODS (depending on whether your code is function-based or object-oriented). This section should list every single function or method intended for public use. At the very minimum, these descriptions should list the parameters each function/method takes and the return values it can give back.
Feel free to expand your documentation beyond these sections. Make sure to note any areas where your module does something that might go against someone else's assumption. Also make sure to mention limitations of the module that might not be obvious without looking at the code.
Your documentation is complete when someone can use your module without ever having to look at its code. This is very important. This makes it possible for you to separate your module's documented interface from its internal implementation (guts). This is good because it means that you are free to change the module's internals as long as the interface remains the same.
POD (Plain Old Documentation)
The perldoc program expects your documentation to be in POD format. The pod format has a few (very few) tags that you use to markup plain text. As an aside, the Perl compiler ignores POD commands so they can be used for extended comments inside your code.
Here is a list of some of the tags, with some HTML tags that are similar in spirit:
POD tag HTML equivalent Description
=head1
Primary heading.
=head2
Secondary heading.
=over N or Indent N spaces until it finds a =back tag. The convention is generally to indent in multiples of 4, so you see =over 4 a lot.
=back
or
Indicates that you are done with indenting.
=item
Indicates a list item. The convention is to use =over to begin a list, and =back to end it. Generally you do =item *, which puts bullets in front of each list item.
=cut
Indicates the end of a POD section.
For more information on POD, type perldoc perlpod at a UNIX prompt. There's not much to POD, and it will behoove you to know it inside & out.
Write some Perl code
What you've got now is a documented, fully functional Perl module that doesn't do anything. We've got to write some code in NewModule.pm to make it do something. This code should implement the interface defined in the documentation we just wrote.
The NewModule.pm file will have this in it already:
package NewModule;
use strict;
use vars qw($VERSION @ISA @EXPORT @EXPORT_OK);
require Exporter;
@ISA = qw(Exporter AutoLoader);
# Items to export into callers namespace by default. Note: do not export
# names by default without a very good reason. Use EXPORT_OK instead.
# Do not simply export all your public functions/methods/constants.
@EXPORT = qw(
);
$VERSION = '0.01';
# Preloaded methods go here.
# Autoload methods go after =cut, and are processed by the autosplit program.
1;
__END__
Here is a line by line explanation of what this means:
*
use strict;
This turns on the strict pragmas. This means that you must declare all your variables (with "my") before using them, you are not allowed to use soft references (also known as symbolic or dynamic references), and that all subroutines used my be preceded by an ampersand (&) or followed by parentheses.
This is crucial to writing maintainable code for the following reasons:
1. It forces you either declare all your global variables in advance or to use lexically scoped variables (the preferred choice). This leads to cleaner code with cleaner function interfaces and fewer hidden dependencies.
2. It will help you track down typos in variable names because it will complain about undeclared variables when you make a typo.
3. It doesn't allow you to use soft references, which are almost never needed and are a maintenance nightmare.
It also makes your code run a bit faster, because the Perl interpreter can do some optimizations at compile-time rather than waiting until run-time to make all its decisions.
*
use vars qw($VERSION @ISA @EXPORT @EXPORT_OK);
The vars pragma is a way of declaring your global variables. Any variable listed here is available throughout the entire package scope.
*
require Exporter;
This loads the Exporter module, which allows you to export variables and subroutines into the calling packages namespace when the module is used with the use statement.
*
@ISA = qw(Exporter AutoLoader);
By placing a module name in the @ISA array, we are saying that the current package is a subclass (in the object-oriented sense) of that package. In practical terms, this means that if we try to use a method NewModule->foo and the &foo subroutine is not found in the NewModule package, then Perl will also check for &Exporter::foo and &AutoLoader::foo as well. If those aren't found, Perl will check Exporter's parents and AutoLoader's parents, and so on up the tree.
In general, you can remove the references to AutoLoader as you probably won't use this. If you are writing an object-oriented module (a class) then you should remove the Exporter-related code as well.
*
@EXPORT = qw(
);
Any variables or subroutines listed here will automatically be placed into the calling package's namespace. It is important to document these in order to prevent namespace conflicts.
*
$VERSION = '0.01';
All modules should have a version number. This helps with version control. In addition, it allows you to do:
use My::Module 1.21;
This will cause the compiler to die if it cannot find at least version 1.21 of the module. It checks the version by looking at the $VERSION variable.
If you're using RCS or CVS and you want to synchronize $VERSION with the revision numbers, you can do the following for the $VERSION scalar:
$VERSION = sprintf "%d.%03d", q$Revision: 1.6 $ =~ /: (\d+)\.(\d+)/;
This must be all on one line. The Revision tag will be kept up to date by the rcs programs. Perl in turn will use the regex to extract the major and minor numbers, which are then formatted by sprintf. In this case $VERSION will be set to '1.006'.
*
The references to preloaded and autoloaded methods are only relevant if you are using the AutoLoader module. Don't use the AutoLoader module unless you know what you're doing, it's generally not worth the trouble.
*
1;
When a module is loaded (via use) the compiler will complain unless the last statement executed when it is loaded is true. This line ensures that this is the case (as long as you don't place any code after this line). It's Perl's way of making sure that it successfully parsed all the way to the end of the file.
*
__END__
Anything after this token is ignored by the compiler. This is generally where you will place your documentation.
Let's create some sample code. Don't worry about what this code does or how it works. We're mostly concerned with having a few methods so we can demonstrate how to document a module. For reference, this code is using the Object Oriented Perl syntax and features that became available with Perl 5.
package NewModule;
use strict;
use vars qw($VERSION);
$VERSION = '0.01';
sub new {
my $package = shift;
return bless({}, $package);
}
sub verbose {
my $self = shift;
if (@_) {
$self->{'verbose'} = shift;
}
return $self->{'verbose'};
}
sub hoot {
my $self = shift;
return "Don't pollute!" if $self->{'verbose'};
return;
}
1;
__END__
Write some tests for your code
One of the benefits of developing modules this way is that you can maintain a list of tests for your code that make sure it's working properly. This is what the test.pl file is for. Let's put a couple of tests at the end of the file - here is the complete file now:
# Before `make install' is performed this script should be runnable with
# `make test'. After `make install' it should work as `perl test.pl'
######################### We start with some black magic to print on failure.
# Change 1..1 below to 1..last_test_to_print .
# (It may become useful if the test is moved to ./t subdirectory.)
BEGIN { $| = 1; print "1..1\n"; }
END {print "not ok 1\n" unless $loaded;}
use NewModule;
$loaded = 1;
print "ok 1\n";
######################### End of black magic.
# Insert your test code below (better if it prints "ok 13"
# (correspondingly "not ok 13") depending on the success of chunk 13
# of the test code):
# Test 2:
my $obj = new NewModule;
$obj->verbose(1);
my $result = $obj->hoot;
print ($result eq "Don't pollute!" ? "ok 2\n" : "not ok 2\n");
# Test 3:
$obj->verbose(0);
my $result = $obj->hoot;
print ($result eq "" ? "ok 3\n" : "not ok 3\n");
The first test has already been created by h2xs in step one. It just makes sure we can load NewModule.pm in the first place. The second and third tests check that the hoot method returns the right things. These tests were written by the programmer.
These tests should completely exercise every function/method of the entire module, as exhaustively as possible. This script should be the regression test for your module. Every time you make a change to the module's implementation, you can test it against this script to make sure that nothing is broken. It also lets you determine whether your code will work on different platforms.
While this is a signficant time commitment for a large module, it also has a big payoff. Whenever a change is made to this module, you can find out very quickly whether or not the existing functionality has been changed. And when a bug gets reported, the first thing you can do is add a test to test.pl that exhibits the bug - when we fix the bug, we'll never have to worry about it escaping our attention again.
Install the module
Now we've got everything written, we can try installing the module. The general procedure for installing any Perl module is:
perl Makefile.PL
make
make test
make install
Let's try it now.
[~/modules/NewModule],6:07pm% ls
Changes MANIFEST Makefile.PL NewModule.pm README test.pl
[~/modules/NewModule],6:07pm% perl Makefile.PL
Checking if your kit is complete...
Looks good
Writing Makefile for NewModule
[~/modules/NewModule],6:09pm% make
mkdir ./blib
mkdir ./blib/lib
mkdir ./blib/arch
mkdir ./blib/arch/auto
mkdir ./blib/arch/auto/NewModule
mkdir ./blib/lib/auto
mkdir ./blib/lib/auto/NewModule
mkdir ./blib/man3
cp NewModule.pm ./blib/lib/NewModule.pm
Manifying ./blib/man3/NewModule.3
[~/modules/NewModule],6:09pm% make test
PERL_DL_NONLAZY=1 /usr/local/bin/perl -I./blib/arch -I./blib/lib -I/usr/local/li
b/perl5/alpha-dec_osf/5.00404 -I/usr/local/lib/perl5 test.pl
1..1
ok 1
ok 2
ok 3
[~/modules/NewModule],6:10pm% su
s/key 1111 aa11111
Password:
[forum]:/home/ken/modules/NewModule# make install
Installing /usr/local/lib/perl5/site_perl/./NewModule.pm
Installing /usr/local/lib/perl5/man/man3/./NewModule.3
Writing /usr/local/lib/perl5/site_perl/alpha-dec_osf/auto/NewModule/.packlist
Appending installation info to /usr/local/lib/perl5/alpha-dec_osf/5.00404/perllocal.pod
Notice that I had to become root to install the module globally. Installation involves copying files into the Perl library directory, which most people don't have permission to copy into. Since this isn't a very useful module, I installed it and then immediately uninstalled it by deleting the first three files mentioned in the "make install" step.
If you want to install the module in some non-standard location (like /foo/bar/lib), you give a LIB directive in the Makefile.PL step, i.e. perl Makefile.PL LIB=/put/module/here. This can also be put inside Makefile.PL if you're writing a module that will only be used locally and has a specific installation location that you want to enforce.
Tips
* Write the documentation for a module first, before writing any code. Discuss the module with other people first, before writing any code. Plan the module first, before writing any code.
It's easy to come up with a solution to a problem. It takes planning to come up with a good solution. Remember: the documentation, not the code, defines what a module does.
* Every module should have a purpose. There's a proliferation of modules with names like "perlutils.pm", "rcs_utils.pm", and "utilUtils.pm" that have no obvious purpose, and it's difficult to know what each does. This leads to confusion and duplication of code.
* A note about naming conventions. The Perl standard for module names is that all modules start with a capital letter. Names starting with a lower case letter are reserved for pragmas. In addition, it is not a bad thing to have your name include two colons, as in the name Text::Wrap. When this module is installed, it will be placed in a directory named Text under the root library directory. The module code itself will be in a file called Wrap.pm. This helps keep the library directory more organized. In addition, the :: naming convention can also indicate class hierarchies, although it does not have to.
* If you're writing an object-oriented module, don't use the Exporter stuff. Exporter lets someone "use" your module and then use some of your module's functions without fully package-qualifying those functions. See the DataExchange module for an example, and read the Exporter documentation for the details.
* The AutoLoader stuff lets your module only compile certain subroutines as they're needed. It can make programs that use your module faster if they only use small parts of your module. It's usually not necessary to use this, but if you're writing a really huge module (like CGI.pm) it might be worth it. Read the AutoLoader docs for the details. In general, avoid it, it's not worth the headache.
* You can type "make realclean" to get rid of temporary files and directories that are created during the testing and installation process.
* You can type "make dist" to create a file suitable for uploading to CPAN or giving to your friends at the mall.