ABCDEFGHIJKLMNOPQRSTUVWXYZ

perlhack

PERLHACK(1)            Perl Programmers Reference Guide            PERLHACK(1)



NAME
       perlhack - How to hack at the Perl internals

DESCRIPTION
       This document attempts to explain how Perl development takes place, and
       ends with some suggestions for people wanting to become bona fide
       porters.

       The perl5-porters mailing list is where the Perl standard distribution
       is maintained and developed.  The list can get anywhere from 10 to 150
       messages a day, depending on the heatedness of the debate.  Most days
       there are two or three patches, extensions, features, or bugs being
       discussed at a time.

       A searchable archive of the list is at either:

           http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/

       or

           http://archive.develooper.com/perl5-porters@perl.org/

       List subscribers (the porters themselves) come in several flavours.
       Some are quiet curious lurkers, who rarely pitch in and instead watch
       the ongoing development to ensure they're forewarned of new changes or
       features in Perl.  Some are representatives of vendors, who are there
       to make sure that Perl continues to compile and work on their plat-
       forms.  Some patch any reported bug that they know how to fix, some are
       actively patching their pet area (threads, Win32, the regexp engine),
       while others seem to do nothing but complain.  In other words, it's
       your usual mix of technical people.

       Over this group of porters presides Larry Wall.  He has the final word
       in what does and does not change in the Perl language.  Various
       releases of Perl are shepherded by a ``pumpking'', a porter responsible
       for gathering patches, deciding on a patch-by-patch feature-by-feature
       basis what will and will not go into the release.  For instance,
       Gurusamy Sarathy was the pumpking for the 5.6 release of Perl, and
       Jarkko Hietaniemi was the pumpking for the 5.8 release, and Hugo van
       der Sanden and Rafael Garcia-Suarez share the pumpking for the 5.10
       release.

       In addition, various people are pumpkings for different things.  For
       instance, Andy Dougherty and Jarkko Hietaniemi did a grand job as the
       Configure pumpkin up till the 5.8 release. For the 5.10 release H.Mer-
       ijn Brand took over.

       Larry sees Perl development along the lines of the US government:
       there's the Legislature (the porters), the Executive branch (the pump-
       kings), and the Supreme Court (Larry).  The legislature can discuss and
       submit patches to the executive branch all they like, but the executive
       branch is free to veto them.  Rarely, the Supreme Court will side with
       the executive branch over the legislature, or the legislature over the
       executive branch.  Mostly, however, the legislature and the executive
       branch are supposed to get along and work out their differences without
       impeachment or court cases.

       You might sometimes see reference to Rule 1 and Rule 2.  Larry's power
       as Supreme Court is expressed in The Rules:

       1   Larry is always by definition right about how Perl should behave.
           This means he has final veto power on the core functionality.

       2   Larry is allowed to change his mind about any matter at a later
           date, regardless of whether he previously invoked Rule 1.

       Got that?  Larry is always right, even when he was wrong.  It's rare to
       see either Rule exercised, but they are often alluded to.

       New features and extensions to the language are contentious, because
       the criteria used by the pumpkings, Larry, and other porters to decide
       which features should be implemented and incorporated are not codified
       in a few small design goals as with some other languages.  Instead, the
       heuristics are flexible and often difficult to fathom.  Here is one
       person's list, roughly in decreasing order of importance, of heuristics
       that new features have to be weighed against:

       Does concept match the general goals of Perl?
           These haven't been written anywhere in stone, but one approximation
           is:

            1. Keep it fast, simple, and useful.
            2. Keep features/concepts as orthogonal as possible.
            3. No arbitrary limits (platforms, data sizes, cultures).
            4. Keep it open and exciting to use/patch/advocate Perl everywhere.
            5. Either assimilate new technologies, or build bridges to them.

       Where is the implementation?
           All the talk in the world is useless without an implementation.  In
           almost every case, the person or people who argue for a new feature
           will be expected to be the ones who implement it.  Porters capable
           of coding new features have their own agendas, and are not avail-
           able to implement your (possibly good) idea.

       Backwards compatibility
           It's a cardinal sin to break existing Perl programs.  New warnings
           are contentious--some say that a program that emits warnings is not
           broken, while others say it is.  Adding keywords has the potential
           to break programs, changing the meaning of existing token sequences
           or functions might break programs.

       Could it be a module instead?
           Perl 5 has extension mechanisms, modules and XS, specifically to
           avoid the need to keep changing the Perl interpreter.  You can
           write modules that export functions, you can give those functions
           prototypes so they can be called like built-in functions, you can
           even write XS code to mess with the runtime data structures of the
           Perl interpreter if you want to implement really complicated
           things.  If it can be done in a module instead of in the core, it's
           highly unlikely to be added.

       Is the feature generic enough?
           Is this something that only the submitter wants added to the lan-
           guage, or would it be broadly useful?  Sometimes, instead of adding
           a feature with a tight focus, the porters might decide to wait
           until someone implements the more generalized feature.  For
           instance, instead of implementing a ``delayed evaluation'' feature,
           the porters are waiting for a macro system that would permit
           delayed evaluation and much more.

       Does it potentially introduce new bugs?
           Radical rewrites of large chunks of the Perl interpreter have the
           potential to introduce new bugs.  The smaller and more localized
           the change, the better.

       Does it preclude other desirable features?
           A patch is likely to be rejected if it closes off future avenues of
           development.  For instance, a patch that placed a true and final
           interpretation on prototypes is likely to be rejected because there
           are still options for the future of prototypes that haven't been
           addressed.

       Is the implementation robust?
           Good patches (tight code, complete, correct) stand more chance of
           going in.  Sloppy or incorrect patches might be placed on the back
           burner until the pumpking has time to fix, or might be discarded
           altogether without further notice.

       Is the implementation generic enough to be portable?
           The worst patches make use of a system-specific features.  It's
           highly unlikely that nonportable additions to the Perl language
           will be accepted.

       Is the implementation tested?
           Patches which change behaviour (fixing bugs or introducing new fea-
           tures) must include regression tests to verify that everything
           works as expected.  Without tests provided by the original author,
           how can anyone else changing perl in the future be sure that they
           haven't unwittingly broken the behaviour the patch implements? And
           without tests, how can the patch's author be confident that his/her
           hard work put into the patch won't be accidentally thrown away by
           someone in the future?

       Is there enough documentation?
           Patches without documentation are probably ill-thought out or
           incomplete.  Nothing can be added without documentation, so submit-
           ting a patch for the appropriate manpages as well as the source
           code is always a good idea.

       Is there another way to do it?
           Larry said ``Although the Perl Slogan is There's More Than One Way
           to Do It, I hesitate to make 10 ways to do something''.  This is a
           tricky heuristic to navigate, though--one man's essential addition
           is another man's pointless cruft.

       Does it create too much work?
           Work for the pumpking, work for Perl programmers, work for module
           authors, ...  Perl is supposed to be easy.

       Patches speak louder than words
           Working code is always preferred to pie-in-the-sky ideas.  A patch
           to add a feature stands a much higher chance of making it to the
           language than does a random feature request, no matter how fer-
           vently argued the request might be.  This ties into ``Will it be
           useful?'', as the fact that someone took the time to make the patch
           demonstrates a strong desire for the feature.

       If you're on the list, you might hear the word ``core'' bandied around.
       It refers to the standard distribution.  ``Hacking on the core'' means
       you're changing the C source code to the Perl interpreter.  ``A core
       module'' is one that ships with Perl.

       Keeping in sync

       The source code to the Perl interpreter, in its different versions, is
       kept in a repository managed by a revision control system ( which is
       currently the Perforce program, see http://perforce.com/ ).  The pump-
       kings and a few others have access to the repository to check in
       changes.  Periodically the pumpking for the development version of Perl
       will release a new version, so the rest of the porters can see what's
       changed.  The current state of the main trunk of repository, and
       patches that describe the individual changes that have happened since
       the last public release are available at this location:

           http://public.activestate.com/gsar/APC/
           ftp://ftp.linux.activestate.com/pub/staff/gsar/APC/

       If you're looking for a particular change, or a change that affected a
       particular set of files, you may find the Perl Repository Browser use-
       ful:

           http://public.activestate.com/cgi-bin/perlbrowse

       You may also want to subscribe to the perl5-changes mailing list to
       receive a copy of each patch that gets submitted to the maintenance and
       development "branches" of the perl repository.  See
       http://lists.perl.org/ for subscription information.

       If you are a member of the perl5-porters mailing list, it is a good
       thing to keep in touch with the most recent changes. If not only to
       verify if what you would have posted as a bug report isn't already
       solved in the most recent available perl development branch, also known
       as perl-current, bleading edge perl, bleedperl or bleadperl.

       Needless to say, the source code in perl-current is usually in a per-
       petual state of evolution.  You should expect it to be very buggy.  Do
       not use it for any purpose other than testing and development.

       Keeping in sync with the most recent branch can be done in several
       ways, but the most convenient and reliable way is using rsync, avail-
       able at ftp://rsync.samba.org/pub/rsync/ .  (You can also get the most
       recent branch by FTP.)

       If you choose to keep in sync using rsync, there are two approaches to
       doing so:

       rsync'ing the source tree
           Presuming you are in the directory where your perl source resides
           and you have rsync installed and available, you can `upgrade' to
           the bleadperl using:

            # rsync -avz rsync://ftp.linux.activestate.com/perl-current/ .

           This takes care of updating every single item in the source tree to
           the latest applied patch level, creating files that are new (to
           your distribution) and setting date/time stamps of existing files
           to reflect the bleadperl status.

           Note that this will not delete any files that were in '.' before
           the rsync. Once you are sure that the rsync is running correctly,
           run it with the --delete and the --dry-run options like this:

            # rsync -avz --delete --dry-run rsync://ftp.linux.activestate.com/perl-current/ .

           This will simulate an rsync run that also deletes files not present
           in the bleadperl master copy. Observe the results from this run
           closely. If you are sure that the actual run would delete no files
           precious to you, you could remove the '--dry-run' option.

           You can than check what patch was the latest that was applied by
           looking in the file .patch, which will show the number of the lat-
           est patch.

           If you have more than one machine to keep in sync, and not all of
           them have access to the WAN (so you are not able to rsync all the
           source trees to the real source), there are some ways to get around
           this problem.

           Using rsync over the LAN
               Set up a local rsync server which makes the rsynced source tree
               available to the LAN and sync the other machines against this
               directory.

               From http://rsync.samba.org/README.html :

                  "Rsync uses rsh or ssh for communication. It does not need to be
                   setuid and requires no special privileges for installation.  It
                   does not require an inetd entry or a daemon.  You must, however,
                   have a working rsh or ssh system.  Using ssh is recommended for
                   its security features."

           Using pushing over the NFS
               Having the other systems mounted over the NFS, you can take an
               active pushing approach by checking the just updated tree
               against the other not-yet synced trees. An example would be

                 #!/usr/bin/perl -w

                 use strict;
                 use File::Copy;

                 my %MF = map {
                     m/(\S+)/;
                     $1 => [ (stat $1)[2, 7, 9] ];     # mode, size, mtime
                     } `cat MANIFEST`;

                 my %remote = map { $_ => "/$_/pro/3gl/CPAN/perl-5.7.1" } qw(host1 host2);

                 foreach my $host (keys %remote) {
                     unless (-d $remote{$host}) {
                         print STDERR "Cannot Xsync for host $host\n";
                         next;
                         }
                     foreach my $file (keys %MF) {
                         my $rfile = "$remote{$host}/$file";
                         my ($mode, $size, $mtime) = (stat $rfile)[2, 7, 9];
                         defined $size or ($mode, $size, $mtime) = (0, 0, 0);
                         $size == $MF{$file}[1] && $mtime == $MF{$file}[2] and next;
                         printf "%4s %-34s %8d %9d  %8d %9d\n",
                             $host, $file, $MF{$file}[1], $MF{$file}[2], $size, $mtime;
                         unlink $rfile;
                         copy ($file, $rfile);
                         utime time, $MF{$file}[2], $rfile;
                         chmod $MF{$file}[0], $rfile;
                         }
                     }

               though this is not perfect. It could be improved with checking
               file checksums before updating. Not all NFS systems support
               reliable utime support (when used over the NFS).

       rsync'ing the patches
           The source tree is maintained by the pumpking who applies patches
           to the files in the tree. These patches are either created by the
           pumpking himself using "diff -c" after updating the file manually
           or by applying patches sent in by posters on the perl5-porters
           list.  These patches are also saved and rsync'able, so you can
           apply them yourself to the source files.

           Presuming you are in a directory where your patches reside, you can
           get them in sync with

            # rsync -avz rsync://ftp.linux.activestate.com/perl-current-diffs/ .

           This makes sure the latest available patch is downloaded to your
           patch directory.

           It's then up to you to apply these patches, using something like

            # last=`ls -t *.gz | sed q`
            # rsync -avz rsync://ftp.linux.activestate.com/perl-current-diffs/ .
            # find . -name '*.gz' -newer $last -exec gzcat {} \; >blead.patch
            # cd ../perl-current
            # patch -p1 -N <../perl-current-diffs/blead.patch

           or, since this is only a hint towards how it works, use CPAN-
           patchaperl from Andreas K



perl v5.8.6                       2004-11-05                       PERLHACK(1)