���ѧۧݧ�ӧ�� �ާ֧ߧ֧էا֧� - ���֧էѧܧ�ڧ��ӧѧ�� - /home/ukubnwwtacc0unt/chapelbellstudios.com/uploads/cover/Util.zip
���ѧ٧ѧ�
PK �b�\�;�S^t ^t FieldHash.pmnu �[��� package Hash::Util::FieldHash; use 5.009004; use strict; use warnings; use Scalar::Util qw( reftype); our $VERSION = '1.19'; require Exporter; our @ISA = qw(Exporter); our %EXPORT_TAGS = ( 'all' => [ qw( fieldhash fieldhashes idhash idhashes id id_2obj register )], ); our @EXPORT_OK = ( @{ $EXPORT_TAGS{'all'} } ); { require XSLoader; my %ob_reg; # private object registry sub _ob_reg { \ %ob_reg } XSLoader::load(); } sub fieldhash (\%) { for ( shift ) { return unless ref() && reftype( $_) eq 'HASH'; return $_ if Hash::Util::FieldHash::_fieldhash( $_, 0); return $_ if Hash::Util::FieldHash::_fieldhash( $_, 2) == 2; return; } } sub idhash (\%) { for ( shift ) { return unless ref() && reftype( $_) eq 'HASH'; return $_ if Hash::Util::FieldHash::_fieldhash( $_, 0); return $_ if Hash::Util::FieldHash::_fieldhash( $_, 1) == 1; return; } } sub fieldhashes { map &fieldhash( $_), @_ } sub idhashes { map &idhash( $_), @_ } 1; __END__ =head1 NAME Hash::Util::FieldHash - Support for Inside-Out Classes =head1 SYNOPSIS ### Create fieldhashes use Hash::Util qw(fieldhash fieldhashes); # Create a single field hash fieldhash my %foo; # Create three at once... fieldhashes \ my(%foo, %bar, %baz); # ...or any number fieldhashes @hashrefs; ### Create an idhash and register it for garbage collection use Hash::Util::FieldHash qw(idhash register); idhash my %name; my $object = \ do { my $o }; # register the idhash for garbage collection with $object register($object, \ %name); # the following entry will be deleted when $object goes out of scope $name{$object} = 'John Doe'; ### Register an ordinary hash for garbage collection use Hash::Util::FieldHash qw(id register); my %name; my $object = \ do { my $o }; # register the hash %name for garbage collection of $object's id register $object, \ %name; # the following entry will be deleted when $object goes out of scope $name{id $object} = 'John Doe'; =head1 FUNCTIONS C<Hash::Util::FieldHash> offers a number of functions in support of L<The Inside-out Technique> of class construction. =over =item id id($obj) Returns the reference address of a reference $obj. If $obj is not a reference, returns $obj. This function is a stand-in replacement for L<Scalar::Util::refaddr|Scalar::Util/refaddr>, that is, it returns the reference address of its argument as a numeric value. The only difference is that C<refaddr()> returns C<undef> when given a non-reference while C<id()> returns its argument unchanged. C<id()> also uses a caching technique that makes it faster when the id of an object is requested often, but slower if it is needed only once or twice. =item id_2obj $obj = id_2obj($id) If C<$id> is the id of a registered object (see L</register>), returns the object, otherwise an undefined value. For registered objects this is the inverse function of C<id()>. =item register register($obj) register($obj, @hashrefs) In the first form, registers an object to work with for the function C<id_2obj()>. In the second form, it additionally marks the given hashrefs down for garbage collection. This means that when the object goes out of scope, any entries in the given hashes under the key of C<id($obj)> will be deleted from the hashes. It is a fatal error to register a non-reference $obj. Any non-hashrefs among the following arguments are silently ignored. It is I<not> an error to register the same object multiple times with varying sets of hashrefs. Any hashrefs that are not registered yet will be added, others ignored. Registry also implies thread support. When a new thread is created, all references are replaced with new ones, including all objects. If a hash uses the reference address of an object as a key, that connection would be broken. With a registered object, its id will be updated in all hashes registered with it. =item idhash idhash my %hash Makes an idhash from the argument, which must be a hash. An I<idhash> works like a normal hash, except that it stringifies a I<reference used as a key> differently. A reference is stringified as if the C<id()> function had been invoked on it, that is, its reference address in decimal is used as the key. =item idhashes idhashes \ my(%hash, %gnash, %trash) idhashes \ @hashrefs Creates many idhashes from its hashref arguments. Returns those arguments that could be converted or their number in scalar context. =item fieldhash fieldhash %hash; Creates a single fieldhash. The argument must be a hash. Returns a reference to the given hash if successful, otherwise nothing. A I<fieldhash> is, in short, an idhash with auto-registry. When an object (or, indeed, any reference) is used as a fieldhash key, the fieldhash is automatically registered for garbage collection with the object, as if C<register $obj, \ %fieldhash> had been called. =item fieldhashes fieldhashes @hashrefs; Creates any number of field hashes. Arguments must be hash references. Returns the converted hashrefs in list context, their number in scalar context. =back =head1 DESCRIPTION A word on terminology: I shall use the term I<field> for a scalar piece of data that a class associates with an object. Other terms that have been used for this concept are "object variable", "(object) property", "(object) attribute" and more. Especially "attribute" has some currency among Perl programmer, but that clashes with the C<attributes> pragma. The term "field" also has some currency in this sense and doesn't seem to conflict with other Perl terminology. In Perl, an object is a blessed reference. The standard way of associating data with an object is to store the data inside the object's body, that is, the piece of data pointed to by the reference. In consequence, if two or more classes want to access an object they I<must> agree on the type of reference and also on the organization of data within the object body. Failure to agree on the type results in immediate death when the wrong method tries to access an object. Failure to agree on data organization may lead to one class trampling over the data of another. This object model leads to a tight coupling between subclasses. If one class wants to inherit from another (and both classes access object data), the classes must agree about implementation details. Inheritance can only be used among classes that are maintained together, in a single source or not. In particular, it is not possible to write general-purpose classes in this technique, classes that can advertise themselves as "Put me on your @ISA list and use my methods". If the other class has different ideas about how the object body is used, there is trouble. For reference C<Name_hash> in L</Example 1> shows the standard implementation of a simple class C<Name> in the well-known hash based way. It also demonstrates the predictable failure to construct a common subclass C<NamedFile> of C<Name> and the class C<IO::File> (whose objects I<must> be globrefs). Thus, techniques are of interest that store object data I<not> in the object body but some other place. =head2 The Inside-out Technique With I<inside-out> classes, each class declares a (typically lexical) hash for each field it wants to use. The reference address of an object is used as the hash key. By definition, the reference address is unique to each object so this guarantees a place for each field that is private to the class and unique to each object. See C<Name_id> in L</Example 1> for a simple example. In comparison to the standard implementation where the object is a hash and the fields correspond to hash keys, here the fields correspond to hashes, and the object determines the hash key. Thus the hashes appear to be turned I<inside out>. The body of an object is never examined by an inside-out class, only its reference address is used. This allows for the body of an actual object to be I<anything at all> while the object methods of the class still work as designed. This is a key feature of inside-out classes. =head2 Problems of Inside-out Inside-out classes give us freedom of inheritance, but as usual there is a price. Most obviously, there is the necessity of retrieving the reference address of an object for each data access. It's a minor inconvenience, but it does clutter the code. More important (and less obvious) is the necessity of garbage collection. When a normal object dies, anything stored in the object body is garbage-collected by perl. With inside-out objects, Perl knows nothing about the data stored in field hashes by a class, but these must be deleted when the object goes out of scope. Thus the class must provide a C<DESTROY> method to take care of that. In the presence of multiple classes it can be non-trivial to make sure that every relevant destructor is called for every object. Perl calls the first one it finds on the inheritance tree (if any) and that's it. A related issue is thread-safety. When a new thread is created, the Perl interpreter is cloned, which implies that all reference addresses in use will be replaced with new ones. Thus, if a class tries to access a field of a cloned object its (cloned) data will still be stored under the now invalid reference address of the original in the parent thread. A general C<CLONE> method must be provided to re-establish the association. =head2 Solutions C<Hash::Util::FieldHash> addresses these issues on several levels. The C<id()> function is provided in addition to the existing C<Scalar::Util::refaddr()>. Besides its short name it can be a little faster under some circumstances (and a bit slower under others). Benchmark if it matters. The working of C<id()> also allows the use of the class name as a I<generic object> as described L<further down|/"The Generic Object">. The C<id()> function is incorporated in I<id hashes> in the sense that it is called automatically on every key that is used with the hash. No explicit call is necessary. The problems of garbage collection and thread safety are both addressed by the function C<register()>. It registers an object together with any number of hashes. Registry means that when the object dies, an entry in any of the hashes under the reference address of this object will be deleted. This guarantees garbage collection in these hashes. It also means that on thread cloning the object's entries in registered hashes will be replaced with updated entries whose key is the cloned object's reference address. Thus the object-data association becomes thread-safe. Object registry is best done when the object is initialized for use with a class. That way, garbage collection and thread safety are established for every object and every field that is initialized. Finally, I<field hashes> incorporate all these functions in one package. Besides automatically calling the C<id()> function on every object used as a key, the object is registered with the field hash on first use. Classes based on field hashes are fully garbage-collected and thread safe without further measures. =head2 More Problems Another problem that occurs with inside-out classes is serialization. Since the object data is not in its usual place, standard routines like C<Storable::freeze()>, C<Storable::thaw()> and C<Data::Dumper::Dumper()> can't deal with it on their own. Both C<Data::Dumper> and C<Storable> provide the necessary hooks to make things work, but the functions or methods used by the hooks must be provided by each inside-out class. A general solution to the serialization problem would require another level of registry, one that associates I<classes> and fields. So far, the functions of C<Hash::Util::FieldHash> are unaware of any classes, which I consider a feature. Therefore C<Hash::Util::FieldHash> doesn't address the serialization problems. =head2 The Generic Object Classes based on the C<id()> function (and hence classes based on C<idhash()> and C<fieldhash()>) show a peculiar behavior in that the class name can be used like an object. Specifically, methods that set or read data associated with an object continue to work as class methods, just as if the class name were an object, distinct from all other objects, with its own data. This object may be called the I<generic object> of the class. This works because field hashes respond to keys that are not references like a normal hash would and use the string offered as the hash key. Thus, if a method is called as a class method, the field hash is presented with the class name instead of an object and blithely uses it as a key. Since the keys of real objects are decimal numbers, there is no conflict and the slot in the field hash can be used like any other. The C<id()> function behaves correspondingly with respect to non-reference arguments. Two possible uses (besides ignoring the property) come to mind. A singleton class could be implemented this using the generic object. If necessary, an C<init()> method could die or ignore calls with actual objects (references), so only the generic object will ever exist. Another use of the generic object would be as a template. It is a convenient place to store class-specific defaults for various fields to be used in actual object initialization. Usually, the feature can be entirely ignored. Calling I<object methods> as I<class methods> normally leads to an error and isn't used routinely anywhere. It may be a problem that this error isn't indicated by a class with a generic object. =head2 How to use Field Hashes Traditionally, the definition of an inside-out class contains a bare block inside which a number of lexical hashes are declared and the basic accessor methods defined, usually through C<Scalar::Util::refaddr>. Further methods may be defined outside this block. There has to be a DESTROY method and, for thread support, a CLONE method. When field hashes are used, the basic structure remains the same. Each lexical hash will be made a field hash. The call to C<refaddr> can be omitted from the accessor methods. DESTROY and CLONE methods are not necessary. If you have an existing inside-out class, simply making all hashes field hashes with no other change should make no difference. Through the calls to C<refaddr> or equivalent, the field hashes never get to see a reference and work like normal hashes. Your DESTROY (and CLONE) methods are still needed. To make the field hashes kick in, it is easiest to redefine C<refaddr> as sub refaddr { shift } instead of importing it from C<Scalar::Util>. It should now be possible to disable DESTROY and CLONE. Note that while it isn't disabled, DESTROY will be called before the garbage collection of field hashes, so it will be invoked with a functional object and will continue to function. It is not desirable to import the functions C<fieldhash> and/or C<fieldhashes> into every class that is going to use them. They are only used once to set up the class. When the class is up and running, these functions serve no more purpose. If there are only a few field hashes to declare, it is simplest to use Hash::Util::FieldHash; early and call the functions qualified: Hash::Util::FieldHash::fieldhash my %foo; Otherwise, import the functions into a convenient package like C<HUF> or, more general, C<Aux> { package Aux; use Hash::Util::FieldHash ':all'; } and call Aux::fieldhash my %foo; as needed. =head2 Garbage-Collected Hashes Garbage collection in a field hash means that entries will "spontaneously" disappear when the object that created them disappears. That must be borne in mind, especially when looping over a field hash. If anything you do inside the loop could cause an object to go out of scope, a random key may be deleted from the hash you are looping over. That can throw the loop iterator, so it's best to cache a consistent snapshot of the keys and/or values and loop over that. You will still have to check that a cached entry still exists when you get to it. Garbage collection can be confusing when keys are created in a field hash from normal scalars as well as references. Once a reference is I<used> with a field hash, the entry will be collected, even if it was later overwritten with a plain scalar key (every positive integer is a candidate). This is true even if the original entry was deleted in the meantime. In fact, deletion from a field hash, and also a test for existence constitute I<use> in this sense and create a liability to delete the entry when the reference goes out of scope. If you happen to create an entry with an identical key from a string or integer, that will be collected instead. Thus, mixed use of references and plain scalars as field hash keys is not entirely supported. =head1 EXAMPLES The examples show a very simple class that implements a I<name>, consisting of a first and last name (no middle initial). The name class has four methods: =over =item * C<init()> An object method that initializes the first and last name to its two arguments. If called as a class method, C<init()> creates an object in the given class and initializes that. =item * C<first()> Retrieve the first name =item * C<last()> Retrieve the last name =item * C<name()> Retrieve the full name, the first and last name joined by a blank. =back The examples show this class implemented with different levels of support by C<Hash::Util::FieldHash>. All supported combinations are shown. The difference between implementations is often quite small. The implementations are: =over =item * C<Name_hash> A conventional (not inside-out) implementation where an object is a hash that stores the field values, without support by C<Hash::Util::FieldHash>. This implementation doesn't allow arbitrary inheritance. =item * C<Name_id> Inside-out implementation based on the C<id()> function. It needs a C<DESTROY> method. For thread support a C<CLONE> method (not shown) would also be needed. Instead of C<Hash::Util::FieldHash::id()> the function C<Scalar::Util::refaddr> could be used with very little functional difference. This is the basic pattern of an inside-out class. =item * C<Name_idhash> Idhash-based inside-out implementation. Like C<Name_id> it needs a C<DESTROY> method and would need C<CLONE> for thread support. =item * C<Name_id_reg> Inside-out implementation based on the C<id()> function with explicit object registry. No destructor is needed and objects are thread safe. =item * C<Name_idhash_reg> Idhash-based inside-out implementation with explicit object registry. No destructor is needed and objects are thread safe. =item * C<Name_fieldhash> FieldHash-based inside-out implementation. Object registry happens automatically. No destructor is needed and objects are thread safe. =back These examples are realized in the code below, which could be copied to a file F<Example.pm>. =head2 Example 1 use strict; use warnings; { package Name_hash; # standard implementation: the # object is a hash sub init { my $obj = shift; my ($first, $last) = @_; # create an object if called as class method $obj = bless {}, $obj unless ref $obj; $obj->{ first} = $first; $obj->{ last} = $last; $obj; } sub first { shift()->{ first} } sub last { shift()->{ last} } sub name { my $n = shift; join ' ' => $n->first, $n->last; } } { package Name_id; use Hash::Util::FieldHash qw(id); my (%first, %last); sub init { my $obj = shift; my ($first, $last) = @_; # create an object if called as class method $obj = bless \ my $o, $obj unless ref $obj; $first{ id $obj} = $first; $last{ id $obj} = $last; $obj; } sub first { $first{ id shift()} } sub last { $last{ id shift()} } sub name { my $n = shift; join ' ' => $n->first, $n->last; } sub DESTROY { my $id = id shift; delete $first{ $id}; delete $last{ $id}; } } { package Name_idhash; use Hash::Util::FieldHash; Hash::Util::FieldHash::idhashes( \ my (%first, %last) ); sub init { my $obj = shift; my ($first, $last) = @_; # create an object if called as class method $obj = bless \ my $o, $obj unless ref $obj; $first{ $obj} = $first; $last{ $obj} = $last; $obj; } sub first { $first{ shift()} } sub last { $last{ shift()} } sub name { my $n = shift; join ' ' => $n->first, $n->last; } sub DESTROY { my $n = shift; delete $first{ $n}; delete $last{ $n}; } } { package Name_id_reg; use Hash::Util::FieldHash qw(id register); my (%first, %last); sub init { my $obj = shift; my ($first, $last) = @_; # create an object if called as class method $obj = bless \ my $o, $obj unless ref $obj; register( $obj, \ (%first, %last) ); $first{ id $obj} = $first; $last{ id $obj} = $last; $obj; } sub first { $first{ id shift()} } sub last { $last{ id shift()} } sub name { my $n = shift; join ' ' => $n->first, $n->last; } } { package Name_idhash_reg; use Hash::Util::FieldHash qw(register); Hash::Util::FieldHash::idhashes \ my (%first, %last); sub init { my $obj = shift; my ($first, $last) = @_; # create an object if called as class method $obj = bless \ my $o, $obj unless ref $obj; register( $obj, \ (%first, %last) ); $first{ $obj} = $first; $last{ $obj} = $last; $obj; } sub first { $first{ shift()} } sub last { $last{ shift()} } sub name { my $n = shift; join ' ' => $n->first, $n->last; } } { package Name_fieldhash; use Hash::Util::FieldHash; Hash::Util::FieldHash::fieldhashes \ my (%first, %last); sub init { my $obj = shift; my ($first, $last) = @_; # create an object if called as class method $obj = bless \ my $o, $obj unless ref $obj; $first{ $obj} = $first; $last{ $obj} = $last; $obj; } sub first { $first{ shift()} } sub last { $last{ shift()} } sub name { my $n = shift; join ' ' => $n->first, $n->last; } } 1; To exercise the various implementations the script L<below|/"Example 2"> can be used. It sets up a class C<Name> that is a mirror of one of the implementation classes C<Name_hash>, C<Name_id>, ..., C<Name_fieldhash>. That determines which implementation is run. The script first verifies the function of the C<Name> class. In the second step, the free inheritability of the implementation (or lack thereof) is demonstrated. For this purpose it constructs a class called C<NamedFile> which is a common subclass of C<Name> and the standard class C<IO::File>. This puts inheritability to the test because objects of C<IO::File> I<must> be globrefs. Objects of C<NamedFile> should behave like a file opened for reading and also support the C<name()> method. This class juncture works with exception of the C<Name_hash> implementation, where object initialization fails because of the incompatibility of object bodies. =head2 Example 2 use strict; use warnings; $| = 1; use Example; { package Name; use parent 'Name_id'; # define here which implementation to run } # Verify that the base package works my $n = Name->init(qw(Albert Einstein)); print $n->name, "\n"; print "\n"; # Create a named file handle (See definition below) my $nf = NamedFile->init(qw(/tmp/x Filomena File)); # use as a file handle... for ( 1 .. 3 ) { my $l = <$nf>; print "line $_: $l"; } # ...and as a Name object print "...brought to you by ", $nf->name, "\n"; exit; # Definition of NamedFile package NamedFile; use parent 'Name'; use parent 'IO::File'; sub init { my $obj = shift; my ($file, $first, $last) = @_; $obj = $obj->IO::File::new() unless ref $obj; $obj->open($file) or die "Can't read '$file': $!"; $obj->Name::init($first, $last); } __END__ =head1 GUTS To make C<Hash::Util::FieldHash> work, there were two changes to F<perl> itself. C<PERL_MAGIC_uvar> was made available for hashes, and weak references now call uvar C<get> magic after a weakref has been cleared. The first feature is used to make field hashes intercept their keys upon access. The second one triggers garbage collection. =head2 The C<PERL_MAGIC_uvar> interface for hashes C<PERL_MAGIC_uvar> I<get> magic is called from C<hv_fetch_common> and C<hv_delete_common> through the function C<hv_magic_uvar_xkey>, which defines the interface. The call happens for hashes with "uvar" magic if the C<ufuncs> structure has equal values in the C<uf_val> and C<uf_set> fields. Hashes are unaffected if (and as long as) these fields hold different values. Upon the call, the C<mg_obj> field will hold the hash key to be accessed. Upon return, the C<SV*> value in C<mg_obj> will be used in place of the original key in the hash access. The integer index value in the first parameter will be the C<action> value from C<hv_fetch_common>, or -1 if the call is from C<hv_delete_common>. This is a template for a function suitable for the C<uf_val> field in a C<ufuncs> structure for this call. The C<uf_set> and C<uf_index> fields are irrelevant. IV watch_key(pTHX_ IV action, SV* field) { MAGIC* mg = mg_find(field, PERL_MAGIC_uvar); SV* keysv = mg->mg_obj; /* Do whatever you need to. If you decide to supply a different key newkey, return it like this */ sv_2mortal(newkey); mg->mg_obj = newkey; return 0; } =head2 Weakrefs call uvar magic When a weak reference is stored in an C<SV> that has "uvar" magic, C<set> magic is called after the reference has gone stale. This hook can be used to trigger further garbage-collection activities associated with the referenced object. =head2 How field hashes work The three features of key hashes, I<key replacement>, I<thread support>, and I<garbage collection> are supported by a data structure called the I<object registry>. This is a private hash where every object is stored. An "object" in this sense is any reference (blessed or unblessed) that has been used as a field hash key. The object registry keeps track of references that have been used as field hash keys. The keys are generated from the reference address like in a field hash (though the registry isn't a field hash). Each value is a weak copy of the original reference, stored in an C<SV> that is itself magical (C<PERL_MAGIC_uvar> again). The magical structure holds a list (another hash, really) of field hashes that the reference has been used with. When the weakref becomes stale, the magic is activated and uses the list to delete the reference from all field hashes it has been used with. After that, the entry is removed from the object registry itself. Implicitly, that frees the magic structure and the storage it has been using. Whenever a reference is used as a field hash key, the object registry is checked and a new entry is made if necessary. The field hash is then added to the list of fields this reference has used. The object registry is also used to repair a field hash after thread cloning. Here, the entire object registry is processed. For every reference found there, the field hashes it has used are visited and the entry is updated. =head2 Internal function Hash::Util::FieldHash::_fieldhash # test if %hash is a field hash my $result = _fieldhash \ %hash, 0; # make %hash a field hash my $result = _fieldhash \ %hash, 1; C<_fieldhash> is the internal function used to create field hashes. It takes two arguments, a hashref and a mode. If the mode is boolean false, the hash is not changed but tested if it is a field hash. If the hash isn't a field hash the return value is boolean false. If it is, the return value indicates the mode of field hash. When called with a boolean true mode, it turns the given hash into a field hash of this mode, returning the mode of the created field hash. C<_fieldhash> does not erase the given hash. Currently there is only one type of field hash, and only the boolean value of the mode makes a difference, but that may change. =head1 AUTHOR Anno Siegel (ANNO) wrote the xs code and the changes in perl proper Jerry Hedden (JDHEDDEN) made it faster =head1 COPYRIGHT AND LICENSE Copyright (C) 2006-2007 by (Anno Siegel) This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.7 or, at your option, any later version of Perl 5 you may have available. =cut PK �v�\#��� � Exec.pmnu �[��� package Filter::Util::Exec ; require 5.006 ; require XSLoader; our $VERSION = "1.58" ; XSLoader::load('Filter::Util::Exec'); 1 ; __END__ =head1 NAME Filter::Util::Exec - exec source filter =head1 SYNOPSIS use Filter::Util::Exec; =head1 DESCRIPTION This module is provides the interface to allow the creation of I<Source Filters> which use a Unix coprocess. See L<Filter::exec>, L<Filter::cpp> and L<Filter::sh> for examples of the use of this module. Note that the size of the buffers is limited to 32-bit. =head2 B<filter_add()> The function, C<filter_add> installs a filter. It takes one parameter which should be a reference. The kind of reference used will dictate which of the two filter types will be used. If a CODE reference is used then a I<closure filter> will be assumed. If a CODE reference is not used, a I<method filter> will be assumed. In a I<method filter>, the reference can be used to store context information. The reference will be I<blessed> into the package by C<filter_add>. See L<Filter::Util::Call> for examples of using context information using both I<method filters> and I<closure filters>. =head1 AUTHOR Paul Marquess =head1 DATE 11th December 1995. =cut PK �v�\U���6 �6 Call.pmnu �[��� # Call.pm # # Copyright (c) 1995-2011 Paul Marquess. All rights reserved. # Copyright (c) 2011-2014 Reini Urban. All rights reserved. # Copyright (c) 2014-2017 cPanel Inc. All rights reserved. # # This program is free software; you can redistribute it and/or # modify it under the same terms as Perl itself. package Filter::Util::Call ; require 5.006 ; # our require Exporter; use XSLoader (); use strict; use warnings; our @ISA = qw(Exporter); our @EXPORT = qw( filter_add filter_del filter_read filter_read_exact) ; our $VERSION = "1.58" ; our $XS_VERSION = $VERSION; $VERSION = eval $VERSION; sub filter_read_exact($) { my ($size) = @_ ; my ($left) = $size ; my ($status) ; unless ( $size > 0 ) { require Carp; Carp::croak("filter_read_exact: size parameter must be > 0"); } # try to read a block which is exactly $size bytes long while ($left and ($status = filter_read($left)) > 0) { $left = $size - length $_ ; } # EOF with pending data is a special case return 1 if $status == 0 and length $_ ; return $status ; } sub filter_add($) { my($obj) = @_ ; # Did we get a code reference? my $coderef = (ref $obj eq 'CODE'); # If the parameter isn't already a reference, make it one. if (!$coderef and (!ref($obj) or ref($obj) =~ /^ARRAY|HASH$/)) { $obj = bless (\$obj, (caller)[0]); } # finish off the installation of the filter in C. Filter::Util::Call::real_import($obj, (caller)[0], $coderef) ; } XSLoader::load('Filter::Util::Call'); 1; __END__ =head1 NAME Filter::Util::Call - Perl Source Filter Utility Module =head1 SYNOPSIS use Filter::Util::Call ; =head1 DESCRIPTION This module provides you with the framework to write I<Source Filters> in Perl. An alternate interface to Filter::Util::Call is now available. See L<Filter::Simple> for more details. A I<Perl Source Filter> is implemented as a Perl module. The structure of the module can take one of two broadly similar formats. To distinguish between them, the first will be referred to as I<method filter> and the second as I<closure filter>. Here is a skeleton for the I<method filter>: package MyFilter ; use Filter::Util::Call ; sub import { my($type, @arguments) = @_ ; filter_add([]) ; } sub filter { my($self) = @_ ; my($status) ; $status = filter_read() ; $status ; } 1 ; and this is the equivalent skeleton for the I<closure filter>: package MyFilter ; use Filter::Util::Call ; sub import { my($type, @arguments) = @_ ; filter_add( sub { my($status) ; $status = filter_read() ; $status ; } ) } 1 ; To make use of either of the two filter modules above, place the line below in a Perl source file. use MyFilter; In fact, the skeleton modules shown above are fully functional I<Source Filters>, albeit fairly useless ones. All they does is filter the source stream without modifying it at all. As you can see both modules have a broadly similar structure. They both make use of the C<Filter::Util::Call> module and both have an C<import> method. The difference between them is that the I<method filter> requires a I<filter> method, whereas the I<closure filter> gets the equivalent of a I<filter> method with the anonymous sub passed to I<filter_add>. To make proper use of the I<closure filter> shown above you need to have a good understanding of the concept of a I<closure>. See L<perlref> for more details on the mechanics of I<closures>. =head2 B<use Filter::Util::Call> The following functions are exported by C<Filter::Util::Call>: filter_add() filter_read() filter_read_exact() filter_del() =head2 B<import()> The C<import> method is used to create an instance of the filter. It is called indirectly by Perl when it encounters the C<use MyFilter> line in a source file (See L<perlfunc/import> for more details on C<import>). It will always have at least one parameter automatically passed by Perl - this corresponds to the name of the package. In the example above it will be C<"MyFilter">. Apart from the first parameter, import can accept an optional list of parameters. These can be used to pass parameters to the filter. For example: use MyFilter qw(a b c) ; will result in the C<@_> array having the following values: @_ [0] => "MyFilter" @_ [1] => "a" @_ [2] => "b" @_ [3] => "c" Before terminating, the C<import> function must explicitly install the filter by calling C<filter_add>. =head2 B<filter_add()> The function, C<filter_add>, actually installs the filter. It takes one parameter which should be a reference. The kind of reference used will dictate which of the two filter types will be used. If a CODE reference is used then a I<closure filter> will be assumed. If a CODE reference is not used, a I<method filter> will be assumed. In a I<method filter>, the reference can be used to store context information. The reference will be I<blessed> into the package by C<filter_add>, unless the reference was already blessed. See the filters at the end of this documents for examples of using context information using both I<method filters> and I<closure filters>. =head2 B<filter() and anonymous sub> Both the C<filter> method used with a I<method filter> and the anonymous sub used with a I<closure filter> is where the main processing for the filter is done. The big difference between the two types of filter is that the I<method filter> uses the object passed to the method to store any context data, whereas the I<closure filter> uses the lexical variables that are maintained by the closure. Note that the single parameter passed to the I<method filter>, C<$self>, is the same reference that was passed to C<filter_add> blessed into the filter's package. See the example filters later on for details of using C<$self>. Here is a list of the common features of the anonymous sub and the C<filter()> method. =over 5 =item B<$_> Although C<$_> doesn't actually appear explicitly in the sample filters above, it is implicitly used in a number of places. Firstly, when either C<filter> or the anonymous sub are called, a local copy of C<$_> will automatically be created. It will always contain the empty string at this point. Next, both C<filter_read> and C<filter_read_exact> will append any source data that is read to the end of C<$_>. Finally, when C<filter> or the anonymous sub are finished processing, they are expected to return the filtered source using C<$_>. This implicit use of C<$_> greatly simplifies the filter. =item B<$status> The status value that is returned by the user's C<filter> method or anonymous sub and the C<filter_read> and C<read_exact> functions take the same set of values, namely: < 0 Error = 0 EOF > 0 OK =item B<filter_read> and B<filter_read_exact> These functions are used by the filter to obtain either a line or block from the next filter in the chain or the actual source file if there aren't any other filters. The function C<filter_read> takes two forms: $status = filter_read() ; $status = filter_read($size) ; The first form is used to request a I<line>, the second requests a I<block>. In line mode, C<filter_read> will append the next source line to the end of the C<$_> scalar. In block mode, C<filter_read> will append a block of data which is <= C<$size> to the end of the C<$_> scalar. It is important to emphasise the that C<filter_read> will not necessarily read a block which is I<precisely> C<$size> bytes. If you need to be able to read a block which has an exact size, you can use the function C<filter_read_exact>. It works identically to C<filter_read> in block mode, except it will try to read a block which is exactly C<$size> bytes in length. The only circumstances when it will not return a block which is C<$size> bytes long is on EOF or error. It is I<very> important to check the value of C<$status> after I<every> call to C<filter_read> or C<filter_read_exact>. =item B<filter_del> The function, C<filter_del>, is used to disable the current filter. It does not affect the running of the filter. All it does is tell Perl not to call filter any more. See L<Example 4: Using filter_del> for details. =item I<real_import> Internal function which adds the filter, based on the L<filter_add> argument type. =item I<unimport()> May be used to disable a filter, but is rarely needed. See L<filter_del>. =back =head1 LIMITATIONS See L<perlfilter/LIMITATIONS> for an overview of the general problems filtering code in a textual line-level only. =over =item __DATA__ is ignored The content from the __DATA__ block is not filtered. This is a serious limitation, e.g. for the L<Switch> module. See L<http://search.cpan.org/perldoc?Switch#LIMITATIONS> for more. =item Max. codesize limited to 32-bit Currently internal buffer lengths are limited to 32-bit only. =back =head1 EXAMPLES Here are a few examples which illustrate the key concepts - as such most of them are of little practical use. The C<examples> sub-directory has copies of all these filters implemented both as I<method filters> and as I<closure filters>. =head2 Example 1: A simple filter. Below is a I<method filter> which is hard-wired to replace all occurrences of the string C<"Joe"> to C<"Jim">. Not particularly Useful, but it is the first example and I wanted to keep it simple. package Joe2Jim ; use Filter::Util::Call ; sub import { my($type) = @_ ; filter_add(bless []) ; } sub filter { my($self) = @_ ; my($status) ; s/Joe/Jim/g if ($status = filter_read()) > 0 ; $status ; } 1 ; Here is an example of using the filter: use Joe2Jim ; print "Where is Joe?\n" ; And this is what the script above will print: Where is Jim? =head2 Example 2: Using the context The previous example was not particularly useful. To make it more general purpose we will make use of the context data and allow any arbitrary I<from> and I<to> strings to be used. This time we will use a I<closure filter>. To reflect its enhanced role, the filter is called C<Subst>. package Subst ; use Filter::Util::Call ; use Carp ; sub import { croak("usage: use Subst qw(from to)") unless @_ == 3 ; my ($self, $from, $to) = @_ ; filter_add( sub { my ($status) ; s/$from/$to/ if ($status = filter_read()) > 0 ; $status ; }) } 1 ; and is used like this: use Subst qw(Joe Jim) ; print "Where is Joe?\n" ; =head2 Example 3: Using the context within the filter Here is a filter which a variation of the C<Joe2Jim> filter. As well as substituting all occurrences of C<"Joe"> to C<"Jim"> it keeps a count of the number of substitutions made in the context object. Once EOF is detected (C<$status> is zero) the filter will insert an extra line into the source stream. When this extra line is executed it will print a count of the number of substitutions actually made. Note that C<$status> is set to C<1> in this case. package Count ; use Filter::Util::Call ; sub filter { my ($self) = @_ ; my ($status) ; if (($status = filter_read()) > 0 ) { s/Joe/Jim/g ; ++ $$self ; } elsif ($$self >= 0) { # EOF $_ = "print q[Made ${$self} substitutions\n]" ; $status = 1 ; $$self = -1 ; } $status ; } sub import { my ($self) = @_ ; my ($count) = 0 ; filter_add(\$count) ; } 1 ; Here is a script which uses it: use Count ; print "Hello Joe\n" ; print "Where is Joe\n" ; Outputs: Hello Jim Where is Jim Made 2 substitutions =head2 Example 4: Using filter_del Another variation on a theme. This time we will modify the C<Subst> filter to allow a starting and stopping pattern to be specified as well as the I<from> and I<to> patterns. If you know the I<vi> editor, it is the equivalent of this command: :/start/,/stop/s/from/to/ When used as a filter we want to invoke it like this: use NewSubst qw(start stop from to) ; Here is the module. package NewSubst ; use Filter::Util::Call ; use Carp ; sub import { my ($self, $start, $stop, $from, $to) = @_ ; my ($found) = 0 ; croak("usage: use Subst qw(start stop from to)") unless @_ == 5 ; filter_add( sub { my ($status) ; if (($status = filter_read()) > 0) { $found = 1 if $found == 0 and /$start/ ; if ($found) { s/$from/$to/ ; filter_del() if /$stop/ ; } } $status ; } ) } 1 ; =head1 Filter::Simple If you intend using the Filter::Call functionality, I would strongly recommend that you check out Damian Conway's excellent Filter::Simple module. Damian's module provides a much cleaner interface than Filter::Util::Call. Although it doesn't allow the fine control that Filter::Util::Call does, it should be adequate for the majority of applications. It's available at http://search.cpan.org/dist/Filter-Simple/ =head1 AUTHOR Paul Marquess =head1 DATE 26th January 1996 =head1 LICENSE Copyright (c) 1995-2011 Paul Marquess. All rights reserved. Copyright (c) 2011-2014 Reini Urban. All rights reserved. Copyright (c) 2014-2017 cPanel Inc. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself. =cut PK �v�\R���W W perlfilter.podnu �[��� =head1 NAME perlfilter - Source Filters =head1 DESCRIPTION This article is about a little-known feature of Perl called I<source filters>. Source filters alter the program text of a module before Perl sees it, much as a C preprocessor alters the source text of a C program before the compiler sees it. This article tells you more about what source filters are, how they work, and how to write your own. The original purpose of source filters was to let you encrypt your program source to prevent casual piracy. This isn't all they can do, as you'll soon learn. But first, the basics. =head1 CONCEPTS Before the Perl interpreter can execute a Perl script, it must first read it from a file into memory for parsing and compilation. If that script itself includes other scripts with a C<use> or C<require> statement, then each of those scripts will have to be read from their respective files as well. Now think of each logical connection between the Perl parser and an individual file as a I<source stream>. A source stream is created when the Perl parser opens a file, it continues to exist as the source code is read into memory, and it is destroyed when Perl is finished parsing the file. If the parser encounters a C<require> or C<use> statement in a source stream, a new and distinct stream is created just for that file. The diagram below represents a single source stream, with the flow of source from a Perl script file on the left into the Perl parser on the right. This is how Perl normally operates. file -------> parser There are two important points to remember: =over 5 =item 1. Although there can be any number of source streams in existence at any given time, only one will be active. =item 2. Every source stream is associated with only one file. =back A source filter is a special kind of Perl module that intercepts and modifies a source stream before it reaches the parser. A source filter changes our diagram like this: file ----> filter ----> parser If that doesn't make much sense, consider the analogy of a command pipeline. Say you have a shell script stored in the compressed file I<trial.gz>. The simple pipeline command below runs the script without needing to create a temporary file to hold the uncompressed file. gunzip -c trial.gz | sh In this case, the data flow from the pipeline can be represented as follows: trial.gz ----> gunzip ----> sh With source filters, you can store the text of your script compressed and use a source filter to uncompress it for Perl's parser: compressed gunzip Perl program ---> source filter ---> parser =head1 USING FILTERS So how do you use a source filter in a Perl script? Above, I said that a source filter is just a special kind of module. Like all Perl modules, a source filter is invoked with a use statement. Say you want to pass your Perl source through the C preprocessor before execution. As it happens, the source filters distribution comes with a C preprocessor filter module called Filter::cpp. Below is an example program, C<cpp_test>, which makes use of this filter. Line numbers have been added to allow specific lines to be referenced easily. 1: use Filter::cpp; 2: #define TRUE 1 3: $a = TRUE; 4: print "a = $a\n"; When you execute this script, Perl creates a source stream for the file. Before the parser processes any of the lines from the file, the source stream looks like this: cpp_test ---------> parser Line 1, C<use Filter::cpp>, includes and installs the C<cpp> filter module. All source filters work this way. The use statement is compiled and executed at compile time, before any more of the file is read, and it attaches the cpp filter to the source stream behind the scenes. Now the data flow looks like this: cpp_test ----> cpp filter ----> parser As the parser reads the second and subsequent lines from the source stream, it feeds those lines through the C<cpp> source filter before processing them. The C<cpp> filter simply passes each line through the real C preprocessor. The output from the C preprocessor is then inserted back into the source stream by the filter. .-> cpp --. | | | | | <-' cpp_test ----> cpp filter ----> parser The parser then sees the following code: use Filter::cpp; $a = 1; print "a = $a\n"; Let's consider what happens when the filtered code includes another module with use: 1: use Filter::cpp; 2: #define TRUE 1 3: use Fred; 4: $a = TRUE; 5: print "a = $a\n"; The C<cpp> filter does not apply to the text of the Fred module, only to the text of the file that used it (C<cpp_test>). Although the use statement on line 3 will pass through the cpp filter, the module that gets included (C<Fred>) will not. The source streams look like this after line 3 has been parsed and before line 4 is parsed: cpp_test ---> cpp filter ---> parser (INACTIVE) Fred.pm ----> parser As you can see, a new stream has been created for reading the source from C<Fred.pm>. This stream will remain active until all of C<Fred.pm> has been parsed. The source stream for C<cpp_test> will still exist, but is inactive. Once the parser has finished reading Fred.pm, the source stream associated with it will be destroyed. The source stream for C<cpp_test> then becomes active again and the parser reads line 4 and subsequent lines from C<cpp_test>. You can use more than one source filter on a single file. Similarly, you can reuse the same filter in as many files as you like. For example, if you have a uuencoded and compressed source file, it is possible to stack a uudecode filter and an uncompression filter like this: use Filter::uudecode; use Filter::uncompress; M'XL(".H<US4''V9I;F%L')Q;>7/;1I;_>_I3=&E=%:F*I"T?22Q/ M6]9*<IQCO*XFT"0[PL%%'Y+IG?WN^ZYN-$'J.[.JE$,20/?K=_[> ... Once the first line has been processed, the flow will look like this: file ---> uudecode ---> uncompress ---> parser filter filter Data flows through filters in the same order they appear in the source file. The uudecode filter appeared before the uncompress filter, so the source file will be uudecoded before it's uncompressed. =head1 WRITING A SOURCE FILTER There are three ways to write your own source filter. You can write it in C, use an external program as a filter, or write the filter in Perl. I won't cover the first two in any great detail, so I'll get them out of the way first. Writing the filter in Perl is most convenient, so I'll devote the most space to it. =head1 WRITING A SOURCE FILTER IN C The first of the three available techniques is to write the filter completely in C. The external module you create interfaces directly with the source filter hooks provided by Perl. The advantage of this technique is that you have complete control over the implementation of your filter. The big disadvantage is the increased complexity required to write the filter - not only do you need to understand the source filter hooks, but you also need a reasonable knowledge of Perl guts. One of the few times it is worth going to this trouble is when writing a source scrambler. The C<decrypt> filter (which unscrambles the source before Perl parses it) included with the source filter distribution is an example of a C source filter (see Decryption Filters, below). =over 5 =item B<Decryption Filters> All decryption filters work on the principle of "security through obscurity." Regardless of how well you write a decryption filter and how strong your encryption algorithm is, anyone determined enough can retrieve the original source code. The reason is quite simple - once the decryption filter has decrypted the source back to its original form, fragments of it will be stored in the computer's memory as Perl parses it. The source might only be in memory for a short period of time, but anyone possessing a debugger, skill, and lots of patience can eventually reconstruct your program. That said, there are a number of steps that can be taken to make life difficult for the potential cracker. The most important: Write your decryption filter in C and statically link the decryption module into the Perl binary. For further tips to make life difficult for the potential cracker, see the file I<decrypt.pm> in the source filters distribution. =back =head1 CREATING A SOURCE FILTER AS A SEPARATE EXECUTABLE An alternative to writing the filter in C is to create a separate executable in the language of your choice. The separate executable reads from standard input, does whatever processing is necessary, and writes the filtered data to standard output. C<Filter::cpp> is an example of a source filter implemented as a separate executable - the executable is the C preprocessor bundled with your C compiler. The source filter distribution includes two modules that simplify this task: C<Filter::exec> and C<Filter::sh>. Both allow you to run any external executable. Both use a coprocess to control the flow of data into and out of the external executable. (For details on coprocesses, see Stephens, W.R., "Advanced Programming in the UNIX Environment." Addison-Wesley, ISBN 0-210-56317-7, pages 441-445.) The difference between them is that C<Filter::exec> spawns the external command directly, while C<Filter::sh> spawns a shell to execute the external command. (Unix uses the Bourne shell; NT uses the cmd shell.) Spawning a shell allows you to make use of the shell metacharacters and redirection facilities. Here is an example script that uses C<Filter::sh>: use Filter::sh 'tr XYZ PQR'; $a = 1; print "XYZ a = $a\n"; The output you'll get when the script is executed: PQR a = 1 Writing a source filter as a separate executable works fine, but a small performance penalty is incurred. For example, if you execute the small example above, a separate subprocess will be created to run the Unix C<tr> command. Each use of the filter requires its own subprocess. If creating subprocesses is expensive on your system, you might want to consider one of the other options for creating source filters. =head1 WRITING A SOURCE FILTER IN PERL The easiest and most portable option available for creating your own source filter is to write it completely in Perl. To distinguish this from the previous two techniques, I'll call it a Perl source filter. To help understand how to write a Perl source filter we need an example to study. Here is a complete source filter that performs rot13 decoding. (Rot13 is a very simple encryption scheme used in Usenet postings to hide the contents of offensive posts. It moves every letter forward thirteen places, so that A becomes N, B becomes O, and Z becomes M.) package Rot13; use Filter::Util::Call; sub import { my ($type) = @_; my ($ref) = []; filter_add(bless $ref); } sub filter { my ($self) = @_; my ($status); tr/n-za-mN-ZA-M/a-zA-Z/ if ($status = filter_read()) > 0; $status; } 1; All Perl source filters are implemented as Perl classes and have the same basic structure as the example above. First, we include the C<Filter::Util::Call> module, which exports a number of functions into your filter's namespace. The filter shown above uses two of these functions, C<filter_add()> and C<filter_read()>. Next, we create the filter object and associate it with the source stream by defining the C<import> function. If you know Perl well enough, you know that C<import> is called automatically every time a module is included with a use statement. This makes C<import> the ideal place to both create and install a filter object. In the example filter, the object (C<$ref>) is blessed just like any other Perl object. Our example uses an anonymous array, but this isn't a requirement. Because this example doesn't need to store any context information, we could have used a scalar or hash reference just as well. The next section demonstrates context data. The association between the filter object and the source stream is made with the C<filter_add()> function. This takes a filter object as a parameter (C<$ref> in this case) and installs it in the source stream. Finally, there is the code that actually does the filtering. For this type of Perl source filter, all the filtering is done in a method called C<filter()>. (It is also possible to write a Perl source filter using a closure. See the C<Filter::Util::Call> manual page for more details.) It's called every time the Perl parser needs another line of source to process. The C<filter()> method, in turn, reads lines from the source stream using the C<filter_read()> function. If a line was available from the source stream, C<filter_read()> returns a status value greater than zero and appends the line to C<$_>. A status value of zero indicates end-of-file, less than zero means an error. The filter function itself is expected to return its status in the same way, and put the filtered line it wants written to the source stream in C<$_>. The use of C<$_> accounts for the brevity of most Perl source filters. In order to make use of the rot13 filter we need some way of encoding the source file in rot13 format. The script below, C<mkrot13>, does just that. die "usage mkrot13 filename\n" unless @ARGV; my $in = $ARGV[0]; my $out = "$in.tmp"; open(IN, "<$in") or die "Cannot open file $in: $!\n"; open(OUT, ">$out") or die "Cannot open file $out: $!\n"; print OUT "use Rot13;\n"; while (<IN>) { tr/a-zA-Z/n-za-mN-ZA-M/; print OUT; } close IN; close OUT; unlink $in; rename $out, $in; If we encrypt this with C<mkrot13>: print " hello fred \n"; the result will be this: use Rot13; cevag "uryyb serq\a"; Running it produces this output: hello fred =head1 USING CONTEXT: THE DEBUG FILTER The rot13 example was a trivial example. Here's another demonstration that shows off a few more features. Say you wanted to include a lot of debugging code in your Perl script during development, but you didn't want it available in the released product. Source filters offer a solution. In order to keep the example simple, let's say you wanted the debugging output to be controlled by an environment variable, C<DEBUG>. Debugging code is enabled if the variable exists, otherwise it is disabled. Two special marker lines will bracket debugging code, like this: ## DEBUG_BEGIN if ($year > 1999) { warn "Debug: millennium bug in year $year\n"; } ## DEBUG_END The filter ensures that Perl parses the code between the <DEBUG_BEGIN> and C<DEBUG_END> markers only when the C<DEBUG> environment variable exists. That means that when C<DEBUG> does exist, the code above should be passed through the filter unchanged. The marker lines can also be passed through as-is, because the Perl parser will see them as comment lines. When C<DEBUG> isn't set, we need a way to disable the debug code. A simple way to achieve that is to convert the lines between the two markers into comments: ## DEBUG_BEGIN #if ($year > 1999) { # warn "Debug: millennium bug in year $year\n"; #} ## DEBUG_END Here is the complete Debug filter: package Debug; use strict; use warnings; use Filter::Util::Call; use constant TRUE => 1; use constant FALSE => 0; sub import { my ($type) = @_; my (%context) = ( Enabled => defined $ENV{DEBUG}, InTraceBlock => FALSE, Filename => (caller)[1], LineNo => 0, LastBegin => 0, ); filter_add(bless \%context); } sub Die { my ($self) = shift; my ($message) = shift; my ($line_no) = shift || $self->{LastBegin}; die "$message at $self->{Filename} line $line_no.\n" } sub filter { my ($self) = @_; my ($status); $status = filter_read(); ++ $self->{LineNo}; # deal with EOF/error first if ($status <= 0) { $self->Die("DEBUG_BEGIN has no DEBUG_END") if $self->{InTraceBlock}; return $status; } if ($self->{InTraceBlock}) { if (/^\s*##\s*DEBUG_BEGIN/ ) { $self->Die("Nested DEBUG_BEGIN", $self->{LineNo}) } elsif (/^\s*##\s*DEBUG_END/) { $self->{InTraceBlock} = FALSE; } # comment out the debug lines when the filter is disabled s/^/#/ if ! $self->{Enabled}; } elsif ( /^\s*##\s*DEBUG_BEGIN/ ) { $self->{InTraceBlock} = TRUE; $self->{LastBegin} = $self->{LineNo}; } elsif ( /^\s*##\s*DEBUG_END/ ) { $self->Die("DEBUG_END has no DEBUG_BEGIN", $self->{LineNo}); } return $status; } 1; The big difference between this filter and the previous example is the use of context data in the filter object. The filter object is based on a hash reference, and is used to keep various pieces of context information between calls to the filter function. All but two of the hash fields are used for error reporting. The first of those two, Enabled, is used by the filter to determine whether the debugging code should be given to the Perl parser. The second, InTraceBlock, is true when the filter has encountered a C<DEBUG_BEGIN> line, but has not yet encountered the following C<DEBUG_END> line. If you ignore all the error checking that most of the code does, the essence of the filter is as follows: sub filter { my ($self) = @_; my ($status); $status = filter_read(); # deal with EOF/error first return $status if $status <= 0; if ($self->{InTraceBlock}) { if (/^\s*##\s*DEBUG_END/) { $self->{InTraceBlock} = FALSE } # comment out debug lines when the filter is disabled s/^/#/ if ! $self->{Enabled}; } elsif ( /^\s*##\s*DEBUG_BEGIN/ ) { $self->{InTraceBlock} = TRUE; } return $status; } Be warned: just as the C-preprocessor doesn't know C, the Debug filter doesn't know Perl. It can be fooled quite easily: print <<EOM; ##DEBUG_BEGIN EOM Such things aside, you can see that a lot can be achieved with a modest amount of code. =head1 CONCLUSION You now have better understanding of what a source filter is, and you might even have a possible use for them. If you feel like playing with source filters but need a bit of inspiration, here are some extra features you could add to the Debug filter. First, an easy one. Rather than having debugging code that is all-or-nothing, it would be much more useful to be able to control which specific blocks of debugging code get included. Try extending the syntax for debug blocks to allow each to be identified. The contents of the C<DEBUG> environment variable can then be used to control which blocks get included. Once you can identify individual blocks, try allowing them to be nested. That isn't difficult either. Here is an interesting idea that doesn't involve the Debug filter. Currently Perl subroutines have fairly limited support for formal parameter lists. You can specify the number of parameters and their type, but you still have to manually take them out of the C<@_> array yourself. Write a source filter that allows you to have a named parameter list. Such a filter would turn this: sub MySub ($first, $second, @rest) { ... } into this: sub MySub($$@) { my ($first) = shift; my ($second) = shift; my (@rest) = @_; ... } Finally, if you feel like a real challenge, have a go at writing a full-blown Perl macro preprocessor as a source filter. Borrow the useful features from the C preprocessor and any other macro processors you know. The tricky bit will be choosing how much knowledge of Perl's syntax you want your filter to have. =head1 LIMITATIONS Source filters only work on the string level, thus are highly limited in its ability to change source code on the fly. It cannot detect comments, quoted strings, heredocs, it is no replacement for a real parser. The only stable usage for source filters are encryption, compression, or the byteloader, to translate binary code back to source code. See for example the limitations in L<Switch>, which uses source filters, and thus is does not work inside a string eval, the presence of regexes with embedded newlines that are specified with raw C</.../> delimiters and don't have a modifier C<//x> are indistinguishable from code chunks beginning with the division operator C</>. As a workaround you must use C<m/.../> or C<m?...?> for such patterns. Also, the presence of regexes specified with raw C<?...?> delimiters may cause mysterious errors. The workaround is to use C<m?...?> instead. See L<http://search.cpan.org/perldoc?Switch#LIMITATIONS> Currently the content of the C<__DATA__> block is not filtered. Currently internal buffer lengths are limited to 32-bit only. =head1 THINGS TO LOOK OUT FOR =over 5 =item Some Filters Clobber the C<DATA> Handle Some source filters use the C<DATA> handle to read the calling program. When using these source filters you cannot rely on this handle, nor expect any particular kind of behavior when operating on it. Filters based on Filter::Util::Call (and therefore Filter::Simple) do not alter the C<DATA> filehandle, but on the other hand totally ignore the text after C<__DATA__>. =back =head1 REQUIREMENTS The Source Filters distribution is available on CPAN, in CPAN/modules/by-module/Filter Starting from Perl 5.8 Filter::Util::Call (the core part of the Source Filters distribution) is part of the standard Perl distribution. Also included is a friendlier interface called Filter::Simple, by Damian Conway. =head1 AUTHOR Paul Marquess E<lt>Paul.Marquess@btinternet.comE<gt> Reini Urban E<lt>rurban@cpan.orgE<gt> =head1 Copyrights The first version of this article originally appeared in The Perl Journal #11, and is copyright 1998 The Perl Journal. It appears courtesy of Jon Orwant and The Perl Journal. This document may be distributed under the same terms as Perl itself. PK �}�\��?O O FieldHash/FieldHash.sonu ȯ�� ELF >