ABCDEFGHIJKLMNOPQRSTUVWXYZ

XML::Parser::PerlSAX

XML::Parser::PerlSAX(3User Contributed Perl DocumentatiXML::Parser::PerlSAX(3)



NAME
       XML::Parser::PerlSAX - Perl SAX parser using XML::Parser

SYNOPSIS
        use XML::Parser::PerlSAX;

        $parser = XML::Parser::PerlSAX->new( [OPTIONS] );
        $result = $parser->parse( [OPTIONS] );

        $result = $parser->parse($string);

DESCRIPTION
       "XML::Parser::PerlSAX" is a PerlSAX parser using the XML::Parser mod-
       ule.  This man page summarizes the specific options, handlers, and
       properties supported by "XML::Parser::PerlSAX"; please refer to the
       PerlSAX standard in `"PerlSAX.pod"' for general usage information.

METHODS
       new Creates a new parser object.  Default options for parsing,
           described below, are passed as key-value pairs or as a single hash.
           Options may be changed directly in the parser object unless stated
           otherwise.  Options passed to `"parse()"' override the default
           options in the parser object for the duration of the parse.

       parse
           Parses a document.  Options, described below, are passed as key-
           value pairs or as a single hash.  Options passed to `"parse()"'
           override default options in the parser object.

       location
           Returns the location as a hash:

             ColumnNumber    The column number of the parse.
             LineNumber      The line number of the parse.
             BytePosition    The current byte position of the parse.
             PublicId        A string containing the public identifier, or undef
                             if none is available.
             SystemId        A string containing the system identifier, or undef
                             if none is available.
             Base            The current value of the base for resolving relative
                             URIs.

           ALPHA WARNING: The `"SystemId"' and `"PublicId"' properties
           returned are the system and public identifiers of the document
           passed to `"parse()"', not the identifiers of the currently parsing
           external entity.  The column, line, and byte positions are of the
           current entity being parsed.

OPTIONS
       The following options are supported by "XML::Parser::PerlSAX":

        Handler          default handler to receive events
        DocumentHandler  handler to receive document events
        DTDHandler       handler to receive DTD events
        ErrorHandler     handler to receive error events
        EntityResolver   handler to resolve entities
        Locale           locale to provide localisation for errors
        Source           hash containing the input source for parsing
        UseAttributeOrder set to true to provide AttributeOrder and Defaulted
                          properties in `start_element()'

       If no handlers are provided then all events will be silently ignored,
       except for `"fatal_error()"' which will cause a `"die()"' to be called
       after calling `"end_document()"'.

       If a single string argument is passed to the `"parse()"' method, it is
       treated as if a `"Source"' option was given with a `"String"' parame-
       ter.

       The `"Source"' hash may contain the following parameters:

        ByteStream       The raw byte stream (file handle) containing the
                         document.
        String           A string containing the document.
        SystemId         The system identifier (URI) of the document.
        PublicId         The public identifier.
        Encoding         A string describing the character encoding.

       If more than one of `"ByteStream"', `"String"', or `"SystemId"', then
       preference is given first to `"ByteStream"', then `"String"', then
       `"SystemId"'.

HANDLERS
       The following handlers and properties are supported by
       "XML::Parser::PerlSAX":

       DocumentHandler methods


           start_document
               Receive notification of the beginning of a document.

               No properties defined.

           end_document
               Receive notification of the end of a document.

               No properties defined.

           start_element
               Receive notification of the beginning of an element.

                Name             The element type name.
                Attributes       A hash containing the attributes attached to the
                                 element, if any.

               The `"Attributes"' hash contains only string values.

               If the `"UseAttributeOrder"' parser option is true, the follow-
               ing properties are also passed to `"start_element"':

                AttributeOrder   An array of attribute names in the order they were
                                 specified, followed by the defaulted attribute
                                 names.
                Defaulted        The index number of the first defaulted attribute in
                                 `AttributeOrder.  If this index is equal to the
                                 length of `AttributeOrder', there were no defaulted
                                 values.

               Note to "XML::Parser" users:  `"Defaulted"' will be half the
               value of "XML::Parser::Expat"'s `"specified_attr()"' function
               because only attribute names are provided, not their values.

           end_element
               Receive notification of the end of an element.

                Name             The element type name.

           characters
               Receive notification of character data.

                Data             The characters from the XML document.

           processing_instruction
               Receive notification of a processing instruction.

                Target           The processing instruction target.
                Data             The processing instruction data, if any.

           comment
               Receive notification of a comment.

                Data             The comment data, if any.

           start_cdata
               Receive notification of the start of a CDATA section.

               No properties defined.

           end_cdata
               Receive notification of the end of a CDATA section.

               No properties defined.

           entity_reference
               Receive notification of an internal entity reference.  If this
               handler is defined, internal entities will not be expanded and
               not passed to the `"characters()"' handler.  If this handler is
               not defined, internal entities will be expanded if possible and
               passed to the `"characters()"' handler.

                Name             The entity reference name
                Value            The entity reference value

           DTDHandler methods


           notation_decl
               Receive notification of a notation declaration event.

                Name             The notation name.
                PublicId         The notation's public identifier, if any.
                SystemId         The notation's system identifier, if any.
                Base             The base for resolving a relative URI, if any.

           unparsed_entity_decl
               Receive notification of an unparsed entity declaration event.

                Name             The unparsed entity's name.
                SystemId         The entity's system identifier.
                PublicId         The entity's public identifier, if any.
                Base             The base for resolving a relative URI, if any.

           entity_decl
               Receive notification of an entity declaration event.

                Name             The entity name.
                Value            The entity value, if any.
                PublicId         The notation's public identifier, if any.
                SystemId         The notation's system identifier, if any.
                Notation         The notation declared for this entity, if any.

               For internal entities, the `"Value"' parameter will contain the
               value and the `"PublicId"', `"SystemId"', and `"Notation"' will
               be undefined.  For external entities, the `"Value"' parameter
               will be undefined, the `"SystemId"' parameter will have the
               system id, the `"PublicId"' parameter will have the public id
               if it was provided (it will be undefined otherwise), the
               `"Notation"' parameter will contain the notation name for
               unparsed entities.  If this is a parameter entity declaration,
               then a '%' will be prefixed to the entity name.

               Note that `"entity_decl()"' and `"unparsed_entity_decl()"'
               overlap.  If both methods are implemented by a handler, then
               this handler will not be called for unparsed entities.

           element_decl
               Receive notification of an element declaration event.

                Name             The element type name.
                Model            The content model as a string.

           attlist_decl
               Receive notification of an attribute list declaration event.

               This handler is called for each attribute in an ATTLIST decla-
               ration found in the internal subset. So an ATTLIST declaration
               that has multiple attributes will generate multiple calls to
               this handler.

                ElementName      The element type name.
                AttributeName    The attribute name.
                Type             The attribute type.
                Fixed            True if this is a fixed attribute.

               The default for `"Type"' is the default value, which will
               either be "#REQUIRED", "#IMPLIED" or a quoted string (i.e. the
               returned string will begin and end with a quote character).

           doctype_decl
               Receive notification of a DOCTYPE declaration event.

                Name             The document type name.
                SystemId         The document's system identifier.
                PublicId         The document's public identifier, if any.
                Internal         The internal subset as a string, if any.

               Internal will contain all whitespace, comments, processing
               instructions, and declarations seen in the internal subset. The
               declarations will be there whether or not they have been pro-
               cessed by another handler (except for unparsed entities pro-
               cessed by the Unparsed handler).  However, comments and pro-
               cessing instructions will not appear if they've been processed
               by their respective handlers.

           xml_decl
               Receive notification of an XML declaration event.

                Version          The version.
                Encoding         The encoding string, if any.
                Standalone       True, false, or undefined if not declared.

           EntityResolver


           resolve_entity
               Allow the handler to resolve external entities.

                Name             The notation name.
                SystemId         The notation's system identifier.
                PublicId         The notation's public identifier, if any.
                Base             The base for resolving a relative URI, if any.

               `"resolve_entity()"' should return undef to request that the
               parser open a regular URI connection to the system identifier
               or a hash describing the new input source.  This hash has the
               same properties as the `"Source"' parameter to `"parse()"':

                 PublicId    The public identifier of the external entity being
                             referenced, or undef if none was supplied.
                 SystemId    The system identifier of the external entity being
                             referenced.
                 String      String containing XML text
                 ByteStream  An open file handle.
                 CharacterStream
                             An open file handle.
                 Encoding    The character encoding, if known.

AUTHOR
       Ken MacLeod, ken@bitsko.slc.ut.us

SEE ALSO
       perl(1), PerlSAX.pod(3)

        Extensible Markup Language (XML) <http://www.w3c.org/XML/>
        SAX 1.0: The Simple API for XML <http://www.megginson.com/SAX/>



perl v5.8.6                       2003-10-21           XML::Parser::PerlSAX(3)