XML::DOM::Parser
XML::DOM::Parser(3) User Contributed Perl Documentation XML::DOM::Parser(3)
NAME
XML::DOM::Parser - An XML::Parser that builds XML::DOM document struc-
tures
SYNOPSIS
use XML::DOM;
my $parser = new XML::DOM::Parser;
my $doc = $parser->parsefile ("file.xml");
$doc->dispose; # Avoid memory leaks - cleanup circular references
DESCRIPTION
XML::DOM::Parser extends XML::Parser
The XML::Parser module was written by Clark Cooper and is built on top
of XML::Parser::Expat, which is a lower level interface to James
Clark's expat library.
XML::DOM::Parser parses XML strings or files and builds a data struc-
ture that conforms to the API of the Document Object Model as described
at <http://www.w3.org/TR/REC-DOM-Level-1>. See the XML::Parser manpage
for other additional properties of the XML::DOM::Parser class. Note
that the 'Style' property should not be used (it is set internally.)
The XML::Parser NoExpand option is more or less supported, in that it
will generate EntityReference objects whenever an entity reference is
encountered in character data. I'm not sure how useful this is. Any
comments are welcome.
As described in the synopsis, when you create an XML::DOM::Parser
object, the parse and parsefile methods create an XML::DOM::Document
object from the specified input. This Document object can then be exam-
ined, modified and written back out to a file or converted to a string.
When using XML::DOM with XML::Parser version 2.19 and up, setting the
XML::DOM::Parser option KeepCDATA to 1 will store CDATASections in
CDATASection nodes, instead of converting them to Text nodes. Subse-
quent CDATASection nodes will be merged into one. Let me know if this
is a problem.
Using LWP to parse URLs
The parsefile() method now also supports URLs, e.g.
http://www.erols.com/enno/xsa.xml. It uses LWP to download the file
and then calls parse() on the resulting string. By default it will use
a LWP::UserAgent that is created as follows:
use LWP::UserAgent;
$LWP_USER_AGENT = LWP::UserAgent->new;
$LWP_USER_AGENT->env_proxy;
Note that env_proxy reads proxy settings from environment variables,
which is what I need to do to get thru our firewall. If you want to use
a different LWP::UserAgent, you can either set it globally with:
XML::DOM::Parser::set_LWP_UserAgent ($my_agent);
or, you can specify it for a specific XML::DOM::Parser by passing it to
the constructor:
my $parser = new XML::DOM::Parser (LWP_UserAgent => $my_agent);
Currently, LWP is used when the filename (passed to parsefile) starts
with one of the following URL schemes: http, https, ftp, wais, gopher,
or file (followed by a colon.) If I missed one, please let me know.
The LWP modules are part of libwww-perl which is available at CPAN.
perl v5.8.6 2002-02-08 XML::DOM::Parser(3)