parser(3)
NAME
XML::DOM::Parser - An XML::Parser that builds XML::DOM
document structures
SYNOPSIS
use XML::DOM;
my $parser = new XML::DOM::Parser;
my $doc = $parser->parsefile ("file.xml");
DESCRIPTION
XML::DOM::Parser extends XML::Parser
The XML::Parser module was written by Clark Cooper and is
built on top of XML::Parser::Expat, which is a lower level
interface to James Clark's expat library.
XML::DOM::Parser parses XML strings or files and builds a
data structure that conforms to the API of the Document
Object Model as described at
<http://www.w3.org/TR/REC-DOM-Level-1>. See the
XML::Parser manpage for other additional properties of the
XML::DOM::Parser class. Note that the 'Style' property
should not be used (it is set internally.)
The XML::Parser NoExpand option is more or less supported,
in that it will generate EntityReference objects whenever
an entity reference is encountered in character data. I'm
not sure how useful this is. Any comments are welcome.
As described in the synopsis, when you create an
XML::DOM::Parser object, the parse and parsefile methods
create an XML::DOM::Document object from the specified
input. This Document object can then be examined, modified
and written back out to a file or converted to a string.
When using XML::DOM with XML::Parser version 2.19 and up,
setting the XML::DOM::Parser option KeepCDATA to 1 will
store CDATASections in CDATASection nodes, instead of con
verting them to Text nodes. Subsequent CDATASection nodes
will be merged into one. Let me know if this is a problem.
Using LWP to parse URLs
- The parsefile() method now also supports URLs, e.g.
http://www.erols.com/enno/xsa.xml. It uses LWP to down
load the file and then calls parse() on the resulting
string. By default it will use a LWP::UserAgent that is
created as follows: - use LWP::UserAgent;
$LWP_USER_AGENT = LWP::UserAgent->new;
$LWP_USER_AGENT->env_proxy; - Note that env_proxy reads proxy settings from environment
variables, which is what I need to do to get thru our
firewall. If you want to use a different LWP::UserAgent,
you can either set it globally with:
XML::DOM::Parser::set_LWP_UserAgent ($my_agent);- or, you can specify it for a specific XML::DOM::Parser by
passing it to the constructor:
my $parser = new XML::DOM::Parser (LWP_UserAgent =>- $my_agent);
- Currently, LWP is used when the filename (passed to parse
file) starts with one of the following URL schemes: http,
https, ftp, wais, gopher, or file (followed by a colon.)
If I missed one, please let me know. - The LWP modules are part of libwww-perl which is available
at CPAN.