xml::parser::perlsax(3)

NAME

XML::Parser::PerlSAX - Perl SAX parser using XML::Parser

SYNOPSIS

use XML::Parser::PerlSAX;
$parser = XML::Parser::PerlSAX->new( [OPTIONS] );
$result = $parser->parse( [OPTIONS] );
$result = $parser->parse($string);

DESCRIPTION

"XML::Parser::PerlSAX" is a PerlSAX parser using the
XML::Parser module. This man page summarizes the specific
options, handlers, and properties supported by
"XML::Parser::PerlSAX"; please refer to the PerlSAX stan
dard in `"PerlSAX.pod"' for general usage information.

METHODS

new Creates a new parser object. Default options for
parsing, described below, are passed as key-value
pairs or as a single hash. Options may be changed
directly in the parser object unless stated otherwise.
Options passed to `"parse()"' override the default
options in the parser object for the duration of the
parse.
parse
Parses a document. Options, described below, are
passed as key-value pairs or as a single hash.
Options passed to `"parse()"' override default options
in the parser object.
location
Returns the location as a hash:

ColumnNumber The column number of the parse.
LineNumber The line number of the parse.
BytePosition The current byte position of the
parse.
PublicId A string containing the public iden
tifier, or undef
if none is available.
SystemId A string containing the system iden
tifier, or undef
if none is available.
Base The current value of the base for
resolving relative
URIs.
ALPHA WARNING: The `"SystemId"' and `"PublicId"' prop
erties returned are the system and public identifiers
of the document passed to `"parse()"', not the identi
fiers of the currently parsing external entity. The
column, line, and byte positions are of the current
entity being parsed.

OPTIONS

The following options are supported by "XML::Parser::Perl
SAX":
Handler default handler to receive events
DocumentHandler handler to receive document events
DTDHandler handler to receive DTD events
ErrorHandler handler to receive error events
EntityResolver handler to resolve entities
Locale locale to provide localisation for er
rors
Source hash containing the input source for
parsing
UseAttributeOrder set to true to provide AttributeOrder
and Defaulted
properties in `start_element()'
If no handlers are provided then all events will be
silently ignored, except for `"fatal_error()"' which will
cause a `"die()"' to be called after calling `"end_docu
ment()"'.
If a single string argument is passed to the `"parse()"'
method, it is treated as if a `"Source"' option was given
with a `"String"' parameter.
The `"Source"' hash may contain the following parameters:

ByteStream The raw byte stream (file handle) con
taining the
document.
String A string containing the document.
SystemId The system identifier (URI) of the docu
ment.
PublicId The public identifier.
Encoding A string describing the character encod
ing.
If more than one of `"ByteStream"', `"String"', or `"Sys
temId"', then preference is given first to `"ByteStream"',
then `"String"', then `"SystemId"'.

HANDLERS

The following handlers and properties are supported by
"XML::Parser::PerlSAX":

DocumentHandler methods
start_document
Receive notification of the beginning of a docu
ment.
No properties defined.
end_document
Receive notification of the end of a document.
No properties defined.
start_element
Receive notification of the beginning of an ele
ment.

Name The element type name.
Attributes A hash containing the attributes
attached to the
element, if any.
The `"Attributes"' hash contains only string val
ues.
If the `"UseAttributeOrder"' parser option is
true, the following properties are also passed to
`"start_element"':

AttributeOrder An array of attribute names in
the order they were
specified, followed by the de
faulted attribute
names.
Defaulted The index number of the first
defaulted attribute in
`AttributeOrder. If this index
is equal to the
length of `AttributeOrder',
there were no defaulted
values.
Note to "XML::Parser" users: `"Defaulted"' will
be half the value of "XML::Parser::Expat"'s
`"specified_attr()"' function because only
attribute names are provided, not their values.
end_element
Receive notification of the end of an element.

Name The element type name.
characters
Receive notification of character data.

Data The characters from the XML doc
ument.
processing_instruction
Receive notification of a processing instruction.

Target The processing instruction tar
get.
Data The processing instruction data,
if any.
comment
Receive notification of a comment.

Data The comment data, if any.
start_cdata
Receive notification of the start of a CDATA sec
tion.
No properties defined.
end_cdata
Receive notification of the end of a CDATA sec
tion.
No properties defined.
entity_reference
Receive notification of an internal entity refer
ence. If this handler is defined, internal enti
ties will not be expanded and not passed to the
`"characters()"' handler. If this handler is not
defined, internal entities will be expanded if
possible and passed to the `"characters()"' han
dler.

Name The entity reference name
Value The entity reference value
DTDHandler methods
notation_decl
Receive notification of a notation declaration
event.

Name The notation name.
PublicId The notation's public identifi
er, if any.
SystemId The notation's system identifi
er, if any.
Base The base for resolving a rela
tive URI, if any.
unparsed_entity_decl
Receive notification of an unparsed entity decla
ration event.

Name The unparsed entity's name.
SystemId The entity's system identifier.
PublicId The entity's public identifier,
if any.
Base The base for resolving a rela
tive URI, if any.
entity_decl
Receive notification of an entity declaration
event.

Name The entity name.
Value The entity value, if any.
PublicId The notation's public identifi
er, if any.
SystemId The notation's system identifi
er, if any.
Notation The notation declared for this
entity, if any.
For internal entities, the `"Value"' parameter
will contain the value and the `"PublicId"',
`"SystemId"', and `"Notation"' will be undefined.
For external entities, the `"Value"' parameter
will be undefined, the `"SystemId"' parameter will
have the system id, the `"PublicId"' parameter
will have the public id if it was provided (it
will be undefined otherwise), the `"Notation"'
parameter will contain the notation name for
unparsed entities. If this is a parameter entity
declaration, then a '%' will be prefixed to the
entity name.
Note that `"entity_decl()"' and
`"unparsed_entity_decl()"' overlap. If both meth
ods are implemented by a handler, then this han
dler will not be called for unparsed entities.
element_decl
Receive notification of an element declaration
event.

Name The element type name.
Model The content model as a string.
attlist_decl
Receive notification of an attribute list declara
tion event.
This handler is called for each attribute in an
ATTLIST declaration found in the internal subset.
So an ATTLIST declaration that has multiple
attributes will generate multiple calls to this
handler.

ElementName The element type name.
AttributeName The attribute name.
Type The attribute type.
Fixed True if this is a fixed at
tribute.
The default for `"Type"' is the default value,
which will either be "#REQUIRED", "#IMPLIED" or a
quoted string (i.e. the returned string will begin
and end with a quote character).
doctype_decl
Receive notification of a DOCTYPE declaration
event.

Name The document type name.
SystemId The document's system identifi
er.
PublicId The document's public identifi
er, if any.
Internal The internal subset as a string,
if any.
Internal will contain all whitespace, comments,
processing instructions, and declarations seen in
the internal subset. The declarations will be
there whether or not they have been processed by
another handler (except for unparsed entities pro
cessed by the Unparsed handler). However, com
ments and processing instructions will not appear
if they've been processed by their respective han
dlers.
xml_decl
Receive notification of an XML declaration event.

Version The version.
Encoding The encoding string, if any.
Standalone True, false, or undefined if not
declared.
EntityResolver
resolve_entity
Allow the handler to resolve external entities.

Name The notation name.
SystemId The notation's system identifi
er.
PublicId The notation's public identifi
er, if any.
Base The base for resolving a rela
tive URI, if any.
`"resolve_entity()"' should return undef to
request that the parser open a regular URI connec
tion to the system identifier or a hash describing
the new input source. This hash has the same
properties as the `"Source"' parameter to
`"parse()"':

PublicId The public identifier of the exter
nal entity being
referenced, or undef if none was
supplied.
SystemId The system identifier of the exter
nal entity being
referenced.
String String containing XML text
ByteStream An open file handle.
CharacterStream
An open file handle.
Encoding The character encoding, if known.

AUTHOR

Ken MacLeod, ken@bitsko.slc.ut.us

SEE ALSO

perl(1), PerlSAX.pod(3)
Extensible Markup Language (XML)
<http://www.w3c.org/XML/>
SAX 1.0: The Simple API for XML <http://www.meggin
son.com/SAX/>
Copyright © 2010-2025 Platon Technologies, s.r.o.           Home | Man pages | tLDP | Documents | Utilities | About
Design by styleshout