html::tree(3)
NAME
HTML::Tree - overview of HTML::TreeBuilder et al
SYNOPSIS
use HTML::TreeBuilder; my $tree = HTML::TreeBuilder->new(); $tree->parse_file($filename); # # Then do something with the tree, using HTML::Element # methods -- for example $tree->dump # # Then: $tree->delete;
DESCRIPTION
HTML-Tree is a suite of Perl modules for making parse
trees out of HTML source. It consists of mainly two mod
ules, whose documentation you should refer to: HTML::Tree
Builder and HTML::Element.
HTML::TreeBuilder is the module that builds the parse
trees. (It uses HTML::Parser to do the work of breaking
the HTML up into tokens.)
The tree that TreeBuilder builds for you is made up of
objects of the class HTML::Element.
If you find that you do not properly understand the docu
mentation for HTML::TreeBuilder and HTML::Element, it may
be because you are unfamiliar with tree-shaped data struc
tures, or with object-oriented modules in general. I have
written some articles for The Perl Journal ("www.tpj.com")
that seek to provide that background: my article "A User's
View of Object-Oriented Modules" in TPJ17; my article
"Trees" in TPJ18; and my article "Scanning HTML" in TPJ19.
The full text of those articles is contained in this dis
tribution, as:
HTML::Tree::AboutObjects -- article: "User's View of
Object-Oriented Modules"
HTML::Tree::AboutTrees -- article: "Trees"
HTML::Tree::Scanning -- article: "Scanning HTML"
Readers already familiar with object-oriented modules and
tree-shaped data structures should read just the last
article. Readers without that background should read the
first, then the second, and then the third.
SEE ALSO
HTML::TreeBuilder, HTML::Element, HTML::Tagset,
HTML::Parser
HTML::DOMbo
The book Perl & LWP by me, Sean M. Burke, published by
O'Reilly and Associates, 2002. ISBN: 0-596-00178-9
- It has several chapters to do with HTML processing in gen
eral, and HTML-Tree specifically. There's more info at: - http://www.oreilly.com/catalog/perllwp/
http://www.amazon.com/exec/obidos/ASIN/0596001789
COPYRIGHT
Copyright 1995-1998 Gisle Aas; copyright 1999-2002 Sean M.
Burke. (Except the articles contained in
HTML::Tree::AboutObjects, HTML::Tree::AboutTrees, and
HTML::Tree::Scanning, which are all copyright 2000 The
Perl Journal.)
Except for those three TPJ articles, the whole HTML-Tree
distribution, of which this file is a part, is free soft
ware; you can redistribute it and/or modify it under the
same terms as Perl itself.
Those three TPJ articles may be distributed under the same
terms as Perl itself.
The programs and documentation in this dist are dis
tributed in the hope that they will be useful, but without
any warranty; without even the implied warranty of mer
chantability or fitness for a particular purpose.
AUTHOR
- Original HTML-Tree author Gisle Aas <gisle@aas.no>; cur
rent maintainer Sean M. Burke, <sburke@cpan.org>