html2latex(1)

NAME

html2latex - HTML to latex converter.

SYNOPSIS

html2latex [OPTION]... URLS...

DESCRIPTION

html2latex uses HTML::TreeBuilder to parse an HTML file
and then it converts the HTML::Element into to a Latex
file. Each URL will have a .*html extension stripped. If
you use a URL, then the files taken from the Internet will
be stored in your ~/.html2latex directory. If pictures
are included, they are converted to .PNG, which can only
be used with pdflatex. As an added bonus, there is an
option to automatically create a PDF from the Latex file
(using pdflatex).

REQUIRES

If your html2latex is not working correctly, this may be
because you do not have many of the needed packages.
html2latex requires HTML::TreeBuilder perhaps LWP::Simple
and URI. If you do not have either of these, try typing
perl -MCPAN -e shell at the command line. This will bring up a shell for CPAN (The Comprehensive Perl Archive Net
work). Then, as root try typing install HTML::TreeBuilder. Should work like magic.

URLS

In your list of URLs any filename given after a URL will
continue to use the latest HOST given. Also, files
default to index.html, regardless of what the server
thinks. So, if you type:

`html2latex http://slashdot.org foo.html http://linuxto
day.net bar.html'

html2latex will try to grab http://slash
dot.org/index.html, http://slashdot.org/foo.html,
http://linuxtoday.net/index.html, and http://linuxto
day.net/bar.html

OPTIONS

Options are secondary to document-specified options. So,
if your HTML file has border=1, a border will be printed
regardless of the --border option. The do overide, how ever, options given in the configuration file. If you
want to change things more permanently, try changing the
config file, html2latex.xml. For information on it, try
the HTML::Latex under section CONFIGURATION FILE.

-h -? --help
Print the brief help and usage.
--latex2pdf --pdf -p
Automatically create a PDF along with a latex file
named FILE.pdf. This may fail and print a number of
cryptic errors.
-i --image --image_scale=SCALE
Set the scale for images in the latex file. This is
usefull because some images in HTML or much to big to
fit on a page. Default is 1.0. SCALE can be any nonzere positive floating point number, large numbers are
not reccomended.
-f --font --font_size=SIZE
Set the default font size. Can be 10-12. Do not try
anything else. html2latex will not check it, but the
latex file will not compile (at least I think not).
Default is 12.
-d --debug
Level of debugging info to print. The more times this
option is used, the higher the level. Default is 0,
and you cannot lower that. Right now, 0 prints noth
ing. 1 prints fun code-tracking info. 2 prints lots
of data-structure information, so don't do it unless
you're serious.
--border --table --table_border
Sets table around borders on. Default is off. Also,
--noborder or --notable will explicity turn table bor ders off.
--class --document --document_class=CLASS
Set the documentclass to use. Any valid latex docu
ment class is valid. Examples are report, book, and article. article is the default. If an invalid docu ment class is used, the output latex file will not
compile.
--package=PACKAGE
html2latex will create a latex file using any packages
that you specify. PACKAGE will be added to the list
of class to put in the file. html2latex will not make
sure the packages are valid, but if they aren't the
latex file won't compile.
--head=HEAD
Latex allows you to add options in the preamble of the
form ocumentclass[OPTIONS]{article}. Each HEAD you
add gets added to the list included. For instance,
you could use `--head=twocolumn' to add the 'twocol
umn' feature of Latex. Since font sizes are already
added, don't add them yourself. See `--font'
--mbox -m
With any of these, html2latex will put a tex ox
around all of the tables it creates. I do not know
why, but with a lot of tables (especially nested
ones), the tex and pdf output will work better. So,
if you do not like your output with tables, try this.
--paragraph --par -P
Uses HTML-style paragraphs. This is by default, so
try --noparagrph or --nopar or -P! to turn it back to
Latex-style paragraphs.
--cache --local
--log -l LOGFILE
Print all messages to LOGFILE instead of STDERR.
--conf -C CONFFILE
Change the configuration file to CONFFILE. For more
information on this file, see the HTML::Latex manpage.

Development

Development is being carried out by Peter Thatcher
(peterthatcher@asu.edu) and Stan Seibert (vol
sung@asu.edu). Homepage is http://html2latex.source
forge.net.
Copyright © 2010-2025 Platon Technologies, s.r.o.           Home | Man pages | tLDP | Documents | Utilities | About
Design by styleshout