htmldoc(1)
NAME
htmldoc - convert html source files into html, postscript, or pdf.
SYNOPSIS
htmldoc [options] filename1.html [ ... filenameN.html ] htmldoc [options] htmldoc [filename.book]
DESCRIPTION
HTMLDOC converts HTML source files into indexed HTML, PostScript, or
Portable Document Format (PDF) files that can be viewed online or
printed. With no options a HTML document is produced on stdout.
The second form of HTMLDOC reads HTML source from stdin, which allows
you to use HTMLDOC as a filter.
The third form of HTMLDOC launches a graphical interface that allows
you to change options and generate documents interactively.
COMMON MISTAKES
There are two types of HTML files - structured documents using headings (H1, H2, etc.) which HTMLDOC calls "books", and unstructured documents that do not use headings which HTMLDOC calls "web pages".
A very common mistake is to try converting a web page using:
htmldoc -f filename.pdf filename.html
which will likely produce a PDF file with no pages. To convert web page files you must use the --webpage or --continuous options at the command-line or choose Web Page or Continuous in the input tab of the GUI.
OPTIONS
The following command-line options are supported by HTMLDOC:
- --batch filename.book
- Generates the specified book file without opening the GUI.
- --bodycolor color
- Specifies the background color for all pages.
- --bodyfont {courier,helvetica,monospace,sans,serif,times}
- --textfont {courier,helvetica,monospace,sans,serif,times}
- Specifies the default typeface for all normal text.
- --bodyimage filename
- Specifies the background image that is tiled on all pages.
- --book
- Specifies that the HTML sources are structured (headings, chapters, etc.)
- --bottom margin
- Specifies the bottom margin in points (no suffix or ##pt), inches (##in), centimeters (##cm), or millimeters (##mm).
- --charset charset-id
- Specifies the ISO character set to use for the output. Supported charsets include some Windows code pages (cp-###), ISO 8859 sets 1-9, 14, and 15 (iso8859-##), and koi8-r.
- --color
- Specifies that PostScript or PDF output should be in color.
- --continuous
- Specifies that the HTML sources are unstructured (plain web pages.) No page breaks are inserted between each file or URL in the output.
- --datadir directory
- Specifies the location of the HTMLDOC data files, usually /usr/share/htmldoc or C:\Program Files\HTMLDOC.
- --duplex
- Specifies that the output should be formatted for double-sided printing.
- --effectduration { 0.1..10.0 }
- Specifies the duration in seconds of PDF page transition effects.
- --embedfonts
- Specifies that fonts should be embedded in PDF and PostScript output.
- --encryption
- Enables encryption of PDF files.
- --fontsize size
- Specifies the default font size for body text.
- --fontspacing spacing
- Specifies the default line spacing for body text. The line spacing is a multiplier for the font size, so a value of 1.2 will provide an additional 20% of space between the lines.
- --footer fff
- Sets the page footer to use on body pages. See the HEADERS/FOOTERS FORMATS section below.
- --format format
- -t format
- Specifies the output format: html, htmlsep (separate HTML files for each heading in the table-of-contents), ps or ps2 (PostScript Level 2), ps1 (PostScript Level 1), ps3 (PostScript Level 3), pdf11 (PDF 1.1/Acrobat 2.0), pdf12 (PDF 1.2/Acrobat 3.0), pdf or pdf13 (PDF 1.3/Acrobat 4.0), or pdf14 (PDF 1.4/Acrobat 5.0).
- --gray
- Specifies that PostScript or PDF output should be grayscale.
- --header fff
- Sets the page header to use on body pages. See the HEADERS/FOOTERS FORMATS section below.
- --headfootfont font
- Sets the font to use on headers and footers.
- --headfootsize size
- Sets the size of the font to use on headers and footers.
- --headingfont typeface
- Sets the typeface to use for headings.
- --help
- Displays a summary of command-line options.
- --helpdir directory
- Specifies the location of the HTMLDOC on-line help files, usually /usr/share/doc/htmldoc or C:\Program Files\HTMLDOC\DOC.
- --jpeg[=quality]
- Sets the JPEG compression level to use for large images. A value of 0 disables JPEG compression.
- --left margin
- Specifies the left margin in points (no suffix or ##pt), inches (##in), centimeters (##cm), or millimeters (##mm).
- --linkcolor color
- Sets the color of links.
- --links
- Enables generation of links in PDF files (default).
- --linkstyle {plain,underline}
- Sets the style of links.
- --logoimage filename
- Specifies an image to be used as a logo in the header or footer in a PostScript or PDF document, and in the navigation bar of a HTML document.
- Note that you need to use the --header and/or --footer options with the l parameter or use the corresponding HTML page comments to display the logo image in the header or footer.
- --no-compression
- Disables compression of PostScript or PDF files.
- --no-duplex
- Disables double-sided printing.
- --no-embedfonts
- Specifies that fonts should not be embedded in PDF and PostScript output.
- --no-encryption
- Disables document encryption.
- --no-jpeg
- Disables JPEG compression of large images.
- --no-links
- Disables generation of links in a PDF document.
- --no-numbered
- Disables automatic heading numbering.
- --no-pscommands
- Disables generation of PostScript setpagedevice commands.
- --no-strict
- Disables strict HTML input checking.
- --no-title
- Disables generation of a title page.
- --no-toc
- Disables generation of a table of contents.
- --numbered
- Numbers all headings in a document.
- --nup pages
- Sets the number of pages that are placed on each output page. Valid values are 1, 2, 4, 6, 9, and 16.
- --outdir directory
- -d directory
- Specifies that output should be sent to a directory in multiple files. (Not compatible with PDF output)
- --outfile filename
- -f filename
- Specifies that output should be sent to a single file.
- --owner-password password
- Sets the owner password for encrypted PDF files.
- --pageduration {1.0..60.0}
- Sets the view duration of a page in a PDF document.
- --pageeffect effect
- Specifies the page transition effect for all pages; this attribute is ignored by all Adobe PDF viewers...
- --pagelayout {single,one,twoleft,tworight}
- Specifies the initial layout of pages for a PDF file.
- --pagemode {document,outlines,fullscreen}
- Specifies the initial viewing mode for a PDF file.
- --path
- Specifies a search path for files in a document.
- --permissions permission[,permission,...]
- Specifies document permissions for encrypted PDF files. The following permissions are understood: all, none, annotate, no-annotate, copy, no-copy, modify, no-modify, print, and no-print. Separate multiple permissions with commas.
- --pscommands
- Specifies that PostScript setpagedevice commands should be included in the output.
- --quiet
- Suppresses all messages, even error messages.
- --referer url
- Specifies the URL that is passed in the Referer: field of HTTP requests.
- --right margin
- Specifies the right margin in points (no suffix or ##pt), inches (##in), centimeters (##cm), or millimeters (##mm).
- --size pagesize
- Specifies the page size using a standard name or in points (no suffix or ##x##pt), inches (##x##in), centimeters (##x##cm), or millimeters (##x##mm). The standard sizes that are currently recognized are "letter" (8.5x11in), "legal" (8.5x14in), "a4" (210x297mm), and "universal" (8.27x11in).
- --strict
- Enables strict HTML input checking.
- --textcolor color
- Specifies the default color of all text.
- --title
- Enables the generation of a title page.
- --titlefile filename
- --titleimage filename
- Specifies the file to use for the title page. If the file is an image then the title page is automatically generated using the document meta data and title image.
- --tocfooter fff
- Sets the page footer to use on table-of-contents pages. See the HEADERS/FOOTERS FORMATS section below.
- --tocheader fff
- Sets the page header to use on table-of-contents pages. See the HEADERS/FOOTERS FORMATS section below.
- --toclevels levels
- Sets the number of levels in the table-of-contents.
- --toctitle string
- Sets the title for the table-of-contents.
- --top margin
- Specifies the top margin in points (no suffix or ##pt), inches (##in), centimeters (##cm), or millimeters (##mm).
- --user-password password
- Specifies the user password for encryption of PDF files.
- --verbose
- -v
- Provides verbose messages.
- --version
- Displays the current version number.
- --webpage
- Specifies that the HTML sources are unstructured (plain web pages.) A page break is inserted between each file or URL in the output.
HEADER/FOOTER FORMATS
The header and footer of each page can contain up to three preformatted
values. These values are specified using a single character for the
left, middle, and right of the page, resulting in the fff notation
shown previously.
Each character can be one of the following:
blank
- /
- n/N arabic page numbers (1/3, 2/3, 3/3)
- :
- c/C arabic chapter page numbers (1/2, 2/2, 1/4, 2/4, ...)
- 1
- arabic numbers (1, 2, 3, ...)
- a
- lowercase letters
- A
- uppercase letters
- c
- current chapter heading
- C
- current chapter page number (arabic)
- d
- current date
- D
- current date and time
- h
- current heading
- i
- lowercase roman numerals
- I
- uppercase roman numerals
- l
- logo image
- t
- title text
- T
- current time
ENVIRONMENT VARIABLES
HTMLDOC looks for several environment variables which can override the
default directories, display additional debugging information, and disable CGI mode:
- HTMLDOC_DATA
- This environment variable specifies the location of HTMLDOC's data and fonts directories, normally /usr/share/htmldoc or C:\Program Files\Easy Software Products\HTMLDOC.
- HTMLDOC_DEBUG
- This environment variable enables debugging information that is sent to stderr. The value is a list of any of the following keywords separated by spaces: "all", "links", "memory", "remotebytes", "table", "tempfiles", and/or "timing".
- HTMLDOC_HELP
- This environment variable specifies the location of HTMLDOC's documentation directory, normally /usr/share/doc/htmldoc or C:\Program Files\Easy Software Products\HTMLDOC\doc.
- HTMLDOC_NOCGI
- This environment variable, when set (the value doesn't matter), disables CGI mode. It is most useful for using HTMLDOC on a web server from a scripting language or invocation from a program.
SEE ALSO
HTMLDOC Software Users Manual
http://www.htmldoc.org/documentation.php
AUTHOR
Michael Sweet, Easy Software Products
TRADEMARKS
PostScript is a trademark that may be registered in some countries and
Adobe is a registered trademark of Adobe Systems Incorporated.
COPYRIGHTS
Portable Document Format Copyright 1993-1999 by Adobe Systems Incorporated.
HTMLDOC and <HTML>DOC are the trademark property of Easy Software Products. HTMLDOC is copyright 1997-2005 by Easy Software Products.
This program is based in part on the work of the Independent JPEG
Group.
NO WARRANTY
- This program is distributed in the hope that it will be useful, but
WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.