djvu2hocr(1)

NAME

djvu2hocr - DjVu to hOCR converter

SYNOPSIS

djvu2hocr [option...] djvu-file
djvu2hocr {--version | --help | -h}

DESCRIPTION

djvu2hocr converts hidden text from a DjVu file to the hOCR[1] format.

OPTIONS

Text segmentation options
--word-segmentation=simple
Use the same word segmentation as found in the DjVu file.
This is the default.
--word-segmentation=uax29
Use the Unicode Text Segmentation[2] algorithm to break lines into words, possibly fixing word segmentation found in the DjVu file.
Other options
--version
Output version information and exit.
-h, --help
Display help and exit.

SEE ALSO

djvu(1)

AUTHOR

Jakub Wilk <ubanus@users.sf.net>
Author.

COPYRIGHT

Copyright (C) 2009, 2010 Jakub Wilk

NOTES

1. hOCR
http://docs.google.com/View?docid=dfxcv4vc_67g844kf
2. Unicode Text Segmentation
http://unicode.org/reports/tr29/
Copyright © 2010-2025 Platon Technologies, s.r.o.           Home | Man pages | tLDP | Documents | Utilities | About
Design by styleshout