CNTRAINING(1)

NAME

tesseract - command line OCR tool

SYNOPSIS

Part  of  the  process  to train tesseract for a new language. When the
character features of all the training pages have  been  extracted,  we
need to cluster them to create the prototypes. The character shape fea-
tures can be clustered using the mftraining and cntraining programs:

cntraining fontfile_1.tr fontfile_2.tr ...

This will output the normproto data file (the  character  normalization
sensitivity prototypes).

DESCRIPTION

This manual page documents briefly the cntraining command.

tesseract is a commercial quality OCR engine originally developed at HP between 1985 and 1995. In 1995, this engine was among the top 3 evaluated by UNLV. It was open-sourced by HP and UNLV in 2005.

SEE ALSO

feh(1), convert(1), mftraining(1), tesseract(1), unicharset_extractor(1), wordlist2dawg(1).

AUTHOR

tesseract was written by Ray Smith.

This manual page was written by Jeffrey Ratcliffe <Jeffrey.Ratcliffe@gmail.com>, for the Debian project (but may be used by others).
Copyright © 2010-2025 Platon Technologies, s.r.o.           Home | Man pages | tLDP | Documents | Utilities | About
Design by styleshout