CNTRAINING(1)
NAME
tesseract - command line OCR tool
SYNOPSIS
Part of the process to train tesseract for a new language. When the character features of all the training pages have been extracted, we need to cluster them to create the prototypes. The character shape fea- tures can be clustered using the mftraining and cntraining programs: cntraining fontfile_1.tr fontfile_2.tr ... This will output the normproto data file (the character normalization sensitivity prototypes).
DESCRIPTION
This manual page documents briefly the cntraining command.
tesseract is a commercial quality OCR engine originally developed at HP
between 1985 and 1995. In 1995, this engine was among the top 3 evaluated by UNLV. It was open-sourced by HP and UNLV in 2005.
SEE ALSO
feh(1), convert(1), mftraining(1), tesseract(1), unicharset_extractor(1), wordlist2dawg(1).
AUTHOR
tesseract was written by Ray Smith.
- This manual page was written by Jeffrey Ratcliffe <Jeffrey.Ratcliffe@gmail.com>, for the Debian project (but may be used by others).