MFTRAINING(1)

NAME

tesseract - command line OCR tool

SYNOPSIS

Part  of  the  process  to train tesseract for a new language. When the
character features of all the training pages have  been  extracted,  we
need to cluster them to create the prototypes. The character shape fea-
tures can be clustered using the mftraining and cntraining programs:

mftraining fontfile_1.tr fontfile_2.tr ...

This will output two data files: inttemp  (the  shape  prototypes)  and
pffmtable  (the  number  of  expected  features for each character). (A
third file called Microfeat is also written by this program, but it  is
not used.)

DESCRIPTION

This manual page documents briefly the mftraining command.

tesseract is a commercial quality OCR engine originally developed at HP between 1985 and 1995. In 1995, this engine was among the top 3 evaluated by UNLV. It was open-sourced by HP and UNLV in 2005.

SEE ALSO

feh(1), convert(1), tesseract(1), cntraining(1), unicharset_extractor(1), wordlist2dawg(1).

AUTHOR

tesseract was written by Ray Smith.

This manual page was written by Jeffrey Ratcliffe <Jeffrey.Ratcliffe@gmail.com>, for the Debian project (but may be used by others).
Copyright © 2010-2025 Platon Technologies, s.r.o.           Index | Man stránky | tLDP | Dokumenty | Utilitky | O projekte
Design by styleshout