Help files

Create Tesseract OCR Engine Action

WinAutomation allows you to work with a number of different OCR Engines through an equal number of different Actions. Each of these Actions has different Properties and property Values according to each respective Engine's capabilities.

This Action allows you to work with the Tesseract OCR Engine in order to extract text from Image Files, in combination with the Extract Text With OCR Action.

Please Note: Tesseract is the only OCR Engine that comes ready to use with WinAutomation without the need to install it.

Tesseract Engine allows you to detect a number of different languages [1] and this Action is giving you the option to select any of them. It also allows you to rescale your image; you can resize width and height [2] of your image independently from one another through the use of multipliers, since Tesseract works best on images which have a DPI (Dots Per Inch) of at least 300. The Action returns an Ocr Engine Data Type [3] stored within a variable:

create-tesseract-ocr-engin.png

wc.pngTesseract Language:

This drop-down menu allows you to select the language of the image's text that Tesseract will detect:

tesseract-available-languages.png

ea.png Width & Height Multipliers:

These multipliers allow you to rescale an image in order to help the OCR Engine read the text in it. Tesseract is known to require a DPI of at least 300 in order to work, however since there is plenty of confusion among non-experts regarding DPI, PPI and optimal Image settings for OCR Text Extraction we invite you to feel comfortable to experiment/play with the available options (try for example 2, 3 or 4).

ea3.png OCR Engine:

This text field invites you to set the variable that will hold the value of the Ocr Engine Data Type produced from this action.