Hello guys,
I need your help :)
I use tesseract for text recognize. Therefor I grab al lot of ROI´s in my Image.
Of course some ROI´s are garbage or contain a symbol (see the pictures). But tesseract also find text here like "“v" or a single letter -_-.
Symbol
https://picload.org/image/rdwaaiop/symbol.png
Garbage
https://picload.org/image/rdwaailg/garbage.png
**initilize tesseract OCR engine**
tesseract::TessBaseAPI *myOCR = new tesseract::TessBaseAPI();
myOCR->Init(NULL, "eng");
tesseract::PageSegMode pagesegmode = static_cast(7);
myOCR->SetPageSegMode(pagesegmode);
How can I prevent this?
Thanks.
↧