How to scan and OCR like a pro with open source tools
Posted by: Anonymous
[ip: 71.234.246.22]
on June 25, 2008 04:01 AM
Tesseract is the OCR package to use. I've had good success using it for a health-information application. It does a really good and accurate job on numbers (like dates and record numbers) even with a variety of fonts.
You may find it helpful to put an alarm around your tesseract runs. I've found some particularly nasty pages on which it hangs.
How to scan and OCR like a pro with open source tools
Posted by: Anonymous [ip: 71.234.246.22] on June 25, 2008 04:01 AMYou may find it helpful to put an alarm around your tesseract runs. I've found some particularly nasty pages on which it hangs.
#