1

[kubuntu] OCR with ocrfeeder and tesseract (or tesseract-ocr)

view story
linux-howto

http://ubuntuforums.org – I'm running Kubuntu 13.04 (raring) on a Lenovo laptop. I have a scan of a page of typed text, in both pdf and tiff formats. I'm trying to use OCR (optical character recognition) to turn the image of that scan into the actual text. From what I read, the best tool for this is ocrfeeder, together with tesseract. It appears that the correct procedure is to go into ocrfeeder, then select File / Import PDF. Doing that, I get a screen that displays the scanned image, though for some reason the file listing to the left shows it as a jpg, not a pdf. I now go to Document/Recognize Document. Wha (Hardware)