Hi, I’m newbie here. I tried key word search module and enabled its OCR function. It could read text from image files but it couldn’t read text from PDF scanned document. I was wondering that is there any way to solve this issue. I really appreciate for your help
What is the MIME type of your document? How was it created?
OCR only processes “image” MIME types.
I agree with downey in that OCR will only process “image” mime types. To over come problem, a module can be developed to convert the pdf to a series of images and inserted back as derived files or another method is to extract the text in the pdf to a text file using the pdf2text python module.
it was ".pdf " and it was produced by a fujitsu scanner. Since I am fraud examiner, I need a software that can read pdf scanned document. I usually use nuix software to apply keywords analysis.
Thanks for your response
Btw, is there any autopsy 3rd module that can solve this issue?
The upcoming release will be able to OCR scanned pdfs.