Icon

Reading Image Based PDFs with Tika Parser

<p>This workflow uses the Tika Parser node to read the characters from a PDF. The PDF appears to be in an image format versus a text format. The PDF has an image at the top of the page with some text in that image. The PDF has additional text below the image. The workflow uses an Image Reader (Table) node and then the Tess4J node for the OCR processing of any of the characters in the PDF.</p>

Nodes

Extensions

Links