Conda is needed on the machine and needs to be set up according to the "Prerequisites" section in this documentation (under Preferences - KNIME - Conda).
For portability, the Conda Environment Propagation node sets up the environment, so it should be not necessary to install the following environment. The commands are stated for sake of completeness, in case a workaround without the CEP node is being created:
Linux: conda create -n knime_ocr_tess_pdfium -c knime -c conda-forge --strict-channel-priority python=3.11 knime-python-scripting=5.8 pypdfium2 opencv pytesseract tesseract pillow numpy pandas
Windows: conda create -n knime_ocr_tess_pdfium -c knime -c conda-forge -c pypdfium2-team -c bblanchon --strict-channel-priority python=3.11 knime-python-scripting=5.8 pypdfium2-team::pypdfium2_helpers opencv pytesseract tesseract pillow numpy pandas
Note: If any language other than English is selected, the workflow willdownload Tesseract's appropriate language files and store them within the workflow folder under /data/tessdata/
This workflow is based on this one: https://hub.knime.com/s/hDBtIjjK900pPNaK