Run OCR for local file
Process a local file with the OCR API.Language used for OCR. If no language is specified, English eng
is taken as default.
IMPORTANT: The language code has always 3-letters (not 2). So it is eng
and not en
.
Engine2 has automatic Western language detection, so this value will be ignored. Any Western language can be processed.
Engine3 supports additional writing systems/languages. More can be added on request.
ara
bul
chs
cht
hrv
cze
dan
dut
eng
fin
fre
ger
gre
hun
kor
ita
jpn
pol
por
rus
slv
spa
swe
tur
Engine3 also supports:
hin
kan
per
tel
tam
tai
vie
true
, returns the coordinates of the bounding boxes for each word. If false, the OCR'ed text
is returned only as a text block (this makes the JSON reponse smaller). Overlay data can be used,
for example, to show text over the image.TextOrientation
parameter in the
JSON response. If the image is not rotated, then TextOrientation=0, otherwise it is the degree of the
rotation, e. g. "270".true
, API generates a searchable PDF. This parameter automatically sets isOverlayRequired = true
.true
, the text layer is hidden (not visible)scale=true
, but the API uses scale=false
by default. See also this
OCR forum post.The OCR API offers two different OCR engine with a different processing logic. We recommend that you try both and then use whatever engine gives you the best OCR result. You can use both OCR engines with our free online OCR service on the front page and with the OCREngine=1/2 parameter in your API call.
Features of OCR Engine 1:
Features of OCR Engine 2:
Features of OCR Engine 3:
Features of OCR Engine 5:
Enterprise Support: Both OCR engines 1+2 are available for offline, self-hosting as On-Premise OCR!
The returned OCR result JSON response is identical for both engines! You can switch between both engines as needed. The features that are not mentioned in this OCR engine comparison are the same for both engines, for example PDF OCR, detect orientation and receipt scanning support. If you have any question about using the different OCR engines, please ask in our OCR API Forum.
Specify how the response should be mapped to the table output. The following formats are available:
Raw Response: Returns the raw response in a single row with the following columns:
You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.
To use this node in KNIME, install the extension OCR Space from the below update site following our NodePit Product and Node Installation Guide:
A zipped version of the software site can be downloaded here.
Deploy, schedule, execute, and monitor your KNIME workflows locally, in the cloud or on-premises – with our brand new NodePit Runner.
Try NodePit Runner!Do you have feedback, questions, comments about NodePit, want to support this platform, or want your own nodes or workflows listed here as well? Do you think, the search results could be improved or something is missing? Then please get in touch! Alternatively, you can send us an email to mail@nodepit.com.
Please note that this is only about NodePit. We do not provide general support for KNIME — please use the KNIME forums instead.