Icon

ExtractingATableFromAPDF

Extracting a Table from a PDF

Given a text-based PDF document with a table, can you partially extract the table into a KNIME data table for further analysis? For this challenge we will extract the table from this PDF document and attempt to partially reconstruct it within KNIME.

The corresponding KNIME table should contain the following columns:
* Day
* Max
* Min
* Norm
* Depart
* Heat
* Cool.

Note 1: Your final output should be a table, not a single row with all the relevant data.

Note 2: The Tika Parser node is better suited for this task than the PDF Parser node. We completed this task without components, regular expressions, or code-snippet nodes.

Node 1Node 3Node 4Node 5Node 6Node 8Node 9Node 11Node 12Node 13Node 14Node 15Tika Parser Row Filter Cell Splitter Ungroup Row Filter Cell Splitter String Manipulation(Multi Column) String Replacer Transpose RowID Insert ColumnHeader Column Filter Node 1Node 3Node 4Node 5Node 6Node 8Node 9Node 11Node 12Node 13Node 14Node 15Tika Parser Row Filter Cell Splitter Ungroup Row Filter Cell Splitter String Manipulation(Multi Column) String Replacer Transpose RowID Insert ColumnHeader Column Filter

Nodes

Extensions

Links