Load text-based files

Loads text-based files into a new table column from a column of URLs or full filepaths. Each file is added in its entirety to a single multi-line String cell in a new column added to the output table. The column can then be re-typed (e.g. to mol, mol2, PDB etc.) as required.

File encoding is attempted as follows when the ' Guess ' option is selected:

  1. Firstly, the URL connection to the supplied file is inspected for encoding information.
  2. If none is available, then the first 4 bytes of the file are inspected for any BOM present, and if so, the following are recognised: UTF-8, UTF-16 (Big- and Little Endian), UTF-32 (Big- and Little Endian)
  3. Finally, the default (UTF-8) is assumed. As UTF-8 is not required to provide a BOM, this is a reasonable guess in most cases.
A console INFO entry is added for each file format detected, and a WARN entry added when the default is used because none could be detected.

This node was developed by Vernalis (Cambridge, UK) . For feedback and more information, please contact knime@vernalis.com

Options

Select filepath column
Select the column containing the paths or URLs to the files
Remove input column
If checked, the input column containing the paths is removed
Select file encoding
Select the required file encoding method. The default 'Guess' option performs as described above. Choosing other options may result in nonsense being returned
Txt Column name
Enter a name for the column containing the loaded files

Input Ports

Icon
Input table containing the filepath or URL column

Output Ports

Icon
Output table with the loaded files added

Popular Predecessors

Popular Successors

  • Table Row to Variable11 %
  • XPath7 %
  • Cell Splitter5 %
  • Cell Splitter4 %
  • HTML Parser4 %
  • Save File Locally3 %
  • String Replacer3 %
  • Punctuation Erasure (deprecated)3 %
  • Loop End2 %
  • Column Filter2 %
  • Molecule Type Cast2 %
  • RDKit From Molecule2 %
  • Load text-based files2 %
  • Table Row To Variable Loop Start2 %
  • String To SVG2 %
  • Rule-based Row Splitter2 %
  • Interactive Table2 %
  • Strings To Document2 %
  • String To XML2 %
  • FASTA Sequence Extractor1 %
  • SMARTSViewer1 %
  • CSV Writer1 %
  • Loop End1 %
  • Regex Split1 %
  • Excel Writer (XLS)1 %
  • String to JSON1 %
  • Lhasa Type Caster1 %
  • Lhasa Type Caster1 %
  • Clean HTML Retriever< 1 %
  • Load Text Files< 1 %
  • PDB Loader< 1 %
  • MolConverter< 1 %
  • String to Molecule< 1 %
  • String to Table< 1 %
  • Split Collection Column< 1 %
  • Catch Errors (Data Ports)< 1 %
  • Table Writer< 1 %
  • Java Edit Variable< 1 %
  • Java Snippet< 1 %
  • Variable Condition Loop End< 1 %
  • Concatenate< 1 %
  • One to Many< 1 %
  • Row Filter< 1 %
  • GroupBy< 1 %
  • Joiner< 1 %
  • Table Manipulator< 1 %
  • Pivoting< 1 %
  • Column Rename< 1 %
  • Column Splitter< 1 %
  • String Manipulation< 1 %
  • String Replace (Dictionary)< 1 %
  • Column Expressions< 1 %
  • Table to PDF< 1 %
  • OpenBabel< 1 %
  • DL4J Feedforward Predictor (Classification)< 1 %
  • NGram creator< 1 %
  • Punctuation Erasure< 1 %
  • Abner tagger< 1 %
  • POS tagger< 1 %
  • Bag Of Words Creator< 1 %
  • Sentence Extractor< 1 %
  • Document Viewer< 1 %
  • Tag Cloud< 1 %
  • Path to String< 1 %
  • Table View< 1 %
  • JSON Path< 1 %
  • JSON to Table< 1 %
  • Python Script (legacy)< 1 %
  • Webpage Retriever< 1 %
  • Mirabilis Reaction Classifier< 1 %
  • Date Extractor< 1 %

Views

This node has no views

Workflows

Links

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.