Icon

Reading CSV that is incorrectly formatted - two options

In this option, we read the file in, amend its format and write it back out as it should have beenwritten. It is then ready to read in again using CSV Reader.This prepares the file by removing double quotes from beginning and end, if they are presentand then saving back (overwriting) the file with tab delimited (and specifying no quotes option)Warning - don't feed a file through that has quoted columns, but doesn't have double-quotesaround the record as this will break it.It IS OK to feed a file back through that has already been prepared by this workflow, as it will nothave double-quotes to strip.... In this option we read the file as lines. They are split usingregex to remove the start and end double quotes, and thensplit into columns by tab character.When reading the file, we told it NOT to treat first row ascolumn header. This means first row *contains* the columnheadings as data, which we can then transpose and apply asthe column headings.. Read the fileRemove double quotes at endsif presentWrite to CSVin the expectedformatSplit by tab into columnsRemoveoriginalsingle column... now we canread it correctlyRead the fileRemove double quotes at endsif presentRemoveoriginalsingle columnSplit by tab into columnsRow 1 containscolumnheadings so split theseoff as a single row withwhich we can rename the columnsMake into lookup tableof column namesRename the columns...... at this point we have the required input from the "csv" file Line Reader Regex Split CSV Writer Cell Splitter Column Filter CSV Reader Line Reader Regex Split Column Filter Cell Splitter Row Splitter Transpose Insert ColumnHeader In this option, we read the file in, amend its format and write it back out as it should have beenwritten. It is then ready to read in again using CSV Reader.This prepares the file by removing double quotes from beginning and end, if they are presentand then saving back (overwriting) the file with tab delimited (and specifying no quotes option)Warning - don't feed a file through that has quoted columns, but doesn't have double-quotesaround the record as this will break it.It IS OK to feed a file back through that has already been prepared by this workflow, as it will nothave double-quotes to strip.... In this option we read the file as lines. They are split usingregex to remove the start and end double quotes, and thensplit into columns by tab character.When reading the file, we told it NOT to treat first row ascolumn header. This means first row *contains* the columnheadings as data, which we can then transpose and apply asthe column headings.. Read the fileRemove double quotes at endsif presentWrite to CSVin the expectedformatSplit by tab into columnsRemoveoriginalsingle column... now we canread it correctlyRead the fileRemove double quotes at endsif presentRemoveoriginalsingle columnSplit by tab into columnsRow 1 containscolumnheadings so split theseoff as a single row withwhich we can rename the columnsMake into lookup tableof column namesRename the columns...... at this point we have the required input from the "csv" fileLine Reader Regex Split CSV Writer Cell Splitter Column Filter CSV Reader Line Reader Regex Split Column Filter Cell Splitter Row Splitter Transpose Insert ColumnHeader

Nodes

Extensions

Links