Icon

m_​001_​import_​r_​excel_​unite_​fields

use R to list excel sheet names, extract the data and keep only columns that are present in all sheets
https://forum.knime.com/t/reading-excel-sheets-with-different-column-counts-and-different-data-types/19245/2?u=mlauber71https://stackoverflow.com/questions/17944777/how-to-extract-sheet-names-from-excel-file-in-r use R package readxl to list all sheets of excel files from a folder, determine their sheets and columns and guess the type.In the end keep only those columns and data that are present in all files df <- read_excel(path = v_path , sheet = knime.flow.in[["v_sheets"]], col_types = 'guess', guess_max = 50000)# if you want to import all data as string and deal with formats later use# col_types = 'text' knime://knime.workflow/../data/START Outer loopread sheet namesjust to get R startingwrite resultSTART inner loopguess typeor fore allcolumns to textv_excel_filev_excel_sheetEND inner loopv_excel_sheetv_excel_fileEND Outer loopcol_name_typecalculate occurance ofcolumns and typesMaximum number of Variable Type combinationsuniquevariables / Column namesfrom all Excel fileskeeponly columns with maximumcoveragekeep only columnsthat are present in all/mostexcel filescolum namesas v_excel_filev_excel_sheet List Files Table Row ToVariable Loop Start R Snippet R Source (Table) Sorter Table Writer Table Row ToVariable Loop Start R Snippet URL to File Path Extract Table Spec ConstantValue Column ConstantValue Column Loop End (2 ports) ConstantValue Column ConstantValue Column Loop End (2 ports) Column Combiner GroupBy GroupBy Joiner Joiner GroupBy Transpose ReferenceColumn Filter RowID ConstantValue Column ConstantValue Column collect meta data https://forum.knime.com/t/reading-excel-sheets-with-different-column-counts-and-different-data-types/19245/2?u=mlauber71https://stackoverflow.com/questions/17944777/how-to-extract-sheet-names-from-excel-file-in-r use R package readxl to list all sheets of excel files from a folder, determine their sheets and columns and guess the type.In the end keep only those columns and data that are present in all files df <- read_excel(path = v_path , sheet = knime.flow.in[["v_sheets"]], col_types = 'guess', guess_max = 50000)# if you want to import all data as string and deal with formats later use# col_types = 'text' knime://knime.workflow/../data/START Outer loopread sheet namesjust to get R startingwrite resultSTART inner loopguess typeor fore allcolumns to textv_excel_filev_excel_sheetEND inner loopv_excel_sheetv_excel_fileEND Outer loopcol_name_typecalculate occurance ofcolumns and typesMaximum number of Variable Type combinationsuniquevariables / Column namesfrom all Excel fileskeeponly columns with maximumcoveragekeep only columnsthat are present in all/mostexcel filescolum namesas v_excel_filev_excel_sheetList Files Table Row ToVariable Loop Start R Snippet R Source (Table) Sorter Table Writer Table Row ToVariable Loop Start R Snippet URL to File Path Extract Table Spec ConstantValue Column ConstantValue Column Loop End (2 ports) ConstantValue Column ConstantValue Column Loop End (2 ports) Column Combiner GroupBy GroupBy Joiner Joiner GroupBy Transpose ReferenceColumn Filter RowID ConstantValue Column ConstantValue Column collect meta data

Nodes

Extensions

Links