Icon

Amazon Reviews Preprocessing

Pre-processing of Amazon reviews raw data (from Oct 1999 to Oct 2012) from one columnwith roughly 4.5 million rows, pivoted into 0.57 million rows with 8 columns. Read rawdataSplit into key andvalue pairsRename tomore meaningfulcolumn namesIMPORTANT: Identify the keyused to break up the long list into chunksof informationDuplicateRow IDJoinby Row IDPopulatewith previousvaluesSort by the NumberDuplicateRow IDGet thenumbers fromRow IDConvert theNumberGet rid ofunwantedcolumnsRearrangethe columnNode 21Generate thepreprocesseddataRelabelthe rownum labelsto Review labelsfor traceabilityGet rid ofunwantedcolumns File Reader Column Expressions Column Rename Row Filter RowID Joiner Missing Value Sorter RowID Cell SplitterBy Position String To Number Column Filter Column Resorter Pivoting CSV Writer Column Expressions Column Filter Pre-processing of Amazon reviews raw data (from Oct 1999 to Oct 2012) from one columnwith roughly 4.5 million rows, pivoted into 0.57 million rows with 8 columns. Read rawdataSplit into key andvalue pairsRename tomore meaningfulcolumn namesIMPORTANT: Identify the keyused to break up the long list into chunksof informationDuplicateRow IDJoinby Row IDPopulatewith previousvaluesSort by the NumberDuplicateRow IDGet thenumbers fromRow IDConvert theNumberGet rid ofunwantedcolumnsRearrangethe columnNode 21Generate thepreprocesseddataRelabelthe rownum labelsto Review labelsfor traceabilityGet rid ofunwantedcolumns File Reader Column Expressions Column Rename Row Filter RowID Joiner Missing Value Sorter RowID Cell SplitterBy Position String To Number Column Filter Column Resorter Pivoting CSV Writer Column Expressions Column Filter

Nodes

Extensions

Links