Icon

20210513 Pikairos Big Data join 2 possible solutions

There has been no description set for this workflow's metadata.

1st option : Save partial results into KNIME tables to be concatenated in a second time (less memory consumming) 2nd option : Concatenate partial results at the end of the loop (Memory consumming) I chose hereto split the data into 100 different chunks(chose what suits you)This is just to close the loop.I have done the choice herenot to concatenatethe results to avoid cloggingyour computer memoryAlternatively you could tryto directly use a generic loop endas in 2nd option below,to concatenate the joined rowscoming from the joiner.In this case you will not needto save the chunck resultsin different tablesConfigure the joineras you needSave the result intodifferenttables everytimeunderknime://knime.workflow/_LOCAL_DATA/Current_chunk_*100000random values100000random valuesCurrentchunk file name100000 random valuesConfigure the joineras you needI chose hereto split the data into 100 different chunks(chose what suits you)100000random valuesThe wholejoining and concatenationresultsConcatenate matched rowsThis may end upclogging the 4G memoryIf so, use 1st option andthen concatenate the resultsin a second stepChunk Loop Start Variable Loop End Joiner (Labs) Table Writer Random NumbersGenerator Random NumbersGenerator Java EditVariable (simple) Random NumbersGenerator Joiner (Labs) Chunk Loop Start Random NumbersGenerator InteractiveTable (local) Loop End 1st option : Save partial results into KNIME tables to be concatenated in a second time (less memory consumming) 2nd option : Concatenate partial results at the end of the loop (Memory consumming) I chose hereto split the data into 100 different chunks(chose what suits you)This is just to close the loop.I have done the choice herenot to concatenatethe results to avoid cloggingyour computer memoryAlternatively you could tryto directly use a generic loop endas in 2nd option below,to concatenate the joined rowscoming from the joiner.In this case you will not needto save the chunck resultsin different tablesConfigure the joineras you needSave the result intodifferenttables everytimeunderknime://knime.workflow/_LOCAL_DATA/Current_chunk_*100000random values100000random valuesCurrentchunk file name100000 random valuesConfigure the joineras you needI chose hereto split the data into 100 different chunks(chose what suits you)100000random valuesThe wholejoining and concatenationresultsConcatenate matched rowsThis may end upclogging the 4G memoryIf so, use 1st option andthen concatenate the resultsin a second stepChunk Loop Start Variable Loop End Joiner (Labs) Table Writer Random NumbersGenerator Random NumbersGenerator Java EditVariable (simple) Random NumbersGenerator Joiner (Labs) Chunk Loop Start Random NumbersGenerator InteractiveTable (local) Loop End

Nodes

Extensions

Links