Icon

KNIME_​Applying an expression on multiple columns

KNIME_Applying an expression on multiple columns
My challenge - how many ways can I think of in Knime toperforming the same action on 3 columns and create 3 newcolums with the newly derived valuesMay 4th 2021 @takbb Brian Bates Option 1 - hand code it for each column, but almostas crazy as Option 0, except that at least it's in onenode Option 2 - use a Column Expressions node in aColumn List loop. The nodes increase over writing it ina single column expression, but once you've got thepattern, it's reasonably quick to do, and it would scale Option 3 - Math Formula acting the same on multiple-columns.Limitation is that the Math Formula node can only "see" and"use" the numeric columns in the formula. You also can't doconditionals so not as flexible as column expressions Option 4 - Multi Column string manipulation can also domathematical calculations if you are able to convert string to requireddatatype, and useful if youneed to do String handling or access other non-numeric columns too!BUT output columns with be of type String, so you need to convertthem back to required data type if required! Option 5 -Write some short (or in my case not as short as it shouldbe!) python script. Python scripts have access to the entire table and so the script can dothe loop on the columns as required Option 6 Java snippet? Not in the same way that the python script can beused. Java doesn't have access to the whole table, and cannot modify thetable in the same way that python can. A java snippet would have to be put into a loop as was done for theColumn Expressions node, and creating the required column requires aflow variable as does Column Expressions. Overall though ColumnExpressions is probably easier to use (and slightly friendlier once yourealise its power), although Java Snippet gives a larger edit windowwhich can be useful if it's complex. Option 0 - crazy talk - a node for eachcolumn! (ok with maybe these 3, butwould it scale if there were 20+ ?)And yes, we could also potentially useString Manipulation, but you get the idea! Rule Engine?Sadly a non-starter for this task as itcannot contain mathematical (or anyother) expressions. It can simply checksimple conditions and utilize an existingvalue from an available column orvariable. It cannot modify any existingvalue though so no concatenating strings,and no adding numbers together etc. Updated 5 May 2021In the loops, the flow variablecontaining the output columnname is passed to thecolumn filter to ensure thatonly the new column exits theloop.Previously I had hard-codeda wildcar *_out as the outputcolumn, but this way is moreflexible. Of course ifappending multiple columnson each iteration, you mightneed to go back to wildcards, or look at other waysof filtering the columns Node 1Node 2Specify the columns to include in the loopColumn expressionconfigured to multiply current column by 20and output to the column_name definedas a variablecreate variable with "_out" column name for this iterationEnd of loopConfigured to say itretains some rowidson eachiteration.Use the (Column Append) nodeif we are adding new columnsfilter out any columns not ending with _out as we don't want others appended in "wrong order", or duplicatedfor each iterationat the end of the loopJoin on rowid with original columnsCan work for relativelysimple formulaConvert value to Int and multiply by 20Make my output columnsnumeric again(Apologies for my horribly non-pythonicpython!)Specify the columns to include in the loopEnd of loopConfigured to say itretains some rowidson eachiteration.Use the (Column Append) nodeif we are adding new columnsJoin on rowid with original columnsfilter out any columns not ending with _out as we don't want others appended in "wrong order", or duplicatedfor each iterationat the end of the loopoutput column name setvia flow variableMuch the same principle as the Column Expressions nodecreate variable with "_out" column name for this iterationNode 20Node 21Node 22 Table Creator Column Expressions Column ListLoop Start Column Expressions String Manipulation(Variable) Loop End (ColumnAppend) Column Filter Joiner Math Formula(Multi Column) String Manipulation(Multi Column) Column AutoType Cast Python Script Column ListLoop Start Loop End (ColumnAppend) Joiner Column Filter Java Snippet String Manipulation(Variable) Math Formula Math Formula Math Formula My challenge - how many ways can I think of in Knime toperforming the same action on 3 columns and create 3 newcolums with the newly derived valuesMay 4th 2021 @takbb Brian Bates Option 1 - hand code it for each column, but almostas crazy as Option 0, except that at least it's in onenode Option 2 - use a Column Expressions node in aColumn List loop. The nodes increase over writing it ina single column expression, but once you've got thepattern, it's reasonably quick to do, and it would scale Option 3 - Math Formula acting the same on multiple-columns.Limitation is that the Math Formula node can only "see" and"use" the numeric columns in the formula. You also can't doconditionals so not as flexible as column expressions Option 4 - Multi Column string manipulation can also domathematical calculations if you are able to convert string to requireddatatype, and useful if youneed to do String handling or access other non-numeric columns too!BUT output columns with be of type String, so you need to convertthem back to required data type if required! Option 5 -Write some short (or in my case not as short as it shouldbe!) python script. Python scripts have access to the entire table and so the script can dothe loop on the columns as required Option 6 Java snippet? Not in the same way that the python script can beused. Java doesn't have access to the whole table, and cannot modify thetable in the same way that python can. A java snippet would have to be put into a loop as was done for theColumn Expressions node, and creating the required column requires aflow variable as does Column Expressions. Overall though ColumnExpressions is probably easier to use (and slightly friendlier once yourealise its power), although Java Snippet gives a larger edit windowwhich can be useful if it's complex. Option 0 - crazy talk - a node for eachcolumn! (ok with maybe these 3, butwould it scale if there were 20+ ?)And yes, we could also potentially useString Manipulation, but you get the idea! Rule Engine?Sadly a non-starter for this task as itcannot contain mathematical (or anyother) expressions. It can simply checksimple conditions and utilize an existingvalue from an available column orvariable. It cannot modify any existingvalue though so no concatenating strings,and no adding numbers together etc. Updated 5 May 2021In the loops, the flow variablecontaining the output columnname is passed to thecolumn filter to ensure thatonly the new column exits theloop.Previously I had hard-codeda wildcar *_out as the outputcolumn, but this way is moreflexible. Of course ifappending multiple columnson each iteration, you mightneed to go back to wildcards, or look at other waysof filtering the columns Node 1Node 2Specify the columns to include in the loopColumn expressionconfigured to multiply current column by 20and output to the column_name definedas a variablecreate variable with "_out" column name for this iterationEnd of loopConfigured to say itretains some rowidson eachiteration.Use the (Column Append) nodeif we are adding new columnsfilter out any columns not ending with _out as we don't want others appended in "wrong order", or duplicatedfor each iterationat the end of the loopJoin on rowid with original columnsCan work for relativelysimple formulaConvert value to Int and multiply by 20Make my output columnsnumeric again(Apologies for my horribly non-pythonicpython!)Specify the columns to include in the loopEnd of loopConfigured to say itretains some rowidson eachiteration.Use the (Column Append) nodeif we are adding new columnsJoin on rowid with original columnsfilter out any columns not ending with _out as we don't want others appended in "wrong order", or duplicatedfor each iterationat the end of the loopoutput column name setvia flow variableMuch the same principle as the Column Expressions nodecreate variable with "_out" column name for this iterationNode 20Node 21Node 22Table Creator Column Expressions Column ListLoop Start Column Expressions String Manipulation(Variable) Loop End (ColumnAppend) Column Filter Joiner Math Formula(Multi Column) String Manipulation(Multi Column) Column AutoType Cast Python Script Column ListLoop Start Loop End (ColumnAppend) Joiner Column Filter Java Snippet String Manipulation(Variable) Math Formula Math Formula Math Formula

Nodes

Extensions

Links