Icon

KNIME_​challenge9_​solution

KNIME_challenge9_solution
Challenge 9: Simple AnonymizationLevel: MediumDescription: You would like to post a question on the KNIME forum, but you have confidential data that you cannot share. In this challenge you will create a workflow whichremoves (or transforms) any columns that reveal anything confidential in your data (such as location, name, gender, etc.). After that, you should shuffle the remaining columns'rows such that each numeric column maintains its original statistical distribution but does not have a relationship with any other column. Rename these columns as well, suchthat in the end of your workflow they do not have any specific meaning. Let's see an example: parse a few strings as numbers as theycould be interesting to do analytics on e.g.height, weight etc. Parse monetary fields as number Mask string fields using MD-5 encoding impute empty and 0 instead of missingConsole returns quite a few" ERROR Column Expressions 0:79 Execute failed: ("NullPointerException"):null" Shuffle all rows per column to mask feature correlation and mask headers 1)+----------+----------+----------+-----------+-----+| Column 1 | Column 2 | Column 3 | column 4 | ... |+----------+----------+----------+-----------+-----+| 16725 | 235046 | 22 | 56 | || 11882 | 175316 | 30 | 64 | || 6744 | 208675 | 24 | 68 | || 17525 | 241401 | 18 | 54 | || 2979 | 193945 | 27 | 73 | || 1648 | 210214 | 27 | 75 | || 11693 | 221355 | 23 | 64 | || .... | | | | |+----------+----------+----------+-----------+-----+ Node 71Node 75Node 77Node 78str to intparse date fieldsNode 81Node 85Node 86Node 87Node 88Node 89Node 90Node 91Node 93Node 94Node 96Node 97Node 98Node 99 CSV Reader Unpivoting Column Splitter Column Splitter Column Expressions String to Date&Time Column Expressions Unpivoting Column Expressions Missing Value Pivoting Pivoting Column Appender ExtractColumn Header Missing Value Column ListLoop Start Shuffle Loop End (ColumnAppend) Column Filter Column Filter Challenge 9: Simple AnonymizationLevel: MediumDescription: You would like to post a question on the KNIME forum, but you have confidential data that you cannot share. In this challenge you will create a workflow whichremoves (or transforms) any columns that reveal anything confidential in your data (such as location, name, gender, etc.). After that, you should shuffle the remaining columns'rows such that each numeric column maintains its original statistical distribution but does not have a relationship with any other column. Rename these columns as well, suchthat in the end of your workflow they do not have any specific meaning. Let's see an example: parse a few strings as numbers as theycould be interesting to do analytics on e.g.height, weight etc. Parse monetary fields as number Mask string fields using MD-5 encoding impute empty and 0 instead of missingConsole returns quite a few" ERROR Column Expressions 0:79 Execute failed: ("NullPointerException"):null" Shuffle all rows per column to mask feature correlation and mask headers 1)+----------+----------+----------+-----------+-----+| Column 1 | Column 2 | Column 3 | column 4 | ... |+----------+----------+----------+-----------+-----+| 16725 | 235046 | 22 | 56 | || 11882 | 175316 | 30 | 64 | || 6744 | 208675 | 24 | 68 | || 17525 | 241401 | 18 | 54 | || 2979 | 193945 | 27 | 73 | || 1648 | 210214 | 27 | 75 | || 11693 | 221355 | 23 | 64 | || .... | | | | |+----------+----------+----------+-----------+-----+ Node 71Node 75Node 77Node 78str to intparse date fieldsNode 81Node 85Node 86Node 87Node 88Node 89Node 90Node 91Node 93Node 94Node 96Node 97Node 98Node 99CSV Reader Unpivoting Column Splitter Column Splitter Column Expressions String to Date&Time Column Expressions Unpivoting Column Expressions Missing Value Pivoting Pivoting Column Appender ExtractColumn Header Missing Value Column ListLoop Start Shuffle Loop End (ColumnAppend) Column Filter Column Filter

Nodes

Extensions

Links