0 ×

05_​Basic_​Data_​Preparation

Workflow

Basic Data PreparationFor building, using and testing GDPR Metanodes, youl will need to create data that actually shows the required conditions. In our case, we are interested in personal data,discriminatory fields and pseudo discriminatory fields.To create such data, we use the classic adults.csv dataset. Each row represents an individual who is annonymous. we first add a unique identifier. There is already one"special category" field - race. We use one of those races and artificially create a high correlation between that one race and a particular zip code, thereby introducting possiblepseudo discrimination. In addition, we randomly ad another "special category" field for Trade union Membership. In addition, the Model Data Generation nodes can be used to create such data. Further examples of data creation are avaialble on the KNIME Public Example Server. Read Adults FileReplace with specificcolumn namesColumn Name tableRead US Postal CodesfinaltableRandomlyassignUnion MembershipCreatea "uniquepersonal identifier" for each recordNode 22Cross CheckingHigh Correlation File Reader Insert ColumnHeader Table Creator File Reader Table Writer Random LabelAssigner RowID String Manipulation Create highly correlatedZIP Code to one Race Linear Correlation Basic Data PreparationFor building, using and testing GDPR Metanodes, youl will need to create data that actually shows the required conditions. In our case, we are interested in personal data,discriminatory fields and pseudo discriminatory fields.To create such data, we use the classic adults.csv dataset. Each row represents an individual who is annonymous. we first add a unique identifier. There is already one"special category" field - race. We use one of those races and artificially create a high correlation between that one race and a particular zip code, thereby introducting possiblepseudo discrimination. In addition, we randomly ad another "special category" field for Trade union Membership. In addition, the Model Data Generation nodes can be used to create such data. Further examples of data creation are avaialble on the KNIME Public Example Server. Read Adults FileReplace with specificcolumn namesColumn Name tableRead US Postal CodesfinaltableRandomlyassignUnion MembershipCreatea "uniquepersonal identifier" for each recordNode 22Cross CheckingHigh CorrelationFile Reader Insert ColumnHeader Table Creator File Reader Table Writer Random LabelAssigner RowID String Manipulation Create highly correlatedZIP Code to one Race Linear Correlation

Download

Get this workflow from the following link: Download

Nodes

05_​Basic_​Data_​Preparation consists of the following 19 nodes(s):

Plugins

05_​Basic_​Data_​Preparation contains nodes provided by the following 3 plugin(s):