Icon

05_​Basic_​Data_​Preparation

Basic Data Preparation
Basic Data PreparationFor building, using and testing GDPR Metanodes, you will need to create data that actually shows the required conditions In our case, we are interested in personal data,discriminatory fields and pseudo discriminatory fields.To create such data, we use the classic adults.csv dataset. Each row represents an individual who is annonymous. we first add a unique identifier. There is already one "specialcategory" field - race. We use one of those races and artificially create a high correlation between that one race and a particular zip code, thereby introducting possible pseudodiscrimination. In addition, we randomly ad another "special category" field for Trade union Membership. In addition, the Model Data Generation nodes can be used to create such data. Further examples of data creation are avaialble on the KNIME Public Example Server. Read Adults FileReplace with specificcolumn namesColumn Name tableRead US Postal CodesfinaltableRandomlyassignUnion MembershipCreatea "uniquepersonal identifier" for each recordCross CheckingHigh Correlation File Reader Insert ColumnHeader Table Creator File Reader Table Writer Random LabelAssigner RowID String Manipulation Create highly correlatedZIP Code to one Race Linear Correlation Basic Data PreparationFor building, using and testing GDPR Metanodes, you will need to create data that actually shows the required conditions In our case, we are interested in personal data,discriminatory fields and pseudo discriminatory fields.To create such data, we use the classic adults.csv dataset. Each row represents an individual who is annonymous. we first add a unique identifier. There is already one "specialcategory" field - race. We use one of those races and artificially create a high correlation between that one race and a particular zip code, thereby introducting possible pseudodiscrimination. In addition, we randomly ad another "special category" field for Trade union Membership. In addition, the Model Data Generation nodes can be used to create such data. Further examples of data creation are avaialble on the KNIME Public Example Server. Read Adults FileReplace with specificcolumn namesColumn Name tableRead US Postal CodesfinaltableRandomlyassignUnion MembershipCreatea "uniquepersonal identifier" for each recordCross CheckingHigh CorrelationFile Reader Insert ColumnHeader Table Creator File Reader Table Writer Random LabelAssigner RowID String Manipulation Create highly correlatedZIP Code to one Race Linear Correlation

Nodes

Extensions

Links