Icon

05_​Persiapan_​Data

Basic Data Preparation
Basic Data PreparationFor building, using and testing GDPR Metanodes, you will need to create data that actually shows the required conditions In our case, we are interested in personal data, discriminatory fields and pseudo discriminatory fields.To create such data, we use the classic adults.csv dataset. Each row represents an individual who is annonymous. we first add a unique identifier. There is already one "special category" field - race. We use one of those races and artificiallycreate a high correlation between that one race and a particular zip code, thereby introducting possible pseudo discrimination. In addition, we randomly ad another "special category" field for Trade union Membership. In addition, the Model Data Generation nodes can be used to create such data. Further examples of data creation are avaialble on the KNIME Public Example Server.This workflow is used to artificially build in pseudo-discrimination as well as to introduce another "special category" field - Union Membership - into the classic adults.csv dataset.It is an interesting technique since any set of GDPR metanodes you create will need to be thoroughly tested. Another good approach is to use the Model Data Generation nodes to artificially create a dataset that can be used to test yournodes. See the Node Guide and KNIME Public Server for further examples Read Adults FileReplace with specificcolumn namesColumn Name tableRead US Postal CodesfinaltableRandomlyassignUnion MembershipCreatea "uniquepersonal identifier" for each recordNode 22Cross CheckingHigh Correlation File Reader(deprecated) Insert ColumnHeader Table Creator File Reader(deprecated) Table Writer(deprecated) Random LabelAssigner RowID String Manipulation Create highly correlatedZIP Code to one Race Linear Correlation(deprecated) Basic Data PreparationFor building, using and testing GDPR Metanodes, you will need to create data that actually shows the required conditions In our case, we are interested in personal data, discriminatory fields and pseudo discriminatory fields.To create such data, we use the classic adults.csv dataset. Each row represents an individual who is annonymous. we first add a unique identifier. There is already one "special category" field - race. We use one of those races and artificiallycreate a high correlation between that one race and a particular zip code, thereby introducting possible pseudo discrimination. In addition, we randomly ad another "special category" field for Trade union Membership. In addition, the Model Data Generation nodes can be used to create such data. Further examples of data creation are avaialble on the KNIME Public Example Server.This workflow is used to artificially build in pseudo-discrimination as well as to introduce another "special category" field - Union Membership - into the classic adults.csv dataset.It is an interesting technique since any set of GDPR metanodes you create will need to be thoroughly tested. Another good approach is to use the Model Data Generation nodes to artificially create a dataset that can be used to test yournodes. See the Node Guide and KNIME Public Server for further examples Read Adults FileReplace with specificcolumn namesColumn Name tableRead US Postal CodesfinaltableRandomlyassignUnion MembershipCreatea "uniquepersonal identifier" for each recordNode 22Cross CheckingHigh Correlation File Reader(deprecated) Insert ColumnHeader Table Creator File Reader(deprecated) Table Writer(deprecated) Random LabelAssigner RowID String Manipulation Create highly correlatedZIP Code to one Race Linear Correlation(deprecated)

Nodes

Extensions

Links