Icon

KN-121 Advanced Data Generation v05

[KNIME Nodes] KN-121 Advanced Data Generation

[KNIME Nodes] KN-121 Advanced Data Generation

Generates Customer Profile sample data using advanced data generation techniques. All of the nodes used in this example come from KNIME. These nodes can be used as an alternative to the Market Simulation data generation capabilities.

A comprehensive description of this workflow with step-by-step instructions can be found at the Scientific Strategy website:
https://scientificstrategy.com/kn-121/

KNIME Nodes (KN-121) - Advanced Data GenerationGenerates Customer Profile sample data using advanced data generationtechniques. All of the nodes used in this example come from KNIME. These nodescan be used as an alternative to the Market Simulation data generationcapabilities. This example is originally based upon the KNIME workflow called "Scalable DataGeneration" but has been cleaned up with step-by-step explanations. The requiredExtensions include: (1) KNIME Data Generation, and (2) KNIME JSON-Processing.See Also: https://hub.knime.com/knime/workflows/Examples/50_Applications/53_Performance_and_Scalability/01_workflows/00_Scalable_Data_Generation Step 4: Generate an appropriateIncome for all Customers basedupon their Age Generation andwhether they are a Student. Step 5: Shred the Income ofsome Customers based uponthe Probability that they are notcurrently working. Step 6: Clean up the finalCustomer Profiles by roundingthe Income, removing extracolumns, and sorting byCustomerID. Step 1: Create 200 RandomCustomers (48% Male / 52%Female) with Customer RowID'srunning from c0 to c199. Step 2: Use 6 Age PyramidProfiles to randomly allocate theage of these Customers frombetween 17 years old and 100years old. Advanced Data Generation Steps:1. Generate list of 200 Male / Female Customers2. Create a Population Age Pyramid using 6 x Pyramid Profiles3. Assign all Customers an Occupation based upon Age Probability4. Generate an appropriate Income for all Customers5. Shred the Income based upon Probability that actively Working6. Clean up the final Customer Profiles Step 3: Assign an Occupation toeach Customer based upon theirAge and an OccupationProbability.Occupation =Non-StudentOccupation =StudentGenerate Incomesby Age GenerationGenerate RandomStudent IncomeRejoin AllOccupationsAssign Occupationby Age GenerationAssign Familyby Age GenerationRoundAgePopulationAge PyramidShred Income byWorking ProbabilityRoundIncomeTop = Has IncomeBottom = No IncomeShred IncomeIncome = 0.0Rejoin AllIncome200 CustomersIn Emtpy TableRandom Gender48% Male / 52% FemaleBin Customer Ageinto GenerationsDefine AgeFrom PyramidSort ByCustomerIDGenerate Counterfor CustomerIDRenameCustomerIDRemove ColumnHas IncomeRemove PyramidColumn Nominal ValueRow Filter Nominal ValueRow Filter Gamma DistributedAssigner Gamma DistributedAssigner Concatenate ConditionalLabel Assigner ConditionalLabel Assigner Double To Int Random LabelAssigner ConditionalLabel Assigner Double To Int Row Splitter ConstantValue Column Concatenate Empty Table Creator Random LabelAssigner Numeric Binner Gaussian DistributedAssigner Sorter Counter Generation Column Rename Column Filter Column Filter KNIME Nodes (KN-121) - Advanced Data GenerationGenerates Customer Profile sample data using advanced data generationtechniques. All of the nodes used in this example come from KNIME. These nodescan be used as an alternative to the Market Simulation data generationcapabilities. This example is originally based upon the KNIME workflow called "Scalable DataGeneration" but has been cleaned up with step-by-step explanations. The requiredExtensions include: (1) KNIME Data Generation, and (2) KNIME JSON-Processing.See Also: https://hub.knime.com/knime/workflows/Examples/50_Applications/53_Performance_and_Scalability/01_workflows/00_Scalable_Data_Generation Step 4: Generate an appropriateIncome for all Customers basedupon their Age Generation andwhether they are a Student. Step 5: Shred the Income ofsome Customers based uponthe Probability that they are notcurrently working. Step 6: Clean up the finalCustomer Profiles by roundingthe Income, removing extracolumns, and sorting byCustomerID. Step 1: Create 200 RandomCustomers (48% Male / 52%Female) with Customer RowID'srunning from c0 to c199. Step 2: Use 6 Age PyramidProfiles to randomly allocate theage of these Customers frombetween 17 years old and 100years old. Advanced Data Generation Steps:1. Generate list of 200 Male / Female Customers2. Create a Population Age Pyramid using 6 x Pyramid Profiles3. Assign all Customers an Occupation based upon Age Probability4. Generate an appropriate Income for all Customers5. Shred the Income based upon Probability that actively Working6. Clean up the final Customer Profiles Step 3: Assign an Occupation toeach Customer based upon theirAge and an OccupationProbability.Occupation =Non-StudentOccupation =StudentGenerate Incomesby Age GenerationGenerate RandomStudent IncomeRejoin AllOccupationsAssign Occupationby Age GenerationAssign Familyby Age GenerationRoundAgePopulationAge PyramidShred Income byWorking ProbabilityRoundIncomeTop = Has IncomeBottom = No IncomeShred IncomeIncome = 0.0Rejoin AllIncome200 CustomersIn Emtpy TableRandom Gender48% Male / 52% FemaleBin Customer Ageinto GenerationsDefine AgeFrom PyramidSort ByCustomerIDGenerate Counterfor CustomerIDRenameCustomerIDRemove ColumnHas IncomeRemove PyramidColumn Nominal ValueRow Filter Nominal ValueRow Filter Gamma DistributedAssigner Gamma DistributedAssigner Concatenate ConditionalLabel Assigner ConditionalLabel Assigner Double To Int Random LabelAssigner ConditionalLabel Assigner Double To Int Row Splitter ConstantValue Column Concatenate Empty Table Creator Random LabelAssigner Numeric Binner Gaussian DistributedAssigner Sorter Counter Generation Column Rename Column Filter Column Filter

Nodes

Extensions

Links