Icon

03_​Data_​Generation

This directory contains 11 workflows.

Icon01_​Generating_​clusters_​with_​Gaussian_​distribution 

Generating Gaussian Clusters In this workflow, each cluster is based on three Gaussian distributed values, which form the final cluster. The workflow […]

Icon02_​Parallel_​Generation_​of_​a_​Data_​Set_​containing_​Clusters 

Generating gaussian clusters in parallel Similar to the workflow "Generating Gaussian Clusters", this workflow generates three clusters with three […]

Icon03_​Data_​generation_​model_​example 

Data Generation Example: Supermodels This workflow generates a sample model database. Each model is assigned a birthdate, hair color, and height. […]

Icon04_​Generating_​random_​missing_​values_​in_​an_​existing_​data_​set 

Generating Missing Values in Existing Dataset This workflow shows how missing values can be very randomly added to the cells of a column. First […]

Icon05_​Generation_​of_​data_​set_​with_​more_​complex_​cluster_​structure 

Generating Artificial-shaped Clusters This workflow generates three types of cluster with more complex structures (boomerang, T-shape, cup). […]

Icon06_​Random_​combination_​of_​two_​sets_​of_​data 

Random combination of data tables This workflow combines two tables randomly. Here, the rows of table 1 are randomly filled with rows from table […]

Icon07_​Splitting_​data_​and_​rejoining_​for_​manipulating_​only_​subpart 

Splitting and Rejoining When manipulating data there are always things that are done per nominal value. This can be applied as demonstrated here: The […]

Icon08_​Generating_​data_​sets_​containing_​association_​rules 

Generating artificial association rules into existing shopping baskets This workflows takes an existing set of shopping baskets. These baskets are […]

Icon09_​Generating_​a_​Shopping_​Market_​Data_​Set 

Shopping Basket Generation This workflow shows an example on how to generate a whole shopping basket, how to use existing information for it, and how […]

Icon10_​Advantages_​of_​Quasi_​Random_​Sequence_​Generation 

This workflow shows the advantage of using quasi random generation for the use of multidimensional numerical data generation. As seen in the final output, […]