Data Generator

Creates random data containing some clusters for Parallel Universes. The data contains a certain fraction of noise patterns and data that is generated to clusters (all clusters have the same size). The data is normalized in [0, 1].

Options

Cluster count
Comma-separated list defining the number of clusters to generate in each universe (e.g., "2, 3" creates 2 clusters in the first universe and 3 in the second).
Universe sizes
Comma-separated list specifying the number of attributes (dimensions) for each universe (e.g., "2, 3" means first universe has 2 attributes, second has 3).
Pattern count
Total number of data patterns to generate across all clusters and universes.
Standard deviation
Controls the spread of data points within each cluster. Smaller values create tighter clusters.
Noise fraction
Proportion of randomly distributed data points that don't belong to any cluster. Value between 0 (no noise) and 1 (all noise).
Random seed
Fixed value to ensure reproducible data generation. Use the same seed to generate identical datasets.

Input Ports

This node has no input ports

Output Ports

Icon
Contains the data with the cluster id as last column
Icon
Contains the cluster centers. The attributes in the universes where the cluster is not located, are filled with missing values.

Views

This node has no views

Workflows

Links

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.