Synthetic Data Generator (Clustering)

This component generates example data for a clustering task based on the make_blobs() function in the Python scikit-learn library.

For more information see the sklearn documentation:

scikit-learn.org/stable/modules/generated/sklearn.datasets.make_blobs.html

Note: This component requires a Python environment. In this blog post we explain how to setup the KNIME Python extension:

knime.com/blog/setting-up-the-knime-python-extension-revisited-for-python-30-and-20

Options

Cluster Standard Deviation: The standard deviation of the clusters determining how compact and well-separated the clusters are.
Number of Features: The number of features to generate.
Number of Cluster Centers: The number of clusters to generate.
Random Seed: The seed for dataset creation to make the output reproducible.
Number of Samples: The number of samples to generate.

Synthetic Data Generator (Clustering)

Options

Input Ports

Output Ports

Nodes

Extensions

Links

Download