Synthetic Data Generator (Clustering)

This component generates example data for a clustering task based on the make_blobs() function in the Python scikit-learn library.

For more information see the sklearn documentation:

scikit-learn.org/stable/modules/generated/sklearn.datasets.make_blobs.html

Note: This component requires a Python environment. In this blog post we explain how to setup the KNIME Python extension:

knime.com/blog/setting-up-the-knime-python-extension-revisited-for-python-30-and-20

Options

Cluster Standard Deviation
The standard deviation of the clusters determining how compact and well-separated the clusters are.
Number of Features
The number of features to generate.
Number of Cluster Centers
The number of clusters to generate.
Random Seed
The seed for dataset creation to make the output reproducible.
Number of Samples
The number of samples to generate.

Input Ports

This node has no input ports

Output Ports

Icon
Cluster data

Nodes

Extensions

Links