Synthetic Data Generator (Multilabel Classification)

This component generates example data for a multilabel classification task based on the make_multilabel_classification() function in the Python scikit-learn library.
It generates class columns with 0s/1s that indicate the absence/presence of the respective label. The average number of labels assigned to each row can be regulated.

For more information see the sklearn documentation:

scikit-learn.org/stable/modules/generated/sklearn.datasets.make_multilabel_classification.html

Note: This component requires a Python environment. In this blog post we explain how to setup the KNIME Python extension:

knime.com/blog/setting-up-the-knime-python-extension-revisited-for-python-30-and-20

Options

Number of Labels
The average number of labels per sample.
Number of Classes
The number of class columns to generate.
Number of Samples
The number of samples to generate.%%00010
Number of Features
The number of features to generate.
Random Seed
The seed for dataset creation to make the output reproducible.

Input Ports

This node has no input ports

Output Ports

Icon
Multi-class Classification Data

Nodes

Extensions

Links