This node is currently not available in KNIME v5.6 — instead we’re showing this page for KNIME v4.2. You can use the version menu in the title bar to permanently switch your preferred version. This will also show the link to the update site.

t-SNE

t-SNE is a manifold learning technique that learns low-dimensional embeddings for high-dimensional data. It is most often used for visualization purposes because it exploits the local relationships between data points and can hence capture non-linear structures in the data. Unlike other dimension reduction techniques like PCA, a learned t-SNE model can't be applied to new data. The t-SNE algorithm can be roughly summarized as two steps:

Create a probability distribution capturing the relationships between points in the high-dimensional space
Find a low-dimensional space that resembles the probability dimension as good as possible

As t-SNE directly utilizes the data points, it is sensitive to the scale of the input features and for best results it is recommended to normalize the features using the Normalizer node. For further details check out this great blog post or the original paper . The implementation of this node is based on the Smile - Statistical Machine Intelligence and Learning Engine .

Options

Columns: Select the columns that are included by t-SNE i.e. the original features. Note that currently only numerical columns are supported.
Dimension(s) to reduce to: The number of dimension of the target embedding (for visualization typically 2 or 3).
Iterations: The number of learning iterations to perform. Too few iterations might result in a bad embedding while too many iterations take a long time to train.
Learning rate: The learning rate to use, i.e. how much the embedding changes in one iteration. A too small learning rate means that more iterations are required to reach a good embedding while a too large learning rate can result in unstable embeddings that change strongly between iterations.
Perplexity: Informally, the perplexity is the number of neighbors for each data point. Small perplexities focus more on local structure while larger perplexities take more global relationships into account. Typical values for the perplexity lay between 5 and 50.
Remove original data columns: Check this box if you want to remove the columns used to learn the embedding.
Fail if missing values are encountered: If this box is checked, the node fails if it encounters a missing value in one of the columns used for learning. Otherwise, rows containing missing values in the learning columns will be ignored during learning and the corresponding embedding consists of missing values.
Seed: Allows specifying a static seed to allow for reproducible results. NOTE: The Smile library is reproducible if and only if the VM argument smile.threads is 1. We set this property if it is not set during start up but we don't overwrite existing VM arguments, meaning if you set the smile.threads to anything other than 1 in your knime.ini, we won't overwrite this value and results will thus not be reproducible even if a static seed is provided.

Input Ports

: Input port for the data for which a low-dimensional embedding should be learned

Output Ports

: The low-dimensional embedding

Popular Predecessors

Popular Successors

Views

This node has no views

Workflows

No workflows found

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.

Installation

To use this node in KNIME, install the extension KNIME SMILE Machine Learning Integration from the below update site following our NodePit Product and Node Installation Guide:

v4.2

A zipped version of the software site can be downloaded here.

Plugin provider: KNIME AG, Zurich, Switzerland

Plugin version: 4.2.0.v202004222203

On NodePit since: 2020-07-16

Last update: 2025-08-06

KNIME versions: From v4.2 to v4.2

Deploy, schedule, execute, and monitor your KNIME workflows locally, in the cloud or on-premises – with our brand new NodePit Runner.

Try NodePit Runner!