K Nearest Neighbor (Distance Function)

Classifies a set of test data based on the k Nearest Neighbor algorithm using the training data.

Note that for the Euclidean distance on numeric columns the other K Nearest Neighbor node performs better as it uses an efficient index structure. The implementation of this node performs an exhaustive search of all points to all potential neighbors and is therefore suited for small to medium data sets only.

Options

Column with class labels
Select the column to be used as classification attribute. This column must contain nominal values.
Number of neighbors to consider (k)
The number of nearest neighbors used to classify a new instance. An odd number is recommended to avoid ties.
Weight neighbors by distance
If enabled, the distance of each neighbor to the query pattern influences its weight in the classification. Closer neighbors have greater influence on the result. Note: Only k neighbors are considered, regardless of weighting.
Output class probabilities
If enabled, additional columns containing the class probabilities for each predicted class will be appended to the output.

Input Ports

Icon
Input port for the training data
Icon
Input port for the test data
Icon
The distance function to use.

Output Ports

Icon
Output data with class labels

Views

This node has no views

Workflows

Links

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.