K Nearest Neighbor (Distance Function)

Classifies a set of test data based on the k Nearest Neighbor algorithm using the training data.

Note that for the Euclidean distance on numeric columns the other K Nearest Neighbor node performs better as it uses an efficient index structure. The implementation of this node performs an exhaustive search of all points to all potential neighbors and is therefore suited for small to medium data sets only.

Options

Column with class labels
Select column to be used as classification attribute.
Number of neighbours to consider (k)
Select the number of nearest neighbors used to classify a new instance. An odd number is recommended to avoid ties.
Weight neighbours by distance
Includes the distance of the query pattern to the stored training patterns into the classification. Closer neighbors have greater influence on the resulting class than the ones further away. (Still only k neighbors will be considered, however!)
Output class probabilities
If this option in enabled, additional columns, containing the class probabilities, will be appended to the output table.

Input Ports

Icon
Input port for the training data
Icon
Input port for the test data
Icon
The distance function to use.

Output Ports

Icon
Output data with class labels

Views

This node has no views

Workflows

Links

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.