K Nearest Neighbor

Classifies a set of test data based on the k Nearest Neighbor algorithm using the training data. The underlying algorithm uses a KD tree and should therefore exhibit reasonable performance. However, this type of classifier is still only suited for a few thousand to ten thousand or so training instances. All (and only) numeric columns and the Euclidean distance are used in this implementation. All other columns (of non-numeric type) in the test data are being forwarded as-is to the output.

Options

Column with class labels
Select column to be used as classification attribute.
Number of neighbours to consider (k)
Select the number of nearest neighbors used to classify a new instance. An odd number is recommended to avoid ties.
Weight neighbours by distance
Includes the distance of the query pattern to the stored training patterns into the classification. Closer neighbors have greater influence on the resulting class than the ones further away. (Still only k neighbors will be considered, however!)
Output class probabilities
If this option in enabled, additional columns, containing the class probabilities, will be appended to the output table.

Input Ports

Icon
Input port for the training data
Icon
Input port for the test data

Output Ports

Icon
Output data with class labels

Views

This node has no views

Workflows

Links

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.