Graph Density Initializer

Creates a density model of a n-dimensional vector space, based on a kNN graph. The kNN graph is created by connecting each row with k rows closest (in terms of Euclidean distance) to it in the feature space (k is the specified number of neighbors). This way each row has at least k edges in the kNN graph, however there are two cases in which a row may have more than k edges:

  • It is among the k nearest neighbors of a row that is not among its own nearest neighbors.
  • There are multiple rows that would be the kth nearest neighbor because they have the same distance to the row in question.
Each edge in the kNN graph is weighted using a Gaussian kernel over the distance of the connected rows with standard deviation Sigma. The density of a specific row is calculated as the mean weight of all its edge weights. For more details see the RALF Paper by Ebert et al.
If the node fails to execute due to memory problems, this is usually because the number of neighbors is set too high.

Options

Column Selection
The columns that make up the vector space.
Number of Neighbors
The number of neighbors which are considered.
Sigma
The Sigma for the Gaussian distance weighting function.
Missing Value Handling
Missing values can't be used to build the model therefore two strategies exist to cope with missing values. The node can either fail if it encounters a missing value in one of the used columns or it can ignore the row in which the missing value occurred. Note that an ignored row is not part of the model and consequently the Density Scorer node will treat it as an unknown row.

Input Ports

Icon
Table to build a density model for.

Output Ports

Icon
A Density Scorer Model that can be used with the Density Scorer node.

Views

This node has no views

Workflows

Links

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.