Naive Bayes Learner

This Node Is Deprecated — This version of the node has been replaced with a new and improved version. The old version is kept for backwards-compatibility, but for all new workflows we suggest to use the version linked below.
Go to Suggested ReplacementNaive Bayes Learner

The node creates a Bayesian model from the given training data. It calculates the number of rows per attribute value per class for nominal attributes and the Gaussian distribution for numerical attributes. The created model could be used in the naive Bayes predictor to predict the class membership of unclassified data. The node displays a warning message if any columns are ignored due to unsupported data types. For example Bit Vector columns are ignored when the PMML compatibility flag is enabled since they are not supported by the PMML standard.

Options

Classification Column
The class value column.
Maximum number of unique nominal values per attribute
All nominal columns with more unique values than the defined number will be skipped during learning.
Default probability
A probability of zero for a given attribute/class value pair requires special attention. Without adjustment, a probability of zero would exercise an absolute veto over a likelihood in which that probability appears as a factor. Therefore, the Bayes model incorporates a default probability parameter that specifies a default (usually very small) probability to use in lieu of zero probability for a given attribute/class value pair. Set to zero for no correction.
Ignore missing values
By default the node uses the missing value information to improve the prediction result. Since the PMML standard does not support this option and ignores missing values this option is disabled if the PMML compatibility option is selected and missing values are ignored.
Create PMML 4.2 compatible model
Select this option to create a model which is compliant with the PMML 4.2 standard. The PMML 4.2 standard ignores missing values and does not support bit vectors. Therefore bit vector columns and missing values are ignored during learning and prediction if this option is selected.

Even if this option is not selected the node creates a valid PMML model. However the model contains KNIME specific information to store missing value and bit vector information. This information is used in the KNIME Naive Bayes Predictor to improve the prediction result but ignored by any other PMML compatible predictor which might result in different prediction results.

Input Ports

Icon
Training data
Icon
Optional PMML port object containing preprocessing operations.

Output Ports

Icon
Learned naive Bayes model. The model can be used to classify data with unknown target (class) attribute. To do so, connect the model out port to the "Naive Bayes Predictor" node.
Icon
Data table with attribute statistics e.g. counts per attribute class pair, mean and standard deviation.

Views

Naive Bayes Learner View
The view displays the learned model with the number of rows per class attribute. The number of rows per attribute per class for nominal attributes and the Gaussian distribution per class for numerical attributes.

Workflows

Links

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.