Skip to main content

Naive Bayes

Synopsis

This Operator generates a Naive Bayes classification model.

Description

Naive Bayes is a high-bias, low-variance classifier, and it can build a good model even with a small data set. It is simple to use and computationally inexpensive. Typical use cases involve text categorization, including spam detection, sentiment analysis, and recommender systems.

The fundamental assumption of Naive Bayes is that, given the value of the label (the class), the value of any Attribute is independent of the value of any other Attribute. Strictly speaking, this assumption is rarely true (it's "naive"!), but experience shows that the Naive Bayes classifier often works well. The independence assumption vastly simplifies the calculations needed to build the Naive Bayes probability model.

To complete the probability model, it is necessary to make some assumption about the conditional probability distributions for the individual Attributes, given the class. This Operator uses Gaussian probability densities to model the Attribute data.

Differentiation

Naive Bayes (Kernel)

The alternative Operator Naive Bayes (Kernel) is a variant of Naive Bayes where multiple Gaussians are combined, to create a kernel density.

Input

training set

The input port expects an ExampleSet.

Output

model

The Naive Bayes classification model is delivered from this output port. The model can now be applied to unlabelled data to generate predictions.

example set

The ExampleSet that was given as input is passed through without changes.

Parameters

Laplace correction

简单的天真英航yes includes a weakness: if within the training data a given Attribute value never occurs in the context of a given class, then the conditional probability is set to zero. When this zero value is multiplied together with other probabilities, those values are also set to zero, and the results will be misleading. Laplace correction is a simple trick to avoid this problem, adding one to each count to avoid the occurrence of zero values. For most training sets, adding one to each count has only a negligible effect on the estimated probabilities.