NaiveBayes¶
-
class
pyspark.mllib.classification.
NaiveBayes
[source]¶ New in version 0.9.0.
Methods
Methods Documentation
-
classmethod
train
(data, lambda_=1.0)[source]¶ Train a Naive Bayes model given an RDD of (label, features) vectors.
This is the Multinomial NB which can handle all kinds of discrete data. For example, by converting documents into TF-IDF vectors, it can be used for document classification. By making every vector a 0-1 vector, it can also be used as Bernoulli NB. The input feature values must be nonnegative.
- Parameters
data – RDD of LabeledPoint.
lambda – The smoothing parameter. (default: 1.0)
New in version 0.9.0.
-
classmethod