MulticlassMetrics

class pyspark.mllib.evaluation.MulticlassMetrics(predictionAndLabels)[source]

Evaluator for multiclass classification.

Parameters

predictionAndLabels – an RDD of prediction, label, optional weight and optional probability.

>>> predictionAndLabels = sc.parallelize([(0.0, 0.0), (0.0, 1.0), (0.0, 0.0),
...     (1.0, 0.0), (1.0, 1.0), (1.0, 1.0), (1.0, 1.0), (2.0, 2.0), (2.0, 0.0)])
>>> metrics = MulticlassMetrics(predictionAndLabels)
>>> metrics.confusionMatrix().toArray()
array([[ 2.,  1.,  1.],
       [ 1.,  3.,  0.],
       [ 0.,  0.,  1.]])
>>> metrics.falsePositiveRate(0.0)
0.2...
>>> metrics.precision(1.0)
0.75...
>>> metrics.recall(2.0)
1.0...
>>> metrics.fMeasure(0.0, 2.0)
0.52...
>>> metrics.accuracy
0.66...
>>> metrics.weightedFalsePositiveRate
0.19...
>>> metrics.weightedPrecision
0.68...
>>> metrics.weightedRecall
0.66...
>>> metrics.weightedFMeasure()
0.66...
>>> metrics.weightedFMeasure(2.0)
0.65...
>>> predAndLabelsWithOptWeight = sc.parallelize([(0.0, 0.0, 1.0), (0.0, 1.0, 1.0),
...      (0.0, 0.0, 1.0), (1.0, 0.0, 1.0), (1.0, 1.0, 1.0), (1.0, 1.0, 1.0), (1.0, 1.0, 1.0),
...      (2.0, 2.0, 1.0), (2.0, 0.0, 1.0)])
>>> metrics = MulticlassMetrics(predAndLabelsWithOptWeight)
>>> metrics.confusionMatrix().toArray()
array([[ 2.,  1.,  1.],
       [ 1.,  3.,  0.],
       [ 0.,  0.,  1.]])
>>> metrics.falsePositiveRate(0.0)
0.2...
>>> metrics.precision(1.0)
0.75...
>>> metrics.recall(2.0)
1.0...
>>> metrics.fMeasure(0.0, 2.0)
0.52...
>>> metrics.accuracy
0.66...
>>> metrics.weightedFalsePositiveRate
0.19...
>>> metrics.weightedPrecision
0.68...
>>> metrics.weightedRecall
0.66...
>>> metrics.weightedFMeasure()
0.66...
>>> metrics.weightedFMeasure(2.0)
0.65...
>>> predictionAndLabelsWithProbabilities = sc.parallelize([
...      (1.0, 1.0, 1.0, [0.1, 0.8, 0.1]), (0.0, 2.0, 1.0, [0.9, 0.05, 0.05]),
...      (0.0, 0.0, 1.0, [0.8, 0.2, 0.0]), (1.0, 1.0, 1.0, [0.3, 0.65, 0.05])])
>>> metrics = MulticlassMetrics(predictionAndLabelsWithProbabilities)
>>> metrics.logLoss()
0.9682...

New in version 1.4.0.

Methods

Attributes

Methods Documentation

call(name, *a)

Call method of java_model

confusionMatrix()[source]

Returns confusion matrix: predicted classes are in columns, they are ordered by class label ascending, as in “labels”.

New in version 1.4.0.

fMeasure(label, beta=None)[source]

Returns f-measure.

New in version 1.4.0.

falsePositiveRate(label)[source]

Returns false positive rate for a given label (category).

New in version 1.4.0.

logLoss(eps=1e-15)[source]

Returns weighted logLoss.

New in version 3.0.0.

precision(label)[source]

Returns precision.

New in version 1.4.0.

recall(label)[source]

Returns recall.

New in version 1.4.0.

truePositiveRate(label)[source]

Returns true positive rate for a given label (category).

New in version 1.4.0.

weightedFMeasure(beta=None)[source]

Returns weighted averaged f-measure.

New in version 1.4.0.

Attributes Documentation

accuracy

Returns accuracy (equals to the total number of correctly classified instances out of the total number of instances).

New in version 2.0.0.

weightedFalsePositiveRate

Returns weighted false positive rate.

New in version 1.4.0.

weightedPrecision

Returns weighted averaged precision.

New in version 1.4.0.

weightedRecall

Returns weighted averaged recall. (equals to precision, recall and f-measure)

New in version 1.4.0.

weightedTruePositiveRate

Returns weighted true positive rate. (equals to precision, recall and f-measure)

New in version 1.4.0.