RegressionMetrics

class pyspark.mllib.evaluation.RegressionMetrics(predictionAndObservations)[source]

Evaluator for regression.

Parameters

predictionAndObservations – an RDD of prediction, observation and optional weight.

>>> predictionAndObservations = sc.parallelize([
...     (2.5, 3.0), (0.0, -0.5), (2.0, 2.0), (8.0, 7.0)])
>>> metrics = RegressionMetrics(predictionAndObservations)
>>> metrics.explainedVariance
8.859...
>>> metrics.meanAbsoluteError
0.5...
>>> metrics.meanSquaredError
0.37...
>>> metrics.rootMeanSquaredError
0.61...
>>> metrics.r2
0.94...
>>> predictionAndObservationsWithOptWeight = sc.parallelize([
...     (2.5, 3.0, 0.5), (0.0, -0.5, 1.0), (2.0, 2.0, 0.3), (8.0, 7.0, 0.9)])
>>> metrics = RegressionMetrics(predictionAndObservationsWithOptWeight)
>>> metrics.rootMeanSquaredError
0.68...

New in version 1.4.0.

Methods

Attributes

Methods Documentation

call(name, *a)

Call method of java_model

Attributes Documentation

explainedVariance

Returns the explained variance regression score. explainedVariance = \(1 - \frac{variance(y - \hat{y})}{variance(y)}\)

New in version 1.4.0.

meanAbsoluteError

Returns the mean absolute error, which is a risk function corresponding to the expected value of the absolute error loss or l1-norm loss.

New in version 1.4.0.

meanSquaredError

Returns the mean squared error, which is a risk function corresponding to the expected value of the squared error loss or quadratic loss.

New in version 1.4.0.

r2

Returns R^2^, the coefficient of determination.

New in version 1.4.0.

rootMeanSquaredError

Returns the root mean squared error, which is defined as the square root of the mean squared error.

New in version 1.4.0.