BinaryLogisticRegressionTrainingSummary

class pyspark.ml.classification.BinaryLogisticRegressionTrainingSummary(java_obj=None)[source]

Binary Logistic regression training results for a given model.

New in version 2.0.0.

Methods

Attributes

Methods Documentation

fMeasureByLabel(beta=1.0)

Returns f-measure for each label (category).

New in version 2.3.0.

weightedFMeasure(beta=1.0)

Returns weighted averaged f-measure.

New in version 2.3.0.

Attributes Documentation

accuracy

Returns accuracy. (equals to the total number of correctly classified instances out of the total number of instances.)

New in version 2.3.0.

areaUnderROC

Computes the area under the receiver operating characteristic (ROC) curve.

Note

This ignores instance weights (setting all to 1.0) from LogisticRegression.weightCol. This will change in later Spark versions.

New in version 2.0.0.

fMeasureByThreshold

Returns a dataframe with two fields (threshold, F-Measure) curve with beta = 1.0.

Note

This ignores instance weights (setting all to 1.0) from LogisticRegression.weightCol. This will change in later Spark versions.

New in version 2.0.0.

falsePositiveRateByLabel

Returns false positive rate for each label (category).

New in version 2.3.0.

featuresCol

Field in “predictions” which gives the features of each instance as a vector.

New in version 2.0.0.

labelCol

Field in “predictions” which gives the true label of each instance.

New in version 2.0.0.

labels

Returns the sequence of labels in ascending order. This order matches the order used in metrics which are specified as arrays over labels, e.g., truePositiveRateByLabel.

Note: In most cases, it will be values {0.0, 1.0, …, numClasses-1}, However, if the training set is missing a label, then all of the arrays over labels (e.g., from truePositiveRateByLabel) will be of length numClasses-1 instead of the expected numClasses.

New in version 2.3.0.

objectiveHistory

Objective function (scaled loss + regularization) at each iteration.

New in version 2.0.0.

pr

Returns the precision-recall curve, which is a Dataframe containing two fields recall, precision with (0.0, 1.0) prepended to it.

Note

This ignores instance weights (setting all to 1.0) from LogisticRegression.weightCol. This will change in later Spark versions.

New in version 2.0.0.

precisionByLabel

Returns precision for each label (category).

New in version 2.3.0.

precisionByThreshold

Returns a dataframe with two fields (threshold, precision) curve. Every possible probability obtained in transforming the dataset are used as thresholds used in calculating the precision.

Note

This ignores instance weights (setting all to 1.0) from LogisticRegression.weightCol. This will change in later Spark versions.

New in version 2.0.0.

predictionCol

Field in “predictions” which gives the prediction of each class.

New in version 2.3.0.

predictions

Dataframe outputted by the model’s transform method.

New in version 2.0.0.

probabilityCol

Field in “predictions” which gives the probability of each class as a vector.

New in version 2.0.0.

recallByLabel

Returns recall for each label (category).

New in version 2.3.0.

recallByThreshold

Returns a dataframe with two fields (threshold, recall) curve. Every possible probability obtained in transforming the dataset are used as thresholds used in calculating the recall.

Note

This ignores instance weights (setting all to 1.0) from LogisticRegression.weightCol. This will change in later Spark versions.

New in version 2.0.0.

roc

Returns the receiver operating characteristic (ROC) curve, which is a Dataframe having two fields (FPR, TPR) with (0.0, 0.0) prepended and (1.0, 1.0) appended to it.

Note

This ignores instance weights (setting all to 1.0) from LogisticRegression.weightCol. This will change in later Spark versions.

New in version 2.0.0.

totalIterations

Number of training iterations until termination.

New in version 2.0.0.

truePositiveRateByLabel

Returns true positive rate for each label (category).

New in version 2.3.0.

weightedFalsePositiveRate

Returns weighted false positive rate.

New in version 2.3.0.

weightedPrecision

Returns weighted averaged precision.

New in version 2.3.0.

weightedRecall

Returns weighted averaged recall. (equals to precision, recall and f-measure)

New in version 2.3.0.

weightedTruePositiveRate

Returns weighted true positive rate. (equals to precision, recall and f-measure)

New in version 2.3.0.