RankingMetrics¶
-
class
pyspark.mllib.evaluation.
RankingMetrics
(predictionAndLabels)[source]¶ Evaluator for ranking algorithms.
- Parameters
predictionAndLabels – an RDD of (predicted ranking, ground truth set) pairs.
>>> predictionAndLabels = sc.parallelize([ ... ([1, 6, 2, 7, 8, 3, 9, 10, 4, 5], [1, 2, 3, 4, 5]), ... ([4, 1, 5, 6, 2, 7, 3, 8, 9, 10], [1, 2, 3]), ... ([1, 2, 3, 4, 5], [])]) >>> metrics = RankingMetrics(predictionAndLabels) >>> metrics.precisionAt(1) 0.33... >>> metrics.precisionAt(5) 0.26... >>> metrics.precisionAt(15) 0.17... >>> metrics.meanAveragePrecision 0.35... >>> metrics.meanAveragePrecisionAt(1) 0.3333333333333333... >>> metrics.meanAveragePrecisionAt(2) 0.25... >>> metrics.ndcgAt(3) 0.33... >>> metrics.ndcgAt(10) 0.48... >>> metrics.recallAt(1) 0.06... >>> metrics.recallAt(5) 0.35... >>> metrics.recallAt(15) 0.66...
New in version 1.4.0.
Methods
Attributes
Methods Documentation
-
call
(name, *a)¶ Call method of java_model
-
meanAveragePrecisionAt
(k)[source]¶ Returns the mean average precision (MAP) at first k ranking of all the queries. If a query has an empty ground truth set, the average precision will be zero and a log warining is generated.
New in version 3.0.0.
-
ndcgAt
(k)[source]¶ Compute the average NDCG value of all the queries, truncated at ranking position k. The discounted cumulative gain at position k is computed as: sum,,i=1,,^k^ (2^{relevance of ‘’i’’th item}^ - 1) / log(i + 1), and the NDCG is obtained by dividing the DCG value on the ground truth set. In the current implementation, the relevance value is binary. If a query has an empty ground truth set, zero will be used as NDCG together with a log warning.
New in version 1.4.0.
-
precisionAt
(k)[source]¶ Compute the average precision of all the queries, truncated at ranking position k.
If for a query, the ranking algorithm returns n (n < k) results, the precision value will be computed as #(relevant items retrieved) / k. This formula also applies when the size of the ground truth set is less than k.
If a query has an empty ground truth set, zero will be used as precision together with a log warning.
New in version 1.4.0.
-
recallAt
(k)[source]¶ Compute the average recall of all the queries, truncated at ranking position k.
If for a query, the ranking algorithm returns n results, the recall value will be computed as #(relevant items retrieved) / #(ground truth set). This formula also applies when the size of the ground truth set is less than k.
If a query has an empty ground truth set, zero will be used as recall together with a log warning.
New in version 3.0.0.
Attributes Documentation
-
meanAveragePrecision
¶ Returns the mean average precision (MAP) of all the queries. If a query has an empty ground truth set, the average precision will be zero and a log warining is generated.
New in version 1.4.0.