Functional metrics¶
Classification Metrics¶
accuracy [func]¶
-
torchmetrics.functional.
accuracy
(preds, target, average='micro', mdmc_average='global', threshold=0.5, top_k=None, subset_accuracy=False, num_classes=None, multiclass=None, ignore_index=None)[source] Computes Accuracy:
Where
is a tensor of target values, and
is a tensor of predictions.
For multi-class and multi-dimensional multi-class data with probability predictions, the parameter
top_k
generalizes this metric to a Top-K accuracy metric: for each sample the top-K highest probability items are considered to find the correct label.For multi-label and multi-dimensional multi-class inputs, this metric computes the “global” accuracy by default, which counts all labels or sub-samples separately. This can be changed to subset accuracy (which requires all labels or sub-samples in the sample to be correctly predicted) by setting
subset_accuracy=True
.Accepts all input types listed in Input types.
- Parameters
preds¶ (
Tensor
) – Predictions from model (probabilities, or labels)Defines the reduction that is applied. Should be one of the following:
'micro'
[default]: Calculate the metric globally, across all samples and classes.'macro'
: Calculate the metric for each class separately, and average the metrics across classes (with equal weights for each class).'weighted'
: Calculate the metric for each class separately, and average the metrics across classes, weighting each class by its support (tp + fn
).'none'
orNone
: Calculate the metric for each class separately, and return the metric for every class.'samples'
: Calculate the metric for each sample, and average the metrics across samples (with equal weights for each sample).
Note
What is considered a sample in the multi-dimensional multi-class case depends on the value of
mdmc_average
.mdmc_average¶ (
Optional
[str
]) –Defines how averaging is done for multi-dimensional multi-class inputs (on top of the
average
parameter). Should be one of the following:None
[default]: Should be left unchanged if your data is not multi-dimensional multi-class.'samplewise'
: In this case, the statistics are computed separately for each sample on theN
axis, and then averaged over samples. The computation for each sample is done by treating the flattened extra axes...
(see Input types) as theN
dimension within the sample, and computing the metric for the sample based on that.'global'
: In this case theN
and...
dimensions of the inputs (see Input types) are flattened into a newN_X
sample axis, i.e. the inputs are treated as if they were(N_X, C)
. From here on theaverage
parameter applies as usual.
num_classes¶ (
Optional
[int
]) – Number of classes. Necessary for'macro'
,'weighted'
andNone
average methods.threshold¶ (
float
) – Threshold probability value for transforming probability predictions to binary (0,1) predictions, in the case of binary or multi-label inputs.Number of highest probability predictions considered to find the correct label, relevant only for (multi-dimensional) multi-class inputs with probability predictions. The default value (
None
) will be interpreted as 1 for these inputs.Should be left at default (
None
) for all other types of inputs.multiclass¶ (
Optional
[bool
]) – Used only in certain special cases, where you want to treat inputs as a different type than what they appear to be. See the parameter’s documentation section for a more detailed explanation and examples.ignore_index¶ (
Optional
[int
]) – Integer specifying a target class to ignore. If given, this class index does not contribute to the returned score, regardless of reduction method. If an index is ignored, andaverage=None
or'none'
, the score for the ignored class will be returned asnan
.Whether to compute subset accuracy for multi-label and multi-dimensional multi-class inputs (has no effect for other input types).
For multi-label inputs, if the parameter is set to
True
, then all labels for each sample must be correctly predicted for the sample to count as correct. If it is set toFalse
, then all labels are counted separately - this is equivalent to flattening inputs beforehand (i.e.preds = preds.flatten()
and same fortarget
).For multi-dimensional multi-class inputs, if the parameter is set to
True
, then all sub-sample (on the extra axis) must be correct for the sample to be counted as correct. If it is set toFalse
, then all sub-samples are counter separately - this is equivalent, in the case of label predictions, to flattening the inputs beforehand (i.e.preds = preds.flatten()
and same fortarget
). Note that thetop_k
parameter still applies in both cases, if set.
- Raises
ValueError – If
threshold
is not afloat
between0
and1
.ValueError – If
top_k
parameter is set formulti-label
inputs.ValueError – If
average
is none of"micro"
,"macro"
,"weighted"
,"samples"
,"none"
,None
.ValueError – If
mdmc_average
is not one ofNone
,"samplewise"
,"global"
.ValueError – If
average
is set butnum_classes
is not provided.ValueError – If
num_classes
is set andignore_index
is not in the range[0, num_classes)
.ValueError – If
top_k
is not aninteger
larger than0
.
Example
>>> import torch >>> from torchmetrics.functional import accuracy >>> target = torch.tensor([0, 1, 2, 3]) >>> preds = torch.tensor([0, 2, 1, 3]) >>> accuracy(preds, target) tensor(0.5000)
>>> target = torch.tensor([0, 1, 2]) >>> preds = torch.tensor([[0.1, 0.9, 0], [0.3, 0.1, 0.6], [0.2, 0.5, 0.3]]) >>> accuracy(preds, target, top_k=2) tensor(0.6667)
- Return type
auc [func]¶
-
torchmetrics.functional.
auc
(x, y, reorder=False)[source] Computes Area Under the Curve (AUC) using the trapezoidal rule
- Parameters
- Return type
- Returns
Tensor containing AUC score (float)
- Raises
ValueError – If both
x
andy
tensors are not1d
.ValueError – If both
x
andy
don’t have the same numnber of elements.ValueError – If
x
tesnsor is neither increasing or decreasing.
Example
>>> from torchmetrics.functional import auc >>> x = torch.tensor([0, 1, 2, 3]) >>> y = torch.tensor([0, 1, 2, 2]) >>> auc(x, y) tensor(4.) >>> auc(x, y, reorder=True) tensor(4.)
auroc [func]¶
-
torchmetrics.functional.
auroc
(preds, target, num_classes=None, pos_label=None, average='macro', max_fpr=None, sample_weights=None)[source] Compute Area Under the Receiver Operating Characteristic Curve (ROC AUC)
- Parameters
preds¶ (
Tensor
) – predictions from model (logits or probabilities)num_classes¶ (
Optional
[int
]) – integer with number of classes. Not nessesary to provide for binary problems.pos_label¶ (
Optional
[int
]) – integer determining the positive class. Default isNone
which for binary problem is translate to 1. For multiclass problems this argument should not be set as we iteratively change it in the range [0,num_classes-1]'micro'
computes metric globally. Only works for multilabel problems'macro'
computes metric for each class and uniformly averages them'weighted'
computes metric for each class and does a weighted-average, where each class is weighted by their support (accounts for class imbalance)None
computes and returns the metric per class
max_fpr¶ (
Optional
[float
]) – If notNone
, calculates standardized partial AUC over the range [0, max_fpr]. Should be a float between 0 and 1.sample_weights¶ (
Optional
[Sequence
]) – sample weights for each data point
- Raises
ValueError – If
max_fpr
is not afloat
in the range(0, 1]
.RuntimeError – If
PyTorch version
isbelow 1.6
since max_fpr requires torch.bucketize which is not available below 1.6.ValueError – If
max_fpr
is not set toNone
and the mode isnot binary
since partial AUC computation is not available in multilabel/multiclass.ValueError – If
average
is none ofNone
,"macro"
or"weighted"
.
- Example (binary case):
>>> from torchmetrics.functional import auroc >>> preds = torch.tensor([0.13, 0.26, 0.08, 0.19, 0.34]) >>> target = torch.tensor([0, 0, 1, 1, 1]) >>> auroc(preds, target, pos_label=1) tensor(0.5000)
- Example (multiclass case):
>>> preds = torch.tensor([[0.90, 0.05, 0.05], ... [0.05, 0.90, 0.05], ... [0.05, 0.05, 0.90], ... [0.85, 0.05, 0.10], ... [0.10, 0.10, 0.80]]) >>> target = torch.tensor([0, 1, 1, 2, 2]) >>> auroc(preds, target, num_classes=3) tensor(0.7778)
- Return type
average_precision [func]¶
-
torchmetrics.functional.
average_precision
(preds, target, num_classes=None, pos_label=None, sample_weights=None)[source] Computes the average precision score.
- Parameters
preds¶ (
Tensor
) – predictions from model (logits or probabilities)num_classes¶ (
Optional
[int
]) – integer with number of classes. Not nessesary to provide for binary problems.pos_label¶ (
Optional
[int
]) – integer determining the positive class. Default isNone
which for binary problem is translate to 1. For multiclass problems this argument should not be set as we iteratively change it in the range [0,num_classes-1]sample_weights¶ (
Optional
[Sequence
]) – sample weights for each data point
- Return type
- Returns
tensor with average precision. If multiclass will return list of such tensors, one for each class
- Example (binary case):
>>> from torchmetrics.functional import average_precision >>> pred = torch.tensor([0, 1, 2, 3]) >>> target = torch.tensor([0, 1, 1, 1]) >>> average_precision(pred, target, pos_label=1) tensor(1.)
- Example (multiclass case):
>>> pred = torch.tensor([[0.75, 0.05, 0.05, 0.05, 0.05], ... [0.05, 0.75, 0.05, 0.05, 0.05], ... [0.05, 0.05, 0.75, 0.05, 0.05], ... [0.05, 0.05, 0.05, 0.75, 0.05]]) >>> target = torch.tensor([0, 1, 3, 2]) >>> average_precision(pred, target, num_classes=5) [tensor(1.), tensor(1.), tensor(0.2500), tensor(0.2500), tensor(nan)]
cohen_kappa [func]¶
-
torchmetrics.functional.
cohen_kappa
(preds, target, num_classes, weights=None, threshold=0.5)[source] Calculates Cohen’s kappa score that measures inter-annotator agreement. It is defined as
where
is the empirical probability of agreement and
isg the expected agreement when both annotators assign labels randomly. Note that
is estimated using a per-annotator empirical prior over the class labels.
- Parameters
preds¶ (
Tensor
) – (float or long tensor), Either a(N, ...)
tensor with labels or(N, C, ...)
where C is the number of classes, tensor with labels/probabilitiestarget¶ (
Tensor
) –target
(long tensor), tensor with shape(N, ...)
with ground true labelsweights¶ (
Optional
[str
]) – Weighting type to calculate the score. Choose from -None
or'none'
: no weighting -'linear'
: linear weighting -'quadratic'
: quadratic weightingthreshold¶ (
float
) – Threshold value for binary or multi-label probabilities. default: 0.5
Example
>>> from torchmetrics.functional import cohen_kappa >>> target = torch.tensor([1, 1, 0, 0]) >>> preds = torch.tensor([0, 1, 0, 0]) >>> cohen_kappa(preds, target, num_classes=2) tensor(0.5000)
- Return type
confusion_matrix [func]¶
-
torchmetrics.functional.
confusion_matrix
(preds, target, num_classes, normalize=None, threshold=0.5, multilabel=False)[source] Computes the confusion matrix. Works with binary, multiclass, and multilabel data. Accepts probabilities from a model output or integer class values in prediction. Works with multi-dimensional preds and target, but it should be noted that additional dimensions will be flattened.
If preds and target are the same shape and preds is a float tensor, we use the
self.threshold
argument to convert into integer labels. This is the case for binary and multi-label probabilities.If preds has an extra dimension as in the case of multi-class scores we perform an argmax on
dim=1
.If working with multilabel data, setting the is_multilabel argument to True will make sure that a confusion matrix gets calculated per label.
- Parameters
preds¶ (
Tensor
) – (float or long tensor), Either a(N, ...)
tensor with labels or(N, C, ...)
where C is the number of classes, tensor with labels/probabilitiestarget¶ (
Tensor
) –target
(long tensor), tensor with shape(N, ...)
with ground true labelsNormalization mode for confusion matrix. Choose from
None
or'none'
: no normalization (default)'true'
: normalization over the targets (most commonly used)'pred'
: normalization over the predictions'all'
: normalization over the whole matrix
threshold¶ (
float
) – Threshold value for binary or multi-label probabilities. default: 0.5multilabel¶ (
bool
) – determines if data is multilabel or not.
- Example (binary data):
>>> from torchmetrics import ConfusionMatrix >>> target = torch.tensor([1, 1, 0, 0]) >>> preds = torch.tensor([0, 1, 0, 0]) >>> confmat = ConfusionMatrix(num_classes=2) >>> confmat(preds, target) tensor([[2., 0.], [1., 1.]])
- Example (multiclass data):
>>> target = torch.tensor([2, 1, 0, 0]) >>> preds = torch.tensor([2, 1, 0, 1]) >>> confmat = ConfusionMatrix(num_classes=3) >>> confmat(preds, target) tensor([[1., 1., 0.], [0., 1., 0.], [0., 0., 1.]])
- Example (multilabel data):
>>> target = torch.tensor([[0, 1, 0], [1, 0, 1]]) >>> preds = torch.tensor([[0, 0, 1], [1, 0, 1]]) >>> confmat = ConfusionMatrix(num_classes=3, multilabel=True) >>> confmat(preds, target) tensor([[[1., 0.], [0., 1.]], [[1., 0.], [1., 0.]], [[0., 1.], [0., 1.]]])
- Return type
dice_score [func]¶
-
torchmetrics.functional.
dice_score
(preds, target, bg=False, nan_score=0.0, no_fg_score=0.0, reduction='elementwise_mean')[source] Compute dice score from prediction scores
- Parameters
bg¶ (
bool
) – whether to also compute dice for the backgroundnan_score¶ (
float
) – score to return, if a NaN occurs during computationno_fg_score¶ (
float
) – score to return, if no foreground pixel was found in targeta method to reduce metric score over labels.
'elementwise_mean'
: takes the mean (default)'sum'
: takes the sum'none'
: no reduction will be applied
- Return type
- Returns
Tensor containing dice score
Example
>>> from torchmetrics.functional import dice_score >>> pred = torch.tensor([[0.85, 0.05, 0.05, 0.05], ... [0.05, 0.85, 0.05, 0.05], ... [0.05, 0.05, 0.85, 0.05], ... [0.05, 0.05, 0.05, 0.85]]) >>> target = torch.tensor([0, 1, 3, 2]) >>> dice_score(pred, target) tensor(0.3333)
f1 [func]¶
-
torchmetrics.functional.
f1
(preds, target, beta=1.0, average='micro', mdmc_average=None, ignore_index=None, num_classes=None, threshold=0.5, top_k=None, multiclass=None, multilabel=None)[source] Computes F1 metric. F1 metrics correspond to a equally weighted average of the precision and recall scores.
Works with binary, multiclass, and multilabel data. Accepts probabilities from a model output or integer class values in prediction. Works with multi-dimensional preds and target.
If preds and target are the same shape and preds is a float tensor, we use the
self.threshold
argument to convert into integer labels. This is the case for binary and multi-label probabilities.If preds has an extra dimension as in the case of multi-class scores we perform an argmax on
dim=1
.The reduction method (how the precision scores are aggregated) is controlled by the
average
parameter, and additionally by themdmc_average
parameter in the multi-dimensional multi-class case. Accepts all inputs listed in Input types.- Parameters
preds¶ (
Tensor
) – Predictions from model (probabilities or labels)Defines the reduction that is applied. Should be one of the following:
'micro'
[default]: Calculate the metric globally, across all samples and classes.'macro'
: Calculate the metric for each class separately, and average the metrics across classes (with equal weights for each class).'weighted'
: Calculate the metric for each class separately, and average the metrics across classes, weighting each class by its support (tp + fn
).'none'
orNone
: Calculate the metric for each class separately, and return the metric for every class.'samples'
: Calculate the metric for each sample, and average the metrics across samples (with equal weights for each sample).
Note
What is considered a sample in the multi-dimensional multi-class case depends on the value of
mdmc_average
.mdmc_average¶ (
Optional
[str
]) –Defines how averaging is done for multi-dimensional multi-class inputs (on top of the
average
parameter). Should be one of the following:None
[default]: Should be left unchanged if your data is not multi-dimensional multi-class.'samplewise'
: In this case, the statistics are computed separately for each sample on theN
axis, and then averaged over samples. The computation for each sample is done by treating the flattened extra axes...
(see Input types) as theN
dimension within the sample, and computing the metric for the sample based on that.'global'
: In this case theN
and...
dimensions of the inputs (see Input types) are flattened into a newN_X
sample axis, i.e. the inputs are treated as if they were(N_X, C)
. From here on theaverage
parameter applies as usual.
ignore_index¶ (
Optional
[int
]) – Integer specifying a target class to ignore. If given, this class index does not contribute to the returned score, regardless of reduction method. If an index is ignored, andaverage=None
or'none'
, the score for the ignored class will be returned asnan
.num_classes¶ (
Optional
[int
]) – Number of classes. Necessary for'macro'
,'weighted'
andNone
average methods.threshold¶ (
float
) – Threshold probability value for transforming probability predictions to binary (0,1) predictions, in the case of binary or multi-label inputs.Number of highest probability entries for each sample to convert to 1s - relevant only for inputs with probability predictions. If this parameter is set for multi-label inputs, it will take precedence over
threshold
. For (multi-dim) multi-class inputs, this parameter defaults to 1.Should be left unset (
None
) for inputs with label predictions.multiclass¶ (
Optional
[bool
]) – Used only in certain special cases, where you want to treat inputs as a different type than what they appear to be. See the parameter’s documentation section for a more detailed explanation and examples.multilabel¶ (
Optional
[bool
]) –Deprecated since version 0.3: Argument will not have any effect and will be removed in v0.4, please use
multiclass
intead.
- Return type
- Returns
The shape of the returned tensor depends on the
average
parameterIf
average in ['micro', 'macro', 'weighted', 'samples']
, a one-element tensor will be returnedIf
average in ['none', None]
, the shape will be(C,)
, whereC
stands for the number of classes
Example
>>> from torchmetrics.functional import f1 >>> target = torch.tensor([0, 1, 2, 0, 1, 2]) >>> preds = torch.tensor([0, 2, 1, 0, 0, 1]) >>> f1(preds, target, num_classes=3) tensor(0.3333)
fbeta [func]¶
-
torchmetrics.functional.
fbeta
(preds, target, beta=1.0, average='micro', mdmc_average=None, ignore_index=None, num_classes=None, threshold=0.5, top_k=None, multiclass=None, multilabel=None)[source] Computes f_beta metric.
Works with binary, multiclass, and multilabel data. Accepts probabilities from a model output or integer class values in prediction. Works with multi-dimensional preds and target.
If preds and target are the same shape and preds is a float tensor, we use the
self.threshold
argument to convert into integer labels. This is the case for binary and multi-label probabilities.If preds has an extra dimension as in the case of multi-class scores we perform an argmax on
dim=1
.The reduction method (how the precision scores are aggregated) is controlled by the
average
parameter, and additionally by themdmc_average
parameter in the multi-dimensional multi-class case. Accepts all inputs listed in Input types.- Parameters
preds¶ (
Tensor
) – Predictions from model (probabilities or labels)Defines the reduction that is applied. Should be one of the following:
'micro'
[default]: Calculate the metric globally, across all samples and classes.'macro'
: Calculate the metric for each class separately, and average the metrics across classes (with equal weights for each class).'weighted'
: Calculate the metric for each class separately, and average the metrics across classes, weighting each class by its support (tp + fn
).'none'
orNone
: Calculate the metric for each class separately, and return the metric for every class.'samples'
: Calculate the metric for each sample, and average the metrics across samples (with equal weights for each sample).
Note
What is considered a sample in the multi-dimensional multi-class case depends on the value of
mdmc_average
.mdmc_average¶ (
Optional
[str
]) –Defines how averaging is done for multi-dimensional multi-class inputs (on top of the
average
parameter). Should be one of the following:None
[default]: Should be left unchanged if your data is not multi-dimensional multi-class.'samplewise'
: In this case, the statistics are computed separately for each sample on theN
axis, and then averaged over samples. The computation for each sample is done by treating the flattened extra axes...
(see Input types) as theN
dimension within the sample, and computing the metric for the sample based on that.'global'
: In this case theN
and...
dimensions of the inputs (see Input types) are flattened into a newN_X
sample axis, i.e. the inputs are treated as if they were(N_X, C)
. From here on theaverage
parameter applies as usual.
ignore_index¶ (
Optional
[int
]) – Integer specifying a target class to ignore. If given, this class index does not contribute to the returned score, regardless of reduction method. If an index is ignored, andaverage=None
or'none'
, the score for the ignored class will be returned asnan
.num_classes¶ (
Optional
[int
]) – Number of classes. Necessary for'macro'
,'weighted'
andNone
average methods.threshold¶ (
float
) – Threshold probability value for transforming probability predictions to binary (0,1) predictions, in the case of binary or multi-label inputs.top_k¶ (
Optional
[int
]) – Number of highest probability entries for each sample to convert to 1s - relevant only for inputs with probability predictions. If this parameter is set for multi-label inputs, it will take precedence overthreshold
. For (multi-dim) multi-class inputs, this parameter defaults to 1. Should be left unset (None
) for inputs with label predictions.multiclass¶ (
Optional
[bool
]) – Used only in certain special cases, where you want to treat inputs as a different type than what they appear to be. See the parameter’s documentation section for a more detailed explanation and examples.multilabel¶ (
Optional
[bool
]) –Deprecated since version 0.3: Argument will not have any effect and will be removed in v0.4, please use
multiclass
intead.
- Return type
- Returns
The shape of the returned tensor depends on the
average
parameterIf
average in ['micro', 'macro', 'weighted', 'samples']
, a one-element tensor will be returnedIf
average in ['none', None]
, the shape will be(C,)
, whereC
stands for the number of classes
Example
>>> from torchmetrics.functional import fbeta >>> target = torch.tensor([0, 1, 2, 0, 1, 2]) >>> preds = torch.tensor([0, 2, 1, 0, 0, 1]) >>> fbeta(preds, target, num_classes=3, beta=0.5) tensor(0.3333)
hamming_distance [func]¶
-
torchmetrics.functional.
hamming_distance
(preds, target, threshold=0.5)[source] Computes the average Hamming distance (also known as Hamming loss) between targets and predictions:
Where
is a tensor of target values,
is a tensor of predictions, and
refers to the
-th label of the
-th sample of that tensor.
This is the same as
1-accuracy
for binary data, while for all other types of inputs it treats each possible label separately - meaning that, for example, multi-class data is treated as if it were multi-label.Accepts all input types listed in Input types.
- Parameters
Example
>>> from torchmetrics.functional import hamming_distance >>> target = torch.tensor([[0, 1], [1, 1]]) >>> preds = torch.tensor([[0, 1], [0, 1]]) >>> hamming_distance(preds, target) tensor(0.2500)
- Return type
hinge [func]¶
-
torchmetrics.functional.
hinge
(preds, target, squared=False, multiclass_mode=None)[source] Computes the mean Hinge loss, typically used for Support Vector Machines (SVMs). In the binary case it is defined as:
Where
is the target, and
is the prediction.
In the multi-class case, when
multiclass_mode=None
(default),multiclass_mode=MulticlassMode.CRAMMER_SINGER
ormulticlass_mode="crammer-singer"
, this metric will compute the multi-class hinge loss defined by Crammer and Singer as:Where
is the target class (where
is the number of classes), and
is the predicted output per class.
In the multi-class case when
multiclass_mode=MulticlassMode.ONE_VS_ALL
ormulticlass_mode='one-vs-all'
, this metric will use a one-vs-all approach to compute the hinge loss, giving a vector of C outputs where each entry pits that class against all remaining classes.This metric can optionally output the mean of the squared hinge loss by setting
squared=True
Only accepts inputs with preds shape of (N) (binary) or (N, C) (multi-class) and target shape of (N).
- Parameters
preds¶ (
Tensor
) – Predictions from model (as float outputs from decision function).squared¶ (
bool
) – If True, this will compute the squared hinge loss. Otherwise, computes the regular hinge loss (default).multiclass_mode¶ (
Union
[str
,MulticlassMode
,None
]) – Which approach to use for multi-class inputs (has no effect in the binary case).None
(default),MulticlassMode.CRAMMER_SINGER
or"crammer-singer"
, uses the Crammer Singer multi-class hinge loss.MulticlassMode.ONE_VS_ALL
or"one-vs-all"
computes the hinge loss in a one-vs-all fashion.
- Raises
ValueError – If preds shape is not of size (N) or (N, C).
ValueError – If target shape is not of size (N).
ValueError – If
multiclass_mode
is not: None,MulticlassMode.CRAMMER_SINGER
,"crammer-singer"
,MulticlassMode.ONE_VS_ALL
or"one-vs-all"
.
- Example (binary case):
>>> import torch >>> from torchmetrics.functional import hinge >>> target = torch.tensor([0, 1, 1]) >>> preds = torch.tensor([-2.2, 2.4, 0.1]) >>> hinge(preds, target) tensor(0.3000)
- Example (default / multiclass case):
>>> target = torch.tensor([0, 1, 2]) >>> preds = torch.tensor([[-1.0, 0.9, 0.2], [0.5, -1.1, 0.8], [2.2, -0.5, 0.3]]) >>> hinge(preds, target) tensor(2.9000)
- Example (multiclass example, one vs all mode):
>>> target = torch.tensor([0, 1, 2]) >>> preds = torch.tensor([[-1.0, 0.9, 0.2], [0.5, -1.1, 0.8], [2.2, -0.5, 0.3]]) >>> hinge(preds, target, multiclass_mode="one-vs-all") tensor([2.2333, 1.5000, 1.2333])
- Return type
iou [func]¶
-
torchmetrics.functional.
iou
(preds, target, ignore_index=None, absent_score=0.0, threshold=0.5, num_classes=None, reduction='elementwise_mean')[source] Computes Intersection over union, or Jaccard index calculation:
Where:
and
are both tensors of the same size, containing integer class values. They may be subject to conversion from input data (see description below).
Note that it is different from box IoU.
If preds and target are the same shape and preds is a float tensor, we use the
self.threshold
argument to convert into integer labels. This is the case for binary and multi-label probabilities.If pred has an extra dimension as in the case of multi-class scores we perform an argmax on
dim=1
.- Parameters
preds¶ (
Tensor
) – tensor containing predictions from model (probabilities, or labels) with shape[N, d1, d2, ...]
target¶ (
Tensor
) – tensor containing ground truth labels with shape[N, d1, d2, ...]
ignore_index¶ (
Optional
[int
]) – optional int specifying a target class to ignore. If given, this class index does not contribute to the returned score, regardless of reduction method. Has no effect if given an int that is not in the range [0, num_classes-1], where num_classes is either given or derived from pred and target. By default, no index is ignored, and all classes are used.absent_score¶ (
float
) – score to use for an individual class, if no instances of the class index were present in pred AND no instances of the class index were present in target. For example, if we have 3 classes, [0, 0] for pred, and [0, 2] for target, then class 1 would be assigned the absent_score.threshold¶ (
float
) – Threshold value for binary or multi-label probabilities. default: 0.5num_classes¶ (
Optional
[int
]) – Optionally specify the number of classesa method to reduce metric score over labels.
'elementwise_mean'
: takes the mean (default)'sum'
: takes the sum'none'
: no reduction will be applied
- Returns
Tensor containing single value if reduction is ‘elementwise_mean’, or number of classes if reduction is ‘none’
- Return type
IoU score
Example
>>> from torchmetrics.functional import iou >>> target = torch.randint(0, 2, (10, 25, 25)) >>> pred = torch.tensor(target) >>> pred[2:5, 7:13, 9:15] = 1 - pred[2:5, 7:13, 9:15] >>> iou(pred, target) tensor(0.9660)
matthews_corrcoef [func]¶
-
torchmetrics.functional.
matthews_corrcoef
(preds, target, num_classes, threshold=0.5)[source] Calculates Matthews correlation coefficient that measures the general correlation or quality of a classification. In the binary case it is defined as:
where TP, TN, FP and FN are respectively the true postitives, true negatives, false positives and false negatives. Also works in the case of multi-label or multi-class input.
- Parameters
preds¶ (
Tensor
) – (float or long tensor), Either a(N, ...)
tensor with labels or(N, C, ...)
where C is the number of classes, tensor with labels/probabilitiestarget¶ (
Tensor
) –target
(long tensor), tensor with shape(N, ...)
with ground true labelsthreshold¶ (
float
) – Threshold value for binary or multi-label probabilities. default: 0.5
Example
>>> from torchmetrics.functional import matthews_corrcoef >>> target = torch.tensor([1, 1, 0, 0]) >>> preds = torch.tensor([0, 1, 0, 0]) >>> matthews_corrcoef(preds, target, num_classes=2) tensor(0.5774)
- Return type
roc [func]¶
-
torchmetrics.functional.
roc
(preds, target, num_classes=None, pos_label=None, sample_weights=None)[source] Computes the Receiver Operating Characteristic (ROC). Works with both binary, multiclass and multilabel input.
- Parameters
preds¶ (
Tensor
) – predictions from model (logits or probabilities)num_classes¶ (
Optional
[int
]) – integer with number of classes. Not nessesary to provide for binary problems.pos_label¶ (
Optional
[int
]) – integer determining the positive class. Default isNone
which for binary problem is translate to 1. For multiclass problems this argument should not be set as we iteratively change it in the range [0,num_classes-1]sample_weights¶ (
Optional
[Sequence
]) – sample weights for each data point
- Return type
Union
[Tuple
[Tensor
,Tensor
,Tensor
],Tuple
[List
[Tensor
],List
[Tensor
],List
[Tensor
]]]- Returns
3-element tuple containing
- fpr:
tensor with false positive rates. If multiclass or multilabel, this is a list of such tensors, one for each class/label.
- tpr:
tensor with true positive rates. If multiclass or multilabel, this is a list of such tensors, one for each class/label.
- thresholds:
tensor with thresholds used for computing false- and true postive rates If multiclass or multilabel, this is a list of such tensors, one for each class/label.
- Example (binary case):
>>> from torchmetrics.functional import roc >>> pred = torch.tensor([0, 1, 2, 3]) >>> target = torch.tensor([0, 1, 1, 1]) >>> fpr, tpr, thresholds = roc(pred, target, pos_label=1) >>> fpr tensor([0., 0., 0., 0., 1.]) >>> tpr tensor([0.0000, 0.3333, 0.6667, 1.0000, 1.0000]) >>> thresholds tensor([4, 3, 2, 1, 0])
- Example (multiclass case):
>>> from torchmetrics.functional import roc >>> pred = torch.tensor([[0.75, 0.05, 0.05, 0.05], ... [0.05, 0.75, 0.05, 0.05], ... [0.05, 0.05, 0.75, 0.05], ... [0.05, 0.05, 0.05, 0.75]]) >>> target = torch.tensor([0, 1, 3, 2]) >>> fpr, tpr, thresholds = roc(pred, target, num_classes=4) >>> fpr [tensor([0., 0., 1.]), tensor([0., 0., 1.]), tensor([0.0000, 0.3333, 1.0000]), tensor([0.0000, 0.3333, 1.0000])] >>> tpr [tensor([0., 1., 1.]), tensor([0., 1., 1.]), tensor([0., 0., 1.]), tensor([0., 0., 1.])] >>> thresholds [tensor([1.7500, 0.7500, 0.0500]), tensor([1.7500, 0.7500, 0.0500]), tensor([1.7500, 0.7500, 0.0500]), tensor([1.7500, 0.7500, 0.0500])]
- Example (multilabel case):
>>> from torchmetrics.functional import roc >>> pred = torch.tensor([[0.8191, 0.3680, 0.1138], ... [0.3584, 0.7576, 0.1183], ... [0.2286, 0.3468, 0.1338], ... [0.8603, 0.0745, 0.1837]]) >>> target = torch.tensor([[1, 1, 0], [0, 1, 0], [0, 0, 0], [0, 1, 1]]) >>> fpr, tpr, thresholds = roc(pred, target, num_classes=3, pos_label=1) >>> fpr [tensor([0.0000, 0.3333, 0.3333, 0.6667, 1.0000]), tensor([0., 0., 0., 1., 1.]), tensor([0.0000, 0.0000, 0.3333, 0.6667, 1.0000])] >>> tpr [tensor([0., 0., 1., 1., 1.]), tensor([0.0000, 0.3333, 0.6667, 0.6667, 1.0000]), tensor([0., 1., 1., 1., 1.])] >>> thresholds [tensor([1.8603, 0.8603, 0.8191, 0.3584, 0.2286]), tensor([1.7576, 0.7576, 0.3680, 0.3468, 0.0745]), tensor([1.1837, 0.1837, 0.1338, 0.1183, 0.1138])]
precision [func]¶
-
torchmetrics.functional.
precision
(preds, target, average='micro', mdmc_average=None, ignore_index=None, num_classes=None, threshold=0.5, top_k=None, multiclass=None, multilabel=None, is_multiclass=None)[source] Computes Precision:
Where
and
represent the number of true positives and false positives respecitively. With the use of
top_k
parameter, this metric can generalize to Precision@K.The reduction method (how the precision scores are aggregated) is controlled by the
average
parameter, and additionally by themdmc_average
parameter in the multi-dimensional multi-class case. Accepts all inputs listed in Input types.- Parameters
preds¶ (
Tensor
) – Predictions from model (probabilities or labels)Defines the reduction that is applied. Should be one of the following:
'micro'
[default]: Calculate the metric globally, across all samples and classes.'macro'
: Calculate the metric for each class separately, and average the metrics across classes (with equal weights for each class).'weighted'
: Calculate the metric for each class separately, and average the metrics across classes, weighting each class by its support (tp + fn
).'none'
orNone
: Calculate the metric for each class separately, and return the metric for every class.'samples'
: Calculate the metric for each sample, and average the metrics across samples (with equal weights for each sample).
Note
What is considered a sample in the multi-dimensional multi-class case depends on the value of
mdmc_average
.mdmc_average¶ (
Optional
[str
]) –Defines how averaging is done for multi-dimensional multi-class inputs (on top of the
average
parameter). Should be one of the following:None
[default]: Should be left unchanged if your data is not multi-dimensional multi-class.'samplewise'
: In this case, the statistics are computed separately for each sample on theN
axis, and then averaged over samples. The computation for each sample is done by treating the flattened extra axes...
(see Input types) as theN
dimension within the sample, and computing the metric for the sample based on that.'global'
: In this case theN
and...
dimensions of the inputs (see Input types) are flattened into a newN_X
sample axis, i.e. the inputs are treated as if they were(N_X, C)
. From here on theaverage
parameter applies as usual.
ignore_index¶ (
Optional
[int
]) – Integer specifying a target class to ignore. If given, this class index does not contribute to the returned score, regardless of reduction method. If an index is ignored, andaverage=None
or'none'
, the score for the ignored class will be returned asnan
.num_classes¶ (
Optional
[int
]) – Number of classes. Necessary for'macro'
,'weighted'
andNone
average methods.threshold¶ (
float
) – Threshold probability value for transforming probability predictions to binary (0,1) predictions, in the case of binary or multi-label inputs.Number of highest probability entries for each sample to convert to 1s - relevant only for inputs with probability predictions. If this parameter is set for multi-label inputs, it will take precedence over
threshold
. For (multi-dim) multi-class inputs, this parameter defaults to 1.Should be left unset (
None
) for inputs with label predictions.multiclass¶ (
Optional
[bool
]) – Used only in certain special cases, where you want to treat inputs as a different type than what they appear to be. See the parameter’s documentation section for a more detailed explanation and examples.multilabel¶ (
Optional
[bool
]) –Deprecated since version 0.3: Argument will not have any effect and will be removed in v0.4, please use
multiclass
intead.is_multiclass¶ (
Optional
[bool
]) –Deprecated since version 0.3: Argument will not have any effect and will be removed in v0.4, please use
multiclass
intead.
- Return type
- Returns
The shape of the returned tensor depends on the
average
parameterIf
average in ['micro', 'macro', 'weighted', 'samples']
, a one-element tensor will be returnedIf
average in ['none', None]
, the shape will be(C,)
, whereC
stands for the number of classes
- Raises
ValueError – If
average
is not one of"micro"
,"macro"
,"weighted"
,"samples"
,"none"
orNone
.ValueError – If
mdmc_average
is not one ofNone
,"samplewise"
,"global"
.ValueError – If
average
is set butnum_classes
is not provided.ValueError – If
num_classes
is set andignore_index
is not in the range[0, num_classes)
.
Example
>>> from torchmetrics.functional import precision >>> preds = torch.tensor([2, 0, 2, 1]) >>> target = torch.tensor([1, 1, 2, 0]) >>> precision(preds, target, average='macro', num_classes=3) tensor(0.1667) >>> precision(preds, target, average='micro') tensor(0.2500)
precision_recall [func]¶
-
torchmetrics.functional.
precision_recall
(preds, target, average='micro', mdmc_average=None, ignore_index=None, num_classes=None, threshold=0.5, top_k=None, multiclass=None, multilabel=None, is_multiclass=None)[source] Computes Precision and Recall:
Where
text{FN}` and
represent the number of true positives, false negatives and false positives respecitively. With the use of
top_k
parameter, this metric can generalize to Recall@K and Precision@K.The reduction method (how the recall scores are aggregated) is controlled by the
average
parameter, and additionally by themdmc_average
parameter in the multi-dimensional multi-class case. Accepts all inputs listed in Input types.- Parameters
preds¶ (
Tensor
) – Predictions from model (probabilities, or labels)Defines the reduction that is applied. Should be one of the following:
'micro'
[default]: Calculate the metric globally, across all samples and classes.'macro'
: Calculate the metric for each class separately, and average the metrics across classes (with equal weights for each class).'weighted'
: Calculate the metric for each class separately, and average the metrics across classes, weighting each class by its support (tp + fn
).'none'
orNone
: Calculate the metric for each class separately, and return the metric for every class.'samples'
: Calculate the metric for each sample, and average the metrics across samples (with equal weights for each sample).
Note
What is considered a sample in the multi-dimensional multi-class case depends on the value of
mdmc_average
.mdmc_average¶ (
Optional
[str
]) –Defines how averaging is done for multi-dimensional multi-class inputs (on top of the
average
parameter). Should be one of the following:None
[default]: Should be left unchanged if your data is not multi-dimensional multi-class.'samplewise'
: In this case, the statistics are computed separately for each sample on theN
axis, and then averaged over samples. The computation for each sample is done by treating the flattened extra axes...
(see Input types) as theN
dimension within the sample, and computing the metric for the sample based on that.'global'
: In this case theN
and...
dimensions of the inputs (see Input types) are flattened into a newN_X
sample axis, i.e. the inputs are treated as if they were(N_X, C)
. From here on theaverage
parameter applies as usual.
ignore_index¶ (
Optional
[int
]) – Integer specifying a target class to ignore. If given, this class index does not contribute to the returned score, regardless of reduction method. If an index is ignored, andaverage=None
or'none'
, the score for the ignored class will be returned asnan
.num_classes¶ (
Optional
[int
]) – Number of classes. Necessary for'macro'
,'weighted'
andNone
average methods.threshold¶ (
float
) – Threshold probability value for transforming probability predictions to binary (0,1) predictions, in the case of binary or multi-label inputsNumber of highest probability entries for each sample to convert to 1s - relevant only for inputs with probability predictions. If this parameter is set for multi-label inputs, it will take precedence over
threshold
. For (multi-dim) multi-class inputs, this parameter defaults to 1.Should be left unset (
None
) for inputs with label predictions.multiclass¶ (
Optional
[bool
]) – Used only in certain special cases, where you want to treat inputs as a different type than what they appear to be. See the parameter’s documentation section for a more detailed explanation and examples.multilabel¶ (
Optional
[bool
]) –Deprecated since version 0.3: Argument will not have any effect and will be removed in v0.4, please use
multiclass
intead.is_multiclass¶ (
Optional
[bool
]) –Deprecated since version 0.3: Argument will not have any effect and will be removed in v0.4, please use
multiclass
intead.
- Returns
precision and recall. Their shape depends on the
average
parameterIf
average in ['micro', 'macro', 'weighted', 'samples']
, they are a single element tensorIf
average in ['none', None]
, they are a tensor of shape(C, )
, whereC
stands for the number of classes
- Return type
The function returns a tuple with two elements
- Raises
ValueError – If
average
is not one of"micro"
,"macro"
,"weighted"
,"samples"
,"none"
orNone
.ValueError – If
mdmc_average
is not one ofNone
,"samplewise"
,"global"
.ValueError – If
average
is set butnum_classes
is not provided.ValueError – If
num_classes
is set andignore_index
is not in the range[0, num_classes)
.
Example
>>> from torchmetrics.functional import precision_recall >>> preds = torch.tensor([2, 0, 2, 1]) >>> target = torch.tensor([1, 1, 2, 0]) >>> precision_recall(preds, target, average='macro', num_classes=3) (tensor(0.1667), tensor(0.3333)) >>> precision_recall(preds, target, average='micro') (tensor(0.2500), tensor(0.2500))
precision_recall_curve [func]¶
-
torchmetrics.functional.
precision_recall_curve
(preds, target, num_classes=None, pos_label=None, sample_weights=None)[source] Computes precision-recall pairs for different thresholds.
- Parameters
num_classes¶ (
Optional
[int
]) – integer with number of classes. Not nessesary to provide for binary problems.pos_label¶ (
Optional
[int
]) – integer determining the positive class. Default isNone
which for binary problem is translate to 1. For multiclass problems this argument should not be set as we iteratively change it in the range [0,num_classes-1]sample_weights¶ (
Optional
[Sequence
]) – sample weights for each data point
- Return type
Union
[Tuple
[Tensor
,Tensor
,Tensor
],Tuple
[List
[Tensor
],List
[Tensor
],List
[Tensor
]]]- Returns
3-element tuple containing
- precision:
tensor where element i is the precision of predictions with score >= thresholds[i] and the last element is 1. If multiclass, this is a list of such tensors, one for each class.
- recall:
tensor where element i is the recall of predictions with score >= thresholds[i] and the last element is 0. If multiclass, this is a list of such tensors, one for each class.
- thresholds:
Thresholds used for computing precision/recall scores
- Raises
ValueError – If
preds
andtarget
don’t have the same number of dimensions, or one additional dimension forpreds
.ValueError – If the number of classes deduced from
preds
is not the same as thenum_classes
provided.
- Example (binary case):
>>> from torchmetrics.functional import precision_recall_curve >>> pred = torch.tensor([0, 1, 2, 3]) >>> target = torch.tensor([0, 1, 1, 0]) >>> precision, recall, thresholds = precision_recall_curve(pred, target, pos_label=1) >>> precision tensor([0.6667, 0.5000, 0.0000, 1.0000]) >>> recall tensor([1.0000, 0.5000, 0.0000, 0.0000]) >>> thresholds tensor([1, 2, 3])
- Example (multiclass case):
>>> pred = torch.tensor([[0.75, 0.05, 0.05, 0.05, 0.05], ... [0.05, 0.75, 0.05, 0.05, 0.05], ... [0.05, 0.05, 0.75, 0.05, 0.05], ... [0.05, 0.05, 0.05, 0.75, 0.05]]) >>> target = torch.tensor([0, 1, 3, 2]) >>> precision, recall, thresholds = precision_recall_curve(pred, target, num_classes=5) >>> precision [tensor([1., 1.]), tensor([1., 1.]), tensor([0.2500, 0.0000, 1.0000]), tensor([0.2500, 0.0000, 1.0000]), tensor([0., 1.])] >>> recall [tensor([1., 0.]), tensor([1., 0.]), tensor([1., 0., 0.]), tensor([1., 0., 0.]), tensor([nan, 0.])] >>> thresholds [tensor([0.7500]), tensor([0.7500]), tensor([0.0500, 0.7500]), tensor([0.0500, 0.7500]), tensor([0.0500])]
recall [func]¶
-
torchmetrics.functional.
recall
(preds, target, average='micro', mdmc_average=None, ignore_index=None, num_classes=None, threshold=0.5, top_k=None, multiclass=None, multilabel=None, is_multiclass=None)[source] Computes Recall:
Where
and
represent the number of true positives and false negatives respecitively. With the use of
top_k
parameter, this metric can generalize to Recall@K.The reduction method (how the recall scores are aggregated) is controlled by the
average
parameter, and additionally by themdmc_average
parameter in the multi-dimensional multi-class case. Accepts all inputs listed in Input types.- Parameters
preds¶ (
Tensor
) – Predictions from model (probabilities, or labels)Defines the reduction that is applied. Should be one of the following:
'micro'
[default]: Calculate the metric globally, across all samples and classes.'macro'
: Calculate the metric for each class separately, and average the metrics across classes (with equal weights for each class).'weighted'
: Calculate the metric for each class separately, and average the metrics across classes, weighting each class by its support (tp + fn
).'none'
orNone
: Calculate the metric for each class separately, and return the metric for every class.'samples'
: Calculate the metric for each sample, and average the metrics across samples (with equal weights for each sample).
Note
What is considered a sample in the multi-dimensional multi-class case depends on the value of
mdmc_average
.mdmc_average¶ (
Optional
[str
]) –Defines how averaging is done for multi-dimensional multi-class inputs (on top of the
average
parameter). Should be one of the following:None
[default]: Should be left unchanged if your data is not multi-dimensional multi-class.'samplewise'
: In this case, the statistics are computed separately for each sample on theN
axis, and then averaged over samples. The computation for each sample is done by treating the flattened extra axes...
(see Input types) as theN
dimension within the sample, and computing the metric for the sample based on that.'global'
: In this case theN
and...
dimensions of the inputs (see Input types) are flattened into a newN_X
sample axis, i.e. the inputs are treated as if they were(N_X, C)
. From here on theaverage
parameter applies as usual.
ignore_index¶ (
Optional
[int
]) – Integer specifying a target class to ignore. If given, this class index does not contribute to the returned score, regardless of reduction method. If an index is ignored, andaverage=None
or'none'
, the score for the ignored class will be returned asnan
.num_classes¶ (
Optional
[int
]) – Number of classes. Necessary for'macro'
,'weighted'
andNone
average methods.threshold¶ (
float
) – Threshold probability value for transforming probability predictions to binary (0,1) predictions, in the case of binary or multi-label inputsNumber of highest probability entries for each sample to convert to 1s - relevant only for inputs with probability predictions. If this parameter is set for multi-label inputs, it will take precedence over
threshold
. For (multi-dim) multi-class inputs, this parameter defaults to 1.Should be left unset (
None
) for inputs with label predictions.multiclass¶ (
Optional
[bool
]) – Used only in certain special cases, where you want to treat inputs as a different type than what they appear to be. See the parameter’s documentation section for a more detailed explanation and examples.multilabel¶ (
Optional
[bool
]) –Deprecated since version 0.3: Argument will not have any effect and will be removed in v0.4, please use
multiclass
intead.is_multiclass¶ (
Optional
[bool
]) –Deprecated since version 0.3: Argument will not have any effect and will be removed in v0.4, please use
multiclass
intead.
- Return type
- Returns
The shape of the returned tensor depends on the
average
parameterIf
average in ['micro', 'macro', 'weighted', 'samples']
, a one-element tensor will be returnedIf
average in ['none', None]
, the shape will be(C,)
, whereC
stands for the number of classes
- Raises
ValueError – If
average
is not one of"micro"
,"macro"
,"weighted"
,"samples"
,"none"
orNone
.ValueError – If
mdmc_average
is not one ofNone
,"samplewise"
,"global"
.ValueError – If
average
is set butnum_classes
is not provided.ValueError – If
num_classes
is set andignore_index
is not in the range[0, num_classes)
.
Example
>>> from torchmetrics.functional import recall >>> preds = torch.tensor([2, 0, 2, 1]) >>> target = torch.tensor([1, 1, 2, 0]) >>> recall(preds, target, average='macro', num_classes=3) tensor(0.3333) >>> recall(preds, target, average='micro') tensor(0.2500)
select_topk [func]¶
-
torchmetrics.utilities.data.
select_topk
(prob_tensor, topk=1, dim=1)[source] Convert a probability tensor to binary by selecting top-k highest entries.
- Parameters
- Return type
- Returns
A binary tensor of the same shape as the input tensor of type torch.int32
Example
>>> x = torch.tensor([[1.1, 2.0, 3.0], [2.0, 1.0, 0.5]]) >>> select_topk(x, topk=2) tensor([[0, 1, 1], [1, 1, 0]], dtype=torch.int32)
stat_scores [func]¶
-
torchmetrics.functional.
stat_scores
(preds, target, reduce='micro', mdmc_reduce=None, num_classes=None, top_k=None, threshold=0.5, multiclass=None, ignore_index=None, is_multiclass=None)[source] Computes the number of true positives, false positives, true negatives, false negatives. Related to Type I and Type II errors and the confusion matrix.
The reduction method (how the statistics are aggregated) is controlled by the
reduce
parameter, and additionally by themdmc_reduce
parameter in the multi-dimensional multi-class case. Accepts all inputs listed in Input types.- Parameters
preds¶ (
Tensor
) – Predictions from model (probabilities or labels)threshold¶ (
float
) – Threshold probability value for transforming probability predictions to binary (0 or 1) predictions, in the case of binary or multi-label inputs.Number of highest probability entries for each sample to convert to 1s - relevant only for inputs with probability predictions. If this parameter is set for multi-label inputs, it will take precedence over
threshold
. For (multi-dim) multi-class inputs, this parameter defaults to 1.Should be left unset (
None
) for inputs with label predictions.Defines the reduction that is applied. Should be one of the following:
'micro'
[default]: Counts the statistics by summing over all [sample, class] combinations (globally). Each statistic is represented by a single integer.'macro'
: Counts the statistics for each class separately (over all samples). Each statistic is represented by a(C,)
tensor. Requiresnum_classes
to be set.'samples'
: Counts the statistics for each sample separately (over all classes). Each statistic is represented by a(N, )
1d tensor.
Note
What is considered a sample in the multi-dimensional multi-class case depends on the value of
mdmc_reduce
.num_classes¶ (
Optional
[int
]) – Number of classes. Necessary for (multi-dimensional) multi-class or multi-label data.ignore_index¶ (
Optional
[int
]) – Specify a class (label) to ignore. If given, this class index does not contribute to the returned score, regardless of reduction method. If an index is ignored, andreduce='macro'
, the class statistics for the ignored class will all be returned as-1
.mdmc_reduce¶ (
Optional
[str
]) –Defines how the multi-dimensional multi-class inputs are handeled. Should be one of the following:
None
[default]: Should be left unchanged if your data is not multi-dimensional multi-class (see Input types for the definition of input types).'samplewise'
: In this case, the statistics are computed separately for each sample on theN
axis, and then the outputs are concatenated together. In each sample the extra axes...
are flattened to become the sub-sample axis, and statistics for each sample are computed by treating the sub-sample axis as theN
axis for that sample.'global'
: In this case theN
and...
dimensions of the inputs are flattened into a newN_X
sample axis, i.e. the inputs are treated as if they were(N_X, C)
. From here on thereduce
parameter applies as usual.
multiclass¶ (
Optional
[bool
]) – Used only in certain special cases, where you want to treat inputs as a different type than what they appear to be. See the parameter’s documentation section for a more detailed explanation and examples.is_multiclass¶ (
Optional
[bool
]) –Deprecated since version 0.3: Argument will not have any effect and will be removed in v0.4, please use
multiclass
intead.
- Return type
- Returns
The metric returns a tensor of shape
(..., 5)
, where the last dimension corresponds to[tp, fp, tn, fn, sup]
(sup
stands for support and equalstp + fn
). The shape depends on thereduce
andmdmc_reduce
(in case of multi-dimensional multi-class data) parameters:If the data is not multi-dimensional multi-class, then
If
reduce='micro'
, the shape will be(5, )
If
reduce='macro'
, the shape will be(C, 5)
, whereC
stands for the number of classesIf
reduce='samples'
, the shape will be(N, 5)
, whereN
stands for the number of samples
If the data is multi-dimensional multi-class and
mdmc_reduce='global'
, thenIf
reduce='micro'
, the shape will be(5, )
If
reduce='macro'
, the shape will be(C, 5)
If
reduce='samples'
, the shape will be(N*X, 5)
, whereX
stands for the product of sizes of all “extra” dimensions of the data (i.e. all dimensions except forC
andN
)
If the data is multi-dimensional multi-class and
mdmc_reduce='samplewise'
, thenIf
reduce='micro'
, the shape will be(N, 5)
If
reduce='macro'
, the shape will be(N, C, 5)
If
reduce='samples'
, the shape will be(N, X, 5)
- Raises
ValueError – If
reduce
is none of"micro"
,"macro"
or"samples"
.ValueError – If
mdmc_reduce
is none ofNone
,"samplewise"
,"global"
.ValueError – If
reduce
is set to"macro"
andnum_classes
is not provided.ValueError – If
num_classes
is set andignore_index
is not in the range[0, num_classes)
.ValueError – If
ignore_index
is used withbinary data
.ValueError – If inputs are
multi-dimensional multi-class
andmdmc_reduce
is not provided.
Example
>>> from torchmetrics.functional import stat_scores >>> preds = torch.tensor([1, 0, 2, 1]) >>> target = torch.tensor([1, 1, 2, 0]) >>> stat_scores(preds, target, reduce='macro', num_classes=3) tensor([[0, 1, 2, 1, 1], [1, 1, 1, 1, 2], [1, 0, 3, 0, 1]]) >>> stat_scores(preds, target, reduce='micro') tensor([2, 2, 6, 2, 4])
to_categorical [func]¶
-
torchmetrics.utilities.data.
to_categorical
(tensor, argmax_dim=1)[source] Converts a tensor of probabilities to a dense label tensor
- Parameters
- Return type
- Returns
A tensor with categorical labels [N, d2, …]
Example
>>> x = torch.tensor([[0.2, 0.5], [0.9, 0.1]]) >>> to_categorical(x) tensor([1, 0])
to_onehot [func]¶
-
torchmetrics.utilities.data.
to_onehot
(label_tensor, num_classes=None)[source] Converts a dense label tensor to one-hot format
- Parameters
- Return type
- Returns
A sparse label tensor with shape [N, C, d1, d2, …]
Example
>>> x = torch.tensor([1, 2, 3]) >>> to_onehot(x) tensor([[0, 1, 0, 0], [0, 0, 1, 0], [0, 0, 0, 1]])
Regression Metrics¶
explained_variance [func]¶
-
torchmetrics.functional.
explained_variance
(preds, target, multioutput='uniform_average')[source] Computes explained variance.
- Parameters
Defines aggregation in the case of multiple output scores. Can be one of the following strings (default is ‘uniform_average’.):
’raw_values’ returns full set of scores
’uniform_average’ scores are uniformly averaged
’variance_weighted’ scores are weighted by their individual variances
Example
>>> from torchmetrics.functional import explained_variance >>> target = torch.tensor([3, -0.5, 2, 7]) >>> preds = torch.tensor([2.5, 0.0, 2, 8]) >>> explained_variance(preds, target) tensor(0.9572)
>>> target = torch.tensor([[0.5, 1], [-1, 1], [7, -6]]) >>> preds = torch.tensor([[0, 2], [-1, 2], [8, -5]]) >>> explained_variance(preds, target, multioutput='raw_values') tensor([0.9677, 1.0000])
image_gradients [func]¶
-
torchmetrics.functional.
image_gradients
(img)[source] Computes the gradients of a given image using finite difference
- Parameters
img¶ (
Tensor
) – An(N, C, H, W)
input tensor where C is the number of image channels- Return type
- Returns
Tuple of (dy, dx) with each gradient of shape
[N, C, H, W]
- Raises
TypeError – If
img
is not of the type <torch.Tensor>.RuntimeError – If
img
is not a 4D tensor.
Example
>>> from torchmetrics.functional import image_gradients >>> image = torch.arange(0, 1*1*5*5, dtype=torch.float32) >>> image = torch.reshape(image, (1, 1, 5, 5)) >>> dy, dx = image_gradients(image) >>> dy[0, 0, :, :] tensor([[5., 5., 5., 5., 5.], [5., 5., 5., 5., 5.], [5., 5., 5., 5., 5.], [5., 5., 5., 5., 5.], [0., 0., 0., 0., 0.]])
Note
The implementation follows the 1-step finite difference method as followed by the TF implementation. The values are organized such that the gradient of [I(x+1, y)-[I(x, y)]] are at the (x, y) location
mean_absolute_error [func]¶
-
torchmetrics.functional.
mean_absolute_error
(preds, target)[source] Computes mean absolute error
- Parameters
- Return type
- Returns
Tensor with MAE
Example
>>> from torchmetrics.functional import mean_absolute_error >>> x = torch.tensor([0., 1, 2, 3]) >>> y = torch.tensor([0., 1, 2, 2]) >>> mean_absolute_error(x, y) tensor(0.2500)
mean_squared_error [func]¶
-
torchmetrics.functional.
mean_squared_error
(preds, target)[source] Computes mean squared error
- Parameters
- Return type
- Returns
Tensor with MSE
Example
>>> from torchmetrics.functional import mean_squared_error >>> x = torch.tensor([0., 1, 2, 3]) >>> y = torch.tensor([0., 1, 2, 2]) >>> mean_squared_error(x, y) tensor(0.2500)
mean_squared_log_error [func]¶
-
torchmetrics.functional.
mean_squared_log_error
(preds, target)[source] Computes mean squared log error
- Parameters
- Return type
- Returns
Tensor with RMSLE
Example
>>> from torchmetrics.functional import mean_squared_log_error >>> x = torch.tensor([0., 1, 2, 3]) >>> y = torch.tensor([0., 1, 2, 2]) >>> mean_squared_log_error(x, y) tensor(0.0207)
Note
Half precision is only support on GPU for this metric
pearson_corrcoef [func]¶
-
torchmetrics.functional.
pearson_corrcoef
(preds, target)[source] Computes pearson correlation coefficient.
Example
>>> from torchmetrics.functional import pearson_corrcoef >>> target = torch.tensor([3, -0.5, 2, 7]) >>> preds = torch.tensor([2.5, 0.0, 2, 8]) >>> pearson_corrcoef(preds, target) tensor(0.9849)
- Return type
psnr [func]¶
-
torchmetrics.functional.
psnr
(preds, target, data_range=None, base=10.0, reduction='elementwise_mean', dim=None)[source] Computes the peak signal-to-noise ratio
- Parameters
data_range¶ (
Optional
[float
]) – the range of the data. If None, it is determined from the data (max - min).data_range
must be given whendim
is not None.a method to reduce metric score over labels.
'elementwise_mean'
: takes the mean (default)'sum'
: takes the sum'none'
: no reduction will be applied
dim¶ (
Union
[int
,Tuple
[int
, …],None
]) – Dimensions to reduce PSNR scores over provided as either an integer or a list of integers. Default is None meaning scores will be reduced across all dimensions.
- Return type
- Returns
Tensor with PSNR score
- Raises
ValueError – If
dim
is notNone
anddata_range
is not provided.
Example
>>> from torchmetrics.functional import psnr >>> pred = torch.tensor([[0.0, 1.0], [2.0, 3.0]]) >>> target = torch.tensor([[3.0, 2.0], [1.0, 0.0]]) >>> psnr(pred, target) tensor(2.5527)
Note
Half precision is only support on GPU for this metric
r2score [func]¶
-
torchmetrics.functional.
r2score
(preds, target, adjusted=0, multioutput='uniform_average')[source] Computes r2 score also known as coefficient of determination:
where
is the sum of residual squares, and
is total sum of squares. Can also calculate adjusted r2 score given by
where the parameter
(the number of independent regressors) should be provided as the
adjusted
argument.- Parameters
adjusted¶ (
int
) – number of independent regressors for calculating adjusted r2 score. Default 0 (standard r2 score).Defines aggregation in the case of multiple output scores. Can be one of the following strings (default is
'uniform_average'
.):'raw_values'
returns full set of scores'uniform_average'
scores are uniformly averaged'variance_weighted'
scores are weighted by their individual variances
- Raises
ValueError – If both
preds
andtargets
are not1D
or2D
tensors.ValueError – If
len(preds)
is less than2
since at least2
sampels are needed to calculate r2 score.ValueError – If
multioutput
is not one ofraw_values
,uniform_average
orvariance_weighted
.ValueError – If
adjusted
is not aninteger
greater than0
.
Example
>>> from torchmetrics.functional import r2score >>> target = torch.tensor([3, -0.5, 2, 7]) >>> preds = torch.tensor([2.5, 0.0, 2, 8]) >>> r2score(preds, target) tensor(0.9486)
>>> target = torch.tensor([[0.5, 1], [-1, 1], [7, -6]]) >>> preds = torch.tensor([[0, 2], [-1, 2], [8, -5]]) >>> r2score(preds, target, multioutput='raw_values') tensor([0.9654, 0.9082])
- Return type
spearman_corrcoef [func]¶
-
torchmetrics.functional.
spearman_corrcoef
(preds, target)[source] Computes spearmans rank correlation coefficient:
where
and
are the rank associated to the variables x and y. Spearmans correlations coefficient corresponds to the standard pearsons correlation coefficient calculated on the rank variables.
Example
>>> from torchmetrics.functional import spearman_corrcoef >>> target = torch.tensor([3, -0.5, 2, 7]) >>> preds = torch.tensor([2.5, 0.0, 2, 8]) >>> spearman_corrcoef(preds, target) tensor(1.0000)
- Return type
ssim [func]¶
-
torchmetrics.functional.
ssim
(preds, target, kernel_size=(11, 11), sigma=(1.5, 1.5), reduction='elementwise_mean', data_range=None, k1=0.01, k2=0.03)[source] Computes Structual Similarity Index Measure
- Parameters
kernel_size¶ (
Sequence
[int
]) – size of the gaussian kernel (default: (11, 11))sigma¶ (
Sequence
[float
]) – Standard deviation of the gaussian kernel (default: (1.5, 1.5))a method to reduce metric score over labels.
'elementwise_mean'
: takes the mean (default)'sum'
: takes the sum'none'
: no reduction will be applied
data_range¶ (
Optional
[float
]) – Range of the image. IfNone
, it is determined from the image (max - min)
- Return type
- Returns
Tensor with SSIM score
- Raises
TypeError – If
preds
andtarget
don’t have the same data type.ValueError – If
preds
andtarget
don’t haveBxCxHxW shape
.ValueError – If the length of
kernel_size
orsigma
is not2
.ValueError – If one of the elements of
kernel_size
is not anodd positive number
.ValueError – If one of the elements of
sigma
is not apositive number
.
Example
>>> from torchmetrics.functional import ssim >>> preds = torch.rand([16, 1, 16, 16]) >>> target = preds * 0.75 >>> ssim(preds, target) tensor(0.9219)
NLP¶
bleu_score [func]¶
-
torchmetrics.functional.
bleu_score
(translate_corpus, reference_corpus, n_gram=4, smooth=False)[source] Calculate BLEU score of machine translated text with one or more references
- Parameters
- Return type
- Returns
Tensor with BLEU Score
Example
>>> from torchmetrics.functional import bleu_score >>> translate_corpus = ['the cat is on the mat'.split()] >>> reference_corpus = [['there is a cat on the mat'.split(), 'a cat is on the mat'.split()]] >>> bleu_score(translate_corpus, reference_corpus) tensor(0.7598)
Pairwise¶
embedding_similarity [func]¶
-
torchmetrics.functional.
embedding_similarity
(batch, similarity='cosine', reduction='none', zero_diagonal=True)[source] Computes representation similarity
Example
>>> from torchmetrics.functional import embedding_similarity >>> embeddings = torch.tensor([[1., 2., 3., 4.], [1., 2., 3., 4.], [4., 5., 6., 7.]]) >>> embedding_similarity(embeddings) tensor([[0.0000, 1.0000, 0.9759], [1.0000, 0.0000, 0.9759], [0.9759, 0.9759, 0.0000]])
- Parameters
- Return type
- Returns
A square matrix (batch, batch) with the similarity scores between all elements If sum or mean are used, then returns (b, 1) with the reduced value for each row
Retrieval¶
retrieval_average_precision [func]¶
-
torchmetrics.functional.
retrieval_average_precision
(preds, target)[source] Computes average precision (for information retrieval), as explained here.
preds
andtarget
should be of the same shape and live on the same device. If notarget
isTrue
,0
is returned.target
must be either bool or integers andpreds
must be float, otherwise an error is raised.- Parameters
- Return type
- Returns
a single-value tensor with the average precision (AP) of the predictions
preds
w.r.t. the labelstarget
.
Example
>>> from torchmetrics.functional import retrieval_average_precision >>> preds = tensor([0.2, 0.3, 0.5]) >>> target = tensor([True, False, True]) >>> retrieval_average_precision(preds, target) tensor(0.8333)
retrieval_reciprocal_rank [func]¶
-
torchmetrics.functional.
retrieval_reciprocal_rank
(preds, target)[source] Computes reciprocal rank (for information retrieval), as explained here.
preds
andtarget
should be of the same shape and live on the same device. If notarget
isTrue
, 0 is returned.target
must be either bool or integers andpreds
must be float, otherwise an error is raised.- Parameters
- Return type
- Returns
a single-value tensor with the reciprocal rank (RR) of the predictions
preds
wrt the labelstarget
.
Example
>>> from torchmetrics.functional import retrieval_reciprocal_rank >>> preds = torch.tensor([0.2, 0.3, 0.5]) >>> target = torch.tensor([False, True, False]) >>> retrieval_reciprocal_rank(preds, target) tensor(0.5000)
retrieval_precision [func]¶
-
torchmetrics.functional.
retrieval_precision
(preds, target, k=None)[source] Computes the precision metric (for information retrieval), as explained here. Precision is the fraction of relevant documents among all the retrieved documents.
preds
andtarget
should be of the same shape and live on the same device. If notarget
isTrue
,0
is returned.target
must be either bool or integers andpreds
must be float, otherwise an error is raised. If you want to measure Precision@K,k
must be a positive integer.- Parameters
- Return type
- Returns
a single-value tensor with the precision (at
k
) of the predictionspreds
w.r.t. the labelstarget
.
Example
>>> preds = tensor([0.2, 0.3, 0.5]) >>> target = tensor([True, False, True]) >>> retrieval_precision(preds, target, k=2) tensor(0.5000)
retrieval_recall [func]¶
-
torchmetrics.functional.
retrieval_recall
(preds, target, k=None)[source] Computes the recall metric (for information retrieval), as explained here. Recall is the fraction of relevant documents retrieved among all the relevant documents.
preds
andtarget
should be of the same shape and live on the same device. If notarget
isTrue
,0
is returned.target
must be either bool or integers andpreds
must be float, otherwise an error is raised. If you want to measure Recall@K,k
must be a positive integer.- Parameters
- Return type
- Returns
a single-value tensor with the recall (at
k
) of the predictionspreds
w.r.t. the labelstarget
.
Example
>>> from torchmetrics.functional import retrieval_recall >>> preds = tensor([0.2, 0.3, 0.5]) >>> target = tensor([True, False, True]) >>> retrieval_recall(preds, target, k=2) tensor(0.5000)
retrieval_fall_out [func]¶
-
torchmetrics.functional.
retrieval_fall_out
(preds, target, k=None)[source] Computes the Fall-out (for information retrieval), as explained here. Fall-out is the fraction of non-relevant documents retrieved among all the non-relevant documents.
preds
andtarget
should be of the same shape and live on the same device. If notarget
isTrue
,0
is returned.target
must be either bool or integers andpreds
must be float, otherwise an error is raised. If you want to measure Fall-out@K,k
must be a positive integer.- Parameters
- Return type
- Returns
a single-value tensor with the fall-out (at
k
) of the predictionspreds
w.r.t. the labelstarget
.
Example
>>> from torchmetrics.functional import retrieval_fall_out >>> preds = tensor([0.2, 0.3, 0.5]) >>> target = tensor([True, False, True]) >>> retrieval_fall_out(preds, target, k=2) tensor(1.)
retrieval_normalized_dcg [func]¶
-
torchmetrics.functional.
retrieval_normalized_dcg
(preds, target, k=None)[source] Computes Normalized Discounted Cumulative Gain (for information retrieval), as explained here.
preds
andtarget
should be of the same shape and live on the same device.target
must be either bool or integers andpreds
must be float, otherwise an error is raised.- Parameters
- Return type
- Returns
a single-value tensor with the nDCG of the predictions
preds
w.r.t. the labelstarget
.
Example
>>> from torchmetrics.functional import retrieval_normalized_dcg >>> preds = torch.tensor([.1, .2, .3, 4, 70]) >>> target = torch.tensor([10, 0, 0, 1, 5]) >>> retrieval_normalized_dcg(preds, target) tensor(0.6957)