capsa_torch.interpret¶
- class capsa_torch.interpret.ClassificationMetrics¶
‘Namespace’ for classification metric evaluation functions
- static accuracy(pred, label)¶
Given predictions and labels for N instances of a classification task with C classes, computes the fraction of predictions that match the label (over non-NaN labels).
- Parameters:
pred (
Tensor
) – (N,) with entries in [0, C-1]; NaNs are treated as incorrect predictionslabel (
Tensor
) – (N,) with entries in [0, C-1]; NaNs ignored (not counted towards accuracy)
- Return type:
float
- Returns:
accuracy in [0, 1]
- static errors(pred, label)¶
Given predictions and labels for N instances of a classification task with C classes, computes the number of errors (excluding NaN labels).
- Parameters:
pred (
Tensor
) – (N,) with entries in [0, C-1]; NaNs are treated as incorrect predictionslabel (
Tensor
) – (N,) with entries in [0, C-1]; NaNs ignored (not counted)
- Return type:
float
- Returns:
number of errors >= 0
- static precision(pred, label)¶
Given predictions and labels for N instances of a binary classification task, computes the fraction of predicted positives that are correct over non-NaN labels. Any nonzero, non-NaN value is treated as positive.
- Parameters:
pred (
Tensor
) – (N,) with entries in {0, 1} (or nonzero = positive); NaNs ignoredlabel (
Tensor
) – (N,) with entries in {0, 1} (or nonzero = positive); NaNs ignored
- Return type:
float
- Returns:
precision in [0, 1]
- static positive_predictive_value(pred, label)¶
Given predictions and labels for N instances of a binary classification task, computes the fraction of predicted positives that are correct over non-NaN labels. Any nonzero, non-NaN value is treated as positive.
- Parameters:
pred (
Tensor
) – (N,) with entries in {0, 1} (or nonzero = positive); NaNs ignoredlabel (
Tensor
) – (N,) with entries in {0, 1} (or nonzero = positive); NaNs ignored
- Return type:
float
- Returns:
precision in [0, 1]
- static negative_predictive_value(pred, label)¶
Given predictions and labels for N instances of a binary classification task, computes the fraction of predicted negatives that are correct over non-NaN labels. Zero is treated as negative; NaN predictions are excluded.
- Parameters:
pred (
Tensor
) – (N,) with entries in {0, 1} (zero = negative); NaNs ignoredlabel (
Tensor
) – (N,) with entries in {0, 1} (zero = negative); NaNs ignored
- Return type:
float
- Returns:
negative predictive value (NPV) in [0, 1]
- static true_positives(pred, label)¶
Given predictions and labels for N instances of a binary classification task, computes the number of true positives over non-NaN pairs. Any nonzero value is treated as positive.
- Parameters:
pred (
Tensor
) – (N,) with entries in {0, 1} (or nonzero = positive); NaNs ignoredlabel (
Tensor
) – (N,) with entries in {0, 1} (or nonzero = positive); NaNs ignored
- Return type:
float
- Returns:
count of true positives
- static true_negatives(pred, label)¶
Given predictions and labels for N instances of a binary classification task, computes the number of true negatives over non-NaN pairs. Zero is treated as negative.
- Parameters:
pred (
Tensor
) – (N,) with entries in {0, 1} (zero = negative); NaNs ignoredlabel (
Tensor
) – (N,) with entries in {0, 1} (zero = negative); NaNs ignored
- Return type:
float
- Returns:
count of true negatives
- static false_positives(pred, label)¶
Given predictions and labels for N instances of a binary classification task, computes the number of false positives over non-NaN pairs. Any nonzero value is treated as positive and zero as negative.
- Parameters:
pred (
Tensor
) – (N,) with entries in {0, 1}; NaNs ignoredlabel (
Tensor
) – (N,) with entries in {0, 1}; NaNs ignored
- Return type:
float
- Returns:
count of false positives
- static false_negatives(pred, label)¶
Given predictions and labels for N instances of a binary classification task, computes the number of false negatives over non-NaN pairs. Any nonzero value is treated as positive and zero as negative.
- Parameters:
pred (
Tensor
) – (N,) with entries in {0, 1}; NaNs ignoredlabel (
Tensor
) – (N,) with entries in {0, 1}; NaNs ignored
- Return type:
float
- Returns:
count of false negatives
- static specificity(pred, label)¶
Given predictions and labels for N instances of a binary classification task, computes specificity (true negative rate) TN/(TN+FP) over actual negatives where labels are not NaN.
- Parameters:
pred (
Tensor
) – (N,) with entries in {0, 1}; NaNs are treated as incorrectlabel (
Tensor
) – (N,) with entries in {0, 1}; NaNs ignored
- Return type:
float
- Returns:
specificity in [0, 1]
- static recall(pred, label)¶
Given predictions and labels for N instances of a binary classification task, computes sensitivity/recall (true positive rate) TP/(TP+FN) over actual positives where labels are not NaN.
- Parameters:
pred (
Tensor
) – (N,) with entries in {0, 1}; NaNs are treated as incorrectlabel (
Tensor
) – (N,) with entries in {0, 1}; NaNs ignored
- Return type:
float
- Returns:
sensitivity in [0, 1]
- static sensitivity(pred, label)¶
Given predictions and labels for N instances of a binary classification task, computes sensitivity/recall (true positive rate) TP/(TP+FN) over actual positives where labels are not NaN.
- Parameters:
pred (
Tensor
) – (N,) with entries in {0, 1}; NaNs are treated as incorrectlabel (
Tensor
) – (N,) with entries in {0, 1}; NaNs ignored
- Return type:
float
- Returns:
sensitivity in [0, 1]
- static f1_score(pred, label)¶
Given predictions and labels for N instances of a binary classification task, computes the F1 score using the NaN semantics of ClassificationMetrics.precision and ClassificationMetrics.recall.
- Parameters:
pred (
Tensor
) – (N,) with entries in {0, 1}; NaNs count against recalllabel (
Tensor
) – (N,) with entries in {0, 1}; NaNs ignored
- Return type:
float
- Returns:
F1 score in [0, 1]
- static balanced_accuracy(pred, label)¶
Given predictions and labels for N instances of a binary classification task, computes the balanced accuracy using the NaN semantics of ClassificationMetrics.specificity and ClassificationMetrics.recall.
- Parameters:
pred (
Tensor
) – (N,) with entries in {0, 1}; NaNs are treated as incorrectlabel (
Tensor
) – (N,) with entries in {0, 1}; NaNs ignored
- Return type:
float
- Returns:
Balanced accuracy in [0, 1]
- static multiclass_reward_score(pred, label, reward_matrix)¶
Given predictions and labels for N instances of a classification task with C classes, alongside a reward_matrix R such that R[i][j] describes the reward for predicting class i when the true class is j; evaluates the average reward per instance Note: accuracy is equivalent to a reward_matrix of I_c
:param pred - (N,) with entries in [0, C-1] :param label - (N,) with entries in [0, C-1] :param reward_matrix - (C, C) of floats. Diagonal entries should be the largest in their row and column
- Return type:
float
- class capsa_torch.interpret.RegressionMetrics¶
‘Namespace’ for regression metric evaluation functions
- static mean_absolute_error(pred, label, **kwargs)¶
Given predictions and labels for N instances of a regression task with D dimensions, computes the mean absolute error.
- Parameters:
pred (
Tensor
) – real-valued predictions; NaNs are treated as incorrect predictionslabel (
Tensor
) – real-valued labels; NaNs ignored (not counted towards accuracy)dim – the dimension or dimensions to reduce. Default: None, all dimensions are reduced.
keepdim – whether the output tensor has dim retained or not. Default: False.
- Return type:
float
- Returns:
mean absolute error
- static mae(pred, label, **kwargs)¶
Given predictions and labels for N instances of a regression task with D dimensions, computes the mean absolute error.
- Parameters:
pred (
Tensor
) – real-valued predictions; NaNs are treated as incorrect predictionslabel (
Tensor
) – real-valued labels; NaNs ignored (not counted towards accuracy)dim – the dimension or dimensions to reduce. Default: None, all dimensions are reduced.
keepdim – whether the output tensor has dim retained or not. Default: False.
- Return type:
float
- Returns:
mean absolute error
- static root_mean_square_error(pred, label, **kwargs)¶
Given predictions and labels for N instances of a regression task with D dimensions, computes the root mean square error.
- Parameters:
pred (
Tensor
) – real-valued predictions; NaNs are treated as incorrect predictionslabel (
Tensor
) – real-valued labels; NaNs ignored (not counted towards accuracy)dim – the dimension or dimensions to reduce. Default: None, all dimensions are reduced.
keepdim – whether the output tensor has dim retained or not. Default: False.
- Return type:
float
- Returns:
mean absolute error
- static rmse(pred, label, **kwargs)¶
Given predictions and labels for N instances of a regression task with D dimensions, computes the root mean square error.
- Parameters:
pred (
Tensor
) – real-valued predictions; NaNs are treated as incorrect predictionslabel (
Tensor
) – real-valued labels; NaNs ignored (not counted towards accuracy)dim – the dimension or dimensions to reduce. Default: None, all dimensions are reduced.
keepdim – whether the output tensor has dim retained or not. Default: False.
- Return type:
float
- Returns:
mean absolute error
- capsa_torch.interpret.aleatoric_misclassification_prob_binary(y_pred, y_sigma, risk_threshold=0.0)¶
Compute the misclassification probability using the CDF interpretation from output, risk, and risk_threshold values for multi-class binary label tasks (multiple labels can be positive). The CDF risk interpretation is defined as the probability of sampling from N(output, risk) across the risk_threshold (misclassifying).
- Parameters:
output – The output LOGITS from model prediction
risk – The risk values from model prediction
risk_threshold (Tensor | float) – Threshold for each class in LOGIT space (default:
0.0
)
- Return type:
Tensor
- Returns:
The computed prob. of misclassification risk tensor
- capsa_torch.interpret.aleatoric_misclassification_prob_categorical(y_pred, y_sigma, dim=-1)¶
Compute the aleatoric misclassification probability for categorical tasks (only one positive label per instance).
- Parameters:
y_pred (
Tensor
) – The output LOGITS from model predictiony_sigma (
Tensor
) – The risk values from model predictiondim (
int
) – The dimension which corresponds to the logits over the classes (default:-1
)
- Return type:
Tensor
- Returns:
The computed aleatoric probability of misclassification risk tensor
- capsa_torch.interpret.epistemic_misclassification_prob_binary(y_pred, y_sigma, risk_threshold=0.0)¶
Compute the misclassification probability using the CDF interpretation from output, risk, and risk_threshold values for multi-class binary label tasks (multiple labels can be positive). The CDF risk interpretation is defined as the probability of sampling from N(output, risk) across the risk_threshold (misclassifying).
- Parameters:
output – The output LOGITS from model prediction
risk – The risk values from model prediction
risk_threshold (Tensor | float) – Threshold for each class in LOGIT space (default:
0.0
)
- Return type:
Tensor
- Returns:
The computed prob. of misclassification risk tensor
- capsa_torch.interpret.epistemic_misclassification_prob_categorical(y_pred, y_sigma, dim=-1, num_points_integral=15)¶
Compute the misclassification probability using the CDF interpretation from output, risk, and risk_threshold values for categorical tasks (only one positive label per instance). The CDF risk interpretation is defined as the probability of that the largest element of y_pred (along dim) does NOT correspond to the true largest value of y ~ N(y_pred, y_sigma). It should go to 0 as the elements of y_sigma go to 0.
- Parameters:
y_pred (
Tensor
) – The output LOGITS from model predictiony_sigma (
Tensor
) – The risk values from model predictionnum_points_integral (
int
) – The number of points over which to numerically calculate the integral (default:15
)dim (
int
) – The dimension which corresponds to the logits over the classes (default:-1
)
- Return type:
Tensor
- Returns:
The computed epistemic probability of misclassification risk tensor
- capsa_torch.interpret.epistemic_uncertainty_categorical(y_pred, y_sigma, dim=-1, num_points_sample=15)¶
Compute the epistemic uncertainty for categorical tasks (only one positive label per instance). It goes to 0 as the elements of y_sigma go to 0.
- Parameters:
y_pred (
Tensor
) – The output LOGITS from model predictiony_sigma (
Tensor
) – The risk values from model predictiondim (
int
) – The dimension which corresponds to the logits over the classes (default:-1
)
- Return type:
Tensor
- Returns:
The computed epistemic risk tensor
- capsa_torch.interpret.misclassification_prob_binary(y_pred, y_sigma, risk_threshold=0.0, num_points_integral=15)¶
Compute the misclassification probability using the CDF interpretation from output, risk, and risk_threshold values for multi-class binary label tasks (multiple labels can be positive). The CDF risk interpretation is defined as the probability of sampling from N(output, risk) across the risk_threshold (misclassifying).
- Parameters:
output – The output LOGITS from model prediction
risk – The risk values from model prediction
risk_threshold (Tensor | float) – Threshold for each class in LOGIT space (default:
0.0
)num_points_integral (int) – The number of points over which to numerically calculate the integral (default:
15
)
- Return type:
Tensor
- Returns:
The computed prob. of misclassification risk tensor
- capsa_torch.interpret.misclassification_prob_categorical(y_pred, y_sigma, dim=-1, num_points_integral=15, class_preds=None)¶
Compute the misclassification probability for categorical tasks (only one positive label per instance). The misclassification risk is defined as the complement of the maximum expected softmax value of y ~ N(y_pred, y_sigma). As y_sigma goes to 0, it converges to the softmax uncertainty (complement of the maximum expected softmax value of y_pred).
- Parameters:
y_pred (Tensor) – The output logits from the wrapped model
y_sigma (Tensor) – The risk values from the wrapped model
num_points_integral (int) – The number of points over which to numerically calculate the integral (default:
15
)dim (int) – The dimension which corresponds to the logits over the classes (default:
-1
)class_preds (Tensor | int | None) – The index (or indices) of the predicted classes. If None, will default to the highest probability label for each input (y_pred.argmax(dim)). (default:
None
)
- Return type:
Tensor
- Returns:
Misclassification probability tensor with shape y_pred.shape but without the class dimension
- capsa_torch.interpret.top_percent_risk_cut_accuracy(outputs, risks, gt, risk_thresholds)¶
- Return type:
torch.Tensor
- capsa_torch.interpret.top_percent_risk_cut_metric(outputs, risks, gt, risk_thresholds, metric_fn=<function ClassificationMetrics.accuracy>)¶
For multi-class classification problems. Returns the accuracy of the model on a subsets of the dataset, cutting off the top x% high risk inputs
- Parameters:
outputs (Tensor) – A tensor of output labels; typically integers in the range [1, n_classes]. Must be of shape [len(dataset)].
risks (Tensor) – Risk values corresponding to outputs; higher risk outputs are cut first. These should be produced with
misclassification_prob_categorical()
Must be of shape [len(dataset)] with risks[i] corresponding to outputs[i].gt (Tensor) – Ground truth target labels; typically integers in the range [1, n_classes], but can be anything that supports equality with the outputs. Must be of shape [len(dataset)] with gt[i] corresponding to outputs[i]
risk_thresholds (int | float | Sequence[float] | torch.Tensor) – If float(s), must lie in [0,1) and represent the risk quantile(s) at which to cut outputs. The accuracy will be reported only for outputs with risk below this quantile. If int n, behaves the same as passing torch.linspace(1 / (n + 1), 1., n) as a list.
metric_fn (Callable[[Tensor, Tensor], float]) – A function that computes a metric given predictions p and ground truth gt. Default: Accuracy (default:
<function ClassificationMetrics.accuracy at 0x7f6459e5f040>
)
- Return type:
tuple[torch.Tensor, torch.Tensor]
- Returns:
Value of the metric function for each cut percentage, as a tensor.