-
Notifications
You must be signed in to change notification settings - Fork 408
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sigmoid function in PrecisionRecallCurve leads to information loss #1526
Comments
Hi! thanks for your contribution!, great first issue! |
Hi @djdameln, thanks for reporting this issue. Sorry for not getting back to you sooner. if format_input:
if not torch.all((preds >= 0) * (preds <= 1)):
preds = preds.sigmoid()
target = target.long() thus in your case simply added On a note, you are completely right that we introduced this as our standard formatting of input to standardize all possible input for further processing. This was especially related to the inclusion of the new |
@SkafteNicki Thanks for clarifying. The proposed solution should solve our problem! |
🐛 Bug
Hello, first of all, thank you for the awesome library! I am a maintainer of the Anomalib library, and we are using TorchMetrics extensively throughout our code base to evaluate our models.
The most recent version of TorchMetrics introduced some changes to the
PrecisionRecallCurve
metric, which are causing some problems in one of our components. The problems are related to the re-mapping of the prediction values to the [0,1] range by applying a sigmoid function.Some context
The goal of the models in our library is to detect anomalous samples in a dataset that contains both normal and anomalous samples. The task is similar to a classical binary classification problem, but instead of generating a class label and a confidence score, our models generate an anomaly score, which quantifies the distance of the sample to the distribution of normal samples seen during training. The range of possible anomaly score values is unbounded and may differ widely between models and/or datasets, which makes it tricky to set a good threshold for mapping the raw anomaly scores to a binary class label (normal vs. anomalous). This is why we apply an adaptive thresholding mechanism as a post-processing step. The adaptive threshold mechanism returns the threshold value that maximizes the F1 score over the validation set.
Our adaptive thresholding class inherits from TorchMetrics'
PrecisionRecallCurve
class. After TorchMetrics computes the precision and recall values, our class computes the F1 scores for the range of precision and recall values, and finally returns the threshold value that corresponds to the highest observed F1 score.The problem
In the latest version of the
PrecisionRecallCurve
metric, theupdate
method now re-maps the predictions to the [0, 1] range by applying a sigmoid function. As a result, thethresholds
variable returned bycompute
is now not in the same domain as the original predictions, and the values are not usable for our purpose of finding the optimal threshold value.In addition, the sigmoid function squeezes the higher and lower values, which leads to lower resolution at the extremes of the input range, and in some cases even information loss.
To Reproduce
Here's an example to illustrate the problem. Let's say we have a set of binary targets and a set of model predictions in the range [12, 17]. Previously, the
PrecisionRecallCurve
metric would return the values of precision and recall for the different thresholds that occur naturally in the data.v0.10.3
Given these outputs it is straightforward to obtain the F1 scores for the different threshold values and use this to find the optimal threshold that maximizes F1.
After the recent changes, the predictions are now re-mapped by the sigmoid function. While we can still compute the F1 scores, we can no longer find the value of the threshold that yields the highest F1 score, because the values of the
thresholds
variable are no longer in the same domain as the original predictions.v0.11.1
Note that the elements of the
thresholds
variable all appear as 1.0000 because the numerical differences between the threshold candidates are minimized due to the squeezing effect of the sigmoid.It gets even worse when we increase the absolute values of the predictions to [22, 27]. The output of the sigmoid now evaluates to 1.0 for all predictions due to rounding, and the metric is not able to compute any meaningful precision and recall values.
v0.11.1
I guess this change was made to accommodate classical binary classification problems, where the predictions are generally confidence scores in the [0, 1] range, but I feel this is too restricting for other problem classes. Mathematically there is no reason why the precision-recall curve cannot be computed using predictions that fall outside of this range.
Expected behavior
The re-mapping of the prediction values to [0,1] by applying a sigmoid function should be optional.
Environment
The text was updated successfully, but these errors were encountered: