[Bug] torchmetrics.functional.classification.binary_auroc
gives wrong results when logits are large
#2819
Labels
torchmetrics.functional.classification.binary_auroc
gives wrong results when logits are large
#2819
🐛 Bug
torchmetrics.functional.classification.binary_auroc
always gives 0.5 when all logits are large. This seems to be caused by a floating point precision error with sigmoid.To Reproduce
Code sample
Output:
Expected behavior
AUROC of the above example should be 0.9286, as computed by
sklearn
.Output:
Environment
Additional context
This appears to be a problem of floating point precision with sigmoid at line 185 in function
_binary_precision_recall_curve_format
in filetorchmetrics/src/torchmetrics/functional/classification/precision_recall_curve.py
.I extracted all the necessary functions and made a miniature
binary_auroc
function that uses exactly the same algorithm (works for the above example, did not test for other examples):Output:
preds = preds.sigmoid()
converts all logits to 1 as if all the logits are the same, which is not the case. The maximum magnitude of a logit must be less than 36.74 fordouble
or 16.64 forfloat32
to avoid being converted to exactly 1.Suggested fix
It's probably a good idea to scale the raw logits before sigmoid, something like below:
All functions that applies sigmoid to raw ogits will need such a fix.
The text was updated successfully, but these errors were encountered: