-
-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ENH] ROC analysis: show thresholds #3172
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I enable "Show convex ROC curves" and hover over the graph I get:
-------------------------- AttributeError Exception ---------------------------
Traceback (most recent call last):
File "/Users/aleserjavec/workspace/orange3/Orange/widgets/evaluate/owrocanalysis.py", line 614, in _on_mouse_moved
pts = sp.pointsAt(act_pos)
AttributeError: 'PlotCurveItem' object has no attribute 'pointsAt'
-------------------------------------------------------------------------------
@@ -561,6 +565,7 @@ def _setup_plot(self): | |||
graphics = curve.avg_threshold() | |||
self.plot.addItem(graphics.curve_item) | |||
self.plot.addItem(graphics.confint_item) | |||
graphics.curve_item.scene().sigMouseMoved.connect(self._on_mouse_moved) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Connect the signal once in the __init__
method:
self.plotview.scene().sigMouseMoved.connect(self._on_mouse_moved)
|
||
thresh = curve_pts.thresholds[idx_closest] | ||
if not numpy.isnan(thresh): | ||
item = pg.TextItem(text="{:.3f}".format(thresh), color=(0, 0, 0)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not QToolTip.showText
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mainly because I am new to QT and was just following a few resources online (I've changed it now).
I've fixed most of the things that you've mentioned and some other bugs.
Thank you for the helpful comments!
Codecov Report
@@ Coverage Diff @@
## master #3172 +/- ##
==========================================
+ Coverage 82.74% 82.76% +0.01%
==========================================
Files 344 344
Lines 59304 59397 +93
==========================================
+ Hits 49073 49158 +85
- Misses 10231 10239 +8 |
def _on_mouse_moved(self, pos): | ||
curves = [crv for crv in self.plot.curves if isinstance(crv, pg.PlotCurveItem)] | ||
for i in range(len(self.selected_classifiers)): | ||
sp = curves[i].childItems()[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This assumes implicitly that the order of the curves in the self.plot.curves
corresponds to the self.selected_classifiers
. This seems fragile.
You can use:
curves = [(clsf_idx, self.plot_curves(self.target_index, clsf_idx), self.curve_data(self.target_index, clsf_idx))
for clsf_idx in self.selected_classifiers] # type: List[Tuple[int, plot_curves, ROCData]]
to retrieve triples of (classifier_index, plot_curves, curve_data). Use plot_curves
similarly as curve_data
in ROC averaging dispatch below, but use
plot_curves.merge().curve_item
plot_curves.avg_vertical().curve_item
plot_curves.avg_threshold().curve_item
to get the pg.PlotCurveItem
instances
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My bad - should be fixed now. I've also fixed some other bugs:
- if mouse was hovered over a point on curve for some classifier and then the same classifier's curve would be hidden and mouse hovered over the same point as before, the cache would not reset and therefore an invalid threshold would be shown,
- if tooltip was not being shown and the cache point was "valid" (i. e. same point as previously shown), a tooltip with message "" would be created, which resulted in hiding a tooltip.
I have also encountered the following exception (just once), which I can not seem to be able to reproduce anymore:
--------------------------- RuntimeError Exception ----------------------------
Traceback (most recent call last):
File "/home/matej/.local/lib/python3.5/site-packages/pyqtgraph/GraphicsScene/GraphicsScene.py", line 162, in mouseMoveEvent
self.sendHoverEvents(ev)
File "/home/matej/.local/lib/python3.5/site-packages/pyqtgraph/GraphicsScene/GraphicsScene.py", line 228, in sendHoverEvents
items = self.itemsNearEvent(event, hoverable=True)
File "/home/matej/.local/lib/python3.5/site-packages/pyqtgraph/GraphicsScene/GraphicsScene.py", line 424, in itemsNearEvent
if hoverable and not hasattr(item, 'hoverEvent'):
RuntimeError: wrapped C/C++ object of type ScatterPlotItem has been deleted
When browsing through old issues, I've found out that there's been some instances of similar issues (#1316), though I can't say I completely understand your fix for this.
# Find closest point on curve and display it | ||
idx_closest = numpy.argmin( | ||
[numpy.linalg.norm(mouse_pt - [curve_pts.fpr[idx], curve_pts.tpr[idx]]) | ||
for idx in range(len(curve_pts.thresholds))]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This loop could be vectorized
roc_points = numpy.column_stack((curve_pts.fpr, curve_pts.tpr))
query_pt = numpy.array([[mouse_pt.x(), mouse_pt.y()]])
diff = roc_points - query_pt
idx_closest = numpy.argmin(numpy.linalg.norm(diff, axis=1))
(... although a slightly moot point as the sp.pointsAt
call above already runs through all the points in a python loop)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True - fixed.
I've also made another minor change: now QCursor.pos() is directly used everywhere instead of using it directly in some cases and in the other cases saving it and then using the variable (just a consistency fix).
I think the hit radius is sufficient as it is. However it might be better to collect points from all the curves under the mouse and display a list of all thresholds with classifier names. Right now if two or more curves are close to each other it is not clear to which does the shown threshold belong to.
I would use QTest.mouseMove targeting the QGraphicsView's viewport ( |
I've improved the tooltips so that it is clear, which classifier a threshold belongs to. Also, the tooltip now shows thresholds for all overlapping points. I'll see what I can do about the tests. Edit: I'm having a bit of trouble creating the tests - so far this is what I've got: # setup
data_in = Orange.data.Table("titanic")
res = Orange.evaluation.TestOnTrainingData(
data=data_in,
learners=[Orange.classification.KNNLearner(),
Orange.classification.LogisticRegressionLearner()],
store_data=True
)
self.send_signal(self.widget.Inputs.evaluation_results, res)
self.widget.roc_averaging = OWROCAnalysis.Merge
self.widget.target_index = 0
self.widget.selected_classifiers = [0, 1]
self.widget._replot()
# curve data for classifier 0
curve = self.widget.plot_curves(self.widget.target_index, 0)
curve_merge = curve.merge()
QTest.mouseMove(self.widget.plotview.viewport(), curve_merge.curve_item.mapToScene(0, 0))
# example assertion
self.assertTrue(QToolTip.isVisible()) The |
Maybe QTBUG-5232. I swear I saw this when writing https://github.com/biolab/orange3/pull/3014/files#diff-da80849ceff100f955c9ec05920ba089R138. If I use that function and modify your test like this: ...
# curve data for classifier 0
curve = self.widget.plot_curves(self.widget.target_index, 0)
curve_merge = curve.merge()
view = self.widget.plotview
item = curve_merge.curve_item # type: pg.PlotCurveItem
xs, ys = item.getData()
x, y = xs[len(xs) // 2], ys[len(ys) // 2]
pos = item.mapToScene(x, y)
pos = view.mapFromScene(pos)
mouseMove(view.viewport(), pos)
# example assertion
self.assertTrue(QToolTip.isVisible()) then the test passes. Can you rebase on current master and use the mouseMove function from |
* fix crash when 'Show convex ROC curves' is enabled * register mouse event only once * use QToolTip instead of TextItem for displaying thresholds * fix newly caused pylint warning
* removed dangerous retrieval of ROC curve data * changed tooltip caching so that it works smoother * fixed bug where cache value was not correct and would not reset properly
…inor consistency fix
Thank you, that does fix the issue. I've stumbled upon another issue though - the points which I'm trying to move the mouse to are not being mapped correctly. For example, if I try to simulate a mouse move to (x=0.0, y=0.0) and (x=1.0, y=1.0), it produces the same threshold/tooltip. I've noticed that these two points get mapped to very similar positions inside the test ( |
Try adding
somewhere before mapping the points. |
7438477
to
099ee74
Compare
Thanks once again. 😃 |
@@ -336,6 +336,7 @@ def __init__(self): | |||
self._plot_curves = {} | |||
self._rocch = None | |||
self._perf_line = None | |||
self._tooltip_cache = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not understand why the tooltip cache is necessary. Can you add a comment explaining the reasoning for its use?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My reasoning behind it was that instead of checking all points of a curve to find the closest one, it would first check the tooltip cache, which contains at most 1 threshold (point) per (selected) classifier.
It's hard to say if it's worth the additional 20-ish lines of code though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But it already checks all points in pts = sp.pointsAt(act_pos)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, but it also does it again later (although it is vectorized): idx_closest = numpy.argmin(numpy.linalg.norm(diff, axis=1))
.
Considering your point, it does seem that the performance gain from tooltip cache might not be as big as I thought. Should I remove it and make the code clearer that way?
Edit: also, the tooltip cache provides a convenient way to test the contents of tooltips - the first alternative way I can think of would be to test the entire tooltip text, e.g. assertEqual("Tooltips:\n(#1) 0.400", QTooltip.text())
.
It does not matter really, you can leave it. |
Issue
Implements #3057.
Description of changes
When mouse is put over a point on ROC curve, the threshold that was used for calculation now appears. If threshold value is nan, the value does not get shown. Also, the displaying of thresholds is currently not implemented when showing individual curves.
My questions:
General comments are also welcome.
Includes