-
-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FIX] Fitter: Fix infinite recursion in __getattr__ #1977
Conversation
Codecov Report
@@ Coverage Diff @@
## master #1977 +/- ##
===========================================
+ Coverage 70.22% 89.43% +19.21%
===========================================
Files 343 90 -253
Lines 54092 9190 -44902
===========================================
- Hits 37984 8219 -29765
+ Misses 16108 971 -15137 Continue to review full report at Codecov.
|
I guess the comment in |
32963d2
to
ac6761c
Compare
Wow, thank you very much. I don't think I would've figured that out by myself. That makes the fix much cleaner :) |
ac6761c
to
dac6438
Compare
So I noticed this was failing in some cases, so I tried to try to make things a bit more straightforward. Note: This is based on #1968 because I made some changes to the learner widget tests that enable testing fitters, which turned out to be useful here as well. I've made two rather big changes from last time:
This does also fix the error that @lanzagar pointed out last week. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Orange/classification/tree.py
Outdated
@@ -64,6 +64,8 @@ def __init__( | |||
self.min_samples_split = min_samples_split | |||
self.sufficient_majority = sufficient_majority | |||
self.max_depth = max_depth | |||
self.params = {k: v for k, v in vars().items() | |||
if k not in ('args', 'kwargs')} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is in vars()
-- except globals, which we don't like anyway?! I don't like this blind copying.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was an attempt to make the Learner API more intuitive. Currently, Orange learners have their settings (or parameters) saved to the object themselves, whereas sklearn learners have a special params
field that stores all that data. This field is used heavily in the tests and only rarely in any other code. This makes dealing with the tests a bit confusing. It would probably be better that to switch the field to be protected, and then doing some magic with __getattr__
to make those fields accessible to tests. If this were done, the API to the learners would be a bit more straightforward, but if you think it's unnecessary, I can remove it altogether. This would probably be cleaner than the blind copying that I've done here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still don't understand what is the content of vars
. It contains binarize
, max_depth
, min_samples_leaf
, min_samples_splet
and sufficient_majority
; if these need to be copied to params
, I'd copy them explicitly. Besides, it probably contains self
, and besides that, all global names in the module, which you don't want to copy.
I am probably missing something here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No you're not missing anything at all, I simply tried to emulate what the constructors in the sklearn learners were doing, which use vars()
. I'll change it so the params are explicitly copied. I suppose the same issue is then present in all the sklearn learners. I could fix that if you'd like, in a separate PR, this is kind of a mess already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, it's uglier (although perhaps better).
Leave it like this for now, and perhaps change this in all SKL learners. Perhaps by adding something like this to SKLLearner
from inspect import signature
...
def update_params(self, values):
param_names = signature(type(self).__init__).parameters
self.params.update({name: values[name] for name in param_names if name not in {"args", "kwargs"})
which would then be called from __init__
of derived classes with self.update_params(vars())
. This would only copy the names that were given as arguments.
I see that there's already some similar magic in our SKL wrappers.
Not in this PR.
Sorry this has taken me so long to get to, but I'm at it again. I'm guessing you meant changes related to #1968. The only thing these two have in common is some shared code to make the tests work with fitters. |
I will try to merge all your PRs today, so they can be a part of the next release, which is due soon. The first three commits of this PR are already in the master. So if you just rebase to master, they will disappear here and make it easier to review the code. |
This occurred when a large dataset was used in the test and score widget using cross validation. Changing the exception from AttributeError to TypeError fixed this problem.
dac6438
to
928f3b5
Compare
Yeah, I'm sorry, I had forgotten to rebase, it's done now. |
It is necessary to specify which problem type any param on the learner is valid for, and I had forgotten to do this for the SVM learner. This fixes that.
This fixes the issue I had when training a classification tree on employee attrition data. |
Issue
Running fitters on large enough datasets through the test and score widget and setting it to cross validation would result in a stack overflow error.
I am not particularly sure why this happened, my theory is that when test and score receives a large enough dataset, it runs cross validation in parallel, and that the learner attributes were probably accessed statically and failing.
Unfortunately, I have no idea what caused this and how exactly it was called, so I don't really know how to test against this. But since I will most likely change this in the near future, I wouldn't spend too much time on this.
Description of changes
No longer rely on
self.kwargs
being on the object, but raise attribute error by explicitly gettingkwargs
with__getattribute__
.Includes