-
-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FIX] Scatterplot: Score plots crashed if multiple attributes have the same… #1535
Conversation
Current coverage is 88.26% (diff: 100%)@@ master #1535 diff @@
==========================================
Files 77 77
Lines 7624 7624
Methods 0 0
Messages 0 0
Branches 0 0
==========================================
Hits 6729 6729
Misses 895 895
Partials 0 0
|
What if we make variables sortable, like by name? |
@BlazZupan proposed that. I don't like it much, since we would probably also need to implement |
https://docs.python.org/3/library/functools.html#functools.total_ordering
In other words, it's high time we introduce more sane defaults into our data model! Thankfully, @sstanovnik took some care of that in his PR. 😄 |
In the long term I would rather see a crash in the widget that produced such data (or make it automatically rename the second occurrence of the variable to "name (2)"). If I remember correctly, pandas does not allow multiple variables with same name. |
@astaric, the problem was the same ReliefF, not same name. I agree that we should prohibit duplicated names, though. @kernc, what we have now is OK (comparison for equivalence, not magnitude). I said that introducing comparison operators would make it inconsistent (besides being semantically meaningless). I can't see how pandas would fix something that is not broken. Unless that's the whole idea of introducing it. :) Second, if you saw what I'm seeing these days - Orange crashing on students' computers more than it did a year ago on the same course - you'd be less eager to turn Orange upside down once again. You can advertise pandas as a magic bullet for everything, yet the most stable Orange ever was Orange 2, with its totally proprietary ad-hoc data model. By relying more and more on bs like scikit-learn, we're moving farther in the opposite direction. How do we know that pandas won't cause just as many problems as skl, which we should get rid of as soon as we can? I wonder more and more about which of the advertised advantages of pandas - except for being several times slower than what we have now - are true and which are not. For instance, unless |
I tend to disagree on several points. The problems Orange is facing are a result of insufficient porting effort and the haste to deliver.
Variables in that branch extend
|
We just had a break during our hands-on lecture, and a line of students with computers formed to show us different ways in which Orange crashed during the lecture. I see it, in part, as a consequence of "simplifying" many things from Orange 2 that were perceived as too complicated. Now that we're fixing the bugs, we see that they were complicated for a reason. Your solution to this is that we throw everything away and replace it with something new once again. It's easy to reason like this if you don't need to deliver. Remove domain matching ... and we'll reintroduce it next year. It's easy to simplify if you haven't actually worked with students and end-users and you don't know how they actually use Orange. Deriving |
It's sane that variables have a defined default sort order as opposed to it being undefined. That the order is lexicographic on variables' names is also straightforward, simple, and not entirely unexpected. As I see it, Orange is still transitioning. Odd version numbers have historically denoted unstable, developmental releases. 😄 |
Sort tried to compare two instances of ContinuousVariable.