Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FIX] Scatterplot: Score plots crashed if multiple attributes have the same… #1535

Merged
merged 1 commit into from
Sep 2, 2016

Conversation

janezd
Copy link
Contributor

@janezd janezd commented Sep 1, 2016

Sort tried to compare two instances of ContinuousVariable.

@codecov-io
Copy link

codecov-io commented Sep 1, 2016

Current coverage is 88.26% (diff: 100%)

Merging #1535 into master will not change coverage

@@             master      #1535   diff @@
==========================================
  Files            77         77          
  Lines          7624       7624          
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
  Hits           6729       6729          
  Misses          895        895          
  Partials          0          0          

Sunburst

Powered by Codecov. Last update a30e0c2...9cba5ac

@kernc
Copy link
Contributor

kernc commented Sep 1, 2016

What if we make variables sortable, like by name?

@janezd
Copy link
Contributor Author

janezd commented Sep 2, 2016

@BlazZupan proposed that. I don't like it much, since we would probably also need to implement <= and >=, and then also == ... which we already have and it's unrelated to attribute names. Even more, we could have var1 < var2 and var1 == var2 at the same time.

@kernc
Copy link
Contributor

kernc commented Sep 2, 2016

I don't like it much, since we would probably also need to implement <= and >=, and then also == ...

https://docs.python.org/3/library/functools.html#functools.total_ordering

... which we already have and it's unrelated to attribute names. Even more, we could have var1 < var2 and var1 == var2 at the same time.

In other words, it's high time we introduce more sane defaults into our data model! Thankfully, @sstanovnik took some care of that in his PR. 😄

@kernc kernc merged commit b050abc into biolab:master Sep 2, 2016
@astaric
Copy link
Member

astaric commented Sep 2, 2016

In the long term I would rather see a crash in the widget that produced such data (or make it automatically rename the second occurrence of the variable to "name (2)"). If I remember correctly, pandas does not allow multiple variables with same name.

@janezd
Copy link
Contributor Author

janezd commented Sep 2, 2016

@astaric, the problem was the same ReliefF, not same name. I agree that we should prohibit duplicated names, though.

@kernc, what we have now is OK (comparison for equivalence, not magnitude). I said that introducing comparison operators would make it inconsistent (besides being semantically meaningless). I can't see how pandas would fix something that is not broken. Unless that's the whole idea of introducing it. :)

Second, if you saw what I'm seeing these days - Orange crashing on students' computers more than it did a year ago on the same course - you'd be less eager to turn Orange upside down once again. You can advertise pandas as a magic bullet for everything, yet the most stable Orange ever was Orange 2, with its totally proprietary ad-hoc data model. By relying more and more on bs like scikit-learn, we're moving farther in the opposite direction. How do we know that pandas won't cause just as many problems as skl, which we should get rid of as soon as we can?

I wonder more and more about which of the advertised advantages of pandas - except for being several times slower than what we have now - are true and which are not. For instance, unless Domain and Variable are going away (which they don't), the suggestion that using pandas would in any way fix this bug is unfounded. Correct me if I'm wrong.

@kernc
Copy link
Contributor

kernc commented Sep 2, 2016

I tend to disagree on several points. The problems Orange is facing are a result of insufficient porting effort and the haste to deliver.

the suggestion that using pandas would in any way fix this bug is unfounded. Correct me if I'm wrong.

Variables in that branch extend str, so they are lexicographically sortable by default.

Simplicity is prerequisite for reliability.
— Edsger W. Dijkstra

@janezd
Copy link
Contributor Author

janezd commented Sep 2, 2016

We just had a break during our hands-on lecture, and a line of students with computers formed to show us different ways in which Orange crashed during the lecture.

I see it, in part, as a consequence of "simplifying" many things from Orange 2 that were perceived as too complicated. Now that we're fixing the bugs, we see that they were complicated for a reason. Your solution to this is that we throw everything away and replace it with something new once again. It's easy to reason like this if you don't need to deliver. Remove domain matching ... and we'll reintroduce it next year. It's easy to simplify if you haven't actually worked with students and end-users and you don't know how they actually use Orange.

Deriving Variable from str is semantically weird (variable is not a special kind of string, is it?), so we do it for practical reasons. Don't call it sane, though - it's not. As a side effect it indeed fixes this bug, but we've no idea about what other adverse effects will the inherited methods bring.

@kernc
Copy link
Contributor

kernc commented Sep 3, 2016

It's sane that variables have a defined default sort order as opposed to it being undefined. That the order is lexicographic on variables' names is also straightforward, simple, and not entirely unexpected.

As I see it, Orange is still transitioning. Odd version numbers have historically denoted unstable, developmental releases. 😄

@janezd janezd deleted the fix-vizrank-same-relieff branch April 5, 2019 17:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants