-
-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FIX] OWColor: Fix propagating changes to the output #2379
Conversation
Codecov Report
@@ Coverage Diff @@
## master #2379 +/- ##
==========================================
- Coverage 74.33% 74.32% -0.01%
==========================================
Files 321 321
Lines 55988 55990 +2
==========================================
Hits 41616 41616
- Misses 14372 14374 +2 |
Orange/widgets/data/owcolor.py
Outdated
@@ -350,6 +350,7 @@ def set_data(self, data): | |||
self._create_proxies(domain.metas)) | |||
self.openContext(data) | |||
self.data = data.transform(domain) | |||
self.data.domain = domain |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we need a comment here, at the very least.
But there may be a more general problem here. One would expect that after self.data = data.transform(domain)
, we would have self.data.domain is domain
, while transform
only ensures self.data.domain == domain
. We should either change transform
to (always) compare domains with is
, not ==
, or we can add a flag to specify how to compare domains.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would change transform to compare with is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CMIIW, but changing Table.from_table
to compare domains with is
instead of ==
doesn't look like a trivial change. Also it by itself doesn't make OWColor
work — the problem then arises in DomainConversion, which doesn't seem to know how to do the transformation. As a result, we get a data table with all values missing.
Before, as we checked with ==
if domains matched we called from_table_rows
which just makes a selection of rows and keeps the provided domain. Now, if domain1 == domain2
but domain1 is not domain2
we need to make a domain conversion.
It seems like the core of the problem with all values missing is in how DomainConversion checks whether some variable is inside a domain. The relevant part of the code from domain.py
:
def __contains__(self, item):
"""
Return `True` if the item (`str`, `int`, :class:`Variable`) is
in the domain.
"""
return item in self._indices
It seems like item in self._indices == False
while item in list(self._indices.keys()) == True
. Is this a result of how we compare variables?
Should we maybe just revert to self.data.domain = domain
as it was before @janezd refactoring? If so, @janezd can you provide an explanation, since I'm not quite sure how variable's proxies work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first problem sounds like a broken hash on Variable.
As for the fix, I would rather move the data.domain = domain to transform
method than have every caller do this explicitly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@astaric but shouldn't then we fix Table.from_table
as well to prevent from_table
and transform
to behave differently? Isn't data.transform
currently just a syntactic sugar?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is also possible. As we have changed all occurrences of Table.from_table with data.transform, I kind of consider it deprecated, but yeah, we can put the change there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... and risk @janezd changing it back to data.transform
the next time he sees this code? If data.transform
is syntactic sugar it has to have the same behaviour as the thing that this sugar is coating.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... and risk @janezd changing it back to
data.transform
the next time he sees this code?
@janezd change to data.transform
didn't break anything (transform
and to_table
behave the same way). Removing of the extra self.data.domain = domain
is the change that broke things.
If data.transform is syntactic sugar it has to have the same behaviour as the thing that this sugar is coating.
Agree. Hence, if we are going to fix this in the core we need to do it in from_table
. And while I'm all for changing to_table
to compare domains with is
instead of ==
, I'm afraid this might cause other issues all over the place. As it has revealed some problems in domain's __contains__
which I'm not sure how to solve. Any help appreciated 🙈
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And if anyone wishes to play around, here is the minimal example of the problem, after the change introduced in this PR:
import Orange
from Orange.data import Table
def _create_proxies(variables):
part_vars = []
for var in variables:
if var.is_discrete or var.is_continuous:
var = var.make_proxy()
if var.is_discrete:
var.values = var.values[:]
part_vars.append(var)
return part_vars
data = Table('iris')
domain = data.domain
domain = Orange.data.Domain(_create_proxies(domain.attributes),
_create_proxies(domain.class_vars),
_create_proxies(domain.metas))
new_data = data.transform(domain)
print(new_data)
The new_data
contains all missing values since variables from the destination domain are not in the source domain and compute values are not set.
This ensures that after one calls data.transform(domain) the domain of the resulting data is set to the instance that was passed to transform.
@astaric this should work now, please check. |
Issue
Fixes #2378
Description of changes
Assure that when domains are equal,
from_tables
sets the resulting table's domain to the instance that was passed on the input.Includes