Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FIX] OWTreeViewer: Fix trees being displayed differently for same tree object #2067

Merged
merged 4 commits into from
Mar 10, 2017

Conversation

pavlin-policar
Copy link
Collaborator

@pavlin-policar pavlin-policar commented Mar 2, 2017

Issue

OWTreeViewer was not deterministic in showing trees. This was not specific to trees where attributes had the same entropy or whatever the selection criteria was, but occured on any tree (even with iris).

Description of changes

This happened because the layout was updated w.r.t the graph node edges, which were stored in a set object, which is unordered.

I assume that a set was used because speed is important, so I've implemented edges as an ordered dict, to keep the speed benefits as well as getting an ordering.

I've written a test that draws the tree 10x and compares the node positions, but since the ordering of the nodes was completely random before, this is not guaranteed to fail in case the trees are drawn with random ordering. The ordering could happen to be the same 10x. This hasn't actually happened to me yet on iris, since there are more nodes that can fail there, but this could potentially happen.

I have no idea how to test this so that it would always fail, due to the random nature of the ordering.

Includes
  • Code changes
  • Tests
  • Documentation

@pavlin-policar
Copy link
Collaborator Author

pavlin-policar commented Mar 3, 2017

@astaric The tests are failing because apparently the data.tab file I added to Orange/widgets/visualize/tests/ doesn't get uploaded and fails because the file is not found (even though it is tracked by git. How to I get it to actually upload?

@astaric
Copy link
Member

astaric commented Mar 3, 2017

You need to include it in Manifest and setup.py. Maybe you could move the dataset to where the other datasets reside, where this is already done? (https://github.com/biolab/orange3/tree/master/Orange/widgets/tests/datasets)

OWTreeViewer was not deterministic in showing trees. This happened
because the layout was updated w.r.t the graph node edges, which were
stored in a set object, which is unordered.
I assume that a set was used because speed is important, so I've
implemented edges as an ordered dict, to keep the speed benefits as well
as an ordering.
@pavlin-policar
Copy link
Collaborator Author

Thanks, I've moved it to the widgets/tests/datasets folder.

@codecov-io
Copy link

codecov-io commented Mar 3, 2017

Codecov Report

Merging #2067 into master will increase coverage by 0.02%.
The diff coverage is 92.18%.

@@            Coverage Diff             @@
##           master    #2067      +/-   ##
==========================================
+ Coverage   69.81%   69.83%   +0.02%     
==========================================
  Files         315      315              
  Lines       53965    54015      +50     
==========================================
+ Hits        37678    37724      +46     
- Misses      16287    16291       +4

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 34b5e94...82a6959. Read the comment docs.

# Make sure trees are deterministic with data where some variables have
# the same entropy
data_same_entropy = Table(path.join(
path.dirname(__file__), "../../tests/datasets/same_entropy.tab"))
Copy link
Contributor

@kernc kernc Mar 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use os.path.join() everywhere to make the test pass on Windos.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a cleaner solution than path.join(path.dirname(path.dirname(path.dirname(__file__))), "tests", "datasets", "same_entropy.tab")? Could I maybe put in ".." instead of path.dirname?

@@ -19,6 +21,12 @@ def setUpClass(cls):
cls.signal_name = "Tree"
cls.signal_data = cls.model

# Load a dataset that contains two variables with the same entropy
data_same_entropy = Table(path.join(
path.dirname(__file__), "../../tests/datasets/same_entropy.tab"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And here.

first = next(iterator)
except StopIteration:
return True
return all(first == rest for rest in iterator)
Copy link
Contributor

@kernc kernc Mar 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would return len(set(data)) == 1 work?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately not, because I'm comparing QPointFs which are an unhashable type.

@astaric astaric merged commit 1f72515 into biolab:master Mar 10, 2017
@pavlin-policar pavlin-policar deleted the deterministic-trees branch March 10, 2017 09:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants