Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MDS: use Scatter Plot Graph instead of (Mds)plotutils #2448

Merged
merged 4 commits into from
Sep 28, 2017
Merged

MDS: use Scatter Plot Graph instead of (Mds)plotutils #2448

merged 4 commits into from
Sep 28, 2017

Conversation

jerneju
Copy link
Contributor

@jerneju jerneju commented Jul 3, 2017

Issue

Too much duplicated code.

Description of changes

MDS now uses Scatter Plot Graph instead of (Mds)plotutils.

Includes
  • Code changes
  • Tests
  • Documentation

@codecov-io
Copy link

codecov-io commented Jul 3, 2017

Codecov Report

Merging #2448 into master will increase coverage by 0.16%.
The diff coverage is 92.59%.

@@            Coverage Diff             @@
##           master    #2448      +/-   ##
==========================================
+ Coverage   75.01%   75.17%   +0.16%     
==========================================
  Files         331      331              
  Lines       58117    57769     -348     
==========================================
- Hits        43594    43429     -165     
+ Misses      14523    14340     -183

@jerneju jerneju changed the title Linear Projection, MDS: move Plotutils, Mdsplotutils, make_pen [WIP] Linear Projection, MDS: move Plotutils, Mdsplotutils, make_pen Jul 21, 2017
@jerneju jerneju changed the title [WIP] Linear Projection, MDS: move Plotutils, Mdsplotutils, make_pen [WIP] MDS: use Scatter Plot Graph instead of (Mds)plotutils Jul 31, 2017
@janezd
Copy link
Contributor

janezd commented Aug 22, 2017

Is this still WIP? Perhaps it would be better to finish the open PRs before starting new ones. Or do you have some open questions here? (If so, post them.)

@jerneju jerneju changed the title [WIP] MDS: use Scatter Plot Graph instead of (Mds)plotutils MDS: use Scatter Plot Graph instead of (Mds)plotutils Aug 24, 2017
@jerneju
Copy link
Contributor Author

jerneju commented Aug 24, 2017

Not WIP anymore.


def selection_changed(self):
self.commit()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't all this same code appear also in the scatter plot widget? @astaric suggested that this code belongs to OWScatterPlotGraph.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. The same code appears there. So can I make a super class from which MDS and Scatter Plot are inherited?

@janezd
Copy link
Contributor

janezd commented Aug 25, 2017

I approve the idea, but some code should probably be moved to OWScatterPlot. Also, see whether there will be any conflict with #2536.

@jerneju
Copy link
Contributor Author

jerneju commented Sep 12, 2017

@janezd: There is no conflict with that PR.

@janezd
Copy link
Contributor

janezd commented Sep 13, 2017

Now that #2541 is merged, can you remove some of the code in this widget?

If you do, you can then squash the commits.

@jerneju
Copy link
Contributor Author

jerneju commented Sep 15, 2017

@janezd

self._legend_item = None
self._selection_mask = None
self._subset_mask = None # type: Optional[numpy.ndarray]
self._subset_mask = [] # type: Optional[np.ndarray]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This used to be self._subset_mask = None. Why change it to []? If this is to avoid having to test for None in line 649, you shouldn't set it to None in line 341. It is set to np.ndarray in 575.

The type hint, # type: Optional[np.ndarray], suggests that this can be None or ndarray. Not an (empty) list.

This is a mess. Please revert it to how it was before: None and ndarray, not a list.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@@ -187,135 +181,51 @@ def __init__(self):
)

form.addRow("Max iterations:",
gui.spin(box, self, "max_iter", 10, 10 ** 4, step=1))
gui.spin(box_optimization, self, "max_iter", 10, 10 ** 4, step=1))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can break this like this:

    form.addRow(
        "Max iterations:",
        gui.spin(box_optimization, self, "max_iter", 10, 10 ** 4, step=1))

to keep the lines after 80 characters. Do the same in the following lines for consistency. This format makes the code more readable in this particular case, IMHO.


form.addRow("Initialization:",
gui.radioButtons(box, self, "initialization",
gui.radioButtons(box_optimization, self, "initialization",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? IMHO, such temporary boxes can keep a short name box. We all understand it's a temporary into which a "paragraph" of a code puts the controls it's constructing. Making it more "descriptive" does not make the code cleaner, it just makes lines longer.

@@ -651,7 +509,7 @@ def __next_step(self):
self.unconditional_commit()
self.__draw_similar_pairs = True
self._update_plot()
self.plot.autoRange(padding=0.1, items=[self._scatter_item])
#.plot.autoRange(padding=0.1, items=[self._scatter_item])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this commented? If it is no longer needed, remove. If it needs to be replaced with something, replace.

@@ -664,7 +522,7 @@ def __next_step(self):
self.progressBarSet(100.0 * progress, processEvents=None)
self.embedding = embedding
self._update_plot()
self.plot.autoRange(padding=0.1, items=[self._scatter_item])
# self.plot.autoRange(padding=0.1, items=[self._scatter_item])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same.


self._update_plot()
self.plot.autoRange(padding=0.1, items=[self._scatter_item])
# self.plot.autoRange(padding=0.1, items=[self._scatter_item])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same.


self._update_plot()
self.plot.autoRange(padding=0.1)
# self.plot.autoRange(padding=0.1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same.


if self.connected_pairs and self.__draw_similar_pairs:
def connected_pairs():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename to connect_pairs. Perhaps move the initial if out of the function, so that if controls whether the function connect_pairs is called or not.

("Label", name(self.graph.attr_label)),
("Shape", name(self.graph.attr_shape)),
("Size", name(self.graph.attr_size)),
("Jittering", self.graph.jitter_size)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Color, label, shape and size are not reported when set to None (which is OK). Do the same for jittering. Don't report it when set to 0; add %, when non-zero. That is, do something like:

("Jittering", self.graph.jitter_size != 0 and "{} %".format(self.graph.jitter_size))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok.

@janezd
Copy link
Contributor

janezd commented Sep 15, 2017

Context settings do not work. Give it Iris, set some attribute for Labels, change to Zoo and back to Iris. The setting for labels is lost.

@janezd
Copy link
Contributor

janezd commented Sep 15, 2017

Remove coordinate axes.

self._invalidated = False
self._effective_matrix = None

self.variable_x = ContinuousVariable("x")
self.variable_y = ContinuousVariable("y")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename these to mds-x, mds-y.

self.graph.update_data(variable_x, variable_y, reset_view=reset_view)
for axis in ["left", "bottom"]:
self.plot.hideAxis(axis)
self.plot.setAspectLocked(True, 1)

def commit(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a new problem, but this PR is a good place to solve it since this code will be touched anyway.

What if the data already contains attributes "x" and "y" (or "mds-x", "mds-y", as suggested in another comment)? I think the function needs to be changed so that in this case it creates two new attributes with unique names (see Orange.widgets.utils.annotated_data.get_next_name).

box.layout().addStretch()

self.controlArea.layout().addWidget(box)
self.graph.box_zoom_select(self.controlArea)

box = gui.vBox(self.controlArea, "Output")
self.output_combo = gui.comboBox(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We just discussed this with @BlazZupan. There is no universal place for such columns. E.g. in PCA, the components should appear as attributes, and the original attributes as metas. Here, however, x and y can be meta attributes. So: in the intereset of space, remove this combo box and always put coordinates as meta attributes.

Meta attributes are first-class citizens in all visualization widgets. In rare cases, Select columns can be used to move them to features.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed. However, many widgets has an option where one can select an output.

Copy link
Contributor

@lanzagar lanzagar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When jittering is enabled, the drawn lines of similar pairs do not match new point positions.

@@ -30,6 +30,11 @@ def add_columns(domain, attributes=(), class_vars=(), metas=()):
return Domain(attributes, class_vars, metas)


def get_indexes(names, name):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docstring please.

if name not in names and not indexes:
return name
return "{} ({})".format(name, max(indexes, default=1) + 1)


def get_unique_names(names, proposed):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docstring please.

@@ -4,7 +4,7 @@

ANNOTATED_DATA_SIGNAL_NAME = "Data"
ANNOTATED_DATA_FEATURE_NAME = "Selected"

REGULAR_EXPRESSION = "(^{} \()(\d{{1,}})(\)$)"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a very informative variable name. Maybe something like RE_FIND_INDEX or RE_INDEX_TEMPLATE, ...
Also, another blank line after this.

max_index = max([max(get_indexes(names, name),
default=1) for name in proposed], default=1)
for i, name in enumerate(proposed):
proposed[i] = "{} ({})".format(name, max_index + 1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not convinced that using the same max index for all new names makes the most sense, but it probably doesn't matter that much.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now smallest possible index.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lanzagar, max index makes sense. If I already have variables foo, foo (1), foo (3) and foo (4), I want the new one to be foo (5), not foo (2).

You're right that it won't matter much, but I'd follow this logic. get_next_name already does so. @jerneju, sorry to push you back and forth. :|

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I have reverted it back.

Copy link
Contributor

@lanzagar lanzagar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding and removing Data Subset clears the shown similar pairs...

@jerneju
Copy link
Contributor Author

jerneju commented Sep 28, 2017

Well, it is true, but this is a separate issue. It happens in a original version too. Thanks for noticing.

@lanzagar
Copy link
Contributor

We can leave the fix of similar pairs disappearing for a later PR.
Squash the commits please, and I will merge.

@lanzagar lanzagar dismissed their stale review September 28, 2017 13:49

for separate PR

@lanzagar lanzagar merged commit 328b847 into biolab:master Sep 28, 2017
@jerneju jerneju deleted the move-plotutils branch September 28, 2017 14:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants