Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add automated summaries #136

Merged
merged 9 commits into from
Mar 12, 2021
Merged

Add automated summaries #136

merged 9 commits into from
Mar 12, 2021

Conversation

janezd
Copy link
Contributor

@janezd janezd commented Mar 5, 2021

Issue

Provide automated summaries for inputs and outputs. Ref: biolab/orange3#5108.

For proper effect, must be merged with biolab/orange3#5308, but nothing should break if not.

Also: resolves #109.

Description of changes

  • Outputs are intercepted on calls to send (only new-style signals are supported; this is intentionally as the "old-style" are deprecated for many many years)
  • Inputs are intercepted by using the existing decorator that marks inputs

Auto-summary is generated if

  • the signal's auto_summary flag is not explicitly set to False,
  • the signal type has a registered summarizing function.

Summarizing functions are registered by adding another type to single-dispatched orangewidget.utils.signals.summarize, which must return an instance of orangewidget.utils.signals.PartialSummary, e.g.

@summarize.register(Table)
def summarize(data):
    return PartialSummary(data.approx_len(), format_summary_details(data))

The summary (first argument), can be None, int or rich text str (it has to be html-safe!), and the second is None (if also the first is None) or a rich text str.

(Non-consequential?) imcompatibility: if a widget already implement sets the output summary, the one who acts last will override the one who's first. That is, either nothing will change or the summary will (probably) improve.

Includes
  • Code changes
  • Tests
  • Documentation

@janezd
Copy link
Contributor Author

janezd commented Mar 5, 2021

@irgolic, if you confirm the design (you'll need this PR and https://github.com/biolab/orange3/pull/5308/files, but you can easily try other widgets, too), I'll remove summaries from other widgets, too.

@irgolic
Copy link
Member

irgolic commented Mar 5, 2021

-------------------------- AttributeError Exception ---------------------------
Traceback (most recent call last):
  File "/Users/rafael/orange/orange3/Orange/widgets/data/owfile.py", line 339, in load_data
    error = self._try_load()
  File "/Users/rafael/orange/orange3/Orange/widgets/data/owfile.py", line 384, in _try_load
    self.apply_domain_edit()  # sends data
  File "/Users/rafael/orange/orange3/Orange/widgets/data/owfile.py", line 518, in apply_domain_edit
    self.Outputs.data.send(table)
  File "/Users/rafael/orange/orange-widget-base/orangewidget/utils/signals.py", line 204, in send
    self.widget.set_partial_output_summary(summarize(value), self.name, id)
  File "/Users/rafael/orange/orange-widget-base/orangewidget/utils/signals.py", line 327, in set_partial_output_summary
    self._update_summary(
  File "/Users/rafael/orange/orange-widget-base/orangewidget/utils/signals.py", line 358, in _update_summary
    if partial_summary.summary is None:
AttributeError: 'tuple' object has no attribute 'summary'
-------------------------------------------------------------------------------

This is what I get upon opening any widget (tried File, Concatenate, Select Rows).

Glad to hear it looks good. :D

@janezd
Copy link
Contributor Author

janezd commented Mar 6, 2021

Oops, I forgot to push the PR in orange3 repo. Please pull biolab/orange3@087a2ae again.

@janezd
Copy link
Contributor Author

janezd commented Mar 6, 2021

Please check the tooltip for multiple inputs in Concatenate. It is ... acceptable, but any suggestions are welcome.

@janezd
Copy link
Contributor Author

janezd commented Mar 6, 2021

I actually solved it with a bit of HTML.

Screenshot 2021-03-06 at 10 38 24

Screenshot 2021-03-06 at 10 38 39

Screenshot 2021-03-06 at 11 21 29

Edit: This is better:

Screenshot 2021-03-06 at 13 04 01

@irgolic
Copy link
Member

irgolic commented Mar 6, 2021

That looks amazing actually.

Screenshot 2021-03-06 at 18 39 22

Now that you've changed it to a form-type layout, the indents on a MultiInput (dots) may be redundant.

Also, the top line in the message is redundant. That might be tricky to change though, as warnings use the same MessageWidget, and there, the top line is not redundant.

And, can you , delimit the MultiInput, not just ,? (add a space)

@janezd
Copy link
Contributor Author

janezd commented Mar 6, 2021

I've made a few more changes to the tooltip. As you've seen, it's html with css. Please edit at will, you're much better at this than I am.

Skipping space after comma was intentional; though grammatically incorrect, I wanted to keep those numbers closer. But feel free to add space.

I like this so much that I have already changed all widgets (= removed the summary-related code). The only widget I left is Table, which currently shows the input related to the currently shown table. We can decide to keep it, or remove this as well. All other widget look well (in my opinion) with automated summaries.

@janezd
Copy link
Contributor Author

janezd commented Mar 6, 2021

Also, the top line in the message is redundant. That might be tricky to change though,

No, I add it myself, specifically in the tool tip. I think it looks nicer, but you can remove it if you think it's better without.

@irgolic
Copy link
Member

irgolic commented Mar 9, 2021

What do you think of delimiting with just a space?
Screenshot 2021-03-09 at 17 38 22

I've come up with two variants:

V1

Screenshot 2021-03-09 at 19 19 29

Screenshot 2021-03-09 at 19 19 20

V2

Screenshot 2021-03-09 at 19 43 35

Screenshot 2021-03-09 at 19 43 29

Thoughts?

Also, the top line in the message is redundant. That might be tricky to change though,

No, I add it myself, specifically in the tool tip. I think it looks nicer, but you can remove it if you think it's better without.

I would prefer to remove this (the brief message never elides, and the in/out icon doesn't render well in text). But, I can't find where you add this, could you point me to it (if it's simple to remove)?

Also, something else I'd love to do but can't figure out how to do properly is add locale-dependent thousands separators (10242 -> 10,242). It should be as simple as f{len(data):n}, but my locale defaults to C/UTF-8/C/C/C/C instead of something like en_US.UTF-8. What's our stance on locale-dependent formatting, should I just use commas for separators?

@irgolic
Copy link
Member

irgolic commented Mar 9, 2021

Or maybe even this:

Screenshot 2021-03-09 at 20 39 22

Now imagine the entries being selectable. On click, a transient Data Table window opens.

@janezd, let me know which one of these I should push into the PR.

@janezd
Copy link
Contributor Author

janezd commented Mar 9, 2021

Oh, I'm sorry about the line. I thought that you disliked the <hr>, not what's above it. :))) No, this is added elsewhere. And, yes, it's redundant, the icon doesn't look nice, and the line would better be removed.

I really like V1, but bear in mind that the table's name can be "untitled", in which case it's currently not shown. Perhaps show it as (untitled)? Bullets in V2 are redundant, and I don't line centering in combination with right-aligned left column.

As for locale separators, I've no strong opinion, but leaning towards commas because Orange is in English.

Spaces instead of commas in the summary: if you like it better, do it. It's not much of an issue, it's not very common.

@irgolic
Copy link
Member

irgolic commented Mar 9, 2021

This is what I've settled on:

Screenshot 2021-03-09 at 21 51 02

Also, can I suggest trying 'Squash and merge' for this PR? The team behind pandas default to that in their workflow.

Does this version include the thousands separator for the number of instances on your machine?

@janezd
Copy link
Contributor Author

janezd commented Mar 10, 2021

I left separate commits in case we'd like to revert some. But at the end, yes, squash without mercy. (My last commit's message included "to be squashed".)

Does this version include the thousands separator for the number of instances on your machine?

Which version? With this PR, as currently on github (just my commits, without any of your changes), it doesn't show any separators.

@irgolic
Copy link
Member

irgolic commented Mar 10, 2021

Does this version include the thousands separator for the number of instances on your machine?

Which version? With this PR, as currently on github (just my commits, without any of your changes), it doesn't show any separators.

Ah sorry, I've pushed now.

@janezd
Copy link
Contributor Author

janezd commented Mar 10, 2021

Separators are still not shown. But I guess you've done all you can -- :n should do it and probably does it on some platforms/locales. For me, it's:

>>> f"{10000000:n}"
'10000000'

I pushed some more changes (simplified arguments). Please pull before changing further or squashing.

@irgolic
Copy link
Member

irgolic commented Mar 11, 2021

I can get the separator to work if I set the locale within python code, e.g.

import locale
locale.setlocale(locale.LC_ALL, 
                 locale.getdefaultlocale())

I suspect locale is only broken in our dev environments, where it's set to C.UTF-8. Even so, I think I'd prefer to set it to en_US.UTF-8 for everyone, for now.

@janezd janezd changed the title [RFC] Add automated summaries Add automated summaries Mar 11, 2021
@janezd janezd added the needs discussion Core developers need to discuss the issue label Mar 11, 2021
@janezd janezd removed the needs discussion Core developers need to discuss the issue label Mar 12, 2021
@janezd janezd merged commit 37af799 into biolab:master Mar 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

set_output_summary: Use summary when no description
2 participants