Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FIX] t-SNE speed-ups #3592

Merged
merged 5 commits into from
Feb 11, 2019
Merged

[FIX] t-SNE speed-ups #3592

merged 5 commits into from
Feb 11, 2019

Conversation

pavlin-policar
Copy link
Collaborator

Issue

t-SNE was slow

Description of changes
  • The multiscale option makes things slow for larger data sets, so turn it off by default.
  • Use faster approximation setting for Barnes-Hut
  • Update t-SNE plot only at the start and at the end, avoiding potentially many redraws
  • Selecting data in t-SNE was slow because transform was being called. Avoid this.
Includes
  • Code changes
  • Tests
  • Documentation

@@ -258,6 +265,9 @@ def pca_preprocessing(self):
def __start(self):
self.pca_preprocessing()

# Dirty flag indicating whether the embedding has been drawn at least once
self.needs_to_draw = True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can probably be removed now. In any case the comment is no longer true.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This cannot be removed because, at the start of the optimization, we have to draw something, so we just draw some random points, because the graph widget needs some data to be shown. However, we'd like to replace this random data with something slightly more meaningful as soon as possible, so setting this to True here will ensure that after the first batch of iterations is done, the widget will display something better than random data.

@codecov
Copy link

codecov bot commented Feb 11, 2019

Codecov Report

Merging #3592 into master will increase coverage by <.01%.
The diff coverage is 93.33%.

@@            Coverage Diff             @@
##           master    #3592      +/-   ##
==========================================
+ Coverage   83.98%   83.98%   +<.01%     
==========================================
  Files         370      370              
  Lines       66976    66984       +8     
==========================================
+ Hits        56249    56256       +7     
- Misses      10727    10728       +1

@lanzagar lanzagar merged commit 8058443 into biolab:master Feb 11, 2019
@pavlin-policar pavlin-policar deleted the tsne-faster branch February 11, 2019 13:40
pavlin-policar pushed a commit to pavlin-policar/orange3 that referenced this pull request Feb 11, 2019
pavlin-policar pushed a commit to pavlin-policar/orange3 that referenced this pull request Feb 11, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants