Learning / Tweaking UMAP

Created in part to better understand the algorithm; tweaked to make it more pythonic; substituted the nearest-neighbour search to the version suggested in the LargeVis paper.

See the original repo for multithreaded/numba code. Mine isn't actually slower for small datasets but getting to ~10000 points in 50 dimensions and I just can't match with a single threaded implementation. I do find that batches improve the convergence tremendously so I may come back to this code to parallelize/gpu-ize it.

McInnes, L., Healy, J., & Melville, J. (2018). UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction. http://arxiv.org/abs/1802.03426

Tang, J., Liu, J., Zhang, M., & Mei, Q. (2016). Visualizing Large-scale and High-dimensional Data. 287–297. https://doi.org/10.1145/2872427.2883041

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt
umap.ipynb		umap.ipynb
umap_var.py		umap_var.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Learning / Tweaking UMAP

About

Releases

Packages

Languages

lrthomps/umap_minibatch

Folders and files

Latest commit

History

Repository files navigation

Learning / Tweaking UMAP

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages