Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance with fixedrate=true #47

Open
heliosdrm opened this issue Jan 17, 2019 · 4 comments
Open

Improve performance with fixedrate=true #47

heliosdrm opened this issue Jan 17, 2019 · 4 comments

Comments

@heliosdrm
Copy link
Collaborator

heliosdrm commented Jan 17, 2019

In the calculation of recurrence matrices with fixed recurrence rate there is the issue commented in #46 (indirectly calling distancematrix, which is expensive):

if fixedrate
sfun = (m) -> quantile(m[:], ε)
return recurrence_matrix(x, y, 1; scale=sfun, fixedrate=false, metric=metric)

A function that gives the same result as quantile(distancematrix(x,y), ε) without calling distancematrix would help.


Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

@heliosdrm
Copy link
Collaborator Author

Hint: the fixed recurrence rate ε is expected to be a small number (order of magnitude 10^-2). I guess that this may help to find a fast method using only a small fraction of the memory taken by the full distance matrix.

@heliosdrm
Copy link
Collaborator Author

This might be a solution: https://discourse.julialang.org/t/package-for-approximate-small-quantiles/20046/2

Drawbacks: possibly heavy dependencies, instability (the Windows build of OnlineStats is persistently failing for 32-bits); in addition the calculation of the quantiles is not exact but approximate, and needs to set a parameter for the approximation, so the implementation is not straightforward.

So, although it is a solution that may deserve to be taken into account, it needs time to be analyzed. So we can leave it for version 2.0 if suitable (it would imply breaking changes in my opinion).

@Datseris
Copy link
Member

Datseris commented Jan 24, 2019

Hm,,,, I must admit I find it difficult justify adding OnlineStats just for this. I'd much rather just document the possibility.

Our heaviest dependency so far is NearestNeighbors which only depends on StaticArrays and is pure Julia (so it is in fact a quite light dependency). I have to say I am very happy with how low we have kept the dependencies on this package. I think this is a great benefit and I vote to keep it this way.

I have been following some discussions in the Julia forums and some parties have made it clear that to install packages they have to pass through a screening of some sorts. And for that case each dependency matters significantly.

@pucicu
Copy link
Contributor

pucicu commented Jan 25, 2019 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants