Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling Optional dependencies for the analysis module #1159

Closed
kain88-de opened this issue Jan 13, 2017 · 20 comments
Closed

Handling Optional dependencies for the analysis module #1159

kain88-de opened this issue Jan 13, 2017 · 20 comments

Comments

@kain88-de
Copy link
Member

It has come up recently how optional dependencies are treated and how we should deal with dependencies in mda.analysis. The current situation seems to be that require any package that is imported in core or analysis with the exception of a few (matplotlib/scipy/sklearn/netcdf4) which are deemed to be hard to install.

Personally I don't think the installation is a problem anymore today. Either the packages are in the distribution package manager or you can use conda or even pip which installs wheels so no compilation should be needed. A short google search also doesn't list any installation problem posts/blogs on the web that are younger then 2014.

I would say that the if one installed MDAnalysis everything should just work (that happens with the conda installation). This will also mean we can get rid of the optional imports and include guards.

I don't like the current approach. Where the library will work 90% but I may not be able to use a feature in the moment when I really need it but have to internet connection. It's also not really clear to a user what parts won't work. The best we do right now is to give warnings when an analysis module is imported.

I would like to hear other opinions as well.

@jbarnoud
Copy link
Contributor

I do have mixed feeling on the question. On one hand, I like the battery included view you are pushing for. On the other end, having dependencies for analyses automatically added to the main dependencies means we have to be more careful about what analyses we accept in the library.

For instance, ENCORE was the first module to introduce a dependency to sklearn. Because sklearn was just an optional requirement, we did not really bothered about it. Would it be introduced as a main requirement for the library, we would have had to weight it the new dependency.

@richardjgowers
Copy link
Member

Yeah I was thinking the same as @jbarnoud , it's nice to be able to import whatever in analysis and not worry about what it is because it's only optional.

Maybe we can reassess what's considered "simple" to find/install and expand the whitelist of what's required, something like either the anaconda package list or the scipy stack

@kain88-de
Copy link
Member Author

Would it be introduced as a main requirement for the library, we would have had to weight it the new dependency.

I understand that. But with packages from the standard scientific python stack I'm pretty OK. Odds are one either already uses them or has to use them in the future anyway.

I'm mostly put off by our arbitrary decision to say that some dependencies are hard to install and should be optional and others are just fine (even though they have dependencies of their own that might pull in other libraries, see griddataformats eg #1147)

@kain88-de
Copy link
Member Author

This has also come up in #1145. Using a batteries included approach I wouldn't need to add joblib as a dependency. With the current approach it would be better to require joblib (pure python) to use 95% of encore and say if sklearn is installed you can use the last 5% as well.

@orbeckst
Copy link
Member

A fundamental problem is that of dependencies of dependencies as in gridDataFormats. I agree with @kain88-de that it's not straightforward to assess what should count as simple. (gridDataFormats has essentially been grand-fathered in... and I am not overly happy about its scipy dependency but that's a different thing.)

My gut feeling was summarized by @jbarnoud #1159 (comment) – but maybe that really harks back to the olden pre-whl/pre-conda days.

@kain88-de
Copy link
Member Author

As a clear set of rules for development how about the following if a dependency should be optional?

  • pure python packages are fine
  • packages in the anaconda distribution are fine
  • packages with compiled content should be made optional

@jbarnoud
Copy link
Contributor

I would be fine with what @kain88-de suggests. Like @orbeckst I mostly form my opinion on dependencies pre-wheel and lot of things are easier now.

As we account more and more for anaconda when making decisions, I would just like us to keep in mind that not everybody uses conda and that it shall not become mandatory.

@kain88-de
Copy link
Member Author

We could also say that all packages in the scipy-stack are fine. That are less but they include the most prominent packages that one will use (matplotlib / scipy / scikits). See https://www.scipy.org/about.html

@orbeckst
Copy link
Member

The scipy-stack is not really that well defined: the core packages are clear enough

  • python
  • numpy
  • scipy
  • matplotlib
  • pandas
  • sympy
  • ipython
  • nose

but the other packages are "too many to list here". Although h5py, pytables, cython, scikit-image, scikit-learn all are well established and (I think) fairly widely used, I am less sure about chaco and mayavi (which is cool, but depends on VTK and is more niche).

I think we can easily agree on the core packages being dependencies for the analysis module.

For the rest I don't really see the SciPy: other packages giving very useful guidance: ultimately we have to decide what we will consider canonical analysis modules.

@jbarnoud
Copy link
Contributor

In any case, that means that scipy is an acceptable main dependancy, doesn't it? Shall we upgrade it from the full to the minimal install?

@orbeckst
Copy link
Member

In my opinion, we can upgrade scipy and matplotlib to full dependencies.

matplotlib is a bit of an odd case due to some of the convenience plotting functions; perhaps at some point we decide to get rid of these convenience functions but until then we are probably serving our users (and downstream integrators #1361 ) better if we included matplotlib.

@orbeckst
Copy link
Member

@MDAnalysis/coredevs can we decide to upgrade the following packages to required dependencies (this will solve #1361):

  • scipy
  • matplotlib

We can decide on other packages to add later.

Please vote with 👍 or 👎 .

@orbeckst
Copy link
Member

@richardjgowers can you explain your 😕 "confused" comment?

@richardjgowers
Copy link
Member

@orbeckst just 50/50 about it, so I've read it and have no strong opinion

@jbarnoud
Copy link
Contributor

I agree for scipy. But where is matplotlib used?

@orbeckst
Copy link
Member

But where is matplotlib used?

git grep matplotlib:

  • analysis/hole.py
  • analysis/legacy/x3dna.py
  • analysis/psa.py
  • visualization/streamlines.py

All four do somewhat complicated/idiosyncratic plotting.

Perhaps we can get rid of these plotting functions in the future or better shield them from top-level imports but until then I would take the path of least resistance and just include matplotlib, given that it is without a doubt a core scientific python stack package and because it does install easily with whl.

@jbarnoud
Copy link
Contributor

Good for me then.

@mnmelo
Copy link
Member

mnmelo commented Jun 14, 2017

Also, I didn't forget that idea of adapting peak.util.imports for lazy/optional importing. Should have a PR soon.

@orbeckst
Copy link
Member

For right now I am preparing a PR that upgrades scipy and matplotlib to full dependencies. That does not preclude any other solutions further down the line although I am pretty confident that we will keep scipy as a dependency.

@orbeckst orbeckst self-assigned this Jun 16, 2017
@orbeckst
Copy link
Member

I updated the List of core module dependencies (and removed any mentioning of scipy as optional).

orbeckst added a commit that referenced this issue Jun 16, 2017
- updated all modules
- removed any code that guards against scipy or matplotlib import
- conforms to style guide https://github.com/MDAnalysis/mdanalysis/wiki/Style-Guide#module-imports-in-mdanalysisanalysis
- fixes #1159
- fixes #1361
orbeckst added a commit that referenced this issue Jun 16, 2017
- updated all modules
- removed any code that guards against scipy or matplotlib import
- conforms to style guide https://github.com/MDAnalysis/mdanalysis/wiki/Style-Guide#module-imports-in-mdanalysisanalysis
- fixes #1159
- fixes #1361
kain88-de pushed a commit that referenced this issue Jun 17, 2017
- updated all modules
- removed any code that guards against scipy or matplotlib import
- conforms to style guide https://github.com/MDAnalysis/mdanalysis/wiki/Style-Guide#module-imports-in-mdanalysisanalysis
- fixes #1159
- fixes #1361
orbeckst added a commit that referenced this issue Jun 18, 2017
- require scipy and matplotlib in setup.py (fixes #1361)
- scipy and matplotlib are imported at top in analysis
  - updated all modules
  - removed any code that guards against scipy or matplotlib import
  - conforms to style guide https://github.com/MDAnalysis/mdanalysis/wiki/Style-Guide#module-imports-in-mdanalysisanalysis
  - fixes #1159
- removed conditional skipping of tests when scipy or matplotlib are missing
@orbeckst orbeckst mentioned this issue Jun 20, 2017
11 tasks
kain88-de pushed a commit that referenced this issue Jun 21, 2017
- updated all modules
- removed any code that guards against scipy or matplotlib import
- conforms to style guide https://github.com/MDAnalysis/mdanalysis/wiki/Style-Guide#module-imports-in-mdanalysisanalysis
- fixes #1159
- fixes #1361
kain88-de pushed a commit that referenced this issue Jun 21, 2017
- updated all modules
- removed any code that guards against scipy or matplotlib import
- conforms to style guide https://github.com/MDAnalysis/mdanalysis/wiki/Style-Guide#module-imports-in-mdanalysisanalysis
- fixes #1159
- fixes #1361
mnmelo added a commit that referenced this issue Jun 21, 2017
This now addresses issues #577 (facultative imports), #1361 (cleaner
optional dependencies) and #1159 (optional dependencies in the analysis
module)
kain88-de pushed a commit that referenced this issue Jun 22, 2017
- updated all modules
- removed any code that guards against scipy or matplotlib import
- conforms to style guide https://github.com/MDAnalysis/mdanalysis/wiki/Style-Guide#module-imports-in-mdanalysisanalysis
- fixes #1159
- fixes #1361
kain88-de added a commit that referenced this issue Jun 22, 2017
make scipy and matplotlib required dependencies (#1159)
utkbansal pushed a commit to utkbansal/mdanalysis that referenced this issue Jun 24, 2017
utkbansal pushed a commit to utkbansal/mdanalysis that referenced this issue Jun 24, 2017
- updated all modules
- removed any code that guards against scipy or matplotlib import
- conforms to style guide https://github.com/MDAnalysis/mdanalysis/wiki/Style-Guide#module-imports-in-mdanalysisanalysis
- fixes MDAnalysis#1159
- fixes MDAnalysis#1361
utkbansal pushed a commit to utkbansal/mdanalysis that referenced this issue Jun 24, 2017
utkbansal pushed a commit to utkbansal/mdanalysis that referenced this issue Jun 24, 2017
- updated all modules
- removed any code that guards against scipy or matplotlib import
- conforms to style guide https://github.com/MDAnalysis/mdanalysis/wiki/Style-Guide#module-imports-in-mdanalysisanalysis
- fixes MDAnalysis#1159
- fixes MDAnalysis#1361
utkbansal pushed a commit to utkbansal/mdanalysis that referenced this issue Jun 24, 2017
Converted test_units to use parametrize

Some hacking of test_distances to use pytest features

Completed moving test_distances to pytest style

rewrote test_transformations into pytest style

make scipy and matplotlib full dependencies (MDAnalysis#1159)

scipy and matplotlib are imported at top in analysis

- updated all modules
- removed any code that guards against scipy or matplotlib import
- conforms to style guide https://github.com/MDAnalysis/mdanalysis/wiki/Style-Guide#module-imports-in-mdanalysisanalysis
- fixes MDAnalysis#1159
- fixes MDAnalysis#1361

removed conditional skipping of tests when scipy or matplotlib are missing

minor clean ups
utkbansal pushed a commit to utkbansal/mdanalysis that referenced this issue Jun 24, 2017
Converted test_units to use parametrize

Some hacking of test_distances to use pytest features

Completed moving test_distances to pytest style

rewrote test_transformations into pytest style

make scipy and matplotlib full dependencies (MDAnalysis#1159)

scipy and matplotlib are imported at top in analysis

- updated all modules
- removed any code that guards against scipy or matplotlib import
- conforms to style guide https://github.com/MDAnalysis/mdanalysis/wiki/Style-Guide#module-imports-in-mdanalysisanalysis
- fixes MDAnalysis#1159
- fixes MDAnalysis#1361

removed conditional skipping of tests when scipy or matplotlib are missing

minor clean ups
utkbansal pushed a commit to utkbansal/mdanalysis that referenced this issue Jun 24, 2017
Converted test_units to use parametrize

Some hacking of test_distances to use pytest features

Completed moving test_distances to pytest style

rewrote test_transformations into pytest style

make scipy and matplotlib full dependencies (MDAnalysis#1159)

scipy and matplotlib are imported at top in analysis

- updated all modules
- removed any code that guards against scipy or matplotlib import
- conforms to style guide https://github.com/MDAnalysis/mdanalysis/wiki/Style-Guide#module-imports-in-mdanalysisanalysis
- fixes MDAnalysis#1159
- fixes MDAnalysis#1361

removed conditional skipping of tests when scipy or matplotlib are missing

minor clean ups
PicoCentauri pushed a commit to PicoCentauri/mdanalysis that referenced this issue Jan 22, 2019
PicoCentauri pushed a commit to PicoCentauri/mdanalysis that referenced this issue Jan 22, 2019
- updated all modules
- removed any code that guards against scipy or matplotlib import
- conforms to style guide https://github.com/MDAnalysis/mdanalysis/wiki/Style-Guide#module-imports-in-mdanalysisanalysis
- fixes MDAnalysis#1159
- fixes MDAnalysis#1361
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants