Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

quick proof of concept of down #3802

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

quick proof of concept of down #3802

wants to merge 1 commit into from

Conversation

KristofferC
Copy link
Member

@KristofferC KristofferC commented Feb 19, 2024

Very quick implementation of #1062 (comment), (with some debug logging in it to show what happens), specifically:

On the other hand, I wonder whether it would be best to simply call the solver with the versions of the explicitly required dependencies fixed to their declared lower bounds. In my mind, that would be simpler, safer, and more precise.
There would be no need to further increase the complexity of the resolver code, and it would actually test that the declared lower bounds work. After all, if the resolver finds that it cannot set a package version to its declared lower bound due to an incompatibility with some other required package, then the lower bound should be raised, no? Or am I missing something?

An example of its usage:

(PGFPlotsX) pkg> st
Project PGFPlotsX v1.6.1
...

julia> Pkg.API.down()
[ Info: Fixing OrderedCollections to version 1.4.0
[ Info: Fixing Tables to version 1.0.0
[ Info: Fixing DefaultApplication to version 1.0.0
[ Info: Fixing Parameters to version 0.12.0
[ Info: Fixing ArgCheck to version 1.0.0
[ Info: Fixing DocStringExtensions to version 0.8.0
[ Info: Fixing Requires to version 0.5.0
[ Info: Fixing MacroTools to version 0.5.0

    Updating `~/JuliaPkgs/PGFPlotsX.jl/Project.toml`
⌃ [dce04be8] ↓ ArgCheck v2.3.0 ⇒ v1.0.0
⌃ [3f0dd361] ↓ DefaultApplication v1.1.0 ⇒ v1.0.0
⌃ [ffbed154] ↓ DocStringExtensions v0.9.3 ⇒ v0.8.0
⌃ [1914dd2f] ↓ MacroTools v0.5.13 ⇒ v0.5.0
⌃ [bac558e1] ↓ OrderedCollections v1.6.3 ⇒ v1.4.0
⌃ [d96e819e] ↓ Parameters v0.12.3 ⇒ v0.12.0
⌃ [ae029012] ↓ Requires v1.3.0 ⇒ v0.5.0
⌃ [bd369af6] ↓ Tables v1.11.1 ⇒ v1.0.0
    Updating `~/JuliaPkgs/PGFPlotsX.jl/Manifest.toml`
⌃ [dce04be8] ↓ ArgCheck v2.3.0 ⇒ v1.0.0
⌅ [00ebfdb7] + CSTParser v0.6.2
⌅ [34da2185] + Compat v2.2.1
⌅ [864edb3b] + DataStructures v0.17.20
⌃ [3f0dd361] ↓ DefaultApplication v1.1.0 ⇒ v1.0.0
  [8bb1440f] + DelimitedFiles v1.9.1
⌃ [ffbed154] ↓ DocStringExtensions v0.9.3 ⇒ v0.8.0
⌃ [1914dd2f] ↓ MacroTools v0.5.13 ⇒ v0.5.0
⌃ [bac558e1] ↓ OrderedCollections v1.6.3 ⇒ v1.4.0
⌃ [d96e819e] ↓ Parameters v0.12.3 ⇒ v0.12.0
⌃ [ae029012] ↓ Requires v1.3.0 ⇒ v0.5.0
  [10745b16] + Statistics v1.11.1
⌃ [bd369af6] ↓ Tables v1.11.1 ⇒ v1.0.0
  [0796e94c] + Tokenize v0.5.28
  [3a884ed6] - UnPack v1.0.2

Trying this on Plots.jl we get resolver errors (as @oxinabox predicted) due to Scratch.jl being lower bounded to 1.0.0 while one of our dependencies (RelocatableFolders has a lower bound of 1.1):

pkg> activate ../Plots.jl/
  Activating project at `~/JuliaPkgs/Plots.jl`

julia> Pkg.API.down()
[ Info: Fixing Showoff to version 0.3.1
[ Info: Fixing GR to version 0.69.5
[ Info: Fixing Statistics to version 1.11.0
[ Info: Fixing FixedPointNumbers to version 0.6.0
[ Info: Fixing JLFzf to version 0.1.0
[ Info: Fixing PrecompileTools to version 1.0.0
[ Info: Fixing Unzip to version 0.1.0
[ Info: Fixing UnitfulLatexify to version 1.0.0
[ Info: Fixing RecipesPipeline to version 0.6.10
[ Info: Fixing LaTeXStrings to version 1.0.1
[ Info: Fixing PlotUtils to version 1.0.0
[ Info: Fixing JSON to version 0.21.0
[ Info: Fixing StatsBase to version 0.33.0
[ Info: Fixing RelocatableFolders to version 0.3.0
[ Info: Fixing Scratch to version 1.0.0
[ Info: Fixing Latexify to version 0.14.0
[ Info: Fixing FFMPEG to version 0.2.0
[ Info: Fixing Measures to version 0.3.0
[ Info: Fixing RecipesBase to version 1.3.1
[ Info: Fixing UnicodeFun to version 0.4.0
[ Info: Fixing PlotThemes to version 2.0.0
[ Info: Fixing Contour to version 0.5.0
[ Info: Fixing Reexport to version 0.2.0
[ Info: Fixing Requires to version 1.0.0
[ Info: Fixing NaNMath to version 0.3.2
ERROR: Unsatisfiable requirements detected for package Scratch [6c6a2e73]:
 Scratch [6c6a2e73] log:
 ├─possible versions are: 1.0.0 - 1.2.1 or uninstalled
 ├─restricted to versions 1 by Plots [91a5bcdd], leaving only versions: 1.0.0 - 1.2.1
 │ └─Plots [91a5bcdd] log:
 │   ├─possible versions are: 1.40.1 or uninstalled
 │   └─Plots [91a5bcdd] is fixed to version 1.40.1
 ├─restricted to versions 1.0.0 by an explicit requirement, leaving only versions: 1.0.0
 └─restricted by compatibility requirements with RelocatableFolders [05181044] to versions: 1.1.0 - 1.2.1 — no versions left
   └─RelocatableFolders [05181044] log:
     ├─possible versions are: 0.1.0 - 1.0.1 or uninstalled
     ├─restricted to versions [0.3, 1] by Plots [91a5bcdd], leaving only versions: 0.3.0 - 1.0.1
     │ └─Plots [91a5bcdd] log: see above
     └─restricted to versions 0.3.0 by an explicit requirement, leaving only versions: 0.3.0

cc @carlobaldassi, @oxinabox

@KristofferC
Copy link
Member Author

Also cc @ChrisRackauckas who has thought about these stuff.

@MasonProtter
Copy link
Contributor

MasonProtter commented Feb 19, 2024

Trying this on Plots.jl we get resolver errors (as @oxinabox predicted) due to Scratch.jl being lower bounded to 1.0.0 while one of our dependencies (RelocatableFolders has a lower bound of 1.1):

Hm, shouldn't there be a way to work around that though? Like if we flip the situation and look at up, if Plots.jl upper bounded Scratch.jl to 1.0.0, and RelocatableFolders upper bounded it to 1.1, then the resolver would just install it at version 1.0.0 instead of erroring right?

I might be missing something, but I think it should be possible for down to be fully symmetric with up.

@KristofferC
Copy link
Member Author

KristofferC commented Feb 19, 2024

I might be missing something, but I think it should be possible for down to be fully symmetric with up.

Yes, but as I wrote in #1062 (comment), the whole reason to do this is to exactly test that your code works with the given lower bounds. So if you aren't hitting them exactly you are just testing some semi-random version configuration that have low versions "on average". Like in this case, if we install Scratch to 1.1, is the Scratch 1.0 lower bounds correct? You cannot really say anything about that but that is the actual question we wanted to answer by running our tests with down.

I guess you could say that you could in theory minimize the version for one dependency at a time and run the tests for that but that would:

  1. Take forever
  2. Be unclear if the result of that would be representative for running with something where all dependencies have low versions at the same time,

@MasonProtter
Copy link
Contributor

MasonProtter commented Feb 19, 2024

Ah okay, sorry I missed that comment, so this is a (nicer implementation of) the idea in https://github.com/julia-actions/julia-downgrade-compat

I'll say that I didn't really find downgrade-compat very useful for packages with many deps because

  1. the complaints from the resolver when it fails are indecipheralbe and very hard to turn into something actionable if you're not an expert

  2. If the lower bounds of your package isn't actually installable due to being un-resolvable, then I don't see why that's actually problematic. It's not a state a user would end up in and so it doesn't really need to be tested. What I do want to be able to test is the oldest set of compatible packages.

@KristofferC
Copy link
Member Author

KristofferC commented Feb 19, 2024

What I do want to be able to test is the oldest set of compatible packages.

Okay, I think this can be extended to do that. I will try...

Edit: Thinking about this some more, I think it requires basically a full resolver since there is for example not a unique answer to this question of the "oldest set of compatible packages".

@MasonProtter
Copy link
Contributor

MasonProtter commented Feb 19, 2024

Edit: Thinking about this some more, I think it requires basically a full resolver since there is for example not a unique answer to this question of the "oldest set of compatible packages".

Yeah, I was imagining the exact same behaviour as up, but instead of seeking newer versions it seeks older versions. It won't catch everything since the set of versions isn't unique, but that seems "good enough" for trying to keep track of appropriate lower bounds, and much less annoying than having the resolver simply error out because of a incompatible set of lower bounds (at least currently, since it is so hard to know how to fix the resolver issues when the lower bounds aren't compatible)

@MasonProtter
Copy link
Contributor

Afterall, we also think it's okay that when I ]test a package with a given set of upper bounds (either explicit or implicit), that we don't end up testing the unique newest set of packages, and it'd be really aggravating if the resolver threw an error just because it wasn't able to install the maximally newest compatible version of every dependency.

@KristofferC
Copy link
Member Author

Yeah, I was imagining the exact same behaviour as up, but instead of seeking newer versions it seeks older versions.

Again, I am not sure that is useful in answering the question of if our lower bounds are correct. If you run the tests with this version of down you might have the following situation:

[compat]
A = "0.5.1"
B = "0.7.1"

and you get when testing A=v0.5.1 and B=v0.7.3. However, you could in another environment end up with another pareto optimal, say A=v0.5.3 and B=v0.7.1 (the resolver is free to give any pareto optimal resolution). So the fact that we have multiple pareto optimal means that our testing will be insufficient to say that our lower bounds are correct.

@KristofferC
Copy link
Member Author

KristofferC commented Feb 19, 2024

Afterall, we also think it's okay that when I ]test a package with a given set of upper bounds (either explicit or implicit), that we don't end up testing the unique newest set of packages,

I think this is different because when we are normally testing our package we are not explicitly testing that our upper bounds are correct because these are talking about an unknown future which can only be discussed from a "contract" p.o.v like semver. That's why I think that this is not symmetric.

For normal testing we are saying:

  • I want to get a set of compatible packages. If people use semver properly other pareto optimal points will also work fine (in any other environment) since they shouldn't break old code. If my code starts breaking it is other packages' fault.

For down testing we are saying:

  • I am relying on actual features that were introduced in these lower bounds. If I get this lower bound wrong my code has a chance to not work in some environment and that would be my own fault for getting the set of features / bug fixes I need wrong. In order to test this I actually need to run my code with these concrete lowest versions.

@MasonProtter
Copy link
Contributor

MasonProtter commented Feb 19, 2024

That's fair.

I guess though I feel that having a ]down that acts like ]up at least seems like an easier path forward to getting people (imperfectly) testing their lower bounds than having something very strict and fussy.

If we do go with something as strict and fussy as pinning all the lower bounds, then I think nobody is going to use it unless we get much more informative and actionable error messages than we currently have, but making those error messages informative and actionable seems like a very difficult task. I don't know enough about this to have a strong opinion though.

@KristofferC
Copy link
Member Author

KristofferC commented Feb 19, 2024

I'm not sure if the error messages will be so bad. They will basically always be of the form that one of the direct dependencies limits another of the direct dependencies to something higher than the lower bounds that the package itself sets on it. For example in the Plots example the relevant stuff are:

ERROR: Unsatisfiable requirements detected for package Scratch [6c6a2e73]:
 Scratch [6c6a2e73] log:
 ├─possible versions are: 1.0.0 - 1.2.1 or uninstalled
 ├─restricted to versions 1.0.0 by an explicit requirement, leaving only versions: 1.0.0
 └─restricted by compatibility requirements with RelocatableFolders [05181044] to versions: 1.1.0 - 1.2.1 — no versions left

@MasonProtter
Copy link
Contributor

That said, maybe with this implementation it's not so critical that the error messages are good since the feedback loop is so much faster than with the downgrade-ci. Looking back, the main reason I gave up after trying to use it was that I'd need to go and semi-blindly adjust my bounds, run the ci process, get a new error, adjust them again, get a new error, etc.

The process here would be fundamentally the same, but in practice not so painful because I just need the Project.toml file open and a repl, and I can just tweak the bounds and continue hitting ]down until I get something resolveable...

I'm not sure if the error messages will be so bad. They will basically always be of the form that one of the direct dependencies limits another of the direct dependencies to something higher than the lower bounds that the package itself sets on it. For example in the Plots example the relevant stuff are:

ERROR: Unsatisfiable requirements detected for package Scratch [6c6a2e73]:
 Scratch [6c6a2e73] log:
 ├─possible versions are: 1.0.0 - 1.2.1 or uninstalled
 ├─restricted to versions 1.0.0 by an explicit requirement, leaving only versions: 1.0.0
 └─restricted by compatibility requirements with RelocatableFolders [05181044] to versions: 1.1.0 - 1.2.1 — no versions left

In my experience, I couldn't figure out what was going on. Sure, it's not so bad when direct dependancies start getting angry at eachother, but once you have nested indirect dependancies fighting, I get thoroughly confused. But as said above, maybe it's not so bad with this implementation.

@KristofferC
Copy link
Member Author

In https://github.com/julia-actions/julia-downgrade-compat, why does it set ~ on the post 1.0 pacakge:

This action will modify the compat to:

[compat]
julia = "1.6"
Foo = "~1.2.3"
Bar = "=0.1.2"

What if you require a bugfix in Foo 1.2.4 (so the Foo = "1.2.3" entry is incorrect).

@MasonProtter
Copy link
Contributor

MasonProtter commented Feb 19, 2024

You can set which mode it uses with a command line arg: https://github.com/julia-actions/julia-downgrade-compat/blob/main/downgrade.jl#L41-L47, but yeah I can't comment on the specific choices made there.

@aplavin
Copy link
Contributor

aplavin commented Feb 19, 2024

Just my 2c: I would definitely expect down to be just like up, but in the opposite direction. That is, prefer oldest possible versions, but otherwise always resolve the env successfully – just like up does.

More strict behavior could be enabled by a switch similar to the existing test(force_latest_compatible_version=true). This would make everything nice an consistent.

@KristofferC
Copy link
Member Author

Ok you expect this but why would it be useful?

@carlobaldassi
Copy link
Member

I guess that the name down will (rightfully) generate the expectation of an opposite operation to up. Maybe something like pin_down?

@3f6a
Copy link

3f6a commented Feb 20, 2024

but I think it should be possible for down to be fully symmetric with up.

That can't be true in general (unless I'm misunderstanding precisely what you mean). E.g.,

[compat]
PkgA = 1.2.3

Means I'm compatible with v1.2.4, v1.2.5, ..., v1.3.1, ..., but I'm incompatible with v1.2.2 or anything lower. So, it's not symmetric, right?

@3f6a
Copy link

3f6a commented Feb 20, 2024

I guess that the name down will (rightfully) generate the expectation of an opposite operation to up. Maybe something like pin_down?

Since this is intended to be used only for testing, why not make that explicit in the name? E.g., pkg> test down, or pkg> downgrade test ?

@MasonProtter
Copy link
Contributor

MasonProtter commented Feb 20, 2024

Ok you expect this but why would it be useful?

Having a less fussy ]down that gives a reasonable set of old versions I can test on without having to comb through and meticulously purge my Project.toml of incompatible lower bounds would give most of the testing utility of the more fussy ]bottom or whatever, while wasting less of my time playing compat-bound whack-a-mole.

I definitely see the value of an ]bottom functionality, but for the most part I'd honestly just rather use a ]down that's symmetric with ]up.

@MasonProtter
Copy link
Contributor

That can't be true in general (unless I'm misunderstanding precisely what you mean). E.g.,

[compat]
PkgA = 1.2.3

Means I'm compatible with v1.2.4, v1.2.5, ..., v1.3.1, ..., but I'm incompatible with v1.2.2 or anything lower. So, it's not symmetric, right?

Where is the asymmetry you're pointing to? There's only a finite number of versions of any package in the general registry, so you can't end up with v1.2.∞. That means there is some N where the versions are bounded between v1.2.3 and v1.2.N.

Kristoffer is right in that there's an asymmetry induced by the fact that new features can only appear in one direction, so it's good to have a function that demands the absolute oldest version set, whereas it makes less sense to demand the absolute newest version set, but even then I have some doubts.

You could say that it's important that you get the absolute newest version set if you want to test for potentially breaking changes in your dependencies. But we're okay with testing on just a reasonably new set of compatible packages, so I'd also be okay with testing on a reasonably old set.

@KristofferC
Copy link
Member Author

Having a less fussy ]down that gives a reasonable set of old versions I can test on without having to comb through and meticulously purge my Project.toml of incompatible lower bounds would give most of the testing utility..

I disagree, you would have failed to verify your lower bounds so even if your tests pass your lower bounds can still be buggy and give runtime errors for users (which is what you wanted to avoid in the first place).

@oxinabox
Copy link
Contributor

oxinabox commented Feb 20, 2024

Yes, but perfect can be the enemy of good.

There are also other positions in the compat interval not tested.
Like if you declare compat for 1,2,3 but actually v2 removed a feature that v1 had and that v3 brought back.
Testing both top and bottom isn't going to be enough.

Testing (potential one of multiple) lowest possible version that can be achieved in practice -- that a user might actually see,
seems like a good point in the trade-off space of time spent chasing dependencies (which becomes zero as it isn't required to transcribe stuff from your dependency's lower bounded lower bounds wherever higher than your own) vs coverage of potential mistakes and problems.

@adrhill
Copy link

adrhill commented Jul 25, 2024

Again, I am not sure that is useful in answering the question of if our lower bounds are correct.

I understand the argument that ideally, we should test every point of the "Pareto front" of resolvable lower bounds. But realistically speaking, this front can be very large and very expensive to test.

As far as I can tell, julia-downgrade-compat only works if the exact lower bounds specified in the compat entries are resolvable. This is equivalent to the "Pareto front" being a singular point.

From the perspective of a package developer with multiple dependencies (and possibly hundreds of indirect dependencies), this means that julia-downgrade-compat doesn't work out of the box. Developers must first manually set a resolvable set of lower compat entries. This manual resolving must be done by developers every time a lower bound of a compat entry is updated.

I therefore have to agree with @oxinabox on "perfect can be the enemy of good".
Personally, I would be happy with either of these options:

  1. Have down resolve an arbitrary lower bound environment in the same way that up resolves a somewhat arbitrary upper bound environment. Being able to test a single point on the "Pareto front" is already better than not testing any point at all.
  2. Have down resolve an arbitrary lower bound environment and update the lower-bound compat entries such that the "Pareto front" is reduced to a single point and can be fully tested in CI. This is most likely overly restrictive, but it is what package developers currently have to do manually.

@KristofferC
Copy link
Member Author

KristofferC commented Oct 21, 2024

ideally, we should test every point of the "Pareto front" of resolvable lower bounds

That is not needed, you just want to test your lower bounds here and with the assumption that packages follow semver you are good to go (unless you try to support multiple breaking releases).

From the perspective of a package developer with multiple dependencies (and possibly hundreds of indirect dependencies), this means that julia-downgrade-compat doesn't work out of the box.

It works, you just have to bump the lower bounds so they agree with your dependencies so you can actually run your tests with the lower bounds you claim to support.

@adrhill
Copy link

adrhill commented Oct 21, 2024

It works, you just have to bump the lower bounds so they agree with your dependencies so you can actually run your tests with the lower bounds you claim to support.

In theory you are right, but in practice, this manual bisection can take hours – I've tried and given up before.
And this soul-crushing task must be repeated every time one of the compat entries has to be changed, e.g. due to a new breaking release in one of the dependencies.

@KristofferC
Copy link
Member Author

It could maybe be automated somehow?

@aplavin
Copy link
Contributor

aplavin commented Oct 21, 2024

must be repeated every time one of the compat entries has to be changed

Not just that: a new version of a dependency can restrict versions of its dependencies further. So, a test run that relies on the lowest version from direct compats being instantiable can become failing at any time whenever a dependency is updated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants