Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance requires with version information from the build root. #2372

Draft
wants to merge 14 commits into
base: master
Choose a base branch
from

Conversation

gordonmessmer
Copy link
Contributor

For shared objects without versioned symbols, this additional information allows rpm to track minor-version dependencies.

The initial version of this PR updates the external dependency generator, so it will only affect packages that set %global _use_internal_dependency_generator 0, so it's mostly here to facilitate discussion and testing.

@hroncok
Copy link
Contributor

hroncok commented Jan 29, 2023

I just want to mention that:

  • calling rpm from within a generator has always been considered a bad thing to do
  • rpm -q --whatprovides may return multiple things -- it might be safer to rpm -qf the actual .so file

scripts/enhance-requires Outdated Show resolved Hide resolved
scripts/enhance-requires Outdated Show resolved Hide resolved
scripts/enhance-requires Outdated Show resolved Hide resolved
@gordonmessmer
Copy link
Contributor Author

calling rpm from within a generator has always been considered a bad thing to do

Understood. I don't think this PR is the right way to solve this problem. It's just a quick mock-up; I expect to throw it away and do something "the right way" later.

My hope is that we can discuss whether rpm should attempt to provide minor-version information for dependencies that don't provide versioned symbols.

@keszybz
Copy link
Contributor

keszybz commented Jan 29, 2023

A different approach, which wouldn't require calling into rpm and dealing with multiple rpms, would be to:

  1. change the existing Provides generator for SONAME to use a version: libcurl.so.4()(64bit)libcurl.so.4()(64bit) = 4.8.0. I think this would be backwards compatible: because the Provides was unversioned, all existing Requires on that provides must also be unversioned, and would still be satisfied after we add a version. The generator works by examining a file and extracting the SONAME, and it could look for a .so.<major>.<minor>.<patch> suffix on the file and use that, if found, for the version.
  2. change the existing generator for Requires to do essentially the same in reverse: right now it just uses the stored SONAME, but it could also resolve the SONAME in the file system (like ldd does), and add a versioned requires. (libcurl.so.4()(64bit)libcurl.so.4()(64bit) >= 4.8.0).

This would have the following advantages:

  • it operates entirely on the file system, doesn't use rpm database
  • the logic is similar to how we handle SONAME with a level of indirection
  • the requirement is actually better, more precise. Instead of tying to the name of the providing package, we actually tie to the library name. If the library is moved to a different package, or there are multiple packages which have compatible libraries, thing will just work.

@gordonmessmer
Copy link
Contributor Author

Looking at the contents of /usr/lib64 on my local system, I see some cases that look like developers not versioning their so at all (e.g. libasprintf.so.0.0.0, and 144 others). But that's about 7% of all shared object files.

Generally: that sounds like it'd be a major improvement, with coverage of > 90% of a typical system's dependencies. I like it, but I would hope that we could try to address the remainder, further up the stack in Fedora. Possibly: either tooling to detect shared objects that don't increment even their so version and raise an issue with the upstream project, or by modifying dnf so that its default behavior is to update dependencies on update or install transactions.

@keszybz
Copy link
Contributor

keszybz commented Jan 30, 2023

Oh, one more caveat: if the shared library uses symbol versioning, exclude it from the new generator.
This is important because of at least one important package: glibc. Without this, every build would depend on the latest version of glibc. glibc spends a lot of effort making sure that this is not needed, and this spurious dependency would be very annoying to users. The same argument applies to other cases, e.g. systemd-libs provides libsystemd.so.0, which is linked-to by many packages, and libsystemd uses symbol versioning and maintains strict backwards-compat. But there's a strict version dependency between systemd-libs and systemd, and people would be very annoyed if updates of unrelated packages also forced systemd updates.

modifying dnf so that its default behavior is to update dependencies on update or install transactions

I'd be very much enforcing this as a general rule. Maintainers are supposed to express package dependencies, so that updates and fresh installs "just work". If we started enforcing automatic upgrades of all dependencies even without package requirements would mean that routine installs of leaf packages would pull in updates of hundreds of other packages. In many scenarios this would be very annoying and constraining.

I think in the particular case of shared libraries, maintainers expect that the built-in generators handle this for them, but in reality those generators only works for some subset of libraries. If you can make it work for another (large) subset of packages, things will just improve overall. There might be some cases which are not handled, but I think it's better to handle the 90% in a way that has a good technical underpinning, than try for a more questionable approach in an attempt to cover "everything".

@pmatilai
Copy link
Member

pmatilai commented Jan 30, 2023

add a versioned requires. (libcurl.so.4()(64bit) → libcurl.so.4()(64bit) >= 4.8.0).

The whole point of sonames is that it's NOT tied to versions but the soname abstraction, and this kind of dependency breaks that point.

@gordonmessmer
Copy link
Contributor Author

Oh, one more caveat: if the shared library uses symbol versioning, exclude it from the new generator.

Understood. This mock-up only applies the externally-enhanced version info to shared objects that don't provide versioned symbols, and any future work would do the same.

@gordonmessmer
Copy link
Contributor Author

gordonmessmer commented Jan 30, 2023

The whole point of sonames is that it's NOT tied to versions but the soname abstraction, and this kind of dependency breaks that point.

I think "the point" is that binaries should not need to re-link when a shared object is updated with a backward-compatible version.

However, both the soname abstraction and rpm's current handling of it are currently slightly broken, in that there is a real existent dependency on a minimum version within the major version expressed by the soname abstraction, which rpm doesn't detect at all, and which ld.so detects by reporting unresolved symbols at startup and failing to launch a binary.

Because rpm currently only tracks the major-version of the soname, it's possible for rpm (and dnf) to install a package set that doesn't actually satisfy the underlying dependencies. I provided an example in https://lists.fedoraproject.org/archives/list/[email protected]/thread/WE5UTPZ6EIMTSNUX6ILODAHRYNH6J6TM/ (see the second reference in the original email). The dependency on the updated version exists regardless of whether rpm knows about it. By improving rpm's expression of dependencies, we can make the soname abstraction more reliable.

@pmatilai
Copy link
Member

Yup, the soname abstraction and the libtool style major.minor.micro version in the filename don't exactly meet in practise.

So reading through the comments more throroughly, turning the version in the filename (or symlink) into a version as suggested by @keszybz. That'd be enforcing the libtool style versioning, something which people rely on but nothing in the tooling actually cares. And yes, libraries doing symbol version need to be excempt from that.

@pmatilai
Copy link
Member

Oh and BTW, even for a proof-of-concept, you really need to start with the internal dependency generator. In the external one, the whole world-iview is upside down, and starting there is likely to run into conceptual problems elsewhere.

@Conan-Kudo
Copy link
Member

add a versioned requires. (libcurl.so.4()(64bit) → libcurl.so.4()(64bit) >= 4.8.0).

The whole point of sonames is that it's NOT tied to versions but the soname abstraction, and this kind of dependency breaks that point.

I can think of one real-world case this would have broke: the SDL -> sdl12-compat transition. While now the version numbers are ahead, they weren't initially. This would have caused way too much pain to deal with.

@gordonmessmer
Copy link
Contributor Author

I can think of one real-world case this would have broke: the SDL -> sdl12-compat transition

Would that have required Fedora to do more than either bump the so version or, worst case, manually add a Provides: line to the spec file?

@gordonmessmer
Copy link
Contributor Author

(libcurl.so.4()(64bit) → libcurl.so.4()(64bit) >= 4.8.0)

I'm going to start work on an implementation of this soon. I've been looking over elfdeps.c and thinking through implementation details, yesterday.

One thing that comes to mind is that while the Provides: change would be backward compatible, the Requires: change would not. New packages would be generated with Requirements that existing packages wouldn't satisfy, which would require a "rebuild the world" deployment that obsoletes most existing rpm packages. I'm sure there will be push-back against that.

Deployment of that change in a distribution might be slightly less painful if done in phases, where enhanced Provides is enabled first, and Requires in some later release.

...another thing to think about, I guess.

@Conan-Kudo
Copy link
Member

I can think of one real-world case this would have broke: the SDL -> sdl12-compat transition

Would that have required Fedora to do more than either bump the so version or, worst case, manually add a Provides: line to the spec file?

The version scheme was changed to match SDL 1.2 later to deal with weaknesses in the Debian's packaging system. It was originally not that way.

Bumping the soversion would have been a big deal, since it's an ABI break. I'm not really sure this is a good idea.

@gordonmessmer
Copy link
Contributor Author

to deal with weaknesses in the Debian's packaging system

I'm curious... do you have a reference that describes those weaknesses? (I have very little experience with dpkg)

Bumping the soversion would have been a big deal, since it's an ABI break

I'm not suggesting bumping the major version.

@Conan-Kudo
Copy link
Member

Ah, I misremembered when it came to where the weakness was. It had to do with how the Steam Runtime "judges" library versions: libsdl-org/sdl12-compat#53 (comment)

@pmatilai
Copy link
Member

One thing that comes to mind is that while the Provides: change would be backward compatible, the Requires: change would not. New packages would be generated with Requirements that existing packages wouldn't satisfy, which would require a "rebuild the world" deployment that obsoletes most existing rpm packages. I'm sure there will be push-back against that.

That's the annoying thing with any dependency improvements, you need to do them in phases - provides pre-populated preferably on a mass-rebuild and only then enable the require-side. So there will need to be some kind of disable/enable control for the thing.

@gordonmessmer
Copy link
Contributor Author

gordonmessmer commented Feb 1, 2023

I've started work on elfdeps.c... It doesn't resolve symlinks to real paths, so it doesn't get the version number we're actually after yet, and it segfaults if it calls dlmopen() on libgobject, so that might not be a viable approach to resolving a dependency to a path. But if anyone wants to look and express any opinions (especially if there is an easier approach that I have overlooked), please do.

@gordonmessmer gordonmessmer force-pushed the enhance-requires branch 2 times, most recently from 398827f to 41da0a2 Compare February 1, 2023 08:30
@gordonmessmer
Copy link
Contributor Author

This version seems to work as intended. It's more complex than I'd like, but still probably better than parsing the output of ldd.

$ elfdeps --requires --libtool-version-fallback /lib64/libsoup-3.0.so.0
libz.so.1(ZLIB_1.2.0)(64bit)
libgssapi_krb5.so.2(gssapi_krb5_2_MIT)(64bit)
libc.so.6(GLIBC_2.14)(64bit)
libc.so.6(GLIBC_2.4)(64bit)
libc.so.6(GLIBC_2.6)(64bit)
libc.so.6(GLIBC_2.7)(64bit)
libc.so.6(GLIBC_2.3.4)(64bit)
libc.so.6(GLIBC_2.17)(64bit)
libc.so.6(GLIBC_2.34)(64bit)
libc.so.6(GLIBC_2.2.5)(64bit)
libglib-2.0.so.0()(64bit) >= 0.7400.1
libgmodule-2.0.so.0()(64bit) >= 0.7400.1
libgobject-2.0.so.0()(64bit) >= 0.7400.1
libgio-2.0.so.0()(64bit) >= 0.7400.1
libsqlite3.so.0()(64bit) >= 0.8.6
libpsl.so.5()(64bit) >= 5.3.3
libbrotlidec.so.1()(64bit) >= 1.0.9
libgssapi_krb5.so.2()(64bit) >= 2.2
libz.so.1()(64bit) >= 1.2.12
libnghttp2.so.14()(64bit) >= 14.24.1
libc.so.6()(64bit)
rtld(GNU_HASH)

@gordonmessmer
Copy link
Contributor Author

libgssapi_krb5.so.2()(64bit) >= 2.2
libz.so.1()(64bit) >= 1.2.12

Well... almost as intended. Those two are unnecessary since the libraries provide versioned symbols, and shouldn't have additional version information. I'll figure out how to detect and skip that case. In the mean time, opinions on whether the rest of the approach is acceptable are welcome.

@gordonmessmer
Copy link
Contributor Author

If we assumed that the SHT_GNU_verneed header appeared before SHT_DYNAMIC, then when processing the latter, we could loop over the existing ei->requires and look for elements that start with the same filename, and skip the libtool version lookup if one is found.

The only problem is that I don't know if that order is guaranteed.

@gordonmessmer
Copy link
Contributor Author

$ elfdeps --requires --libtool-version-fallback /lib64/libsoup-3.0.so.0
libz.so.1(ZLIB_1.2.0)(64bit)
libgssapi_krb5.so.2(gssapi_krb5_2_MIT)(64bit)
libc.so.6(GLIBC_2.14)(64bit)
libc.so.6(GLIBC_2.4)(64bit)
libc.so.6(GLIBC_2.6)(64bit)
libc.so.6(GLIBC_2.7)(64bit)
libc.so.6(GLIBC_2.3.4)(64bit)
libc.so.6(GLIBC_2.17)(64bit)
libc.so.6(GLIBC_2.34)(64bit)
libc.so.6(GLIBC_2.2.5)(64bit)
libglib-2.0.so.0()(64bit) >= 0.7400.1
libgmodule-2.0.so.0()(64bit) >= 0.7400.1
libgobject-2.0.so.0()(64bit) >= 0.7400.1
libgio-2.0.so.0()(64bit) >= 0.7400.1
libsqlite3.so.0()(64bit) >= 0.8.6
libpsl.so.5()(64bit) >= 5.3.3
libbrotlidec.so.1()(64bit) >= 1.0.9
libgssapi_krb5.so.2()(64bit)
libz.so.1()(64bit)
libnghttp2.so.14()(64bit) >= 14.24.1
libc.so.6()(64bit)
rtld(GNU_HASH)

@gordonmessmer
Copy link
Contributor Author

The last commit includes macros that can be used to enable the fallback dependency versions, which are off by default to provide a phased rollout of the feature.

tools/elfdeps.c Outdated
filename = dest;
}
// Start from the end of the string. Verify that it ends with
// numbers and dots, preceded by ".so.".
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On a purely stylistical nit, // comments do not belong in rpm coding style. Always use /* ... */ and for multiline comments,

/*
 * ...
 * ...
 */

/me realizes this isn't even explained in contributing.md, sigh...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll follow up on each of those items tomorrow. Thanks for the feedback!

These tests are not compatible with fakechroot, for two reasons.
While fakechroot supports dlmopen(), it will convert relative paths
to absolute paths, which prevents library path searching.  The test
could be run from the data directory so that the relative path's
absolute path expansion was correct, and dlmopen() will succeed.
However, linkmap->l_name will have the real path, not a chrooted
path, and when that is passed to readlink(), fakechroot will try
to expand a path that already includes the chroot prefix.  So,
even when the library can be loaded with dlmopen(), the symlink
can't be resolved to find the path that contains the version.
@gordonmessmer
Copy link
Contributor Author

Oh. Yes, I'm subscribed there, so by "other distro folks" I meant non-Fedora people 😄

@pmatilai , is there anything I can do to move that conversation forward?

I've written a description of these changes in the form of a design proposal: https://docs.google.com/document/d/1tnEdxsh95iGkuedJ0VCWXFOawxlrD4xIqMiDqVESk9c/edit?usp=sharing

... and in the form of a Fedora change proposal: https://docs.google.com/document/d/1PcnuoW_DlE4TRqcMJPNqcp0YktO7DX34YobslnT68TI/edit?usp=sharing

(Both are open for comments until spam becomes a problem.)

from the symlink target, generate ".elf-version/<so>" files
containing the value of the _elf_so_version macro.  These files
are then used by the elfdeps tool to generate the provides and
requires.
@gordonmessmer
Copy link
Contributor Author

I've added a commit to illustrate a slightly different approach. Code quality and error handling need more work, and there are no new tests, but I'd like to ignore that for the moment and discuss only whether this approach is worth developing further.

This change adds another file to packages that provide ELF shared objects, and that file contains their version. By default, the version is that of the RPM package, but package maintainers could override the macro so that the maintainers of alternate library implementations can coordinate an ABI version between packages whose version does not match.

This would also support a gradual rollout. Worst-case, it's still possible that any given package might not have enhanced "requires" information until two mass rebuilds were done, but the macros can be enabled at the same time and enhanced information will be added as it becomes available. One of the SUSE list members mentioned that they want to continue to be able to use LEAP packages as dependencies, and I think this approach addresses that concern.

The trade-off is the proliferation of small hidden text files in the lib directories.

Briefly, for example, the libnghttp2 package currently includes:

lrwxrwxrwx. 1 root root 21 Oct 10 17:00 /usr/lib64/libnghttp2.so.14 -> libnghttp2.so.14.24.3*

The previous version of this PR would read the version "14.24.3" from that symlink and change one of the capabilities the package provides from libnghttp2.so.14()(64bit) to libnghttp2.so.14()(64bit) = 14.24.3

This version of the PR would add a file, /usr/lib64/.elf-version/libnghttp2.so.14, which contained a single line reading "1.55.1", which is the version of the package that provides it. It will change one of the capabilities the package provides from libnghttp2.so.14()(64bit) to libnghttp2.so.14()(64bit) = 1.55.1. This was the source of the version that I'd originally proposed, and has the advantage that it also addresses the ~ 7% of libraries that end in ".so.0.0.0".

@Vogtinator
Copy link
Contributor

One of the SUSE list members mentioned that they want to continue to be able to use LEAP packages as dependencies, and I think this approach addresses that concern.

Is it about using library packages from Leap (old) on Tumbleweed (new)? If so, I'd argue that is simply invalid and basically what this approach should protect against.

@pmatilai
Copy link
Member

@mlschroe , thoughts on this?

@gordonmessmer
Copy link
Contributor Author

gordonmessmer commented Mar 20, 2024

Is it about using library packages from Leap (old) on Tumbleweed (new)? If so, I'd argue that is simply invalid and basically what this approach should protect against.

It's not necessarily invalid.

In Fedora, packages that fail to build from source are eventually retired along with all of their dependencies, and (basically) everything is rebuilt for every release, so we wouldn't expect old libraries to mix with new applications, there. But if SUSE doesn't retire those packages, then they could continue using an older library package in a new build root, when they build a new application that depends on an old library.

In the previous version of this PR, the new binary would express a dependency that the library's package doesn't provide. In this version, packages won't express a versioned dependency on libraries that don't provide it, because they won't have a ".elf-version" file.

@Vogtinator
Copy link
Contributor

In Fedora, packages that fail to build from source are eventually retired along with all of their dependencies, and (basically) everything is rebuilt for every release, so we wouldn't expect old libraries to mix with new applications, there. But if SUSE doesn't retire those packages, then they could continue using an older library package in a new build root, when they build a new application that depends on an old library.

For Leap vs. Tumbleweed there's a ~6y gap for some packages so I don't expect any compatibility there.

What can happen is that e.g Leap 15.6 with this PR builds an application against a library from 15.5 built without this PR. That still needs to be installable, either using your .elf-version approach or maybe some other that inspects which capabilities the library provides.

@gordonmessmer
Copy link
Contributor Author

What can happen is that e.g Leap 15.6 with this PR builds an application against a library from 15.5 built without this PR. That still needs to be installable

/me nods

With the .elf-versions file approach, it would be. The library from 15.5 wouldn't have the .elf-versions file, so any application built in a 15.6 build root where that package was present would simply continue to express the same type of un-versioned dependency that it does today.

@gordonmessmer
Copy link
Contributor Author

The other thing I'm wondering is whether we can (or should) label the .elf-version files, so that package managers can skip installing them, the same way that documentation is not installed in container images (RPMTRANS_FLAG_NODOCS). The .elf-version files aren't needed at runtime, they're only needed in build roots.

@Vogtinator
Copy link
Contributor

Vogtinator commented Mar 20, 2024

The other thing I'm wondering is whether we can (or should) label the .elf-version files, so that package managers can skip installing them, the same way that documentation is not installed in container images (RPMTRANS_FLAG_NODOCS). The .elf-version files aren't needed at runtime, they're only needed in build roots.

It's arguably package metadata and doesn't belong in the filelist at all if possible IMO

@gordonmessmer
Copy link
Contributor Author

It's arguably package metadata and doesn't belong in the filelist at all if possible IMO

Generally, I agree, but I'm operating on the principal that elfdeps shouldn't access the RPM DB during the build.

(And if we did access the DB for metadata, I'm not sure how we'd allow packagers to override the version. Would we need to introduce a new field?)

@mcatanzaro
Copy link

One possible disadvantage: you wouldn't be able to e.g. dnf downgrade xz* without also downgrading everything that was built against xz. (You might also consider that an advantage, but most users probably wouldn't.)

@hroncok
Copy link
Contributor

hroncok commented Apr 3, 2024

That's currently possible and can lead to various subtle runtime failures instead.

@gordonmessmer
Copy link
Contributor Author

One possible disadvantage: you wouldn't be able to e.g. dnf downgrade xz*

I think it's important to differentiate the real binary dependencies from RPM's knowledge of those dependencies.

In Fedora 40, it was safe to downgrade xz because libsystemd had been built before xz 5.6. If it had been built after, that might not have been true.

xz 5.6 introduced at least one new symbol: lzma_mt_block_size. It's not certain that libsystemd would have had a reference to this symbol if it had been built in a build root with xz 5.6, but it could have. And if it did, then reverting to xz-5.4 would have caused everything linked to libsystemd to fail to start. Fedora's maintainers would surely have noticed that, and would have rebuild systemd in a build root with xz-5.4, and pushed that update along with the xz-5.4 downgrade, but hypothetically, users might have decided that only xz was worth updating. If they'd updated only the revert-to-5.4 xz packages, the result would have been a disaster on those systems.

But as you point out, if this change was already deployed, then downgrading xz would also downgrade everything that was built against xz-5.6, which would have avoided any users mistakenly breaking everything linked to a hypothetical libsystemd that required symbols in 5.6.

So as I see it, the xz downgrade is a really good illustration of why this change is needed. We were lucky that we were able to safely downgrade that package. We should not rely on luck.

@gordonmessmer
Copy link
Contributor Author

gordonmessmer commented Apr 3, 2024

As far as I know, the blocking issue here is simply a decision about where to get the version of the library. Among others, options include:

1: the rpm version of the package that owns the library. Not a good solution because I think the maintainers don't want elfdeps to access the RPM DB during the build, and because it'd make interchangeable library implementations much harder.

2: a version string derived from the target of the symlink that ELF files refer to. This requires some coordination for interchangeable libraries. It also requires a two-phase rollout in which packages "provide" new capabilities and then later "require" those new capabilities. And once that two-phase rollout is complete, old builds (which might be present in SUSE) can't be used any more.

3: a version string stored in a text file, defaulting to the version of the package that provides the library, but optionally defined by the packagers. Requires no coordination upstream, and supports organic rollout.

4: the same version stored somewhere else. Maybe in an extended attribute. Do extended attributes work properly in environments like mock container builds?

5: do what Debian does, and make package maintainers maintain versioned symbol lists. This supports interchangeable library implementations, and organic roll-out. It also requires a lot of effort on package maintainers' parts, and I expect that the implementation would be significantly more work than the other options, unless we adapted their dep generator scripts.

@gordonmessmer
Copy link
Contributor Author

Periodic check in:

Please let me know if I can do anything to move this forward.

@Conan-Kudo
Copy link
Member

As I think about this, I come back to my long-ago closed ticket #362. Fundamentally, we need stronger symbol dependencies represented somehow, and we need it in a way that we don't bind things to specific implementations so that switching is still workable.

@gordonmessmer
Copy link
Contributor Author

I'm hoping to come back to this soon. I'm moving tomorrow after a couple of months of working on that, and I have a vacation with family almost immediately after that. I might be able to put forward one more experimental proposal in between.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants