Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potenial fd leak with clang lto and cause too many open files #1362

Open
karuboniru opened this issue Oct 21, 2024 · 0 comments · May be fixed by #1363
Open

Potenial fd leak with clang lto and cause too many open files #1362

karuboniru opened this issue Oct 21, 2024 · 0 comments · May be fixed by #1363

Comments

@karuboniru
Copy link

karuboniru commented Oct 21, 2024

My Environment

$ ld --version     
mold 2.34.1 (compatible with GNU ld)

$ clang++ --version                                                                                                                   
clang version 20.0.0pre20241012.gb1746894deebe3 (Fedora 20.0.0~pre20241012.gb1746894deebe3-2.fc41)
Target: x86_64-redhat-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
Configuration file: /etc/clang/x86_64-redhat-linux-gnu-clang++.cfg

$ ulimit -a 
-t: cpu time (seconds)              unlimited
-f: file size (blocks)              unlimited
-d: data seg size (kbytes)          unlimited
-s: stack size (kbytes)             8192
-c: core file size (blocks)         unlimited
-m: resident set size (kbytes)      unlimited
-u: processes                       511419
-n: file descriptors                1024
-l: locked-in-memory size (kbytes)  8192
-v: address space (kbytes)          unlimited
-x: file locks                      unlimited
-i: pending signals                 511419
-q: bytes in POSIX msg queues       819200
-e: max nice                        0
-r: max rt priority                 0
-N 15: rt cpu time (microseconds)   unlimited

  • Project being built: https://github.com/Geant4/geant4/releases/tag/v11.2.2
  • Error message:
    /usr/bin/clang++ -fPIC -W -Wall -pedantic -Wno-non-virtual-dtor -Wno-long-long -Wwrite-strings -Wpointer-arith -Woverloaded-virtual -Wno-variadic-macros -Wshadow -pipe -Qunused-arguments -DGL_SILENCE_DEPRECATION -pthread  -O2 -g -DNDEBUG -flto=thin   -shared -Wl,-soname,libG4processes.so -o BuildProducts/lib64/libG4processes.so @CMakeFiles/G4processes.rsp 
    mold: fatal: opening source/CMakeFiles/G4processes.dir/processes/hadronic/models/im_r_matrix/src/G4CollisionNNToDeltaDelta.cc.o failed: Too many open files
    clang++: error: linker command failed with exit code 1 (use -v to see invocation)
    
  • related input to clang (it is one object file per line)
    $ cat CMakeFiles/G4processes.rsp |wc -l              
    1710
    

Only reproducable with:

  • LTO enabled
  • mold is used
    So not reporting to llvm project as lld or bfd can do the link with same ulimit.
  • Can be workarounded by raising ulimit -n

strace -f: (part of, full strace result (gzipped))

[pid 320263] openat(AT_FDCWD, "source/CMakeFiles/G4processes.dir/processes/hadronic/models/im_r_matrix/src/G4CollisionMesonBaryon.cc.o", O_RDONLY) = 1014
[pid 320263] openat(AT_FDCWD, "source/CMakeFiles/G4processes.dir/processes/hadronic/models/im_r_matrix/src/G4XNDeltaTable.cc.o", O_RDONLY) = 1015
[pid 320263] fstat(1015, {st_mode=S_IFREG|0644, st_size=54628, ...}) = 0
[pid 320263] mmap(NULL, 54628, PROT_READ|PROT_WRITE, MAP_PRIVATE, 1015, 0) = 0x7f08aec9b000
[pid 320263] close(1015)                = 0
[pid 320263] openat(AT_FDCWD, "source/CMakeFiles/G4processes.dir/processes/hadronic/models/im_r_matrix/src/G4XNDeltaTable.cc.o", O_RDONLY) = 1015
[pid 320263] openat(AT_FDCWD, "source/CMakeFiles/G4processes.dir/processes/hadronic/models/im_r_matrix/src/G4CollisionMesonBaryonElastic.cc.o", O_RDONLY) = 1016
[pid 320263] fstat(1016, {st_mode=S_IFREG|0644, st_size=174592, ...}) = 0
[pid 320263] mmap(NULL, 174592, PROT_READ|PROT_WRITE, MAP_PRIVATE, 1016, 0) = 0x7f08aec70000
[pid 320263] close(1016)                = 0
[pid 320263] openat(AT_FDCWD, "source/CMakeFiles/G4processes.dir/processes/hadronic/models/im_r_matrix/src/G4CollisionMesonBaryonElastic.cc.o", O_RDONLY) = 1016
[pid 320263] openat(AT_FDCWD, "source/CMakeFiles/G4processes.dir/processes/hadronic/models/im_r_matrix/src/G4XNNElastic.cc.o", O_RDONLY) = 1017
[pid 320263] fstat(1017, {st_mode=S_IFREG|0644, st_size=161368, ...}) = 0
[pid 320263] mmap(NULL, 161368, PROT_READ|PROT_WRITE, MAP_PRIVATE, 1017, 0) = 0x7f08aec48000
[pid 320263] close(1017)                = 0

This follows a pattern like: [open a object file] -> [fstat a file] -> [mmap from the file] -> [close the file] (expect to start proceeding new file) -> [open the file again] (leak here)


By attaching gdb and reading the code, it seems the files are opened from mold::create_plugin_input_file and the comments at

mold/src/lto-unix.cc

Lines 640 to 646 in 0841ffc

// It looks like GCC doesn't need fd after claim_file_hook() while
// LLVM needs it and takes the ownership of fd. To prevent "too many
// open files" issue, we close fd only for GCC. This is ugly, though.
if (!is_llvm(ctx)) {
MappedFile *mf2 = mf->parent ? mf->parent : mf;
mf2->close_fd();
}
explains why the opened files should not be closed for llvm lto. 🤔

@karuboniru karuboniru linked a pull request Oct 21, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant