-
Notifications
You must be signed in to change notification settings - Fork 21
Translation of execve()
To load and run an executable, a process typically uses the ``execve()` system call.
int execve(const char *pathname, char *const argv[], char *const envp[]);
The meaning of the arguments is as follows.
-
pathname
: The path of file that is going to be executed. -
argv[]
: an array of pointers to strings passed to the new program as its command-line arguments.-
argv[0]
being the program name is a custom.
-
-
envp[]
: an array of pointers to strings, conventionally of the formkey=value
.
The kernel is responsible for parsing and loading the target executable and its dynamic linker/loader (if exists).
Unlike other system calls, in addition to path translation, proot-rs is required to do the loading of the executable instead of the kernel. This is because we need to consider the following scenarios.
- If the executable being executed is an ELF file and contains a segment named
PT_INTERP
, then this means that it is a dynamically linked executable. kernel, when loading this program, will also load thedynamic linker/loader
program pointed to by the path specified byPT_INTERP
. This process takes place in the kernel, which means that the path todynamic linker/loader
is not translated. Refer to LWN - How programs get run: ELF binaries - If the executable file executed is a script file and starts with a
#!interpreter [optional-arg]
(shebang), then the kernel replaces the command line withinterpreter [optional-arg] script-file arg when loading the program ...
. This process also happens in the kernel, which means that the path tointerpreter
is not translated. Refer to man page execve(2)
To avoid missing path translations, proot-rs implements a loading process similar to the one in the kernel, and implements a custom loader-shim
to load the required files. Currently, both ELF and shebang executables are supported for loading.
Here's a diagram that describes the translation of execve().
When tracee executes the execve()
system call. proot-rs will enter this phase. In this phase, the kernel has not really executed the logic of execve()
yet, and we can modify the parameters of execve()
at this time.
It will first try to load the target executable based on the command line arguments provided by execve()
. We may update the command line in the process and then trigger a retry. Of course the number of such retries will be limited, otherwise an ELOOP
error will be returned.
Our goal is to generate LoadInfo
, which will be used in syscall-exit-stop.
The most important point is that we need to replace the first argument of execve()
with the path to loader-shim
. This makes it possible to execute the code we control first after execve()
returns.
In this phase, loader-shim
has been loaded into the tracee process, and we need to generate the load script based on the LoadInfo
generated in the previous phase, and write it to tracee's memory space. load script, which is a collection of LoadStatements
, is defined in loader-shim/src/script.rs
. loader-shim
loads the real dynamic linker/loader and the ELF file that needs to be executed according to these statements. After this is done, loader-shim
jump to the program entry point. This is the end of its mission.