-
Notifications
You must be signed in to change notification settings - Fork 199
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft: Update core assignment algorithm in benchexec/resources.py #892
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a preliminary review with some hints, mostly on code style and documentation. Please fix these issues and provide documentation, such that a full review becomes possible and makes sense.
As part of this (as first step actually), please format the source code with the formatter black and make sure to use it in the future for each commit. Consistent code formatting is a big help for readability.
And please also have a look at all the other CI failures. After each commit these checks will run automatically, so always check whether CI is green for your most recent commit. The check check-format
is fixed automatically if you use the code formatter, and reuse
only complains about the missing copyright header. But flake8
and pytype
provide hints about potential errors in your code. And of course the unit-tests
checks just execute our tests.
This method adds a "root" hierarchy level, if the system topology doesnt have one. Necessary for iterating through the whole topology.
now chooses the right starting core for the next thread
The new get_cpu_distribution method has no information about partial physical cores anymore, this is checked outside of it.
It is better to have CI green to be able to notice further regressions.
Also add unit tests for this function.
Also do not use "raise Exception", use assert to encode coding assumptions.
Only the standard format is lazy.
- Function name starting with "read" to indicate it reads from kernel. - Parameters in better order. - Identifier naming according to Python standard. - Actually use generic identifiers in a generic function and not names that are specific to one use case. - Also replace all trivial callers with a single function.
- Crucial constants should be present only once, documented, and defined in a central place. - Reading from the system and logic should be separate such that the latter is testable. - For reading from the system we can use an existing helper method. - Add tests.
Uses of plain dicts may catch errors in callers earlier. Furthermore, some of the functions even returned a defaultdict in some cases and a plain dict in other cases. The return type should be consistent. With dict.setdefault() the use of a plain dict is almost as convenient as a defaultdict.
…lity It is not really necessary.
We always want the user to allow us to use entire physical cores. This check was broken, because forbidden sibling cores were already removed from the data structure before the check. Furthermore, cores forbidden via cgroups and via the --allowedCores parameter were treated somehow differently, but the effect should be exactly the same.
So far we read the information about the hyperthreading hierarchy level differently from the other levels. This made the code more difficult to understand, and the way how the ids in the hierarchy_levels[0] dict were chosen differed from the other levels. But we can also read this information in the same way as for the other levels, so let's do this. We still also need to use the previous way of reading all siblings from a given list of cores, but we can also simplify that and the separation of concerns still provides an understandability benefit.
The allocation algorithm already supports an arbitrary number of levels, so we can future proof the allocation and read all information about cache levels that the kernel provides. We can also use the assumption that caches are named the same across all cores, and read the cache names only once instead of separately for every core.
This method actually has nothing to do with "sub" units (children), it just takes a set of cores and a level and groups the cores as appropriate for the level. So the names should reflect that.
To allow easier generation of new tests (where we ideally can automatically generate tests for a large number of (also weird) CPU configurations), it's desireable to be able to specify arbitrary layers more or less directly, without having to create new test classes. The final goal is to be able to generate machine configurations given a single argument (or two arguments: layer configuration as a list and total core count), so we can utilize pytest to write easily maintainable test cases.
As the new method already keeps the layers in the correct order and doesn't create duplicate layers, we can remove those parts from the code
After showing that the new layer generation code is equivalent to the existing one, we can remove the old code with the hardcoded layers and use the new code.
Referring to issue #748, the core assignment can now handle additional hierarchy layers (such as a shared L3 cache).
The addition of further layers can be implemented without knowing the exact topology of a machine - the hierarchy of the layers (CPUs, NUMA nodes, L3 caches, hyperthreading, etc) is determined by the algorithm.
Fixes #748
Fixes #850
Kernel documentation: