Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

issue: 3884801 Check for resident hugepages to avoid SIGBUS #204

Open
wants to merge 2 commits into
base: vNext
Choose a base branch
from

Conversation

pasis
Copy link
Member

@pasis pasis commented Jul 29, 2024

Description

In a containerized environment mmap() beyond the hugepages limit can be successful, but a further memory access triggers SIGBUS and terminates the process. Check that all the allocated hugepages are resident with mincore() to avoid the SIGBUS issue.

Cost of the feature depends on the allocation size and hugepage size:

  • 32GB with 2MB pages takes 4400 us
  • 32GB with 1GB pages takes 11 us
  • 32MB with 2MB pages takes 5-6 us

We expect a big memory preallocation at the start (2GB by default) and
rare 32MB allocations during warmup period. Therefore, this feature can
add several additional 5us latency spikes during warmup period.

What

Check for resident hugepages after allocation.

Why ?

Avoid SIGBUS in containerized environments.

Change type

What kind of change does this PR introduce?

  • Bugfix
  • Feature
  • Code style update
  • Refactoring (no functional changes, no api changes)
  • Build related changes
  • CI related changes
  • Documentation content changes
  • Tests
  • Other

Check list

  • Code follows the style de facto guidelines of this project
  • Comments have been inserted in hard to understand places
  • Documentation has been updated (if necessary)
  • Test has been added (if possible)

@pasis pasis added the draft Not to review yet label Jul 29, 2024
@pasis pasis changed the title issue: Check for resident hugepages to avoid SIGBUS issue: 3884801 Check for resident hugepages to avoid SIGBUS Jul 29, 2024
@pasis pasis added enhancement New feature or request and removed draft Not to review yet labels Jul 29, 2024
In a containerized environment mmap() beyond the hugepages limit can be
successful, but a further memory access triggers SIGBUS and terminates
the process. Check that all the allocated hugepages are resident with
mincore() to avoid the SIGBUS issue.

Cost of the feature depends on the allocation size and hugepage size:
 - 32GB with 2MB pages takes 4400 us
 - 32GB with 1GB pages takes 11 us
 - 32MB with 2MB pages takes 5-6 us

We expect a big memory preallocation at the start (2GB by default) and
rare 32MB allocations during warmup period. Therefore, this feature can
add several additional 5us latency spikes during warmup period.

Signed-off-by: Dmytro Podgornyi <[email protected]>
The parameter disables the check for resident hugepages. The check can
be expensive and, on the other hand, can be irrelevant based on the
system configuration. Therefore, disabling it reduces the initialization
time which can be crucial in some scenarios.

The name for the parameter is chosen to extend the idea of reducing
initialization time in the future.

Signed-off-by: Dmytro Podgornyi <[email protected]>
for (size_t i = 0; rc == 0 && i < pages_nr; ++i) {
unsigned char vec;
rc = mincore(page_ptr, 1, &vec);
resident_nr += rc == 0 ? (vec & 1U) : 0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need to count or we can just fail on the first bad hugepage ptr?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can stop, however, this is expected to be very unlikely case, so no point in this optimization and I'd suggest choosing the more readable variant (not necessarily current variant)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think that stopping on the first one may create simpler impl, maybe even without many different dbg_logs.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also we can loop while ptr < end_ptr ... and avoid many different vars and stuff

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request to_merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants