-
Notifications
You must be signed in to change notification settings - Fork 467
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Run dotCMS with jemalloc #30619
Labels
Comments
wezell
added a commit
that referenced
this issue
Nov 11, 2024
…efault but is just available. ref #30619
wezell
added a commit
that referenced
this issue
Nov 11, 2024
wezell
added a commit
that referenced
this issue
Nov 11, 2024
Merged
wezell
added a commit
that referenced
this issue
Nov 11, 2024
wezell
added a commit
that referenced
this issue
Nov 11, 2024
github-project-automation
bot
moved this from New
to Internal QA
in dotCMS - Product Planning
Nov 12, 2024
Merged
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Parent Issue
No response
Task
Running dotCMS in our cloud infra, we have seen many pods getting OOM killed at the container level. This happens even if it seems that the JVM has plenty of heap headroom. In a more perfect world, I would expect to see the jvm die with an internal OOM at which point, we would know that, hey,
java
needs a bigger-Xmx
, but this is not what is happening. Instead it seems that these containers are using untracked/off-heap/system memory which seems to grow in some cases and results in the containers getting killed.Right now, our best guidance on sizing dotCMS's JVM in a container with large heaps is to run with
-Xmx
set to ~65% of available memory. This means that if we want to run with a 10GB heap, we need a 16GB RAM limit on the pod. 4GB overhead for the underlying OS is kind nuts and leads to resource over-allocation and excessive costs. It would be ideal (and more $$ efficient) if we could tighten that up to say, run-Xmx10g
in a pod with 12GB RAM limit.It seems that in some cases libraries that rely on JNI and "unsafe" off-heap memory allocations can cause system memory usage to leak/grow in a way that is very difficult to track. Apparently Linux's default memory allocator
malloc
can stack memory usage in a way that makes it impossible to be reclaimed by the system. A fix for this is to replace glibc'smalloc
implementation with on that does a better job allowing memory to be reclaimed -jemalloc
is a memory allocator implementation that prevents memory fragmentation and allows system memory to be reclaimed.I was looking into implementing a new image filter using
libvips
, which is a high performance image lib. This would rely on a JNI implementation. From reading aboutlibvips
, running it can cause memory usage to grow unbounded unless you usejemalloc
or some other non-default memory allocator. This got me thinking that this might be some of our problems too. I know we use JNI in a number of places, including our image resizing libraries andsaas
compiler and with all the libs we include, we probably are using a bunch of "unsafe" operations in a number of places. I don't have a smoking gun test case but my gut is that moving tojemalloc
has very little downsides and also has the very real possibility of improving our container memory usage profileProposed Objective
Application Performance
Proposed Priority
Priority 2 - Important
Acceptance Criteria
download the dotCMS docker image and run
You should see jemalloc stat output printed out.
External Links... Slack Conversations, Support Tickets, Figma Designs, etc.
References:
Assumptions & Initiation Needs
No response
Quality Assurance Notes & Workarounds
No response
Sub-Tasks & Estimates
No response
The text was updated successfully, but these errors were encountered: