-
Notifications
You must be signed in to change notification settings - Fork 242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possible memory leak when collecting events for profiling #625
Comments
Can something like valgrind maybe provide details on where those allocations are taking place? |
Here are the |
I run it several other times and it looks like the profiling in OpenCL prevents the memory from being freed. |
So I tired to collect only timestamps for each event instead of the complete process. The memory profile looks like this, now. One would have expected 10 memory free (since 10 files are processed) but fewer are visible. |
I got struck by something similar in another project ... but profiling was not involved this time. |
Describe the bug
Very large (host) memory consumption has been observed when running OpenCL application in profiling mode.
Example: Processing 10000 4Mpix images (int32) with ~6 kernels per image on a nvidia Tesla A40 gets (OOM-) killed on a computer with 200GB of memory. The computer could host all images, uncompressed, in memory.
I used the
tracemalloc
tool from Python on the application without noticeable leak (at the Python level) indicating that the leak was from malloc performed outside the scope of Python. I investigated a possible leak coming fromHDF5
via theh5py
since all data were read and written in this format. but this was not the case.When profiling is disabled, the memory consumption does not exceed few percent of the total memory.
To Reproduce
Investigated in:
silx-kit/pyFAI#1744
Expected behavior
A memory leak is expected from keeping the list of all events, but should not exceed 3.4 MB for 60000 kernels (when stored as 2-namedtuple)
Environment (please complete the following information):
Additional context
The list of event is handled at https://github.com/silx-kit/silx/blob/master/src/silx/opencl/processing.py#L288
The text was updated successfully, but these errors were encountered: