-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Serialize GAP calls from within Julia. #505
base: master
Are you sure you want to change the base?
Conversation
Codecov Report
@@ Coverage Diff @@
## master #505 +/- ##
==========================================
- Coverage 72.35% 72.31% -0.04%
==========================================
Files 71 72 +1
Lines 5266 5461 +195
==========================================
+ Hits 3810 3949 +139
- Misses 1456 1512 +56
|
src/sync.jl
Outdated
@inline function lock() | ||
tid = Threads.threadid() | ||
if sync_level[tid] == 0 | ||
Base.lock(mutex) | ||
end | ||
sync_level[tid] += 1 | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand: We are using a ReentrantLock
. Why do we need to keep track on every thread whether we already locked the lock? I.e. why do we need sync_level
?
Is it because locks are per-task but you want per-thread locks? But why? Shouldn't we prevent any attempts to invoke GAP simultaneously from two different tasks, even if they happen to run on the same thread? I mean, GAP still won't expect two concurrent executions and things can and will break in various ways, no?
(BTW, note that at least in some build configurations, tasks can migrate between threads... however, I have no idea if this can happen while the task holds a lock is being held... hopefully not).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is something that's still a bit of a work in progress.
Julia has ReentrantLock
, which is reentrant on a per-task basis. It also has SpinLock
, which works on a per-thread basis, but will eat CPU while blocking. So, in a way, that's a placeholder implementation, and I think I may actually have to wrap a pthread mutex for this to work properly.
Even if ReentrantLock
works, the lock()
method for ReentrantLock()
is comparatively expensive, which is why I tried to optimize it for the nested case.
(ReentrantLock
working on a per-task basis is also what I suspect may be one of the causes for the crashes I'm seeing.)
With the Julia @sync macros now doing the actual synchronization work, the C implementation is no longer needed and can be removed.
Locking relied on tasks not being migrated between threads; we now associate the lock status with the locking task.
Macros and precompilation do not necessarily mix in Julia. If macro evaluation is dependent on program state and the program state changes later, the macro does not get reevaluated. However, Julia does track function dependencies, so we can force recompilation by altering any underlying regular functions.
e8e3cdf
to
b4125ba
Compare
Did you have a look at replacing |
I have played around with various implementations, but haven't really seen any performance differences, especially as there is also the try/catch overhead, which appears to be of about the same magnitude (and also potentially disrupts optimizations). I actually removed the check to see if the lock is already being held, as it performed slightly worse and added unnecessary complexity. A reentrant thread lock might still be useful in allowing several tasks to run GAP code on the same thread, but the logic then becomes a bit tricky. I may be looking into biased locking, especially as it might allow us to avoid exception management in some cases. This would also be useful for the Singular.jl code. |
We allow for two alternative synchronization modes. One uses a standard mutex, the other pins GAP to a specific thread (the main thread by default). When pinned to a thread, GAP can only be called from that thread. Calling it from another thread will result in an error. While pinning is more restrictive than locking, it is also considerably faster, incurring very little overhead.
src/GAP.jl
Outdated
@@ -383,10 +384,10 @@ function prompt() | |||
# save the current SIGINT handler | |||
# HACK: the hardcoded value for SIG_DFL is not portable, revise this | |||
# install GAP's SIGINT handler | |||
old_sigint = ccall(:signal, Ptr{Cvoid}, (Cint, Ptr{Cvoid}), Base.SIGINT, C_NULL) | |||
old_sigint = @sync ccall(:signal, Ptr{Cvoid}, (Cint, Ptr{Cvoid}), Base.SIGINT, C_NULL) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does that make sense for a call into the C library? Anyway, perhaps we can just restrict GAP.prompt() to the main thread or so somehow? If we ever end up racing this function from multiple threads, all hell will break loose anyway...
end)) | ||
end | ||
|
||
function switch(mode::SyncMode) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this function exist only for debugging / benchmarking? Or whatever else is its purpose? In either case, a brief source comment indicating its purpose would be nice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The goal here is to let an (experienced) user switch between locking modes. By default, this pins the GAP interpreter to the main thread. However, one can then also switch to full mutex mode in case GAP needs to be accessed from more than one thread, or to turn off locking in case somebody wants to avoid the overhead involved and takes care of synchronization in a different fashion, e.g. via using @sync
directly.
I agree that there should be a comment and will be adding one. I'll probably also rename @sync
, since Julia's Base.@sync
has different semantics and that would be confusing.
# by installing the proper version during __init__(), we force | ||
# selective recompilation of the affected functions as needed. | ||
|
||
function _mode_mutex() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is unused in practice, it seems (as nothing calls switch()
, it's single caller). It should either be removed or else a brief source comment be added which explains why it's here.
This needs to be rebased, and we need to figure out what to do with it... |
This is a draft version for ensuring that GAP is not called from more than one Julia thread at a time.
It currently works with
Threads.nthreads() == 1
, but not if you, say,export JULIA_NUM_THREADS=4
and then run the following test:Most likely, there is still a call to GAP that is not properly synced and causing the resultant crashes.
Closes #483