-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cancellation fixes and tests. #445
Conversation
chronos/asyncfutures2.nim
Outdated
## done e.g. changes its state (become completed, failed or cancelled). | ||
## | ||
## If ``fut`` is already finished (completed, failed or cancelled) result | ||
## Future[void] object will be returned complete. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
void?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cancelAndWait()
returns Future[void]
which will become complete when future: FutureBase
become finished.
chronos/asyncfutures2.nim
Outdated
|
||
proc checkedCancelAndWait*(fut: FutureBase): Future[bool] = | ||
## Initiate cancellation process for Future ``fut`` and wait until ``fut`` is | ||
## done e.g. changes its state (become completed, failed or cancelled). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
## done e.g. changes its state (become completed, failed or cancelled). | |
## finished e.g. changes its state (become completed, failed or cancelled). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initially it was supposed that cancel()
should return boolean flag which indicates:
true
if cancellation was successful - e.g. last Future in sequence was moved from Pending
to Cancelled
, all future callbacks got scheduled.
false
if cancellation was unsuccessfull, e.g. we encounter any Future in sequence which are not Pending
. It means that this sequence of Futures are moving towards finish and we can't cancel it.
But somehow this flag was discarded and not used, so i expect there present a lot of source code which do not respect boolean result and do not expect cancel(): boolean
. So checkCancelAndWait
it is same as cancelAndWait
but it returns flag which indicates is future was actually cancelled or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what does the boolean mean though? It feels like it's too weak to be useful - an async operation is typically made up of many sub-operations and when we start cancelling things, some of these may have completed and some may be cancelled - I'm not sure a boolean alone can capture this complexity in a useful way and this might be why everyone discards it.
Basically, there are three options: "it was cancelled", "it finished" and "don't know/partial result" and the boolean doesn't represent this well enough to be useful practically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cancellation could be compared to "message delivery", so this boolean flag indicates whether message was delivered to the bottom layer of target Future
or it was not delivered.
what does the boolean mean though? It feels like it's too weak to be useful - an async operation is typically made up of many sub-operations and when we start cancelling things, some of these may have completed and some may be cancelled - I'm not sure a boolean alone can capture this complexity in a useful way and this might be why everyone discards it.
chronos tracks dependent Futures
and cancellation using this tracking mechanism to deliver cancellation message to the bottom layer. So if bottom layer or any other Future before bottom layer will have status != Pending
it will stop at this moment and returns false
. But if the bottom layer Future has status == Pending
, it will become Cancelled
. And in this case cancel()
will return true
.
chronos/asyncfutures2.nim
Outdated
res = fut.checkedCancel() | ||
retFuture | ||
|
||
proc noCancelWait*[T](future: Future[T]): Future[T] = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here - when should this be used? we have 4 cancellation-related functions now and the precise conditions where one should be used over the other is .. confusing (checkedCancelAndWait
, cancelAndWait
, cancel
and noCancelWait
)
if this function does what I think it does, await withoutCancellation(...)
is probably a better name - this function does no "waiting" on its own
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This procedure will create a Future[T]
object which will have OwnCancelSchedule
and it acts like a proxy for any Future[T]. In this case cancellation
mechanism unable to go through this proxy Future[T]
(it will not see any children behind it) and because it only see proxy Future[T]
it is unable to schedule cancellation callbacks because of OwnCancelSchedule
flag.
This procedure should be used only for closing procedures which are usually used to free resources. This procedures should not be cancelled to avoid resource leaks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This procedure should be used only for closing procedures which are usually used to free resources. This procedures should not be cancelled to avoid resource leaks.
this needs to be documented clearly, something like "This function introduces a cancellation barrier that prevents cancellations from propagating to the given future
thus ensuring that even during cancellation, the work of future
is finished.
Disabling cancellation is useful during resource cleanup to avoid leaks, but may also lead to delays or hangs when future
takes long to finish, thus breaking expectations of downstream code on cancellation being fast."
We'll have to be pretty careful about this - ie if a cancellation suddenly takes several second, for example to complete a "proper" ssl shutdown, this will certainly mess up any code that waits for the cancellation to finish - imagine a timeout on sending an attestation via rest: if we block everything until close is done on one of N connections, this can seriously mess up progress on the next attestation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cancellation process is always fast as lightning, it just follows recursively through Future's linked list, searching for the bottom Future. So lets call cancel()
- cancellation, because cancelAndWait
is not a cancellation process. At the end of cancelAndWait()
you could get Future at any state except Pending
. Because some Future[T] code could catch CancelledError
and convert this signal to either error or value.
When you talking about freezes or blocks you need to understand that OS could block you, or your closeWait
procedure could wait for event/signal/mutex
and become blocking. Nothing we can do at this moment.
What we trying to do with this primitive is to avoid cancellation which could come while we already have "result" but perform resources cleanup in defer
or finally
branch.
But of course people could exploit this procedure usage and start using it everywhere. In this case such loops, will never be interrupted.
while true:
await noCancelWait sleepAsync(1.seconds)
echo "Slot start"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case such loops, will never be interrupted.
yes - this is what I mean when I say it might hang: the calling code wants to stop ASAP but the function itself blocks cancellation.
It applies to await closeWait
in particular - if closeWait
, like in ssl, sends some shutdown marker to close the stream gracefully, await closeWait
might take long if connectivity is bad - if we force waiting for closeWait
we get hangs and this is risky.
Perhaps another way to avoid the closeWait
leak in particular is to allow cancellation to propagate but change its handling: for example, if closeWait
receives a cancellation request, it would release the resource it's holding but do so "abortively" without waiting further - this would work well for ssl and the like at least - a database connection could do the same, ie optimistically send a "rollback" command and close the connection without waiting for a response. In this sense, closeWait
means "try to close gracefully" and cancelling closeWait
means "close forcefully" - this strategy also avoids a leak.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
closeWait
could catch CancelledError
and perform any type of cancellation behavior it wants.
And about blocking calls in async, we do cooperative multitasking and we do not have thread pool execution so yes - we have a flaw and anybody could run blocking function and do something bad to every other actors in async loop.
I do not care and nobody actually should care about performance of cancelAndWait
. If you need performance and do not care about result, you can use cancel()
. If you want to avoid some re-circulation issues you can do cancelAndWait
. But you should not expected that cancelAndWait
will be fast or slow.
chronos/asyncfutures2.nim
Outdated
|
||
future.internalMustCancel = true | ||
return true | ||
if FutureFlag.OwnCancelSchedule notin future.internalFlags: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of a flag, could the cancelcb
return a boolean?
The code is a bit confusing because both OwnCancelSchedule
and an empty cancelcb is set for this to work - we could perhaps avoid the state and associated complexity by making this a local decision in the cb.
we can emulate the old behavior in the cancelcb=
function by wrapping it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cancelcb
could not return a flag, because it will introduce a lot of breaking changes for all projects which has Future[T] closure procedures with cancellation callbacks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
breaking changes
as mentioned, we can maybe work around this with a wrapper, ie
proc cancelCb(fut: auto, cb: OldCbType) =
fut.internalcancelcb = proc(): bool =
cb()
true
as is, it's .. difficult to write correct callbacks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this would also go in the 4.0 branch so it's a good opportunity
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not like and i will not agree on any changes with callbacks, having multiple callback types just complicate problem solving. So now i prefer Future flags in here, because it
- Extensible, i can always add/remove flags at any moment without breaking old code. While introducing new callbacks requires deprecation path and so on. More dirty solution as for me.
- Almost not visible for end users of library.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be hard to explain why this result
is needed and what it achieves. And probably we will see a lot of ugly
results in the code.
Avoid `break` in problematic test.
Fix closeWait() calls to use noCancelWait() predicate. Adding sleep to flaky MacOS test.
…el target Future (mustCancel behavior).
Address review comments.
## The procedure then has a chance to clean up or even deny the request | ||
## using `try/except/finally`. | ||
## NOTE: This procedure does not guarantee that cancellation will actually | ||
## happened. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
happen.
No description provided.