Cancellation fixes and tests. #445

cheatfate · 2023-08-18T01:04:23Z

No description provided.

arnetheduck · 2023-09-05T12:29:39Z

chronos/asyncfutures2.nim

+  ## done e.g. changes its state (become completed, failed or cancelled).
+  ##
+  ## If ``fut`` is already finished (completed, failed or cancelled) result
+  ## Future[void] object will be returned complete.


cancelAndWait() returns Future[void] which will become complete when future: FutureBase become finished.

arnetheduck · 2023-09-05T12:32:27Z

chronos/asyncfutures2.nim

+
+proc checkedCancelAndWait*(fut: FutureBase): Future[bool] =
+  ## Initiate cancellation process for Future ``fut`` and wait until ``fut`` is
+  ## done e.g. changes its state (become completed, failed or cancelled).


Suggested change

## done e.g. changes its state (become completed, failed or cancelled).

## finished e.g. changes its state (become completed, failed or cancelled).

Initially it was supposed that cancel() should return boolean flag which indicates:
true if cancellation was successful - e.g. last Future in sequence was moved from Pending to Cancelled, all future callbacks got scheduled.
false if cancellation was unsuccessfull, e.g. we encounter any Future in sequence which are not Pending. It means that this sequence of Futures are moving towards finish and we can't cancel it.

But somehow this flag was discarded and not used, so i expect there present a lot of source code which do not respect boolean result and do not expect cancel(): boolean. So checkCancelAndWait it is same as cancelAndWait but it returns flag which indicates is future was actually cancelled or not.

what does the boolean mean though? It feels like it's too weak to be useful - an async operation is typically made up of many sub-operations and when we start cancelling things, some of these may have completed and some may be cancelled - I'm not sure a boolean alone can capture this complexity in a useful way and this might be why everyone discards it.

Basically, there are three options: "it was cancelled", "it finished" and "don't know/partial result" and the boolean doesn't represent this well enough to be useful practically.

Cancellation could be compared to "message delivery", so this boolean flag indicates whether message was delivered to the bottom layer of target Future or it was not delivered.

what does the boolean mean though? It feels like it's too weak to be useful - an async operation is typically made up of many sub-operations and when we start cancelling things, some of these may have completed and some may be cancelled - I'm not sure a boolean alone can capture this complexity in a useful way and this might be why everyone discards it.

chronos tracks dependent Futures and cancellation using this tracking mechanism to deliver cancellation message to the bottom layer. So if bottom layer or any other Future before bottom layer will have status != Pending it will stop at this moment and returns false. But if the bottom layer Future has status == Pending, it will become Cancelled. And in this case cancel() will return true.

chronos/asyncfutures2.nim

arnetheduck · 2023-09-05T12:40:39Z

chronos/asyncfutures2.nim

+    res = fut.checkedCancel()
+  retFuture
+
+proc noCancelWait*[T](future: Future[T]): Future[T] =


same here - when should this be used? we have 4 cancellation-related functions now and the precise conditions where one should be used over the other is .. confusing (checkedCancelAndWait, cancelAndWait, cancel and noCancelWait)

if this function does what I think it does, await withoutCancellation(...) is probably a better name - this function does no "waiting" on its own

This procedure will create a Future[T] object which will have OwnCancelSchedule and it acts like a proxy for any Future[T]. In this case cancellation mechanism unable to go through this proxy Future[T] (it will not see any children behind it) and because it only see proxy Future[T] it is unable to schedule cancellation callbacks because of OwnCancelSchedule flag.

This procedure should be used only for closing procedures which are usually used to free resources. This procedures should not be cancelled to avoid resource leaks.

This procedure should be used only for closing procedures which are usually used to free resources. This procedures should not be cancelled to avoid resource leaks.

this needs to be documented clearly, something like "This function introduces a cancellation barrier that prevents cancellations from propagating to the given future thus ensuring that even during cancellation, the work of future is finished.

Disabling cancellation is useful during resource cleanup to avoid leaks, but may also lead to delays or hangs when future takes long to finish, thus breaking expectations of downstream code on cancellation being fast."

We'll have to be pretty careful about this - ie if a cancellation suddenly takes several second, for example to complete a "proper" ssl shutdown, this will certainly mess up any code that waits for the cancellation to finish - imagine a timeout on sending an attestation via rest: if we block everything until close is done on one of N connections, this can seriously mess up progress on the next attestation.

Cancellation process is always fast as lightning, it just follows recursively through Future's linked list, searching for the bottom Future. So lets call cancel() - cancellation, because cancelAndWait is not a cancellation process. At the end of cancelAndWait() you could get Future at any state except Pending. Because some Future[T] code could catch CancelledError and convert this signal to either error or value.

When you talking about freezes or blocks you need to understand that OS could block you, or your closeWait procedure could wait for event/signal/mutex and become blocking. Nothing we can do at this moment.

What we trying to do with this primitive is to avoid cancellation which could come while we already have "result" but perform resources cleanup in defer or finally branch.

But of course people could exploit this procedure usage and start using it everywhere. In this case such loops, will never be interrupted.

while true: await noCancelWait sleepAsync(1.seconds) echo "Slot start"

In this case such loops, will never be interrupted.

yes - this is what I mean when I say it might hang: the calling code wants to stop ASAP but the function itself blocks cancellation.

It applies to await closeWait in particular - if closeWait, like in ssl, sends some shutdown marker to close the stream gracefully, await closeWait might take long if connectivity is bad - if we force waiting for closeWait we get hangs and this is risky.

Perhaps another way to avoid the closeWait leak in particular is to allow cancellation to propagate but change its handling: for example, if closeWait receives a cancellation request, it would release the resource it's holding but do so "abortively" without waiting further - this would work well for ssl and the like at least - a database connection could do the same, ie optimistically send a "rollback" command and close the connection without waiting for a response. In this sense, closeWait means "try to close gracefully" and cancelling closeWait means "close forcefully" - this strategy also avoids a leak.

closeWait could catch CancelledError and perform any type of cancellation behavior it wants.

And about blocking calls in async, we do cooperative multitasking and we do not have thread pool execution so yes - we have a flaw and anybody could run blocking function and do something bad to every other actors in async loop.

I do not care and nobody actually should care about performance of cancelAndWait. If you need performance and do not care about result, you can use cancel(). If you want to avoid some re-circulation issues you can do cancelAndWait. But you should not expected that cancelAndWait will be fast or slow.

arnetheduck · 2023-09-05T12:53:44Z

chronos/asyncfutures2.nim

-
-  future.internalMustCancel = true
-  return true
+      if FutureFlag.OwnCancelSchedule notin future.internalFlags:


instead of a flag, could the cancelcb return a boolean?

The code is a bit confusing because both OwnCancelSchedule and an empty cancelcb is set for this to work - we could perhaps avoid the state and associated complexity by making this a local decision in the cb.

we can emulate the old behavior in the cancelcb= function by wrapping it

cancelcb could not return a flag, because it will introduce a lot of breaking changes for all projects which has Future[T] closure procedures with cancellation callbacks.

breaking changes

as mentioned, we can maybe work around this with a wrapper, ie

proc cancelCb(fut: auto, cb: OldCbType) = fut.internalcancelcb = proc(): bool = cb() true

as is, it's .. difficult to write correct callbacks

this would also go in the 4.0 branch so it's a good opportunity

I do not like and i will not agree on any changes with callbacks, having multiple callback types just complicate problem solving. So now i prefer Future flags in here, because it

Extensible, i can always add/remove flags at any moment without breaking old code. While introducing new callbacks requires deprecation path and so on. More dirty solution as for me.

Almost not visible for end users of library.

It would be hard to explain why this result is needed and what it achieves. And probably we will see a lot of ugly results in the code.

Avoid `break` in problematic test.

Fix closeWait() calls to use noCancelWait() predicate. Adding sleep to flaky MacOS test.

…el target Future (mustCancel behavior).

chronos/asyncfutures2.nim

chronos/asyncsync.nim

Address review comments.

diegomrsantos · 2023-09-27T12:10:08Z

chronos/asyncfutures2.nim

-  ## The procedure then has a chance to clean up or even deny the request
-  ## using `try/except/finally`.
+  ## NOTE: This procedure does not guarantee that cancellation will actually
+  ## happened.


cheatfate marked this pull request as draft August 18, 2023 03:44

cheatfate changed the title ~~Add callTick and stream cancellation tests.~~ Cancellation fixes and tests. Aug 25, 2023

cheatfate marked this pull request as ready for review August 25, 2023 10:23

arnetheduck reviewed Sep 5, 2023

View reviewed changes

cheatfate force-pushed the calltick branch from d9df6ef to e5e3968 Compare September 5, 2023 16:08

cheatfate added 14 commits September 9, 2023 21:31

Add callTick and stream cancellation tests.

0e2c198

Fix stepsAsync() test.

2c1e665

Cancellation changes.

4d2f8d7

Update and add more cancellation tests.

b6597eb

Fix Posix shutdown call to handle ENOTCONN error.

17a6f58

With new changes to to cancellation its now possible.

b16d4dc

Refactor testsoon.nim to not produce artifacts after tests are finished.

2c66cca

Debugging MacOS issue.

49ba589

Adjust flaky test times.

9efed95

Fix issue.

d3f45d7

Add test for issue #334 which was also addressed in this PR.

adda289

Avoid `break` in problematic test.

Add noCancelWait() call which prohibits cancellation.

2f5e823

Fix closeWait() calls to use noCancelWait() predicate. Adding sleep to flaky MacOS test.

Remove all debugging echoes.

9c63541

Fix cancelAndWait() which now could perform multiple attempts to canc…

772a7d0

…el target Future (mustCancel behavior).

cheatfate force-pushed the calltick branch from e5e3968 to 772a7d0 Compare September 9, 2023 18:34

cheatfate added 3 commits September 10, 2023 05:56

Fix issues revealed by switch to different cancelAndWait().

e8995ea

Address review comments.

6731bcd

Fix testutils compilation warning.

7e3682f

arnetheduck approved these changes Sep 14, 2023

View reviewed changes

cheatfate added 3 commits September 14, 2023 19:31

Rename callTick() to internalCallTick().

c2807cc

Add some documentation comments.

2231ca1

Disable flaky ratelimit test.

04d8353

arnetheduck reviewed Sep 15, 2023

View reviewed changes

chronos/asyncfutures2.nim Outdated Show resolved Hide resolved

arnetheduck reviewed Sep 15, 2023

View reviewed changes

chronos/asyncfutures2.nim Outdated Show resolved Hide resolved

arnetheduck reviewed Sep 15, 2023

View reviewed changes

chronos/asyncsync.nim Show resolved Hide resolved

arnetheduck approved these changes Sep 15, 2023

View reviewed changes

Rename noCancelWait() to noCancel().

947847c

Address review comments.

cheatfate merged commit 2e8551b into master Sep 15, 2023
12 checks passed

cheatfate deleted the calltick branch September 15, 2023 16:38

Menduist mentioned this pull request Sep 26, 2023

Update README regarding cancellation #450

Merged

diegomrsantos reviewed Sep 27, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cancellation fixes and tests. #445

Cancellation fixes and tests. #445

cheatfate commented Aug 18, 2023

arnetheduck Sep 5, 2023

cheatfate Sep 14, 2023 •

edited

Loading

arnetheduck Sep 5, 2023

cheatfate Sep 5, 2023

arnetheduck Sep 6, 2023

cheatfate Sep 8, 2023

arnetheduck Sep 5, 2023

cheatfate Sep 5, 2023

arnetheduck Sep 6, 2023

cheatfate Sep 7, 2023

arnetheduck Sep 7, 2023

cheatfate Sep 7, 2023

arnetheduck Sep 5, 2023

cheatfate Sep 5, 2023

arnetheduck Sep 5, 2023

arnetheduck Sep 5, 2023

cheatfate Sep 7, 2023

cheatfate Sep 7, 2023

diegomrsantos Sep 27, 2023

	## done e.g. changes its state (become completed, failed or cancelled).
	## finished e.g. changes its state (become completed, failed or cancelled).

Cancellation fixes and tests. #445

Cancellation fixes and tests. #445

Conversation

cheatfate commented Aug 18, 2023

Choose a reason for hiding this comment

cheatfate Sep 14, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cheatfate Sep 14, 2023 •

edited

Loading