-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ignore_after
cannot be cancelled sometimes (race?)
#44
Comments
I believe this is related to Lines 373 to 377 in e55950f
|
Ok, so the issue is that Lines 318 to 323 in e55950f
Lines 373 to 384 in e55950f
TimeoutAfter.__aexit__ is trying to distinguish where the cancellation is coming from (internal or external) based on the value of timed_out_deadline . The issue is that there can be a race where there is ~both an internal and an external cancellation around the same time, in which case __aexit__ will treat it as an internal cancellation (timeout) (due to line 378), and suppress the CancelledError.
In particular, in my example in the OP, as To further support the above explanation, observe that the following extremely hackish patch ( diff --git a/aiorpcx/curio.py b/aiorpcx/curio.py
index 296023e..40825a7 100755
--- a/aiorpcx/curio.py
+++ b/aiorpcx/curio.py
@@ -259,6 +259,7 @@ class TaskGroup:
'''Cancel the passed set of tasks. Wait for them to complete.'''
for task in tasks:
task.cancel()
+ task._really_cancel = True
if tasks:
def pop_task(task):
@@ -372,6 +373,8 @@ class TimeoutAfter:
async def __aexit__(self, exc_type, exc_value, traceback):
timed_out_deadline, uncaught = _unset_task_deadline(self._task)
+ if getattr(self._task, "_really_cancel", False):
+ return False
if exc_type not in (CancelledError, TaskTimeout,
TimeoutCancellationError):
return False
|
Here is a more generic, albeit even more hackish, patch ( diff --git a/aiorpcx/curio.py b/aiorpcx/curio.py
index 296023e..ac8b814 100755
--- a/aiorpcx/curio.py
+++ b/aiorpcx/curio.py
@@ -318,8 +318,14 @@ class UncaughtTimeoutError(Exception):
def _set_new_deadline(task, deadline):
def timeout_task():
# Unfortunately task.cancel is all we can do with asyncio
- task.cancel()
+ task._orig_cancel()
task._timed_out = deadline
+ def mycancel():
+ task._orig_cancel()
+ task._really_cancel = True
+ if not hasattr(task, "_orig_cancel"):
+ task._orig_cancel = task.cancel
+ task.cancel = mycancel
task._deadline_handle = task._loop.call_at(deadline, timeout_task)
@@ -372,6 +378,8 @@ class TimeoutAfter:
async def __aexit__(self, exc_type, exc_value, traceback):
timed_out_deadline, uncaught = _unset_task_deadline(self._task)
+ if getattr(self._task, "_really_cancel", False):
+ return False
if exc_type not in (CancelledError, TaskTimeout,
TimeoutCancellationError):
return False
|
(python 3.9+ only) related kyuupichan#44 (this does not fix the race condition fully, only ~1 out of 2 orderings)
The example2import asyncio
from aiorpcx import TaskGroup
from async_timeout import timeout
async def raise_exc():
await asyncio.sleep(0.5)
raise Exception("asd")
async def f():
event = asyncio.Event()
while True:
try:
async with timeout(0.001):
async with TaskGroup() as group:
await group.spawn(event.wait())
except asyncio.TimeoutError:
pass
async def main():
async with TaskGroup() as group:
await group.spawn(f)
await group.spawn(raise_exc)
print(f"taskgroup exited.")
asyncio.run(main()) As mentioned in that thread, even Also see I intend to monkey-patch diff --git a/electrum/util.py b/electrum/util.py
index 1dc02ca6eb..f73447eca7 100644
--- a/electrum/util.py
+++ b/electrum/util.py
@@ -1246,6 +1246,37 @@ class OldTaskGroup(aiorpcx.TaskGroup):
if self.completed:
self.completed.result()
+# We monkey-patch aiorpcx TimeoutAfter (used by timeout_after and ignore_after API),
+# to fix a timing issue present in asyncio as a whole re timing out tasks.
+# To see the issue we are trying to fix, consider example:
+# async def outer_task():
+# async with timeout_after(0.1):
+# await inner_task()
+# When the 0.1 sec timeout expires, inner_task will get cancelled by timeout_after (=internal cancellation).
+# If around the same time (in terms of event loop iterations) another coroutine
+# cancels outer_task (=external cancellation), there will be a race.
+# Both cancellations work by propagating a CancelledError out to timeout_after, which then
+# needs to decide (in TimeoutAfter.__aexit__) whether it's due to an internal or external cancellation.
+# AFAICT asyncio provides no reliable way of distinguishing between the two.
+# This patch tries to always give priority to external cancellations.
+# see https://github.com/kyuupichan/aiorpcX/issues/44
+# see https://github.com/aio-libs/async-timeout/issues/229
+# see https://bugs.python.org/issue42130 and https://bugs.python.org/issue45098
+def _aiorpcx_monkeypatched_set_new_deadline(task, deadline):
+ def timeout_task():
+ task._orig_cancel()
+ task._timed_out = None if getattr(task, "_externally_cancelled", False) else deadline
+ def mycancel(*args, **kwargs):
+ task._orig_cancel(*args, **kwargs)
+ task._externally_cancelled = True
+ task._timed_out = None
+ if not hasattr(task, "_orig_cancel"):
+ task._orig_cancel = task.cancel
+ task.cancel = mycancel
+ task._deadline_handle = task._loop.call_at(deadline, timeout_task)
+
+aiorpcx.curio._set_new_deadline = _aiorpcx_monkeypatched_set_new_deadline
+
class NetworkJobOnDefaultServer(Logger, ABC):
"""An abstract base class for a job that runs on the main network
I think |
If you want, I could open a PR with some variant of the above. |
I've tested now with example snippet with `asyncio.timeout`import asyncio
from aiorpcx import TaskGroup
async def raise_exc():
await asyncio.sleep(0.5)
raise Exception("asd")
async def f():
event = asyncio.Event()
while True:
try:
# note: requires python 3.11
async with asyncio.timeout(0.001):
await event.wait()
except asyncio.TimeoutError:
pass
async def main():
async with TaskGroup() as group:
await group.spawn(f)
await group.spawn(raise_exc)
print(f"taskgroup exited. {group.exception=}")
asyncio.run(main()) |
wasted some time because asyncio.wait_for() was suppressing cancellations. [0][1][2] deja vu... [3] Looks like this is finally getting fixed in cpython 3.12 [4] So far away... In attempt to avoid encountering this again, let's try using asyncio.timeout in 3.11, which is how upstream reimplemented wait_for in 3.12 [4], and aiorpcx.timeout_after in 3.8-3.10. [0] python/cpython#86296 [1] https://bugs.python.org/issue42130 [2] https://bugs.python.org/issue45098 [3] kyuupichan/aiorpcX#44 [4] python/cpython#98518
wasted some time because asyncio.wait_for() was suppressing cancellations. [0][1][2] deja vu... [3] Looks like this is finally getting fixed in cpython 3.12 [4] So far away... In attempt to avoid encountering this again, let's try using asyncio.timeout in 3.11, which is how upstream reimplemented wait_for in 3.12 [4], and aiorpcx.timeout_after in 3.8-3.10. [0] python/cpython#86296 [1] https://bugs.python.org/issue42130 [2] https://bugs.python.org/issue45098 [3] kyuupichan/aiorpcX#44 [4] python/cpython#98518
(re git tag
0.22.1
)Sometimes
ignore_after
ignores cancellation, which means if it is used in a task as part of a TaskGroup, the group's join will never finish.Consider the following example, which I would expect to print
taskgroup exited.
after <1 second:The same example always finishes on aiorpcx
0.18.7
.Note that increasing the timeout constant of ignore_after seems to decrease the chance the issue occurs, try running the example with e.g.
0.05
seconds instead - with that on my machine it printstaskgroup exited.
around half the time.The text was updated successfully, but these errors were encountered: