Skip to content

fix(cron): cancel orphan coroutine on delivery timeout before standalone fallback (salvage #13495)#13517

Merged
teknium1 merged 3 commits intomainfrom
hermes/hermes-7d482fb6
Apr 21, 2026
Merged

fix(cron): cancel orphan coroutine on delivery timeout before standalone fallback (salvage #13495)#13517
teknium1 merged 3 commits intomainfrom
hermes/hermes-7d482fb6

Conversation

@teknium1
Copy link
Copy Markdown
Contributor

On cron delivery timeout, the live-adapter coroutine is now cancelled — no more duplicate sends from the orphan coroutine completing on the event loop after the standalone fallback already delivered.

Salvage of #13495 by @VTRiot. Cherry-picked onto current main; tests rewritten to invoke the real functions.

Changes

  • cron/scheduler.py: _deliver_result (timeout=60) and _send_media_via_adapter (timeout=30) wrap future.result() in try/except TimeoutError: future.cancel(); raise. Mirrors the existing precedent in tools/mcp_tool.py:1513.
  • tests/cron/test_scheduler.py: rewritten. The original tests replicated the try/except/cancel pattern inline against a mocked future — testing Python's semantics, not the scheduler. Replaced with tests that call _deliver_result and _send_media_via_adapter directly using a real concurrent.futures.Future whose .result() raises TimeoutError, and assert cancel() fires plus downstream behavior (standalone fallback delivers, next media file still dispatched).
  • scripts/release.py: register VTRiot in AUTHOR_MAP (VTRiot's own commit).

Validation

Mutation test: reverted both try/except wrappers in cron/scheduler.py → both new tests fail with assert [] == [True]. Restored → pass.

scripts/run_tests.sh tests/cron/test_scheduler.py
============================== 88 passed in 1.97s ==============================

Why it matters more after #13021

#13021 made cron jobs run in parallel via ThreadPoolExecutor. Multiple in-flight delivery coroutines on the same event loop increases exposure to this orphan-coroutine bug.

Closes #13495.

VTRiot and others added 3 commits April 21, 2026 05:44
…one fallback

When the live adapter delivery path (_deliver_result) or media send path
(_send_media_via_adapter) times out at future.result(timeout=N), the
underlying coroutine scheduled via asyncio.run_coroutine_threadsafe can
still complete on the event loop, causing a duplicate send after the
standalone fallback runs.

Cancel the future on TimeoutError before re-raising, so the standalone
fallback is the sole delivery path.

Adds TestDeliverResultTimeoutCancelsFuture and
TestSendMediaTimeoutCancelsFuture.
…ctly for timeout-cancel

The original tests replicated the try/except/cancel/raise pattern inline with
a mocked future, which tested Python's try/except semantics rather than the
scheduler's behavior. Rewrite them to invoke _deliver_result and
_send_media_via_adapter end-to-end with a real concurrent.futures.Future
whose .result() raises TimeoutError.

Mutation-verified: both tests fail when the try/except wrappers are removed
from cron/scheduler.py, pass with them in place.
@teknium1 teknium1 merged commit 267b2fa into main Apr 21, 2026
11 of 12 checks passed
@teknium1 teknium1 deleted the hermes/hermes-7d482fb6 branch April 21, 2026 12:52
@alt-glitch alt-glitch added type/bug Something isn't working comp/cron Cron scheduler and job management labels Apr 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/cron Cron scheduler and job management type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants