Skip to content

fix(gateway): prevent duplicate final send when only cosmetic edit failed#13542

Open
VTRiot wants to merge 1 commit intoNousResearch:mainfrom
VTRiot:fix/gateway-prevent-duplicate-final-send
Open

fix(gateway): prevent duplicate final send when only cosmetic edit failed#13542
VTRiot wants to merge 1 commit intoNousResearch:mainfrom
VTRiot:fix/gateway-prevent-duplicate-final-send

Conversation

@VTRiot
Copy link
Copy Markdown
Contributor

@VTRiot VTRiot commented Apr 21, 2026

fix(gateway): prevent duplicate final send when only cosmetic edit failed

What does this PR do?

In the Telegram delivery path, when GatewayStreamConsumer.got_done successfully delivers the final response content via _send_or_edit, but the subsequent cosmetic edit (clearing the typing cursor / streaming marker) fails, final_response_sent remains False even though the user has already received the final answer on their chat.

The gateway's _run_agent then takes the fallback path and re-delivers the same content as a fresh message. The user sees the response twice.

This PR introduces a new flag _final_content_delivered on GatewayStreamConsumer, set by got_done as soon as the final content has reached the user (before the cosmetic edit is attempted). The suppression logic in _run_agent now treats this flag as an additional signal — alongside final_response_sent and response_previewed — that final delivery is already complete.

Why not just use already_sent?

already_sent is set by every successful chunk send, including intermediate text such as "Let me search for that..." emitted alongside a tool call. Using it for suppression would re-introduce the bug that tests/gateway/test_duplicate_reply_suppression.py::test_partial_stream_output_does_not_set_already_sent explicitly guards against:

previously this promoted any partial send (already_sent=True) to final_response_sent — which suppressed the gateway's fallback send even when only intermediate text (e.g. 'Let me search...') had been delivered, not the real answer.

The new _final_content_delivered flag is set only in got_done, so it fires only when the final response content reached the user — never when only intermediate text was delivered. This preserves the upstream's intentional design while closing the duplicate-send gap for the specific case where the cosmetic final edit fails after the content has already been displayed.

Changes Made

gateway/stream_consumer.py

  • New instance variable _final_content_delivered (initialized False alongside _final_response_sent).
  • Public @property final_content_delivered for read access.
  • Set to True in two got_done paths:
    • Chunked message path: after a split-and-send completes and _already_sent is true.
    • Main path: when current_update_visible is true and there is accumulated text, recorded before the cosmetic final edit is attempted — so a failure of that edit does not undo the fact that the content reached the user.

gateway/run.py

  • _already_streamed check (post-execution): now also OR-ed with final_content_delivered.
  • _run_agent final suppression: new _content_delivered local, combined into the existing not empty_sentinel and (streamed or previewed or content_delivered) predicate. Log line extended to include the new signal for diagnosability.

tests/gateway/test_duplicate_reply_suppression.py

  • New TestFinalContentDeliveredSuppression class with two cases:
    • test_content_delivered_but_final_edit_failed_suppresses: the new branch — final_content_delivered=True with final_response_sent=False must suppress.
    • test_intermediate_text_only_does_not_suppress: the existing contract — already_sent=True with final_content_delivered=False must not suppress.

How to Test

Reproduction scenario (conceptual)

  1. Start a Telegram conversation that triggers a streaming response.
  2. Let the stream reach got_done and successfully send the final content via _send_or_edit.
  3. Induce a failure on the subsequent cosmetic edit (e.g. the "remove typing cursor" edit) — transient network failure, Telegram rate-limit, stale message id, etc.
  4. Before this PR: final_response_sent stays False; _run_agent falls back to response["already_sent"] being unset and re-sends the full response. The user sees the answer twice.
  5. After this PR: _final_content_delivered=True was set before the failed edit; _run_agent suppresses the fallback send. The user sees the answer once.

Automated tests

pytest tests/gateway/test_duplicate_reply_suppression.py::TestFinalContentDeliveredSuppression -v
pytest tests/gateway/test_duplicate_reply_suppression.py -v

Both new tests pass locally. Full test_duplicate_reply_suppression.py suite: 21 passed / 3 pre-existing unrelated failures (same set as before this change; environment-dependent TestBaseInterruptSuppression).

Notably, test_partial_stream_output_does_not_set_already_sent continues to pass — the upstream-intended behavior for intermediate-text-only sends is preserved.

Implementation Notes

  • _final_content_delivered is not reset by _reset_segment_state(), mirroring the existing behavior of _already_sent. Once final content has reached the user in a given agent turn, that fact should not be unset by subsequent segment boundaries.
  • getattr(_sc, "final_content_delivered", False) is used at both run.py call sites so that older GatewayStreamConsumer instances (e.g. in tests that build a SimpleNamespace stand-in) remain compatible.
  • The new flag is deliberately orthogonal to final_response_sent: the latter implies "the full done-sequence succeeded including the cosmetic edit", the former implies "the content reached the user, regardless of whether the cosmetic edit succeeded". Treating them as independent signals is what lets the suppression logic cover the failed-edit case without regressing the partial-stream case.

Checklist

Code

  • I've read the Contributing Guide
  • My commit message follows Conventional Commits
  • I searched for existing PRs to make sure this isn't a duplicate
  • My PR contains only changes related to this fix
  • I've run pytest tests/gateway/test_duplicate_reply_suppression.py -v with no regression vs. baseline
  • I've added tests for my changes
  • I've tested on my platform: macOS (Apple Silicon), Python 3.14

Documentation & Housekeeping

  • N/A for all documentation items

Related

…iled

When the stream consumer's got_done handler successfully delivers the
final response content via _send_or_edit but the subsequent edit
(e.g. cursor removal) fails, final_response_sent remains False even
though the user has already received the final answer. The gateway's
fallback send path then re-delivers the same content, causing the
user to see the response twice on Telegram.

Introduce a new _final_content_delivered flag on the stream consumer,
set by the got_done handler when the final content has reached the
user. The _run_agent suppression logic now treats this flag as an
additional signal (alongside final_response_sent and
response_previewed) that final delivery is already complete.

This preserves the existing behavior for intermediate-text-only
streams (where already_sent=True but no final content has been
delivered) — those still receive the gateway's fallback send, matching
the test expectation in test_partial_stream_output_does_not_set_already_sent.

Adds TestFinalContentDeliveredSuppression with two cases covering
both the suppression (content delivered + edit failed) and the
non-suppression (intermediate text only) branches.
@VTRiot
Copy link
Copy Markdown
Contributor Author

VTRiot commented Apr 21, 2026

Quick note on the failing Tests / test (pull_request) check: the failures appear to be pre-existing on upstream main, not introduced by this PR.

  • 8 test failures observed in the CI run; 7 of them fail identically on upstream/main at the base commit.
  • The remaining 1 (test_accretion_caps) is under tests/tools/ and is unrelated to the files this PR changes (gateway/stream_consumer.py, gateway/run.py, tests/gateway/test_duplicate_reply_suppression.py).
  • The 2 new tests added in this PR (TestFinalContentDeliveredSuppression) and the full tests/gateway/test_duplicate_reply_suppression.py suite pass locally (21 passed / 3 pre-existing env-dependent failures, same as baseline before this change).

Happy to share the local reproduction log if that helps. This may warrant a separate issue to track the upstream failures, but I didn't want to widen the scope of this PR — let me know if you'd like me to open one.

@alt-glitch alt-glitch added type/bug Something isn't working comp/gateway Gateway runner, session dispatch, delivery platform/telegram Telegram bot adapter labels Apr 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/gateway Gateway runner, session dispatch, delivery platform/telegram Telegram bot adapter type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants