fix(gateway): prevent duplicate final send when only cosmetic edit failed#13542
Open
VTRiot wants to merge 1 commit intoNousResearch:mainfrom
Open
fix(gateway): prevent duplicate final send when only cosmetic edit failed#13542VTRiot wants to merge 1 commit intoNousResearch:mainfrom
VTRiot wants to merge 1 commit intoNousResearch:mainfrom
Conversation
…iled When the stream consumer's got_done handler successfully delivers the final response content via _send_or_edit but the subsequent edit (e.g. cursor removal) fails, final_response_sent remains False even though the user has already received the final answer. The gateway's fallback send path then re-delivers the same content, causing the user to see the response twice on Telegram. Introduce a new _final_content_delivered flag on the stream consumer, set by the got_done handler when the final content has reached the user. The _run_agent suppression logic now treats this flag as an additional signal (alongside final_response_sent and response_previewed) that final delivery is already complete. This preserves the existing behavior for intermediate-text-only streams (where already_sent=True but no final content has been delivered) — those still receive the gateway's fallback send, matching the test expectation in test_partial_stream_output_does_not_set_already_sent. Adds TestFinalContentDeliveredSuppression with two cases covering both the suppression (content delivered + edit failed) and the non-suppression (intermediate text only) branches.
Contributor
Author
|
Quick note on the failing
Happy to share the local reproduction log if that helps. This may warrant a separate issue to track the upstream failures, but I didn't want to widen the scope of this PR — let me know if you'd like me to open one. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
fix(gateway): prevent duplicate final send when only cosmetic edit failed
What does this PR do?
In the Telegram delivery path, when
GatewayStreamConsumer.got_donesuccessfully delivers the final response content via_send_or_edit, but the subsequent cosmetic edit (clearing the typing cursor / streaming marker) fails,final_response_sentremainsFalseeven though the user has already received the final answer on their chat.The gateway's
_run_agentthen takes the fallback path and re-delivers the same content as a fresh message. The user sees the response twice.This PR introduces a new flag
_final_content_deliveredonGatewayStreamConsumer, set bygot_doneas soon as the final content has reached the user (before the cosmetic edit is attempted). The suppression logic in_run_agentnow treats this flag as an additional signal — alongsidefinal_response_sentandresponse_previewed— that final delivery is already complete.Why not just use
already_sent?already_sentis set by every successful chunk send, including intermediate text such as"Let me search for that..."emitted alongside a tool call. Using it for suppression would re-introduce the bug thattests/gateway/test_duplicate_reply_suppression.py::test_partial_stream_output_does_not_set_already_sentexplicitly guards against:The new
_final_content_deliveredflag is set only ingot_done, so it fires only when the final response content reached the user — never when only intermediate text was delivered. This preserves the upstream's intentional design while closing the duplicate-send gap for the specific case where the cosmetic final edit fails after the content has already been displayed.Changes Made
gateway/stream_consumer.py_final_content_delivered(initializedFalsealongside_final_response_sent).@property final_content_deliveredfor read access.Truein twogot_donepaths:_already_sentis true.current_update_visibleis true and there is accumulated text, recorded before the cosmetic final edit is attempted — so a failure of that edit does not undo the fact that the content reached the user.gateway/run.py_already_streamedcheck (post-execution): now also OR-ed withfinal_content_delivered._run_agentfinal suppression: new_content_deliveredlocal, combined into the existingnot empty_sentinel and (streamed or previewed or content_delivered)predicate. Log line extended to include the new signal for diagnosability.tests/gateway/test_duplicate_reply_suppression.pyTestFinalContentDeliveredSuppressionclass with two cases:test_content_delivered_but_final_edit_failed_suppresses: the new branch —final_content_delivered=Truewithfinal_response_sent=Falsemust suppress.test_intermediate_text_only_does_not_suppress: the existing contract —already_sent=Truewithfinal_content_delivered=Falsemust not suppress.How to Test
Reproduction scenario (conceptual)
got_doneand successfully send the final content via_send_or_edit.final_response_sentstaysFalse;_run_agentfalls back toresponse["already_sent"]being unset and re-sends the full response. The user sees the answer twice._final_content_delivered=Truewas set before the failed edit;_run_agentsuppresses the fallback send. The user sees the answer once.Automated tests
Both new tests pass locally. Full
test_duplicate_reply_suppression.pysuite: 21 passed / 3 pre-existing unrelated failures (same set as before this change; environment-dependentTestBaseInterruptSuppression).Notably,
test_partial_stream_output_does_not_set_already_sentcontinues to pass — the upstream-intended behavior for intermediate-text-only sends is preserved.Implementation Notes
_final_content_deliveredis not reset by_reset_segment_state(), mirroring the existing behavior of_already_sent. Once final content has reached the user in a given agent turn, that fact should not be unset by subsequent segment boundaries.getattr(_sc, "final_content_delivered", False)is used at bothrun.pycall sites so that olderGatewayStreamConsumerinstances (e.g. in tests that build aSimpleNamespacestand-in) remain compatible.final_response_sent: the latter implies "the full done-sequence succeeded including the cosmetic edit", the former implies "the content reached the user, regardless of whether the cosmetic edit succeeded". Treating them as independent signals is what lets the suppression logic cover the failed-edit case without regressing the partial-stream case.Checklist
Code
pytest tests/gateway/test_duplicate_reply_suppression.py -vwith no regression vs. baselineDocumentation & Housekeeping
Related
fix(cron): cancel orphan coroutine on delivery timeout before standalone fallback) addresses the same class of user-observed bug — duplicate delivery — on the cron path. This PR addresses the gateway/Telegram path. They are independent in scope, root cause, and modified files.