feat(worker): cooperative finisher merging from Redis processed set#730
Open
thomasrockhu-codecov wants to merge 2 commits intotom/fix-finisher-blocking-timeoutfrom
Open
feat(worker): cooperative finisher merging from Redis processed set#730thomasrockhu-codecov wants to merge 2 commits intotom/fix-finisher-blocking-timeoutfrom
thomasrockhu-codecov wants to merge 2 commits intotom/fix-finisher-blocking-timeoutfrom
Conversation
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Instead of each finisher task only merging its own chord's 1-2 uploads, the lock holder now queries the full Redis "processed" set and merges ALL pending uploads in one lock acquisition. This means one finisher does the work of many, and redundant finishers exit immediately. Changes: - Add get_all_processed_uploads() to ProcessingState (unbounded smembers) - _process_reports_with_lock: after acquiring lock, reconstruct processing_results from Redis instead of using chord arguments - Early exit in run_impl when Redis processed set is empty (another finisher already merged our uploads) - Reduce base_retry_countdown to 30s (from 200s default) so retried finishers discover "nothing to merge" sooner and exit faster Made-with: Cursor
- Skip the Redis-based early exit when processing_results were reconstructed from DB (Redis TTL may have expired while uploads still need merging). - Add _setup_mock_redis_for_processing helper so tests exercising the full merge flow correctly bypass the cooperative early exit. - Update _reconstruct_processing_results callers to use get_all_processed_uploads instead of get_uploads_for_merging. Made-with: Cursor
716ae09 to
3e575fe
Compare
67bc08f to
977b1fa
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Stacked on #729.
base_retry_countdownfrom 200s to 30s so retried finishers check back sooner, discover there's nothing to merge, and exit quickly.Before vs After
For a commit with 100+ parallel CI uploads (e.g., stacks-network with 167 finisher tasks):
Key changes
ProcessingState.get_all_processed_uploads(): new method usingsmembers(noMERGE_BATCH_SIZEcap)_process_reports_with_lock: inside the lock, calls_reconstruct_processing_resultsto get ALL pending uploads from Redis instead of using the chord'sprocessing_resultsrun_impl: early exit whenProcessingState.get_upload_numbers()shows nothing pendingLockManagerinstances usebase_retry_countdown=30for faster retry cyclingTest plan
test_get_all_processed_uploads_returns_full_setfor the new ProcessingState methodtest_early_exit_when_redis_processed_set_emptyverifying early exit pathtest_cooperative_merge_uses_all_redis_uploadsverifying lock holder merges all uploadstest_cooperative_merge_exits_when_lock_holder_already_mergedverifying graceful exit on raceMade with Cursor
Note
Medium Risk
Changes core upload-finishing/merge concurrency behavior (lock usage + Redis-driven selection), which could affect report completeness or notification timing if Redis state is inconsistent or races occur. Added coverage reduces risk but this is still a production-critical workflow change.
Overview
Cooperative finisher merging: the lock-holder finisher now reconstructs merge inputs from the full Redis
processedset (via newProcessingState.get_all_processed_uploads()usingsmembers) and merges/cleans up all pending uploads at once, instead of only its chord’s subset.Less redundant work:
run_impladds an early-exit path when Redis shows bothprocessingandprocessedare empty (another finisher already merged), and both lock acquisitions now use a shorterbase_retry_countdown=30. Tests were updated/added to cover the full-set fetch, early-exit, and cooperative-merge race scenarios.Written by Cursor Bugbot for commit 977b1fa. This will update automatically on new commits. Configure here.