Skip to content

fix: read exactly M values from channel#5447

Open
sbackend123 wants to merge 1 commit intomasterfrom
fix/testGetterRACE
Open

fix: read exactly M values from channel#5447
sbackend123 wants to merge 1 commit intomasterfrom
fix/testGetterRACE

Conversation

@sbackend123
Copy link
Copy Markdown
Contributor

Checklist

  • I have read the coding guide.
  • My change requires a documentation update, and I have done it.
  • I have added tests to cover my changes.
  • I have filled out the description and linked the related issues.

Description

Read exactly M values from channel instead of reading "forever", which leads to deadlock in really seldom cases, which happen in mostly in CI

Open API Spec Version Changes (if applicable)

Motivation and Context (Optional)

runStrategy assumes that while consuming results from c, one of the two exit conditions will always become true before reads are exhausted. Because of that assumption, the previous loop used for range c (which only terminates when c is closed).

Why this can deadlock
In the DATA -> RACE fallback path, some DATA-owned shard fetches may still be finishing while RACE starts.
RACE can include data shard indices whose waits[i] is not closed yet; for those indices, fetch takes the fly=false path and waits on waits[i].
For a successful DATA fetch, the operation order is:

close(g.waits[i])       // 1) unblocks waiting fly=false RACE goroutine
g.fetchedCnt.Add(1)     // 2) increments success counter

There is a small window between (1) and (2):

  • the waiting RACE goroutine can unblock and push its result into c,
  • runStrategy can consume that last value from c,
  • but fetchedCnt may still be stale for that check.

If this happens on the last available message in c, neither exit condition may trigger in that iteration, and the next read blocks forever (no more writers, channel not closed).

Related Issue (Optional)

Screenshots (if appropriate):

AI Disclosure

  • This PR contains code that has been generated by an LLM.
  • I have reviewed the AI generated code thoroughly.
  • I possess the technical expertise to responsibly review the code generated in this PR.

@sbackend123 sbackend123 marked this pull request as ready for review April 26, 2026 18:47
Comment thread pkg/file/redundancy/getter/getter.go
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants