Skip to content

surgeon: detect cycles in XRay.traverse#1193

Merged
ahrtr merged 2 commits intoetcd-io:mainfrom
ivangsm:surgeon/xray-detect-cycles
Apr 22, 2026
Merged

surgeon: detect cycles in XRay.traverse#1193
ahrtr merged 2 commits intoetcd-io:mainfrom
ivangsm:surgeon/xray-detect-cycles

Conversation

@ivangsm
Copy link
Copy Markdown
Contributor

@ivangsm ivangsm commented Apr 18, 2026

What does this PR do?

XRay.traverse (in internal/surgeon/xray.go) recursively follows branch and bucket-root pgids read straight from the file being inspected. A corrupted db whose pages form a cycle causes the recursion to run until the goroutine stack is exhausted.

This PR tracks visited pgids and returns an explicit cycle detected error when a pgid reappears, so the CLI exits cleanly with a diagnostic instead of crashing on exactly the files these tools are meant to diagnose.

Motivation

This was explicitly called out as a follow-up by @tjungblu on #701:

BTW, the forEachPageInternal itself runs out of stack space at some point, so we might also add some cycle detection as a follow-up.

XRay.traverse has the same structural issue as forEachPageInternal and is reachable from the bbolt surgery / inspection tooling, which is specifically meant to operate on potentially corrupted db files (see #701 for the power-off corruption pattern that produced a real page cycle).

A separate follow-up will address forEachPageInternal itself.

Testing

Added TestFindPathsToKey_CycleDetected: fills a db, uses surgeon.CopyPage to overwrite a leaf with its branch ancestor (so the rewritten page references its own pgid through the ancestor's element list), and asserts that FindPathsToKey returns an error containing cycle detected instead of blowing the stack.

Without the fix the new test never returns — the traversal recursed ~955k times and grew the pgid stack slice to a similar size before Go's test timeout fired. With the fix it returns in ~30ms with a clear error.

Ran locally:

```
BBOLT_VERIFY=all TEST_FREELIST_TYPE=array go test ./internal/surgeon/... -count=1
BBOLT_VERIFY=all TEST_FREELIST_TYPE=hashmap go test ./internal/surgeon/... -count=1
BBOLT_VERIFY=all go test ./cmd/bbolt/... -count=1
```

All pass.

XRay.traverse recursively follows branch and bucket-root pgids read
straight from the file being inspected. A corrupted db whose pages form
a cycle (as reported in etcd-io#701, caused by a forced power off on Hyper-V /
VMware) causes the recursion to run until the process exhausts its
goroutine stack.

Track visited pgids and return an explicit error when a cycle is hit so
the CLI (bbolt surgery / inspect) exits cleanly rather than crashing on
exactly the files these tools are meant to diagnose.

Signed-off-by: Iván Salazar <ivangio.salazar@gmail.com>
@ivangsm ivangsm force-pushed the surgeon/xray-detect-cycles branch from f6e679f to b06ba6a Compare April 18, 2026 07:23
@tjungblu
Copy link
Copy Markdown
Contributor

/ok-to-test

looks good to me, thanks for adding this!

@tjungblu
Copy link
Copy Markdown
Contributor

/lgtm

Comment thread internal/surgeon/xray.go

func (n XRay) traverse(stack []common.Pgid, callback func(page *common.Page, stack []common.Pgid) error) error {
p, data, err := guts_cli.ReadPage(n.path, uint64(stack[len(stack)-1]))
func (n XRay) traverse(stack []common.Pgid, visited map[common.Pgid]struct{}, callback func(page *common.Page, stack []common.Pgid) error) error {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The visited map is shared across all paths. It may result in false alarm. Multiple keys with the same value might be included in multiple buckets/pages.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how can the same page be reused in different buckets?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The visited caches all pages it ever visited, including the root page. Note the root page is the starting page to access any keys. Some branch pages might be also the shared path prefix for some keys.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-read the traverse, it just traverses the whole B+tree top down, so normally it will never revisit page, no matter how many keys match. The only possible reason is the db is corrupted.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a test case something like below?

  • create two buckets, and add the same key into both buckets. call FindPathsToKey with the key, it should return two stacks.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for re-reading it — agreed that traverse is top-down, so in a well-formed tree no pgid is revisited and a revisit only happens under real corruption.

Added TestFindPathsToKey_MultipleBuckets in internal/surgeon/xray_test.go covering the case you described: two sibling top-level buckets, each filled with 500 entries plus the same "shared" key (the fill ensures neither bucket is inlined, so their own pgids actually pass through the visited map). The test asserts that FindPathsToKey("shared") returns exactly 2 paths and that the two terminal leaf pgids are distinct — which directly exercises the concern that a shared visited could otherwise suppress the second hit.

Follow-up to the review on etcd-io#1193: cover the case where the same key
lives in more than one bucket. The shared visited map in traverse
could, in theory, suppress legitimate revisits; in practice each
top-level bucket owns a disjoint page sub-tree, so both occurrences
must be returned.

Each bucket is filled with 500 entries so it has its own (non-inline)
root and leaf pages, and the test asserts two paths are returned on
two distinct leaf pgids.

Signed-off-by: Iván Salazar <ivangio.salazar@gmail.com>
@k8s-ci-robot
Copy link
Copy Markdown

New changes are detected. LGTM label has been removed.

@ahrtr ahrtr added this to the v1.5.0 milestone Apr 22, 2026
@k8s-ci-robot
Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ahrtr, ivangsm

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ahrtr ahrtr merged commit cd4bafc into etcd-io:main Apr 22, 2026
48 of 49 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

4 participants