Skip to content

[Access & Execution] Add flag to disable bitswap bloom cache#8227

Open
zhangchiqing wants to merge 13 commits intomasterfrom
leo/disable-bitswap-bloom-cache
Open

[Access & Execution] Add flag to disable bitswap bloom cache#8227
zhangchiqing wants to merge 13 commits intomasterfrom
leo/disable-bitswap-bloom-cache

Conversation

@zhangchiqing
Copy link
Copy Markdown
Member

@zhangchiqing zhangchiqing commented Dec 4, 2025

Add an experimental bitswap-bloom-cache-enabled flag (default: true) that controls whether the Bitswap bloom cache is used for Access/Execution/Observer nodes. When disabled, the blob service uses a new WithSkipBloomCache option to skip the cached blockstore and instead use a plain Pebble-backed blockstore, avoiding the CPU cost of building bloom filters at startup while still relying on Pebble’s SSTable bloom filters.

Summary by CodeRabbit

  • New Features

    • Added configurable Bloom-cache support for the blob service, controllable via a new CLI flag (enabled by default) to improve data retrieval and caching performance.
  • Chores

    • Enhanced debug/informational logging for execution data synchronization.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Dec 4, 2025

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

Scanned Files

None

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Dec 4, 2025

Codecov Report

❌ Patch coverage is 25.00000% with 21 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
network/p2p/blob/blob_service.go 0.00% 14 Missing ⚠️
cmd/scaffold.go 0.00% 4 Missing ⚠️
cmd/access/node_builder/access_node_builder.go 0.00% 1 Missing ⚠️
cmd/execution_builder.go 0.00% 1 Missing ⚠️
cmd/observer/node_builder/observer_builder.go 0.00% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

@zhangchiqing zhangchiqing marked this pull request as ready for review December 8, 2025 23:32
@zhangchiqing zhangchiqing requested a review from a team as a code owner December 8, 2025 23:32
Comment thread cmd/node_builder.go
ComplianceConfig: compliance.DefaultConfig(),
DhtSystemEnabled: true,
BitswapReprovideEnabled: true,
BitswapBloomCacheEnabled: true, // default: use cached blockstore TODO leo: change default to false
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test cases also passed when I changed the default to false.

Copy link
Copy Markdown
Contributor

@fxamacker fxamacker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition for creating cachedBlockStore appears to be inverted. Specifically, we should check !SkipBloomCache before creating cachedBlockStore.

Also, it might be easier to follow if we use a positive flag UseBloomCache instead of a negative flag SkipBloomCache. This would make it more consistent with the new node flag BitswapBloomCacheEnabled.

Comment thread network/p2p/blob/blob_service.go Outdated
Comment thread cmd/execution_builder.go Outdated
Comment thread cmd/access/node_builder/access_node_builder.go Outdated
Comment thread network/p2p/blob/blob_service.go Outdated
zhangchiqing and others added 4 commits December 12, 2025 14:43
Co-authored-by: Faye Amacker <33205765+fxamacker@users.noreply.github.com>
Co-authored-by: Faye Amacker <33205765+fxamacker@users.noreply.github.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jan 2, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 5d47b715-804c-4c1a-9404-d31ccd853f26

📥 Commits

Reviewing files that changed from the base of the PR and between 4d370bf and 1edcaf2.

📒 Files selected for processing (4)
  • cmd/access/node_builder/access_node_builder.go
  • cmd/execution_builder.go
  • cmd/node_builder.go
  • cmd/observer/node_builder/observer_builder.go
🚧 Files skipped from review as they are similar to previous changes (2)
  • cmd/node_builder.go
  • cmd/execution_builder.go

📝 Walkthrough

Walkthrough

Adds a Bitswap bloom-cache toggle: new BitswapBloomCacheEnabled config/CLI flag and propagation of blob.WithUseBloomCache(...) into blob service initialization across node builders; also introduces UseBloomCache option in the blob service implementation and small test logging additions.

Changes

Cohort / File(s) Summary
Configuration & CLI
cmd/node_builder.go, cmd/scaffold.go
Added BitswapBloomCacheEnabled bool to BaseConfig (default true) and CLI flag --bitswap-bloom-cache-enabled bound to it.
Blob Service Implementation
network/p2p/blob/blob_service.go
Added UseBloomCache to BlobServiceConfig, new option WithUseBloomCache(use bool), and altered NewBlobService to conditionally wrap the blockstore with a cached blockstore when enabled.
Node Builder Integrations
cmd/access/node_builder/access_node_builder.go, cmd/execution_builder.go, cmd/observer/node_builder/observer_builder.go
Appended blob.WithUseBloomCache(builder.BitswapBloomCacheEnabled) to blob service option lists used when registering/exposing execution-data blob services.
Tests / Logging
integration/tests/access/cohort3/execution_state_sync_test.go
Added debug logging in executionDataForHeight for fetch failures and successful conversions (height, block_id, chunk count).

Sequence Diagram(s)

(omitted)

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested reviewers

  • tim-barry
  • m-Peter

Poem

🐇 I nibble code in moonlit stacks,
A bloom-cache sown along the tracks,
Bitswap hums with fewer calls,
Pebble petals line the walls,
Hop, hop—performance, here it packs! 🌿

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 75.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title clearly and specifically describes the main change: adding a flag to disable the bitswap bloom cache for Access and Execution nodes.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch leo/disable-bitswap-bloom-cache

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Tip

CodeRabbit can generate a title for your PR based on the changes with custom instructions.

Set the reviews.auto_title_instructions setting to generate a title for your PR based on the changes in the PR with custom instructions.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
cmd/node_builder.go (1)

189-195: BitswapBloomCacheEnabled field and default look consistent (minor doc nit)

The new BitswapBloomCacheEnabled flag and its default of true line up with the blob service’s UseBloomCache default and how builders consume it. One small suggestion: observers also read this flag via their builders, so consider updating the comment (“only meaningful to Access and Execution nodes”) to include Observer nodes to avoid future confusion. Optional only.

Also applies to: 307-314

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e0a94e9 and 4d370bf.

📒 Files selected for processing (7)
  • cmd/access/node_builder/access_node_builder.go
  • cmd/execution_builder.go
  • cmd/node_builder.go
  • cmd/observer/node_builder/observer_builder.go
  • cmd/scaffold.go
  • integration/tests/access/cohort3/execution_state_sync_test.go
  • network/p2p/blob/blob_service.go
🧰 Additional context used
📓 Path-based instructions (5)
**/*.go

📄 CodeRabbit inference engine (.cursor/rules/coding_conventions.mdc)

Follow Go coding conventions as documented in @docs/agents/CodingConventions.md

Follow Go coding standards and conventions as documented in @docs/agents/GoDocs.md

**/*.go: Follow the existing module structure in /module/, /engine/, /model/ and use dependency injection patterns for component composition
Implement proper interfaces before concrete types
Follow Go naming conventions and the project's coding style defined in /docs/CodingConventions.md
Use mock generators: run make generate-mocks after interface changes
All inputs must be considered potentially byzantine; error classification is context-dependent and no code path is safe unless explicitly proven and documented
Use comprehensive error wrapping for debugging; avoid fmt.Errorf, use irrecoverable package for exceptions
NEVER log and continue on best effort basis; ALWAYS explicitly handle errors
Uses golangci-lint with custom configurations (.golangci.yml) and custom linters for Flow-specific conventions (struct write checking)

Files:

  • network/p2p/blob/blob_service.go
  • cmd/access/node_builder/access_node_builder.go
  • cmd/execution_builder.go
  • integration/tests/access/cohort3/execution_state_sync_test.go
  • cmd/node_builder.go
  • cmd/scaffold.go
  • cmd/observer/node_builder/observer_builder.go
{network,engine,consensus}/**/*.go

📄 CodeRabbit inference engine (AGENTS.md)

Network messages must be authenticated and validated

Files:

  • network/p2p/blob/blob_service.go
{module,engine,cmd}/**/*.go

📄 CodeRabbit inference engine (AGENTS.md)

All major processing components must implement the Component interface from /module/component/component.go to ensure consistent lifecycle management and graceful shutdown patterns

Files:

  • cmd/access/node_builder/access_node_builder.go
  • cmd/execution_builder.go
  • cmd/node_builder.go
  • cmd/scaffold.go
  • cmd/observer/node_builder/observer_builder.go
**/*_test.go

📄 CodeRabbit inference engine (AGENTS.md)

**/*_test.go: Unit tests should be co-located with the code they test
Follow the existing pattern of *_test.go files for test naming
Use fixtures for realistic test data as defined in /utils/unittest/

Files:

  • integration/tests/access/cohort3/execution_state_sync_test.go
integration/tests/**/*.go

📄 CodeRabbit inference engine (AGENTS.md)

Integration tests should go in /integration/tests/

Files:

  • integration/tests/access/cohort3/execution_state_sync_test.go
🧠 Learnings (1)
📚 Learning: 2025-12-23T00:28:41.005Z
Learnt from: CR
Repo: onflow/flow-go PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-23T00:28:41.005Z
Learning: Applies to {storage,ledger,execution,fvm}/**/*.go : State consistency is paramount; use proper synchronization primitives

Applied to files:

  • integration/tests/access/cohort3/execution_state_sync_test.go
🧬 Code graph analysis (4)
cmd/access/node_builder/access_node_builder.go (1)
network/p2p/blob/blob_service.go (1)
  • WithUseBloomCache (107-111)
cmd/execution_builder.go (1)
network/p2p/blob/blob_service.go (1)
  • WithUseBloomCache (107-111)
cmd/scaffold.go (1)
cmd/node_builder.go (1)
  • BaseConfig (140-197)
cmd/observer/node_builder/observer_builder.go (1)
network/p2p/blob/blob_service.go (1)
  • WithUseBloomCache (107-111)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (37)
  • GitHub Check: Lint (./insecure/)
  • GitHub Check: Lint (./)
  • GitHub Check: Lint (./integration/)
  • GitHub Check: Unit Tests (network/alsp)
  • GitHub Check: Unit Tests (others)
  • GitHub Check: Unit Tests (network/p2p/scoring)
  • GitHub Check: Unit Tests (network/p2p/node)
  • GitHub Check: Unit Tests (engine/common)
  • GitHub Check: Unit Tests (module)
  • GitHub Check: Unit Tests (network/test/cohort1)
  • GitHub Check: Unit Tests (engine/consensus)
  • GitHub Check: Unit Tests (engine/access)
  • GitHub Check: Unit Tests (network/p2p/connection)
  • GitHub Check: Unit Tests (engine/execution/computation)
  • GitHub Check: Unit Tests (engine/execution)
  • GitHub Check: Unit Tests (engine/verification)
  • GitHub Check: Unit Tests (module/dkg)
  • GitHub Check: Unit Tests (engine)
  • GitHub Check: Unit Tests (storage)
  • GitHub Check: Unit Tests (network)
  • GitHub Check: Unit Tests (network/test/cohort2)
  • GitHub Check: Unit Tests (network/p2p)
  • GitHub Check: Unit Tests (engine/execution/ingestion)
  • GitHub Check: Unit Tests (engine/collection)
  • GitHub Check: Unit Tests (utils)
  • GitHub Check: Unit Tests (cmd)
  • GitHub Check: Unit Tests (admin)
  • GitHub Check: Unit Tests (fvm)
  • GitHub Check: Unit Tests (consensus)
  • GitHub Check: Unit Tests (state)
  • GitHub Check: Unit Tests (ledger)
  • GitHub Check: Integration Tests Others (integration)
  • GitHub Check: Unit Tests Insecure (insecure)
  • GitHub Check: Unit Tests Insecure (insecure/integration/functional/test/gossipsub/scoring)
  • GitHub Check: Unit Tests Insecure (insecure/integration/functional/test/gossipsub/rpc_inspector)
  • GitHub Check: analyze-code (go)
  • GitHub Check: Docker Build
🔇 Additional comments (5)
integration/tests/access/cohort3/execution_state_sync_test.go (1)

200-216: ✅ Well-placed structured logging for better test observability.

The logging additions capture relevant details (height, block ID, chunk count, errors) and are appropriately positioned:

  • Error logging occurs before retry, aiding visibility into transient failures
  • Success logging occurs after successful data conversion, confirming the full flow succeeded

The structured fields using zerolog are idiomatic for Flow Go. Since these logs are within the retryNotFound callback, they may repeat on retries, which is expected and helpful for debugging integration test behavior.

cmd/observer/node_builder/observer_builder.go (1)

1195-1205: Observer execution data service correctly wired to Bloom cache flag

Appending blob.WithUseBloomCache(builder.BitswapBloomCacheEnabled) to the options here aligns observer behavior with access and execution nodes and keeps the toggle centralized in BaseConfig. No issues spotted with this wiring.

cmd/execution_builder.go (1)

398-418: Execution blob service now respects Bloom cache configuration

Appending blob.WithUseBloomCache(node.BitswapBloomCacheEnabled) ensures execution nodes honor the shared Bloom cache flag, consistent with other builders. The option placement alongside existing bitswap and reprovide options looks correct.

cmd/access/node_builder/access_node_builder.go (1)

646-666: Access execution data service correctly wired to Bloom cache flag

Using blob.WithUseBloomCache(builder.BitswapBloomCacheEnabled) here makes the execution data blob service on access nodes respect the shared Bloom cache toggle, mirroring execution and observer builders. The option ordering and usage look sound.

cmd/scaffold.go (1)

216-219: Approved. The flag binding is correct and properly uses DefaultBaseConfig() which initializes BitswapBloomCacheEnabled to true, matching the PR intent. The implementation is consistent with other bitswap-related flags and the help text clearly documents the experimental feature.

Comment on lines 55 to 59
type BlobServiceConfig struct {
ReprovideInterval time.Duration // the interval at which the DHT provider entries are refreshed
BitswapOptions []bitswap.Option // options to pass to the Bitswap service
UseBloomCache bool // if true, use the bloom cache (cached blockstore), otherwise use plain blockstore
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

cd network/p2p/blob && wc -l blob_service.go

Repository: onflow/flow-go

Length of output: 77


🏁 Script executed:

cat -n network/p2p/blob/blob_service.go | head -160

Repository: onflow/flow-go

Length of output: 6605


🏁 Script executed:

# Verify the exact line where blockStore is used in CachedBlockstore call
sed -n '141,151p' network/p2p/blob/blob_service.go

Repository: onflow/flow-go

Length of output: 354


🏁 Script executed:

# Double-check WithRateLimit implementation to confirm it modifies bs.blockStore
sed -n '95,100p' network/p2p/blob/blob_service.go

Repository: onflow/flow-go

Length of output: 303


🏁 Script executed:

# Verify WithParentBlobService also modifies bs.blockStore
sed -n '75,80p' network/p2p/blob/blob_service.go

Repository: onflow/flow-go

Length of output: 334


Fix CachedBlockstore to wrap the final blockstore after applying options

When UseBloomCache is true, CachedBlockstore is called with the local blockStore variable created on line 125, instead of bs.blockStore. Since options like WithRateLimit and WithParentBlobService mutate only bs.blockStore (not the local variable), their effects are silently discarded. For example:

  • If WithRateLimit is applied, it wraps bs.blockStore, but CachedBlockstore then wraps the unwrapped local blockStore, and the rate limiter is lost.
  • If WithParentBlobService is applied, it replaces bs.blockStore with the parent's store, but CachedBlockstore wraps the new local blockStore instead, breaking the parent relationship.

Change line 144 to use bs.blockStore instead of blockStore:

Proposed fix
 	if bs.config.UseBloomCache {
 		cachedBlockStore, err := blockstore.CachedBlockstore(
 			context.Background(),
-			blockStore,
+			bs.blockStore,
 			blockstore.DefaultCacheOpts(),
 		)
 		if err != nil {
 			return nil, fmt.Errorf("failed to create cached blockstore: %w", err)
 		}
 		bs.blockStore = cachedBlockStore
 	}

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In network/p2p/blob/blob_service.go around lines ~125-145, the CachedBlockstore
call wraps the local variable blockStore instead of bs.blockStore, causing any
option-wrapping (e.g. WithRateLimit, WithParentBlobService) applied to
bs.blockStore to be ignored; change the CachedBlockstore invocation to use
bs.blockStore (the final, possibly-wrapped store) so that all option mutations
are preserved and the parent/RateLimit wrappers remain effective.

@github-actions github-actions Bot added the Stale Label used when marking an issue stale. label Mar 4, 2026
@github-actions github-actions Bot closed this Mar 12, 2026
auto-merge was automatically disabled March 12, 2026 02:19

Pull request was closed

@fxamacker fxamacker added Preserve Stale Bot repellent and removed Stale Label used when marking an issue stale. labels Mar 12, 2026
@fxamacker fxamacker reopened this Mar 12, 2026
@blacksmith-sh
Copy link
Copy Markdown

blacksmith-sh Bot commented Mar 21, 2026

Found 20 test failures on Blacksmith runners:

Failures

Test View Logs
github.com/onflow/flow-go/integration/tests/access/cohort3/TestExecutionStateSync View Logs
github.com/onflow/flow-go/integration/tests/access/cohort3/TestExecutionStateSync View Logs
github.com/onflow/flow-go/integration/tests/access/cohort3/TestExecutionStateSync View Logs
github.com/onflow/flow-go/integration/tests/access/cohort3/TestExecutionStateSync View Logs
github.com/onflow/flow-go/integration/tests/access/cohort3/TestExecutionStateSync View Logs
github.com/onflow/flow-go/integration/tests/access/cohort3/TestExecutionStateSync/
TestBadgerDBHappyPath
View Logs
github.com/onflow/flow-go/integration/tests/access/cohort3/TestExecutionStateSync/
TestBadgerDBHappyPath
View Logs
github.com/onflow/flow-go/integration/tests/access/cohort3/TestExecutionStateSync/
TestBadgerDBHappyPath
View Logs
github.com/onflow/flow-go/integration/tests/access/cohort3/TestExecutionStateSync/
TestBadgerDBHappyPath
View Logs
github.com/onflow/flow-go/integration/tests/access/cohort3/TestExecutionStateSync/
TestBadgerDBHappyPath
View Logs
github.com/onflow/flow-go/integration/tests/access/cohort4/TestExecutionDataPruning View Logs
github.com/onflow/flow-go/integration/tests/access/cohort4/TestExecutionDataPruning View Logs
github.com/onflow/flow-go/integration/tests/access/cohort4/TestExecutionDataPruning View Logs
github.com/onflow/flow-go/integration/tests/access/cohort4/TestExecutionDataPruning View Logs
github.com/onflow/flow-go/integration/tests/access/cohort4/TestExecutionDataPruning View Logs
github.com/onflow/flow-go/integration/tests/access/cohort4/TestExecutionDataPruning/
TestHappyPath
View Logs
github.com/onflow/flow-go/integration/tests/access/cohort4/TestExecutionDataPruning/
TestHappyPath
View Logs
github.com/onflow/flow-go/integration/tests/access/cohort4/TestExecutionDataPruning/
TestHappyPath
View Logs
github.com/onflow/flow-go/integration/tests/access/cohort4/TestExecutionDataPruning/
TestHappyPath
View Logs
github.com/onflow/flow-go/integration/tests/access/cohort4/TestExecutionDataPruning/
TestHappyPath
View Logs

Fix in Cursor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Preserve Stale Bot repellent

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants