consensus/bor, internal/cli: full grpc implementation#2194
consensus/bor, internal/cli: full grpc implementation#2194marcello33 wants to merge 8 commits intodevelopfrom
Conversation
There was a problem hiding this comment.
Claude Code Review
This repository is configured for manual code reviews. Comment @claude review to trigger a review and subscribe this PR to future pushes, or @claude review once for a one-time review.
Tip: disable this comment in your organization's Code Review settings.
|
@claude review. |
|
Claude finished @marcello33's task in 3m 47s —— View job Code Review: gRPC Implementation
Overall this is a solid WIP. The auth interceptor, API caching, and header expansion are well-designed. I found one security-sensitive bug and a few smaller items. Finding 1 — uint64 overflow in
|
There was a problem hiding this comment.
Pull request overview
This WIP PR extends the gRPC implementation by adding bearer-token authentication and expanding the Bor gRPC API surface, while aligning Bor’s RPC API lifecycle with gRPC’s per-call access pattern.
Changes:
- Add optional gRPC bearer-token auth + unified unary/stream interceptors (also gating reflection).
- Expand BorApi gRPC handlers and improve header→proto conversion + batch endpoint size gating.
- Cache Bor’s
*APIinstance acrossAPIs()calls to preserve per-API state; update deps/tests accordingly.
Reviewed changes
Copilot reviewed 11 out of 13 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| internal/cli/server/server.go | Add combined unary/stream interceptors and metadata-based bearer auth. |
| internal/cli/server/config.go | Add GRPC.Token config field. |
| internal/cli/server/flags.go | Add --grpc.token CLI flag. |
| internal/cli/server/grpc_auth_test.go | Add unit tests for auth + interceptors. |
| internal/cli/server/api_service.go | Add/extend BorApi gRPC handlers, header conversion, and batch endpoint with size gate. |
| internal/cli/server/api_service_test.go | Add tests for header round-trip and batch-range guards. |
| consensus/bor/bor.go | Cache the Bor *API returned by APIs() across calls. |
| consensus/bor/bor_test.go | Add test asserting APIs() returns the same service instance across calls. |
| go.mod / go.sum | Bump github.com/0xPolygon/polyproto to a newer pseudo-version. |
| cmd/keeper/go.mod / cmd/keeper/go.sum | Bump indirect golang.org/x/sys. |
| eth/tracers/data.csv | Reorder/update CSV rows. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Code ReviewBug: uint64 overflow causes infinite loop in GetBlockInfoInBatch (DoS)File: internal/cli/server/api_service.go L197-L218 The for loop The size gate does not prevent this. For example, StartBlockNumber = MaxUint64, EndBlockNumber = MaxUint64 has a difference of 0, which passes the >= 256 check. Why fetchBlockInfo doesn't save us: rpc.BlockNumber(math.MaxUint64) casts to int64(-1) which is PendingBlockNumber — a valid block on mining/validator nodes. After wrapping to 0, the loop fetches from genesis through every block in the chain, causing unbounded memory allocation and CPU consumption. Suggested fix: Restructure the loop to count iterations instead of iterating on the block number directly: count := req.EndBlockNumber - req.StartBlockNumber + 1
for j := uint64(0); j < count; j++ {
info, ok := s.fetchBlockInfo(ctx, req.StartBlockNumber+j)
if !ok {
break
}
out.Blocks = append(out.Blocks, info)
}This avoids the i <= MaxUint64 trap since j < count where count is bounded by the size gate. Additional concern: fetchBlockInfo casts uint64 block numbers to rpc.BlockNumber (int64) without checking for overflow. Large uint64 values silently map to negative int64 values corresponding to special block numbers (-1 = Pending, -2 = Latest), which could return unexpected blocks. |
Codecov Report❌ Patch coverage is ❌ Your patch check has failed because the patch coverage (50.19%) is below the target coverage (90.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## develop #2194 +/- ##
===========================================
- Coverage 51.99% 51.99% -0.01%
===========================================
Files 884 884
Lines 155503 155735 +232
===========================================
+ Hits 80854 80973 +119
- Misses 69448 69552 +104
- Partials 5201 5210 +9
... and 18 files with indirect coverage changes
🚀 New features to boost your workflow:
|
|
@claude addressed your points. Review again. |
|
Claude finished @marcello33's task in 2m 4s —— View job Re-review: gRPC Implementation (post
|
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 13 out of 15 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@claude addressed your points from the previous run. |
|
Claude finished @marcello33's task in 1m 53s —— View job Re-review: gRPC Implementation (3rd pass — post
|
|
@claude addressed latest comments. Review. |
|
Claude finished @marcello33's task in 1m 16s —— View job Re-review: gRPC Implementation (4th pass — post
|
Code reviewNo issues found. Checked for bugs and CLAUDE.md compliance. |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 13 out of 15 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Code review3 issues found. Checked for bugs and CLAUDE.md compliance. 1. Bug: File: Go's Fix: use a generic helper 2. CLAUDE.md violation: File: The 3. CLAUDE.md violation: Bearer token exposed via CLI flag (visible in File: The |
|
@claude wrt #2194 (comment) |
|
Claude finished @marcello33's task in 54s —— View job Re-review: Verifying latest fixes (5th pass — post
|
|

Summary
In this PR we implement the server-side half of the heimdall-bor gRPC migration started by polyproto's
GetBlockInfoInBatch,GetTdByHash,GetTdByNumber,GetAuthormessages (see this PR). It provides:Results
GetBorChainBlockInfoInBatchwith 77% latency reduction — the call used in the consensus-criticalExtendVoteHandlerpath.Latency (ns per op)
Allocations (bytes and count per op)
Ambient system-level resource usage (per phase, 5 s interval across 1000 blocks)
CPU samples during bench window (pprof, 120 s each, aligned to bench run)
Heimdall CPU is ~17% higher on gRPC — consistent with the client doing proto decode per call instead of JSON-RPC decode.
Details
GetBlockInfoInBatch,GetTdByHash,GetTdByNumber,GetAuthor.Headerfidelity:headerToProtoBorHeadernow emits every 21 header fields.--grpc.tokenflag (orBOR_GRPC_TOKENenv var) + unary/stream interceptors. If the token is empty, auth is disabled. This matches the RFC 6750 §2.1 case-insensitive scheme.Bor.APIs()caches a single*APIinstance viasync.Once, so the gRPC backend shares the samerootHashCacheas the JSON-RPC registration path. Without this, I detected a 2.1×GetRootHashregression.GetBlockInfoInBatchsafety: size gate (≤ 256 blocks),end > MaxInt64guard, count-based loop (wrap-safe), per-iterationctx.Err()check.fetchBlockInforeturns(info, ok, err)so real failures propagate instead of silently truncating batches.InvalidArgumentfor malformed requests,NotFoundfor missing blocks/TD,OutOfRangefor numeric-overflow cases. Callers can branch on canonical codes instead of everything collapsing tocodes.Unknown.What this enables on the heimdall side
GetRootHash,GetVoteOnHash,HeaderByNumber,BlockByNumber,TransactionReceipt,BorBlockReceipt).x/milestone/abci) can issue a singleGetBlockInfoInBatchround-trip instead of 3×N JSON-RPC batch elements — up to 4.4× faster per call and 66% fewer bytes allocated per request on the kurtosis devnet benchmark.Info about the gRPC authentication
gRPC authentication via bearer token is not strictly needed when bor and heimdall run on the same host with bor bound to 127.0.0.1:3131 (the post-PR-#2078 default). This is because:
The token becomes meaningful in three cases:
Enabling it
gRPC is opt-in. Default config has bor listening on
127.0.0.1:3131and heimdall dialing over HTTP JSON-RPC — no behavior change for existing operators.Step 1: Bor side
[grpc]loopback default). Required — heimdall's clients assume both.config.tomlto opt into gRPC on the address heimdall can reach:ps aux/ shell history):bor serverreads it at startup; the flag stays as a fallback.Step 2: Heimdall side
Flip the config in
app.toml:Restart heimdall; the startup parity check runs one
HeaderByNumberover both transports and logs a Warning / Fatal if the hashes diverge (guard against a stale bor that doesn't populate the full proto header).Rollout recommendation
addr=127.0.0.1:3131, token empty or set viaBOR_GRPC_TOKEN. No TLS needed on loopback.addr=0.0.0.0:3131+https://URL + matching token. Heimdall refuses to send a token over plaintext to a non-loopback peer.Downgrade path
Set
bor_grpc_flag = "false"in heimdall and restart. Bor doesn't need to be downgraded — it keeps serving both transports regardless of the flag.Tests
GetBlockInfoInBatch4.4× faster on gRPC, 50–67% allocation reduction across all methods,GetRootHashregression closed via the cached API.Dependencies
GetBlockInfoInBatch/GetTdByHash/GetTdByNumber/GetAuthorRPCs and the expandedHeadermessage.[grpc]port, threadcl_bor_grpc_flag+el_grpc_urlthrough the CL launcher template) if you want to run the end-to-end bench on a devnet. The changes are hereOut of scope
BloomandLogsare intentionally not populated yet. No heimdall consumer reads them today; will be a small follow-up when needed.GetTdByNumbertag resolution (latest/pending) viaHeaderByNumber → GetTd(hash)is scoped to this gRPC handler; the same defect in upstreameth/api_backend.go:GetTdByNumberaffects the JSON-RPC path and is tracked separately.