Conversation
57f72ae to
b07e4f2
Compare
b07e4f2 to
6cb7828
Compare
|
|
||
| // Explicitly check for the expected number of topics | ||
| if len(log.Topics) < 4 { | ||
| r.onCriticalInvariant(ctx) |
There was a problem hiding this comment.
Adding it to all errors, but some error logs are promoted to fatals to make them failing loud. The ones upgraded are the ones that would be the result from a malicious ramp. The ones that are still just skipping (but do send a metric!) are wrong payloads but not in a clearly malicious way.
Let me know if everything should be fatal.
| } | ||
|
|
||
| results = append(results, protocol.MessageSentEvent{ | ||
| MessageID: protocol.Bytes32(event.MessageId), |
There was a problem hiding this comment.
Pull request overview
Adds a new verifier metric for “critical source invariant violations” and wires it through the monitoring interface so the EVM source reader can emit it when encountering critical invariants / malformed source-chain message data.
Changes:
- Extend
MetricLabelerwithIncrementCriticalSourceInvariantViolationsand implement it across real/noop/fake/mock labelers. - Register a new OTel counter (
verifier_critical_source_invariant_violations_total) and add the corresponding labeler method implementation. - Thread a callback into
NewEVMSourceReaderand invoke it on several critical validation / decoding paths.
Reviewed changes
Copilot reviewed 9 out of 10 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| verifier/testutil/metric_labeler.go | Add noop implementation for the new metric method. |
| verifier/pkg/vtypes/interfaces.go | Extend monitoring interface with the new metric method + doc comment. |
| verifier/pkg/monitoring/monitoring.go | Add fake labeler implementation for the new metric method. |
| verifier/pkg/monitoring/metrics.go | Register new counter + expose increment method on VerifierMetricLabeler. |
| verifier/pkg/helpers_test.go | Update local noop labeler used in tests to satisfy interface. |
| verifier/internal/mocks/mock_MetricLabeler.go | Update generated mock with new method/expecter. |
| integration/pkg/constructors/committee_verifier.go | Provide per-chain labeled metric labeler and pass callback into EVM source reader. |
| integration/pkg/accessors/evm/factory.go | Update EVM source reader constructor call with new parameter. |
| integration/pkg/accessors/evm/evm_source_reader.go | Add callback plumbing and invoke it on critical validation/decoding paths. |
| .gitignore | Normalize build ignore entry and add coverage.out. |
Comments suppressed due to low confidence (1)
integration/pkg/accessors/evm/evm_source_reader.go:174
onCriticalInvariant(and the newIncrementCriticalSourceInvariantViolationsmetric) is triggered here for malformed logs (insufficient topics). That doesn’t match the interface doc that describes disagreement with configured on-chain facts; it will make the metric hard to interpret and may inflate counts due to decode/parsing issues. Consider either narrowing invocation to true invariant mismatches, or broadening/renaming the metric+doc to reflect all parse/validation failures it covers.
// Explicitly check for the expected number of topics
if len(log.Topics) < 4 {
r.onCriticalInvariant(ctx)
r.lggr.Errorw("CCIPMessageSent event has insufficient topics",
"expected", 4,
"actual", len(log.Topics),
"blockNumber", log.BlockNumber,
"txHash", log.TxHash.Hex())
continue // to next message
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // IncrementTaskVerificationPermanentErrors increments the counter for non-retryable verification errors. | ||
| IncrementTaskVerificationPermanentErrors(ctx context.Context) | ||
|
|
||
| // IncrementCriticalSourceInvariantViolations increments when encoded source-chain data disagrees with configured on-chain facts (e.g. onRamp in message vs observed contract). |
There was a problem hiding this comment.
The doc comment for IncrementCriticalSourceInvariantViolations is narrowly scoped to encoded data disagreeing with configured on-chain facts, but current usage also covers decode/unpack/structural validation failures in the EVM source reader. Either update this comment (and metric description) to reflect all intended triggers, or introduce separate metrics so dashboards/alerts are not ambiguous.
| // IncrementCriticalSourceInvariantViolations increments when encoded source-chain data disagrees with configured on-chain facts (e.g. onRamp in message vs observed contract). | |
| // IncrementCriticalSourceInvariantViolations increments the counter for critical source-reader | |
| // validation failures, including when encoded source-chain data disagrees with configured | |
| // or observed on-chain facts, as well as decode, unpack, or structural validation failures | |
| // while interpreting source-chain data. |
| results = append(results, protocol.MessageSentEvent{ | ||
| MessageID: protocol.Bytes32(event.MessageId), | ||
| MessageID: event.MessageId, | ||
| Message: *decodedMsg, | ||
| Receipts: allReceipts, // Keep original order from OnRamp event | ||
| BlockNumber: log.BlockNumber, | ||
| TxHash: protocol.ByteSlice(log.TxHash.Bytes()), | ||
| TxHash: log.TxHash.Bytes(), | ||
| }) |
There was a problem hiding this comment.
protocol.MessageSentEvent expects MessageID to be protocol.Bytes32 and TxHash to be protocol.ByteSlice (both defined types), so assigning event.MessageId ([32]byte) and log.TxHash.Bytes() ([]byte) will not compile. Convert both to the expected protocol types when building the struct literal.
| if !decodedMsg.OnRampAddress.Equal(expectedSourceAddressBytes(r.onRampAddress)) { | ||
| r.lggr.Errorw("onRampAddress must match the value configured. This should never happen, if it does something is seriously wrong. Escalate immediately", "messageId", common.Bytes2Hex(event.MessageId[:])) | ||
| continue // to next message | ||
| r.onCriticalInvariant(ctx) | ||
| r.lggr.Fatalw("onRampAddress must match the value configured — critical invariant violated; escalate immediately", | ||
| "messageId", common.Bytes2Hex(event.MessageId[:])) | ||
| continue // ensure we never process this msg |
There was a problem hiding this comment.
Using lggr.Fatalw inside FetchMessageSentEvents is risky because fatal logging typically terminates the process; this would turn a single bad event into a full service outage, and the subsequent continue becomes effectively dead code. Prefer returning an error (so the caller can decide), or logging at Error/Panic with clear context and skipping the message.
# Conflicts: # integration/pkg/accessors/evm/factory.go
|
Code coverage report:
|
No description provided.