fix(op-node): probe engine finalized head on EL-sync startup by sebastianst · Pull Request #20139 · ethereum-optimism/optimism

sebastianst · 2026-04-17T12:40:15Z

Summary

Fixes the stall described in #18468: when op-node is configured with --syncmode=execution-layer, NewEngineController unconditionally initializes syncStatus = syncStatusWillStartEL. After a restart against an already-synced engine, op-node sits in that state forever: SyncStep backs off on isEngineInitialELSyncing(), and the only place that transitions out of syncStatusWillStartEL is insertUnsafePayload — which requires a fresh unsafe payload to arrive. If the sequencer is only gossipping blocks the engine already has, op-node and reth deadlock (reth logs no consensus updates received for a while; op-node logs unsafe=0 safe=0 el_syncing=true indefinitely).

Approach

Add MaybeSkipELSyncIfEngineAlreadySynced on EngineController: while syncStatus == syncStatusWillStartEL, probe the engine's finalized head. If it's a non-genesis block (and SupportsPostFinalizationELSync is not set), transition directly to syncStatusFinishedEL and emit ResetEngineRequestEvent so op-node's in-memory heads are populated via FindL2Heads. The emit happens after the mutex is released, because ResetEngineRequestEvent's handler re-acquires the same lock via OnEvent.
Call it at the top of SyncStep before the EL-sync backoff. It's a no-op once syncStatus has transitioned, so it runs at most a few times on startup.
Factor the finalized-head check (previously inline in insertUnsafePayload's syncStatusWillStartEL branch) into checkEngineAlreadySynced, used by both call sites.

Scope notes

The original concern that motivated this investigation — FCU-reduction from feat(op-node): batch safe-head FCU calls to one per derived L1 block #19638 affecting syncing engines — reduces to this startup bug once you trace it. In CLSync mode, syncStatus never transitions into any EL-syncing state, so isEngineInitialELSyncing() is always false there and the FCU-reduction changes don't alter the gossip→FCU path. The real bug is the startup-stall in ELSync mode, which this PR addresses.
Not in scope here: engines that internally decide to snap-sync (e.g. op-reth with a blank DB) while op-node runs in CLSync mode. op-node has no direct signal for that today. A follow-up would need op-node to inspect engine FCU responses (SYNCING vs VALID) to track engine-initiated sync state.

Test plan

go build ./op-node/... — clean
go vet ./op-node/... — clean
go test ./op-node/rollup/engine/... — pass (new TestMaybeSkipELSyncIfEngineAlreadySynced with four sub-tests)
go test ./op-node/rollup/{driver,derive,attributes,sequencing,finality}/... — pass
op-e2e/actions/sync/... — requires forge-artifacts build; deferred to CI

Fixes #18468

🤖 Generated with Claude Code

When op-node is configured with --syncmode=execution-layer, NewEngineController unconditionally initializes syncStatus to syncStatusWillStartEL. After a restart against an already-synced engine, op-node stalls there indefinitely: SyncStep backs off on isEngineInitialELSyncing(), and the only place that transitions out of syncStatusWillStartEL is insertUnsafePayload — which requires a fresh unsafe payload that may not arrive (the sequencer gossips blocks the engine already has). Add MaybeSkipELSyncIfEngineAlreadySynced: a startup guard that queries the engine's finalized head while in syncStatusWillStartEL. If the engine is already synced (non-genesis finalized head, and SupportsPostFinalizationELSync is not set), transition directly to syncStatusFinishedEL and emit ResetEngineRequestEvent so op-node's in-memory heads are populated via FindL2Heads. The emit happens after the mutex is released to avoid re-entering OnEvent under the lock. Call it at the top of SyncStep before the EL-sync backoff — it's a no-op once syncStatus has transitioned, so it runs at most a few times on startup. The finalized-head check that drives both startup probing and the existing transition inside insertUnsafePayload is factored out into checkEngineAlreadySynced. Fixes #18468 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

sebastianst force-pushed the seb/fix/fcu-nudge-during-el-sync branch from e451773 to 90669a8 Compare April 17, 2026 13:37

sebastianst changed the title ~~fix(op-node): nudge engine with FCU on gossip during initial EL sync~~ fix(op-node): probe engine finalized head on EL-sync startup Apr 17, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(op-node): probe engine finalized head on EL-sync startup#20139

fix(op-node): probe engine finalized head on EL-sync startup#20139
sebastianst wants to merge 1 commit intodevelopfrom
seb/fix/fcu-nudge-during-el-sync

sebastianst commented Apr 17, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

sebastianst commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Approach

Scope notes

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

sebastianst commented Apr 17, 2026 •

edited

Loading