Enhance DomainAdapter to retrieve missing UTXOs from archive store#962
Enhance DomainAdapter to retrieve missing UTXOs from archive store#962nicolasLuduena wants to merge 2 commits intomainfrom
Conversation
📝 WalkthroughWalkthroughAfter a state-store lookup, Changes
Sequence DiagramsequenceDiagram
participant Caller as Caller
participant StateStore as StateStore
participant Indexes as Indexes
participant Archive as Archive
participant Decoder as BlockDecoder
participant Result as ResultMap
Caller->>StateStore: get_utxos(dolos_refs)
StateStore-->>Caller: initial UtxoMap
Caller->>Caller: compute missing refs
alt no missing refs
Caller-->>Result: return UtxoMap
else missing refs exist
Caller->>Caller: group refs by tx_hash
loop per tx_hash
Caller->>Indexes: slot_by_tx_hash(tx_hash)
Indexes-->>Caller: slot
Caller->>Archive: get_block_by_slot(slot)
Archive-->>Caller: block_bytes
Caller->>Decoder: MultiEraBlock::decode(block_bytes)
Decoder-->>Caller: decoded_block (era, txs...)
Caller->>Caller: find tx where tx.hash() == tx_hash
Caller->>Result: insert (tx_output.index -> (block.era, output.encode()))
end
Caller-->>Result: return populated UtxoMap
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/adapters/mod.rs`:
- Around line 209-216: The current pattern using let Ok(Some(...)) =
self.indexes().slot_by_tx_hash(&tx_hash_bytes) else { continue } (and similarly
for self.archive().get_block_by_slot and MultiEraBlock::decode) collapses Err
into “not found”; instead match each call explicitly: handle Ok(Some(...)) as
before, treat Ok(None) as a genuine not-found (continue), and propagate or
return Err for any Err results so archive/index/decoding failures are not
swallowed. Update the call sites (self.indexes().slot_by_tx_hash,
self.archive().get_block_by_slot, MultiEraBlock::decode) to use match or ? to
return errors, and adjust the enclosing function’s return type if needed to
return Result<Option<...>, E> rather than silently continuing on Err.
- Around line 198-235: DomainAdapter::get_utxos currently reconstructs missing
outputs from ArchiveStore (via indexes().slot_by_tx_hash,
archive().get_block_by_slot, MultiEraBlock::decode) and returns them into the
current-state API; change this so the fallback does not expose historical/spent
outputs: either remove the ArchiveStore reconstruction entirely (leave missing
refs absent) or, if you keep reconstruction, validate each reconstructed output
against the current StateStore (use StateStore APIs / the same lookup used for
present UTXOs) to ensure the output is still unspent before inserting into
result; update get_utxos accordingly and keep get_historical_utxos as the sole
path for historical-only reads.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
| let missing: Vec<_> = refs.iter().filter(|r| !result.contains_key(r)).collect(); | ||
| if missing.is_empty() { | ||
| return Some(result); | ||
| } | ||
|
|
||
| let some = dolos_core::StateStore::get_utxos(self.state(), refs) | ||
| .ok()? | ||
| .into_iter() | ||
| .map(|(k, v)| { | ||
| let era = v.0.try_into().expect("era out of range"); | ||
| (k.into(), (era, v.1.clone())) | ||
| }) | ||
| .collect(); | ||
|
|
||
| Some(some) | ||
| let mut by_tx: HashMap<Vec<u8>, Vec<&pallas::interop::utxorpc::TxoRef>> = HashMap::new(); | ||
| for txo_ref in &missing { | ||
| by_tx.entry(txo_ref.0.to_vec()).or_default().push(txo_ref); | ||
| } | ||
|
|
||
| for (tx_hash_bytes, txo_refs) in by_tx { | ||
| let Ok(Some(slot)) = self.indexes().slot_by_tx_hash(&tx_hash_bytes) else { | ||
| continue; | ||
| }; | ||
| let Ok(Some(block_bytes)) = self.archive().get_block_by_slot(&slot) else { | ||
| continue; | ||
| }; | ||
| let Ok(block) = MultiEraBlock::decode(&block_bytes) else { | ||
| continue; | ||
| }; | ||
|
|
||
| let block_txs = block.txs(); | ||
| let Some(tx) = block_txs | ||
| .iter() | ||
| .find(|tx| tx.hash().as_ref() == tx_hash_bytes.as_slice()) | ||
| else { | ||
| continue; | ||
| }; | ||
|
|
||
| let outputs = tx.outputs(); | ||
| let era = block.era(); | ||
|
|
||
| for txo_ref in txo_refs { | ||
| let Some(output) = outputs.get(txo_ref.1 as usize) else { | ||
| continue; | ||
| }; | ||
| result.insert(*txo_ref, (era, output.encode())); | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
printf '\n== LedgerContext implementations ==\n'
rg -nP --type=rust -C3 'impl\s+pallas::interop::utxorpc::LedgerContext\s+for\s+\w+|fn\s+get_utxos\s*\('
printf '\n== Historical lookup definitions / call sites ==\n'
rg -nP --type=rust -C3 '\bfn\s+get_historical_utxos\s*\(|(?<!fn\s)\bget_historical_utxos\s*\('
printf '\n== Non-definition get_utxos call sites ==\n'
rg -nP --type=rust -C3 '(?<!fn\s)\bget_utxos\s*\('Repository: txpipe/dolos
Length of output: 13638
🏁 Script executed:
# Check comments and context around the fallback code in DomainAdapter
sed -n '150,250p' src/adapters/mod.rs
# Look for any documentation or comments about why ArchiveStore fallback exists
rg -B5 -A5 'fn get_utxos' src/adapters/mod.rs
# Check if there are tests that verify this fallback behavior
rg -B3 -A3 'get_utxos.*ArchiveStore|ArchiveStore.*get_utxos|historical.*fallback|fallback.*historical' --type=rustRepository: txpipe/dolos
Length of output: 3910
🏁 Script executed:
# Check git history for when this fallback was added and any commit messages
git log --oneline -p -- src/adapters/mod.rs | grep -A 20 "get_utxos" | head -100
# Look for any documentation or design notes about LedgerContext
find . -name "*.md" -o -name "*.txt" | xargs grep -l "LedgerContext\|get_utxos\|historical.*utxo" 2>/dev/null | head -5
# Check if StateStore.get_utxos is designed to handle spent outputs (i.e., does it return None for spent?)
rg -B5 -A15 'fn get_utxos.*StateStore|impl.*StateStore.*{' crates/core/src/state.rs | head -80
# Look for comments or documentation explaining StateStore behavior
rg -B5 -A10 'pub trait StateStore|// StateStore|StateStore.*current' crates/core/src/state.rsRepository: txpipe/dolos
Length of output: 1875
🏁 Script executed:
# Search for StateStore trait definition and documentation
rg -A30 'pub trait StateStore' crates/core/src/state.rs
# Look for comments in the StateStore implementation files
rg 'StateStore.*current|current.*state|unspent|spent' crates/core/src/state.rs -B2 -A2
# Check AGENTS.md or other docs that were mentioned in learnings
cat AGENTS.md 2>/dev/null | grep -A10 -B10 "StateStore\|current.*state\|historical"
# Look for whether get_utxos is expected to return spent outputs or only current state
rg -B5 -A5 'spent|unspent' src/adapters/mod.rs crates/cardano/src/estart/loading.rsRepository: txpipe/dolos
Length of output: 10288
Confirm: this fallback should not return historical/spent outputs.
The fallback mechanism in DomainAdapter::get_utxos violates the documented storage semantics. Per AGENTS.md, StateStore represents "current ledger state (the 'world view')" while ArchiveStore is strictly "historical block storage". The fallback reconstructs missing outputs directly from ArchiveStore without validating whether they remain in the current unspent set—a call site in estart/loading.rs explicitly expects "outputs that are still unspent" (line 70).
The separate get_historical_utxos method confirms this is an intentional API boundary. Either:
- Remove the
ArchiveStorefallback entirely (missing outputs stay missing), or - Add validation that reconstructed outputs are still present in the current
StateStorebefore returning them
The fallback as currently written allows spent outputs to leak into the current-state API.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/adapters/mod.rs` around lines 198 - 235, DomainAdapter::get_utxos
currently reconstructs missing outputs from ArchiveStore (via
indexes().slot_by_tx_hash, archive().get_block_by_slot, MultiEraBlock::decode)
and returns them into the current-state API; change this so the fallback does
not expose historical/spent outputs: either remove the ArchiveStore
reconstruction entirely (leave missing refs absent) or, if you keep
reconstruction, validate each reconstructed output against the current
StateStore (use StateStore APIs / the same lookup used for present UTXOs) to
ensure the output is still unspent before inserting into result; update
get_utxos accordingly and keep get_historical_utxos as the sole path for
historical-only reads.
| let Ok(Some(slot)) = self.indexes().slot_by_tx_hash(&tx_hash_bytes) else { | ||
| continue; | ||
| }; | ||
| let Ok(Some(block_bytes)) = self.archive().get_block_by_slot(&slot) else { | ||
| continue; | ||
| }; | ||
| let Ok(block) = MultiEraBlock::decode(&block_bytes) else { | ||
| continue; |
There was a problem hiding this comment.
Don’t collapse archive/index failures into “not found”.
This fallback currently returns a partial Some(result) when the index lookup, archive read, or block decode errors out. That makes backend corruption/transient outages indistinguishable from a genuine missing ref.
Proposed fix
for (tx_hash_bytes, txo_refs) in by_tx {
- let Ok(Some(slot)) = self.indexes().slot_by_tx_hash(&tx_hash_bytes) else {
- continue;
- };
- let Ok(Some(block_bytes)) = self.archive().get_block_by_slot(&slot) else {
- continue;
- };
- let Ok(block) = MultiEraBlock::decode(&block_bytes) else {
- continue;
- };
+ let Some(slot) = self.indexes().slot_by_tx_hash(&tx_hash_bytes).ok()? else {
+ continue;
+ };
+ let Some(block_bytes) = self.archive().get_block_by_slot(&slot).ok()? else {
+ continue;
+ };
+ let block = MultiEraBlock::decode(&block_bytes).ok()?;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/adapters/mod.rs` around lines 209 - 216, The current pattern using let
Ok(Some(...)) = self.indexes().slot_by_tx_hash(&tx_hash_bytes) else { continue }
(and similarly for self.archive().get_block_by_slot and MultiEraBlock::decode)
collapses Err into “not found”; instead match each call explicitly: handle
Ok(Some(...)) as before, treat Ok(None) as a genuine not-found (continue), and
propagate or return Err for any Err results so archive/index/decoding failures
are not swallowed. Update the call sites (self.indexes().slot_by_tx_hash,
self.archive().get_block_by_slot, MultiEraBlock::decode) to use match or ? to
return errors, and adjust the enclosing function’s return type if needed to
return Result<Option<...>, E> rather than silently continuing on Err.
This pull request enhances the
LedgerContextimplementation forDomainAdapterby improving the retrieval of UTXO (Unspent Transaction Output) data. The main improvement is a fallback mechanism that attempts to fetch missing UTXOs directly from archived blocks if they are not found in the state store. This makes UTXO lookups more robust and comprehensive.Key improvements to UTXO retrieval:
Supporting changes:
HashMap,MultiEraBlock, and trait imports forArchiveStoreandIndexStoreto support the new logic.Summary by CodeRabbit
Bug Fixes
Chores