fix: improve debug var loc tracking#2955
Conversation
|
cc @bitwalker |
| debug_var_mappings.sort_by_key(|(node_id, _)| *node_id); | ||
| // Keep only the first registration per node_id (duplicates come from deduplication). | ||
| let mut seen_debug_var_node_ids = BTreeSet::new(); | ||
| debug_var_mappings.retain(|(node_id, _)| seen_debug_var_node_ids.insert(*node_id)); |
There was a problem hiding this comment.
I think this changes semantics for deduplicated blocks, not just the crash behavior. add_debug_var treats debug vars as per-occurrence metadata, but this path now keeps only the first debug-var registration for a MastNodeId. Since debug vars are not part of the block fingerprint, two identical blocks from different source scopes can legitimately have different live-variable mappings. After this change they no longer crash, but the debugger can show the first block's vars for every deduplicated copy.
There was a problem hiding this comment.
@huitseeker thanks for the comment!
Hmm. Yes. A proper fix would require either (a) including debug var info in the block fingerprint to prevent dedup when vars differ, or (b) supporting multiple debug var registrations per node in the CSR. Please let me know what you think.
There was a problem hiding this comment.
Or a third/(c) one can be a fix that merges debug vars from all deduplicated copies instead of keeping only the first. In ensure_block, the is_new guard can be removed for debug vars (kept for asm_ops), so all registrations are collected. In build(), mappings for the same MastNodeId would be merged into a single vector rather than discarded. This way the debugger sees variables from all source scopes that map to the deduplicated block.
There was a problem hiding this comment.
Yeah, so rephrasing: either include debug-var metadata in the key when debug info is enabled, skip block deduplication when debug vars are present, or store debug-var mappings per block occurrence rather than per deduplicated MastNodeId. In this case, I'd just use debug vars in the fingerprint, I think. Simple, and lands the proper semantics under all branches.
|
@huitseeker please let me know if there is any other comment. |
| let debug_var_mappings = | ||
| deduplicate_debug_var_mappings(core::mem::take(&mut self.pending_debug_var_mappings)); | ||
|
|
||
| for (node_id, debug_vars) in debug_var_mappings { |
There was a problem hiding this comment.
Even with the cleaner design where debug vars stay external and only contribute to the dedup helper, I think this build path still needs one more step after registration: remove_nodes() has to remap op_debug_var_storage the same way it already remaps asm-op storage.
So I would separate the two concerns:
- dedup sensitivity: make the fingerprint helper consult external debug-var mappings
- final forest correctness: remap the debug-var CSR when merged nodes are compacted away
That keeps the metadata external, but still fixes the post-compaction lookup bug. We should probably talk about this offline as well.
There was a problem hiding this comment.
I see, thank you! I agree that this is the approach! Will try to handle it within this PR.
Let me check how to reflect this to debugger and compiler as well.
| /// | ||
| /// Debug vars are included in the block fingerprint, so blocks with identical operations | ||
| /// but different debug variables will not be deduplicated. | ||
| pub fn with_debug_vars(mut self, debug_vars: Vec<(usize, DebugVarId)>) -> Self { |
There was a problem hiding this comment.
I think we may not want debug_vars to live on BasicBlockNodeBuilder at all.
The thing we need for dedup is not the serialized BasicBlockNode, it is the dedup key that ensure_node_exists() uses before the node is inserted. Those are different concerns. One way to keep them separate would be:
- Keep the actual debug-var storage external, the same way decorators/debug info already live in
debug_info. - In
ensure_block, compute the dedup fingerprint from the normal block data plus a separate debug-var contribution passed alongside the block, instead of storing that contribution on the builder itself. - For already-inserted nodes, when we need to recompute the cached dedup fingerprint after
append_before_enter/append_after_exit, read the debug-var mappings back fromdebug_infoand feed them into the same fingerprint helper.
That avoids making BasicBlockNode::to_builder() responsible for round-tripping debug vars just so dedup stays correct, and it keeps the storage model aligned with the rest of the debug metadata. The builder would stay about block structure, while debug-var mappings remain external metadata that still participates in dedup.
| /// Note: AssemblyOps are only registered for newly created nodes. If a duplicate node already | ||
| /// exists in the forest (deduplication), the asm_ops are ignored since the existing node | ||
| /// already has its metadata registered. | ||
| /// The `debug_vars` parameter contains debug variable metadata for operations in this block. |
There was a problem hiding this comment.
If the intent here is only to make debug vars affect dedup, in line with the above comment, I think there is a cleaner shape than storing them on BasicBlockNodeBuilder.
A possible split is:
- Keep
pending_debug_var_mappingsas the source of truth for the per-node CSR registration path. - Add a helper that computes the basic-block dedup fingerprint from
(operations, decorators, before/after decorators, assert data, debug-var mappings). - Call that helper from
ensure_blockusing the incomingdebug_vars, without moving those mappings onto the block builder itself. - When a block already exists and later gets rebuilt through
append_before_enter/append_after_exit, recompute its cached dedup fingerprint by reading the node's debug-var mappings back out ofdebug_infoand feeding them into the same helper.
That keeps the debug-var metadata external, like the rest of the debug-info storage, while still making dedup sensitive to it. It also avoids having to thread debug vars through to_builder() just to preserve fingerprint semantics.
|
@huitseeker I have addressed the comments:
|
huitseeker
left a comment
There was a problem hiding this comment.
Thanks for the change! I appreciate that on top of the DebugVarLocation::FrameBase/DebugVarInfo, you're having to fix a real assembler/debug-info bug around deduplicated blocks.
You may want to read what #2958 is doing as a side quest. Our CONTRIBUTING.md mentions some contribs may end up being rolled up in our PRs.
I think this PR is aiming at the right semantics for deduped debug vars, but the current diff appears to miss at least one path: merge_basic_blocks() forwards pending asm-op mappings but not pending debug-var mappings in mast_forest_builder.rs.
| // Blocks with different debug vars have different augmented fingerprints, | ||
| // so deduplication only occurs when vars are truly identical. | ||
| if is_new && !debug_vars.is_empty() { | ||
| self.pending_debug_var_mappings.push((node_id, debug_vars)); |
There was a problem hiding this comment.
This pushes debug vars onto the original block id, but merge_basic_blocks() only carries pending_asm_op_mappings into the merged block. If a mergeable block has debug vars, the old node is removed later and remove_nodes() drops that debug-var row instead of moving it to the merged node. A small test that merges two blocks with push_debug_var() would catch this.
There was a problem hiding this comment.
I see. It now handles both asm_ops and debug_vars. Thanks a lot!
huitseeker
left a comment
There was a problem hiding this comment.
Thanks! This is getting really close now
| // Re-augment the fingerprint with this node's debug vars so the dedup key | ||
| // stays consistent with what ensure_block() originally computed. | ||
| let debug_vars_data = | ||
| serialize_debug_vars_for_node(&self.pending_debug_var_mappings, node_id); |
There was a problem hiding this comment.
I think this still leaves one dedup path open. append_before_enter now re-augments the fingerprint for an existing node, but the repeat path in assembler.rs still clones a node with .to_builder(...).with_before_enter(...) and sends it through generic ensure_node().
For a repeated basic block that carries debug vars, that route still fingerprints only the block/decorator structure, so it can alias a same-op/same-decorator block with different debug vars.
A small repro is:
let block = builder.ensure_block(vec![Operation::Add], Vec::new(), vec![], vec![(0, lhs_var)], vec![], vec![])?;
let rebuilt = builder[block].clone().to_builder(builder.mast_forest()).with_before_enter(vec![before_enter]);
let node = builder.ensure_node(rebuilt)?;After build(), debug_vars_for_node(node) is empty. But the direct path ensure_block(..., vec![(0, lhs_var)], vec![before_enter], ...) keeps the debug var.
I think the fix is to make this rebuilt basic-block path use the same debug-var-aware fingerprint and pending-registration flow as ensure_block, instead of only handling it in append_before_enter / append_after_exit. And perhaps adding a regression test that exercises repeat with debug vars.
There was a problem hiding this comment.
wow, nice catch!!! yes, it was not handled.
huitseeker
left a comment
There was a problem hiding this comment.
Besides the issue I pointed to inline, there's a few more gaps. Two nodes with the same ops but different debug vars must not collapse to one node. That rule applies to any node type that can carry debug vars. We should include serialized debug-var data in the dedup key whenever a node has debug vars.
Ideally, I'd see in a test that two basic blocks with the same ops but different vars stay distinct after merge, and that a non-basic-block node with different vars also stays distinct (or better, that a non-basic block node with debug vars is not representable, but that may be much to ask).
Finally, two things we should want to tackle in a posterior PR (because this isn't strictly limited to debug vars):
- metadata from an external placeholder root must not override metadata from the concrete root it points to: If the concrete root has asm-op or debug-var metadata, that concrete metadata should win. Procedure names need the same rule, except that a placeholder name should stay as fallback when the concrete root has no name.
- determinism. If two real roots give different names for the same digest, the result must not depend on merge order. A test should show that merging the same inputs in both orders picks the same final name.
| merged_debug_vars: &mut Vec<(usize, miden_core::mast::DebugVarId)>, | ||
| ) { | ||
| let (matched, rest): (Vec<_>, Vec<_>) = | ||
| core::mem::take(&mut self.pending_debug_var_mappings) |
There was a problem hiding this comment.
This still looks incomplete.
This helper still removes source mappings too early. A block can be merged into a larger block and still stay alive as a root or as a shared node, and in that case the source block still needs its own asm-op and debug-var metadata. The fix is to read the pending mappings for the merged block here, but leave the source mappings in place until the source node is actually removed.
A test should cover that when a small root block is folded into a larger block, the small root still remains in the forest, and both nodes keep the right metadata.
Another gap is in the merge path. After merge_roots(), the merged forest still needs procedure_names, asm_op_storage, and op_debug_var_storage rebuilt from the remapped node ids. The same rule needs to hold for compact(). We should test merging two forests with procedure names, asm-op locations, and debug vars keeps that metadata on the final nodes and that compacting such a forest keeps the metadata matched to the compacted nodes.
Hmm got it. I will try to come up with more tests for various cases and to see if something else is not caught with this. Thanks a lot for the help here! |
Thanks a lot once again @huitseeker! I tried to add more tests and to cover them all in the code. |
| } | ||
|
|
||
| /// Serializes debug variable mappings into bytes for fingerprint augmentation. | ||
| fn serialize_debug_vars(debug_vars: &[(usize, miden_core::mast::DebugVarId)]) -> Vec<u8> { |
There was a problem hiding this comment.
I could reproduce an issue locally with a very small test: two blocks with the same Operation::Add and the same DebugVarInfo::new("x", DebugVarLocation::Stack(0)), but different allocated DebugVarIds, do not dedup. My test fails with left: MastNodeId(0) and right: MastNodeId(1).
The root cause looks to be that the dedup bytes are built from the allocated ID here:
for (op_idx, var_id) in debug_vars {
data.extend_from_slice(&op_idx.to_le_bytes());
data.extend_from_slice(&var_id.as_u32().to_le_bytes());
}That makes builder-time dedup depend on allocation order instead of the underlying DebugVarInfo payload. merge() and compact() already compare resolved debug-var content, so the builder and merger now disagree about what counts as the same block.
I think the fix is to serialize (op_idx, DebugVarInfo) content here instead of (op_idx, DebugVarId), then add a regression test for same payload plus different ID still deduping to one node. We've discussed this direction before.
There was a problem hiding this comment.
I created some tests locally that I did not include into PR (from compiler level) and it seems that I have not triggered these cases at all properly, since all seemed well handled for me.
Yes, it seems that we should go with (op_idx, DebugVarId) serialisation.
|
|
||
| // Augment with the source node's debug vars so same-ops/different-vars | ||
| // blocks from different forests are not collapsed. | ||
| let debug_var_data = |
There was a problem hiding this comment.
I could reproduce a merge-time source-mapping alias here with two forests that have the same block ops and the same debug-var payload, but different AssemblyOp contexts. This fails because forest B's mapped root reads back ctx_a instead of ctx_b.
The gap seems to be that the dedup key is augmented only with debug vars:
let debug_var_data =
serialize_debug_var_content_for_node(original_forests[forest_idx], merging_id);
let node_fingerprint = base_fingerprint.augment_with_data(&debug_var_data);but later the asm-op table is copied with a first-wins rule:
if let alloc::collections::btree_map::Entry::Vacant(e) = asm_entries.entry(new_id) {
let ops = forest.debug_info.asm_ops_for_node(old_id);So two nodes can collapse even when their source mappings differ, and the merged node keeps only the first forest's diagnostics metadata. I think you either need asm-op content in the dedup key too, or you need to skip dedup when asm-op metadata differs. A regression test with two identical blocks and different AssemblyOp contexts would lock this down.
|
Automated check (CONTRIBUTING.md) Recommendations:
Next steps:
|
| // Procedure names are keyed by digest, so no node-ID remapping is needed here. | ||
| // Note: later inserts currently overwrite earlier names for the same digest. | ||
| for forest in forests.iter() { | ||
| self.mast_forest.debug_info.extend_procedure_names( |
There was a problem hiding this comment.
This merge can rename a deduped procedure (see #2982). extend_procedure_names is keyed only by digest, so the later forest overwrites the earlier alias here.
#[test]
fn repro_merge_procedure_names_last_name_wins() {
let mut forest_a = MastForest::new();
let block_a = block_foo().add_to_forest(&mut forest_a).unwrap();
forest_a.make_root(block_a);
let digest = forest_a[block_a].digest();
forest_a.insert_procedure_name(digest, Arc::from("alias_a"));
let mut forest_b = MastForest::new();
let block_b = block_foo().add_to_forest(&mut forest_b).unwrap();
forest_b.make_root(block_b);
assert_eq!(forest_b[block_b].digest(), digest);
forest_b.insert_procedure_name(digest, Arc::from("alias_b"));
let (merged, root_map) = MastForest::merge([&forest_a, &forest_b]).unwrap();
let new_a = root_map.map_root(0, &block_a).unwrap();
let new_b = root_map.map_root(1, &block_b).unwrap();
assert_eq!(new_a, new_b);
assert_eq!(merged.procedure_name(&digest), Some("alias_b"));
}First name wins, or keeping all aliases, would avoid the rename.
| let debug_vars_data = | ||
| serialize_debug_vars_from_forest_node(&self.statically_linked_mast, source_node_id); | ||
| let dedup_fingerprint = base_fingerprint | ||
| .augment_with_data(&asm_ops_data) |
There was a problem hiding this comment.
This change lets two same-digest roots stay distinct by metadata, but the static-link path still picks the source root by digest only. That makes the second alias copy the first alias's metadata.
#[test]
fn repro_statically_linked_same_digest_root_uses_first_alias_metadata() {
let mut source_builder = MastForestBuilder::new(&[]).unwrap();
let alias_a = source_builder
.ensure_block(
vec![Operation::Add],
Vec::new(),
vec![(0, AssemblyOp::new(None, "alias_a".into(), 1, "add".into()))],
vec![],
vec![],
vec![],
)
.unwrap();
let alias_b = source_builder
.ensure_block(
vec![Operation::Add],
Vec::new(),
vec![(0, AssemblyOp::new(None, "alias_b".into(), 1, "add".into()))],
vec![],
vec![],
vec![],
)
.unwrap();
source_builder.mast_forest.make_root(alias_a);
source_builder.mast_forest.make_root(alias_b);
let (static_forest, _remapping) = source_builder.build();
assert_eq!(static_forest[alias_a].digest(), static_forest[alias_b].digest());
let mut exact_builder = MastForestBuilder::new([&static_forest]).unwrap();
let exact_alias_b = {
let node = exact_builder.statically_linked_mast[alias_b].clone();
let builder = exact_builder.build_with_remapped_ids(alias_b, node).unwrap();
exact_builder
.ensure_node_from_statically_linked_source(builder, alias_b)
.unwrap()
};
let (exact_forest, _) = exact_builder.build();
let mut digest_builder = MastForestBuilder::new([&static_forest]).unwrap();
let linked_alias_b = digest_builder
.ensure_external_link(static_forest[alias_b].digest())
.unwrap();
let (linked_forest, _) = digest_builder.build();
assert_eq!(
exact_forest.debug_info().first_asm_op_for_node(exact_alias_b).unwrap().context_name(),
"alias_b"
);
assert_eq!(
linked_forest.debug_info().first_asm_op_for_node(linked_alias_b).unwrap().context_name(),
"alias_b",
"linking the second same-digest root by digest should preserve that root's metadata",
);
}The static-link path needs the exact source root here, not just the digest.
There was a problem hiding this comment.
Ah, yes.... This one will be a slightly bigger change I think, let me check what should be done in the assembler linking flow for this (for example we will need source node ID from the linker etc)...
There was a problem hiding this comment.
I think that ProcedureInfo would need a source_node_id: Option<MastNodeId> field, populated during library packaging, and preserved through the linker. That's a cross-crate schema change touching assembly-syntax, mast-package, the linker, and the assembler. Should we postpone that for another PR and fill out TODO here? @huitseeker
There was a problem hiding this comment.
Yep, we should probably open an issue for this indeed.
There was a problem hiding this comment.
|
@huitseeker It looks like the WarpBuild test job is stuck, and I don't see any way to restart/cancel it, or even see the status. Is that something we have to be granted access to? |
|
@bitwalker @djolertrk The branch should pick up the updated CI with a rebase (it ran the old job on ubuntu-latest). |
Just to add that this is already rebased on top of latest |
|
@djolertrk Please run |
When `ensure_block()` deduplicates a basic block (returning an existing MastNodeId with is_new=false), calling `register_debug_vars_for_node()` on the returned ID violates the CSR structure's sequential ordering requirement, causing a crash.
- add coverage tests
0175cff to
637c51a
Compare
done. thanks! |
* fix: defer debug variable registration to build() to avoid dedup crash When `ensure_block()` deduplicates a basic block (returning an existing MastNodeId with is_new=false), calling `register_debug_vars_for_node()` on the returned ID violates the CSR structure's sequential ordering requirement, causing a crash. * debuginfo: add FrameBase variant and set_value_location to DebugVarInfo * fixup: add FrameBase to serialization round-trip test * chore: fix formatting and add changelog entry * fix: include debug vars in block fingerprint to prevent incorrect dedup * refactor: keep debug vars external to builder, fix remove_nodes remap * fix: transfer debug vars during merge_basic_blocks * fix: preserve debug vars in repeat path's cloned-block dedup * fix: keep source block metadata during merge_basic_blocks * fix: transfer debug metadata during MastForest merge and compact * fix: include debug vars in merger dedup key - add coverage tests * fix: preserve metadata in MAST dedup * fix: preserve asm metadata in builder dedup * fix: preserve metadata-sensitive assembler dedup * doc: Update documentation * test: add more tests for merger * fix: first-name-wins for procedure names, exact static-link resolution * fix: skip metadata fingerprint augmentation in stripped builds * fix: import only selected alias in ensure_external_link * chore: bump rand to fix * fix: normalize padded static-link asm-op metadata for padded blocks --------- Co-authored-by: djole <djolertrk@gmail.com>
This adds var location type to represent
DW_OP_fbreg, that we were missing in the initial implementation. It is used by both compiler and debugger later to print RUST variables values at certain program point.Also, fix a bug in registration of variables per node.
Rationale: Blocks with identical operations but different debug variables were being deduplicated, causing crash in compiler and incorrect variable scoping in the debugger
Fixes #2961