Conversation
|
Goes on top of #2443. |
e52ad69 to
da325b4
Compare
|
In order to try everything within compiler, I have used the |
bitwalker
left a comment
There was a problem hiding this comment.
I think this looks pretty good! See my comments for more specific feedback on things I think we may need to change.
One thing that I think may be worth adding to miden_mast_package as part of this, is a higher-level type or set of types which reify the encoded representation of the debug information into usable form that has been validated and can be directly queried by tools such as the debugger.
As it stands, what you get from deserializing the section is still a very raw form (i.e. references to other data in the section is not materialized/validated). That is useful as a low-level primitive, but means that any tooling that wants to actually use the information encoded in the package has to build that high-level representation themselves. While in some cases that may be necessary/desirable, I suspect that in practice, a single canonical representation defined in miden_mast_package would be sufficient for the uses we have in mind.
My only other note is that I'd like to see a bit more documentation on what the various bits of information are used for, how to put them together. This would of course be one thing solved by providing a higher-level representation of the debug info, but even then there are some bits of information that are somewhat unclear on what their usage should be (e.g. my comment on scope_depth).
Ultimately, one of the things I'd want the high-level representation to provide, would be a set of API functions that correspond to important debugger queries, without the debugger needing to know too much about the internal encoding of the debug info provided by a Package.
Thanks a lot for your comments! I have rebased this on top of the latest next branch, and addressed all comments. My comment regarding API for the debugger -- I have left a comment with a TODO marker, so we can pick it up once compiler side is ready and when we can have hands-on on the consumer/debugger side. Please do note that I have added a couple of documents in the 0xMiden/compiler#820 that describe the debug info design at both IRs level and in the final package. |
3289e5e to
193b063
Compare
|
@djolertrk I don't have access to your fork, so I can't rebase this PR for you. I switched the base branch to |
7f621e3 to
07b33cb
Compare
Done. Thanks! |
- Add DebugVarInfo and DebugVarLocation
- Add Debug info types:
- DebugInfoSection (main container)
- DebugTypeInfo - Type definitions (Primitive, Pointer, Array, Struct, Function, Unknown)
- DebugPrimitiveType - 16 primitive types (Void through Word)
- DebugFileInfo
- DebugFunctionInfo
- DebugVariableInfo
- DebugInlinedCallInfo (for Inlined call site tracking)
- debug_info/serialization.rs - Full serialization/deserialization of
the debug info section
- Delete accidentally committed ._target file, add ._* to .gitignore - Change DebugVarInfo.name from String to Arc<str> for sharing references - Change DebugVarInfo.arg_index to Option<NonZeroU32> - Change DebugVarLocation::Const to use Felt instead of u64 - Add Instruction::has_textual_representation() method for DebugVar - Update print.rs to return Document::Empty for non-textual instructions - Add documentation explaining location should only differ from AssemblyOp - Add documentation for directory_idx explaining relative path usage - Use LineNumber/ColumnNumber types in mast-package debug_info types - Document scope_depth usage with detailed examples - Update processor callback comments for DebugVar decorator - Add miden-debug-types dependency to mast-package - Simplify DebugFileInfo to use full paths only
Add a new callback mechanism for DebugVar decorators to enable debuggers to track source-level variable information during execution. This allows debuggers like miden-debug to receive notifications when variable location information changes, enabling features like displaying local variables and their values during debugging sessions.
- Changed Local(u16) to Local(i16) to store signed FMP offset directly - The offset is computed as: idx - num_locals - Updated Display format - Updated serialization to handle i16 - Address is computed as: FMP + offset
Add separate sections for types, sources, function declarations.
This fixes CI.
A few comments addressed: - Use Vec<Arc<str>> instead of Vec<String> for string tables - Add strongly-typed DebugTypeIdx newtype for type table indices - Box checksum in DebugFileInfo - Use Word type for mast_root instead of [u8; 32] - Make on_debug_var infalliblecand add #[inline(always)] - Remove explicit discriminant values from DebugPrimitiveType enum
07b33cb to
308e043
Compare
308e043 to
7dea9b5
Compare
|
@bobbinth This is ready to merge whenever you get a chance to give it a look |
bobbinth
left a comment
There was a problem hiding this comment.
Looks good! Thank you! I left some comments inline - the main one is about potentially removing the on_debug_var() call back. Basically, if possible, I'd love not to introduce it.
Other comments for the future:
- There are a lot of parallels between debug vars and assembly ops - especially once we are past the AST stage. I wonder if there is a way to unify them somehow, or at least, abstract away some common patters to reduce code duplication.
- It feels a bit weird to me that we now have some debug info in MAST forest and some debug info in the package. I guess it is necessary to keep some debug info in the MAST forest because that's what the processor works with, but it is not super clear to make how we make a decision about what goes where.
| // Re-export serialization traits for use by consumers | ||
| // (the impl blocks are in serialization.rs) |
There was a problem hiding this comment.
I'm not sure I understood this comment. AFAICT, we are not re-exporting serialization traits from here. Is this comment obsolete?
There was a problem hiding this comment.
Ah, yes, removing it.
| } | ||
|
|
||
| /// Returns the debug variable with the given ID, if it exists. | ||
| pub fn debug_var(&self, debug_var_id: DebugVarId) -> Option<&crate::operations::DebugVarInfo> { |
There was a problem hiding this comment.
nit: here and in other places in this file, instead of using absolute path crate::operations::DebugVarInfo we usually import items at the top of the file and then here it would be just DebugVarInfo.
There was a problem hiding this comment.
I did not review this file in detail, but it feels like there is some code duplication between this and other storages (e.g., AsmOp storage). I wonder if there is a good way to avoid this. Let's create an issue for this.
Also, cc @huitseeker for visibility.
| target.write_u16(bytes.len() as u16); | ||
| target.write_bytes(bytes); |
There was a problem hiding this comment.
Could we not do this as:
bytes.write_into(target);| let mut bytes = Vec::with_capacity(len); | ||
| for _ in 0..len { | ||
| bytes.push(source.read_u8()?); | ||
| } |
There was a problem hiding this comment.
Related to the above comments - we could deserialize this as a vector of bytes directly rather than using a manual loop.
In general, we should try to avoid using the pattern of reading the number of elements and then allocating a vector with this number of elements - as this could lead to a security vulnerability. Instead, we should be using ByteReader::read_many_iter() method.
There was a problem hiding this comment.
The Vec::<u8>::read_from is actually good fit here.
| /// Debug variables attached to operations in this block. | ||
| /// Each entry is (op_index, debug_var_id) similar to decorators. | ||
| debug_vars: Vec<(usize, DebugVarId)>, |
There was a problem hiding this comment.
nit: could we move this under asm_ops (i.e., under line 65 below).
| /// operation, they carry heap-allocated [`DebugVarInfo`] payloads, and they are infallible | ||
| /// (the host may not reject them). This separation keeps the hot path for `on_debug` lean | ||
| /// while still allowing the debugger to track source-level variable locations. | ||
| fn on_debug_var(&mut self, _process: &ProcessorState, _var_info: &DebugVarInfo) {} |
There was a problem hiding this comment.
Thinking about this more, it is not clear to me if we need this callback. Could we not do what @bitwalker mentioned in #2471 (comment)?
I'm assuming the debugger steps through the instructions one-by-one, and if so, it should be able to look up DebugVarInfos for a given operation using MastForest::debug_vars_for_operation() method. But maybe that's not how it works?
There was a problem hiding this comment.
But when using the FastProcessor, we still need to be notified in a way that the location has changed, so I think we do need the callback. (cc @bitwalker)
There was a problem hiding this comment.
I think the intent was to execute the FastProcessor in a mode where we move though execution step-by-step, and so, at each cycle, we should be able to get the debug var info from the current MAST forest. @bitwalker - is that how you were thinking about it?
There was a problem hiding this comment.
I talked to @bitwalker, and he confirmed that we can remove the callback. Thanks @bobbinth :)
| if self.in_debug_mode() { | ||
| for &debug_var_id in current_forest.debug_vars_for_operation(node_id, op_idx_in_block) { | ||
| if let Some(var_info) = current_forest.debug_var(debug_var_id) { | ||
| let process = &mut self.state(); | ||
| host.on_debug_var(process, var_info); | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
As mentioned in the previous comment, I'm hoping we can get rid of the on_debug_var() callback, but if not, I think we can move this loop inside the if-statement on line 119 as that one already checks if we are in debug mode.
| let mut strings = alloc::vec::Vec::with_capacity(strings_len); | ||
| for _ in 0..strings_len { | ||
| strings.push(read_string(source)?); | ||
| } |
There was a problem hiding this comment.
As mentioned in one of the other comments, here and in other places in this file, we should either try to use "native" methods to deserialize vectors or use ByteReader::read_many_iter().
There was a problem hiding this comment.
Yes, lets use read_many_iter here.
2e37aca to
17a9fe3
Compare
|
@bobbinth This is ready to merge now |
Handle
DebugVardecorator and newdebug_infosection in the final .masp.