perf(assembly): optimize call graph toposort from O(V*E) to O(V+E)#2942
Open
giwaov wants to merge 2 commits into0xMiden:nextfrom
Open
perf(assembly): optimize call graph toposort from O(V*E) to O(V+E)#2942giwaov wants to merge 2 commits into0xMiden:nextfrom
giwaov wants to merge 2 commits into0xMiden:nextfrom
Conversation
huitseeker
reviewed
Apr 1, 2026
Collaborator
huitseeker
left a comment
There was a problem hiding this comment.
Overall LGTM, though this will need a rebase.
| .any(|(n, out_edges)| !output.contains(n) || !out_edges.is_empty()); | ||
| if has_cycle { | ||
| // If not all nodes were visited, the remaining nodes participate in cycles | ||
| if output.len() != num_nodes { |
Collaborator
There was a problem hiding this comment.
Could we add a small regression test for the whole-graph cycle case with no initial roots? That was the main shape I wanted pinned down around this new cycle path.
#[test]
fn callgraph_toposort_whole_graph_cycle_without_roots() {
let graph = callgraph_cycle_without_roots();
let err = graph.toposort().expect_err(
"expected topological sort to fail when every node is blocked behind a cycle",
);
assert_eq!(err.0.into_iter().collect::<Vec<_>>(), &[A1, A2, A3]);
}
Contributor
Author
There was a problem hiding this comment.
Done. added callgraph_toposort_whole_graph_cycle_without_roots test and rebased onto next.
…xMiden#2830) Replace the O(V*E) Kahn's algorithm implementation in CallGraph with a proper O(V+E) version: - Pre-compute in-degree map instead of rescanning all edges via num_predecessors() on each edge removal - Eliminate graph clone in toposort() by tracking in-degrees rather than destructively removing edges - Eliminate graph clone in toposort_caller() using the same technique - Build reverse adjacency map in reverse_reachable() instead of scanning all edges per BFS step - Use BTreeSet for cycle detection instead of Vec::contains()
9178273 to
d77844b
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #2830
Summary
The
CallGraph::toposort()andtoposort_caller()implementations used nested loops that resulted in O(V*E) complexity for large call graphs. This PR replaces them with proper O(V+E) Kahn's algorithm.Problem
num_predecessors()scanned all nodes/edges to count predecessors for a single node, and was called inside the main Kahn's loop for every edge removal O(E) per call, O(V*E) total.toposort()cloned the entire graph just to destructively remove edges during the sort.Vec::contains()(O(n) per check).reverse_reachable()scanned all edges per BFS step O(V*E).Fix
num_predecessors().BTreeSetfor visited-node lookups in cycle detection O(log n) vs. O(n).reverse_reachable()single O(V+E) pass instead of O(V*E) scanning.Testing
All 9 existing callgraph unit tests pass unchanged. Full
miden-assemblytest suite (180 tests) passes with 0 failures.