pulse: LCM-merge stream-axis dims + slope-based per-pulse for Range#2202
Open
JulienBalianSonos wants to merge 2 commits into
Open
pulse: LCM-merge stream-axis dims + slope-based per-pulse for Range#2202JulienBalianSonos wants to merge 2 commits into
JulienBalianSonos wants to merge 2 commits into
Conversation
e735ac2 to
53572a1
Compare
d042a5d to
3ed19c3
Compare
kali
requested changes
May 11, 2026
- Rename TDim::pulse_lcm -> lcm. The function is a pure integer LCM (with overflow guard and None for symbolic / non-positive operands); the pulse-meet-point framing belongs at the call site, not on the math primitive. Updates the call in stream_axis_lcm and the lcm_basic unit test. - Drop the module-level doc block from pulse/src/lib.rs. The merge problem isn't the right anchor for a pulsification overview, and a proper overview is bigger than this PR's scope -- left for a separate doc pass. - Move the stream-axis merge-semantics doc onto stream_axis_lcm in pulse/src/model.rs, where the mechanism actually lives. - Refactor stream_axis_lcm to try_fold: removes the Vec allocation and the mutable accumulator; same short-circuit semantics. Tests still green: tract-data 133, tract-pulse 34, core-proptest-pulse 55.
11fcdc4 to
854a2e0
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
When two parallel pulse paths converge at an elementwise op, the merged tensor's stream-axis dim is the Least Common Multiple (LCM) of the inputs' per-pulse stream dims, not the NumPy-style broadcast. Two semantics get conflated otherwise:
Broadcast. Equal or one is 1.K_aandK_bare compatible at any pulse that is a multiple ofLCM(K_a, K_b).PulseWrappingOp::pulsed_output_factswas using the typed op'soutput_factsfor shape merging, which falls through tomulti_broadcastand producesBroadcast([K_a, K_b])on the stream axis when paths produce different per-pulse sizes (e.g. ConvTranspose with kernel > stride at steady-state vs initial-state phases). Downstream pulse-divisibility checks then bail.Changes
TDim::pulse_lcmhelper (pure-integer LCM; returnsNonefor symbolic, caller falls back).PulseWrappingOp::pulsed_output_factspost-processes typedoutput_facts: when the stream-axis dim is aBroadcast, replace with LCM of all stream-bearing inputs' stream dims. Non-stream axes keepBroadcastsemantics (correct for runtime shape compatibility).Range::pulsify(pulse/src/ops/array/range.rs): replacestream_dim.substitute(symbol, pulse).to_usize()(full evaluated length, includes constant tail from upstream overlap-add) with slope-basedguess_slope(symbol).0 × pulse_int(steady-state increment per pulse). The substitute form was forcing per-pulse to the kernel-state size when downstream paths were producing the steady-state size, surfacing exactly as theBroadcast(stride, kernel)we were merging.TDim::guess_slopemadepub(waspub(super)) so pulse op-pulsifiers outsidetract-datacan use it.Module-level doc
pulse/src/lib.rsnow documents the stream-axis vs non-stream-axis merge contract so future maintainers don't reintroduce the conflation.Surfaced by
Pocket-TTS Mimi audio decode (
upsample[stride=16, kernel=32] → linear → view → unbind → arange-based mask → SDPA → linear → conv). Pre-fix: failed atPulsification requires pulse Broadcast(16, 32) to be a stride (1) multiple. Post-fix: pulsifies cleanly through optimize at pulse=1, 2, 4, 8.Depends on
#2201 (
tdim: PartialEq second-chance). Needed for the LayerNorm-after-ConvTranspose case where the GCD-factoring rewrite produces algebraically-equal-but-structurally-different canonical forms.Tests
pulse_lcm_basicintract-data.test_pulse_meet_with_arange_branch_types_throughintract-pulse: minimal repro of the upsample-then-arange-mask pattern; pulsifies cleanly post-fix.core-proptest-pulsetests green (no regressions in existing pad+conv, deconv, conv+conv, etc. proptests).Known limitations / follow-ups
pulse_lcmreturnsNonefor symbolic operands; callers fall back to existingBroadcastbehavior (still safe). A symbolic LCM is worth following up.PulseWrappingOp; they should follow the same slope-based convention. This PR fixes Range; quick audit of others recommended.Deconvpulse runtime semantics (output diverges at chunk boundaries for kernel > stride). Exposed bytract compare --streamonce the type-level fix in this PR is in. Filed as #2203.