Conversation
|
To compare against the numbers in #2869 for the recursive verifier (verifying a program executing in 2^20 cycles) This is a 4x improvement in the above case. |
da140ac to
77773d9
Compare
| let w1: AB::Expr = local.chiplets[MEMORY_WORD_ADDR_HI_COL_IDX - CHIPLETS_OFFSET].clone().into(); | ||
| let w1_mul4: AB::Expr = w1.clone() * AB::Expr::from_u16(4); | ||
|
|
||
| let den0: AB::ExprEF = alpha.clone() + Into::<AB::ExprEF>::into(w0); |
There was a problem hiding this comment.
Should this add protocol-level domain separation before v_wiring can safely carry ACE wires, raw memory range-check values, and the new hasher perm-link messages together? Right now the memory side uses plain alpha + w0/w1/4*w1, ACE uses encode([clk, ctx, id, ...]), and the perm-link uses encode([0|1, h0..h11]) on the same LogUp column.
If any of those encodings was to alias, could one subsystem cancel another on the shared sum? #1614 explicitly called out adding an op-label when reusing the wiring bus, and I don't see that namespace implemented here yet.
Nashtare
left a comment
There was a problem hiding this comment.
I would need to do another pass because this is pretty dense, but left a couple commetns while familiarizing myself with it
|
I am merging this so we can conclude #2856 and #2962 and proceed with the multi-table migration. I will comment in the referenced PRs with my take on the approaches there, but broadly: we should avoid changes that are not clear wins or do not have major perf improvements. There are many opportunities for factoring out and centralizing computations, but it is plausible that once we have automated witness generation for the auxiliary trace, these optimizations could be handled on the backend side. It may therefore make more sense to prioritize changes that improve auditability, readability, and soundness (e.g., domain separation). The same applies to constraints — particularly auxiliary constraints — where simplifications from the unified bus architecture would create easier optimization opportunities on the backend side later. In other words, we should prioritize reaching the multi-table migration as soon as possible. The work in the referenced PRs should take this into account. Each PR from here through the multi-table milestone should justify its existence with this goal in mind:
|
Incorporate the hasher chiplet redesign (#2927) into the constraint simplification branch. The hasher now uses a 16-row packed cycle with a controller/permutation split architecture, replacing the previous 32-row cycle. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This PR bundles several closely related changes in the hasher / chiplets area:
mrupdate_idcolumn domain-separates sibling-table entries, preventing cross-operation sibling swapping.w0/w1address-limb columns, with 16-bit range-check lookups routed through the wiring bus.These changes all touch the chiplets trace layout, bus plumbing, and AIR structure, so landing them together keeps the transition coherent.
Why
1. Deduplicate repeated permutations
The old monolithic hasher consumed 32 rows per permutation request, even if the same input state appeared repeatedly.
With the new design:
(input, output)pair,For
Mrequests withUunique input states, the rough cost changes from:32M2M + pad_to_16 + 16UThis is a clear win whenever states repeat (Merkle workloads, identical MAST roots, ...).
2. Fix sibling-table soundness
The old sibling-table encoding was vulnerable to cross-operation sibling reuse. Adding
mrupdate_iddomain-separates entries so sibling-table balance is enforced per MRUPDATE instance, not globally across unrelated operations.3. Add memory address decomposition checks
The memory chiplet now decomposes word addresses into two 16-bit limbs and proves the decomposition using range-check lookups. This closes an important missing piece in memory soundness while reusing the existing wiring-bus infrastructure.
Design
Hasher: two-region trace
The hasher trace is split into two contiguous regions:
Controller (
perm_seg = 0)Compact input/output row pairs, one pair per permutation request.
Permutation segment (
perm_seg = 1)One packed 16-row Poseidon2 cycle per unique input state.
A LogUp permutation-link on the shared
V_WIRINGauxiliary column ties controller requests to the corresponding permutation cycles.Packed 16-row Poseidon2 schedule
The 31-step Poseidon2 schedule is packed as:
init + ext1ext2..ext47 × (3 packed internal rounds)int22 + ext5ext6..ext8Packed internal rows use
s0/s1/s2as witness columns on permutation rows in order to keep constraints degree bounded. Unused witness slots are explicitly zero-constrained (out of caution) though this could be relaxed.Column layout
Hasher: 16 -> 20
New / newly significant columns:
mrupdate_id-- domain separator for sibling-table entriesis_boundary-- marks first controller input / last controller outputdirection_bit-- propagated Merkle routing bit on controller rowsperm_seg-- explicit controller vs permutation-region flagMemory: 15 -> 17
Two new columns:
w0w1These decompose the word address into 16-bit limbs. The wiring bus carries the corresponding range-check lookups.
Constraints
Hasher constraints now total 100.
Constraint group breakdown
s0,s1,s2binary on controller rowsperm_segconfinement, booleanity, monotonicity, cycle alignmentis_boundary/direction_bitto valid row typesnode_indexruleTrace width impact
The new main trace width is 72
No new auxiliary columns were added:
V_WIRING