add caching in hashTreeRoot by anshalshukla · Pull Request #62 · blockblaz/ssz.zig

anshalshukla · 2026-04-28T06:14:10Z

Adds caching in hashing path for List and BitList

Supports caching only when hashing with Sha256
init API doesn't take in Hasher as an input so CacheType cannot be Hasher agnostic without API changes
Can support multiple Hasher in CacheType either by updating init API or by storing a list of CacheType which can be appended based on the Hasher call in hashTreeRoot

gballet

After reviewing, I think that there is a fundamental problem with how the cached hashing has been designed:

There is no reason to make it Sha256-specific, it should be easy to make it agnostic
It should be possible not to use caching, which isn't the case here
imo, the cached hashing should be a wrapper around the regular root-hashing function, e.g. by introducing a TreeHasher interface, with a hashTreeRoot function - which is Hasher-agnostic (and passed as a parameter). If caching is active, then the wrapper is used, otherwise the regular object is used.

gballet · 2026-05-09T11:44:36Z

+// Tests
+const Sha256 = std.crypto.hash.sha2.Sha256;
+const lib = @import("./lib.zig");
+


put this at the start of the file, or just import it inside the tests that need it

gballet · 2026-05-09T11:54:08Z

+        /// Hasher-agnostic cache type. We use SHA256 as the default since that's the
+        /// standard SSZ hasher, but the cache is only used when hashTreeRoot is called
+        /// with a matching hasher.


This comment doesn't look correct. Either it's Sha256-specific, or it's hasher-agnostic. Looks like slop, I suggest removing it.

gballet · 2026-05-10T09:24:50Z

+const std = @import("std");
+const zeros = @import("./zeros.zig");
+
+const BYTES_PER_CHUNK = 32;


this is already defined in src/utils.zig

gballet · 2026-05-10T09:25:06Z

+
+const BYTES_PER_CHUNK = 32;
+const chunk = [BYTES_PER_CHUNK]u8;
+const zero_chunk: chunk = [_]u8{0} ** BYTES_PER_CHUNK;


gballet · 2026-05-10T09:26:48Z

+/// Node 1 is the root. Node i has children 2i and 2i+1.
+/// Leaves occupy indices [capacity .. 2*capacity).


stop the slop, it is obvious from reading the code.

Suggested change

/// Node 1 is the root. Node i has children 2i and 2i+1.

/// Leaves occupy indices [capacity .. 2*capacity).

gballet · 2026-05-10T16:33:30Z

+                    .int => blk: {
+                        const bytes_per_item = @sizeOf(Item);
+                        const items_per_chunk = BYTES_PER_CHUNK / bytes_per_item;
+                        break :blk element_index / items_per_chunk;


what of u384 ?

gballet · 2026-05-10T16:36:26Z

        }

+        /// Free the Merkle cache without freeing the list itself.
+        /// Called by lib.hashTreeRoot to clean up caches on value copies.


Suggested change

/// Called by lib.hashTreeRoot to clean up caches on value copies.

gballet · 2026-05-10T16:36:49Z

+                            }
+                            cache.nodes[cache.capacity + chunk_idx] = leaf;
+                        }
+                        // Zero out chunks beyond data


Suggested change

// Zero out chunks beyond data

gballet · 2026-05-10T16:37:02Z

+                            const end_item = @min(start_item + items_per_chunk, items.len);
+                            for (start_item..end_item) |item_i| {
+                                const pos = (item_i % items_per_chunk) * bytes_per_item;
+                                // SSZ requires little-endian encoding


Suggested change

// SSZ requires little-endian encoding

gballet · 2026-05-10T16:39:28Z

+                    // Composite types: each item is its own chunk (hash tree root of item)
+                    // Composite items may contain pointers to mutable data that
+                    // can change without going through set(), so we must always
+                    // recompute all item hashes. We compare against cached leaves
+                    // to only mark actually-changed nodes dirty for tree rehashing.


Is the cache bringing any performance, then? Most of the types we want are composite types, so if they are recomputed no matter what, that's a lot of performance left on the table.

add caching in hashTreeRoot

43e43d0

anshalshukla requested a review from gballet as a code owner April 28, 2026 06:14

gballet requested changes May 10, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add caching in hashTreeRoot#62

add caching in hashTreeRoot#62
anshalshukla wants to merge 1 commit into
masterfrom
anshalshukla/merkle-cache

anshalshukla commented Apr 28, 2026

Uh oh!

gballet left a comment

Uh oh!

gballet May 9, 2026

Uh oh!

gballet May 9, 2026

Uh oh!

gballet May 10, 2026

Uh oh!

gballet May 10, 2026

Uh oh!

gballet May 10, 2026

Uh oh!

gballet May 10, 2026

Uh oh!

gballet May 10, 2026

Uh oh!

gballet May 10, 2026

Uh oh!

gballet May 10, 2026

Uh oh!

gballet May 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		/// Node 1 is the root. Node i has children 2i and 2i+1.
		/// Leaves occupy indices [capacity .. 2*capacity).

Conversation

anshalshukla commented Apr 28, 2026

Uh oh!

gballet left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants