Skip to content

add caching in hashTreeRoot#62

Open
anshalshukla wants to merge 1 commit into
masterfrom
anshalshukla/merkle-cache
Open

add caching in hashTreeRoot#62
anshalshukla wants to merge 1 commit into
masterfrom
anshalshukla/merkle-cache

Conversation

@anshalshukla
Copy link
Copy Markdown
Contributor

Adds caching in hashing path for List and BitList

  • Supports caching only when hashing with Sha256
  • init API doesn't take in Hasher as an input so CacheType cannot be Hasher agnostic without API changes
  • Can support multiple Hasher in CacheType either by updating init API or by storing a list of CacheType which can be appended based on the Hasher call in hashTreeRoot

@anshalshukla anshalshukla requested a review from gballet as a code owner April 28, 2026 06:14
Copy link
Copy Markdown
Collaborator

@gballet gballet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After reviewing, I think that there is a fundamental problem with how the cached hashing has been designed:

  • There is no reason to make it Sha256-specific, it should be easy to make it agnostic
  • It should be possible not to use caching, which isn't the case here
  • imo, the cached hashing should be a wrapper around the regular root-hashing function, e.g. by introducing a TreeHasher interface, with a hashTreeRoot function - which is Hasher-agnostic (and passed as a parameter). If caching is active, then the wrapper is used, otherwise the regular object is used.

Comment thread src/merkle_cache.zig
// Tests
const Sha256 = std.crypto.hash.sha2.Sha256;
const lib = @import("./lib.zig");

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

put this at the start of the file, or just import it inside the tests that need it

Comment thread src/utils.zig
Comment on lines +36 to +38
/// Hasher-agnostic cache type. We use SHA256 as the default since that's the
/// standard SSZ hasher, but the cache is only used when hashTreeRoot is called
/// with a matching hasher.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment doesn't look correct. Either it's Sha256-specific, or it's hasher-agnostic. Looks like slop, I suggest removing it.

Comment thread src/merkle_cache.zig
const std = @import("std");
const zeros = @import("./zeros.zig");

const BYTES_PER_CHUNK = 32;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is already defined in src/utils.zig

Comment thread src/merkle_cache.zig

const BYTES_PER_CHUNK = 32;
const chunk = [BYTES_PER_CHUNK]u8;
const zero_chunk: chunk = [_]u8{0} ** BYTES_PER_CHUNK;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

Comment thread src/merkle_cache.zig
Comment on lines +9 to +10
/// Node 1 is the root. Node i has children 2i and 2i+1.
/// Leaves occupy indices [capacity .. 2*capacity).
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stop the slop, it is obvious from reading the code.

Suggested change
/// Node 1 is the root. Node i has children 2i and 2i+1.
/// Leaves occupy indices [capacity .. 2*capacity).

Comment thread src/utils.zig
.int => blk: {
const bytes_per_item = @sizeOf(Item);
const items_per_chunk = BYTES_PER_CHUNK / bytes_per_item;
break :blk element_index / items_per_chunk;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what of u384 ?

Comment thread src/utils.zig
}

/// Free the Merkle cache without freeing the list itself.
/// Called by lib.hashTreeRoot to clean up caches on value copies.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// Called by lib.hashTreeRoot to clean up caches on value copies.

Comment thread src/utils.zig
}
cache.nodes[cache.capacity + chunk_idx] = leaf;
}
// Zero out chunks beyond data
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Zero out chunks beyond data

Comment thread src/utils.zig
const end_item = @min(start_item + items_per_chunk, items.len);
for (start_item..end_item) |item_i| {
const pos = (item_i % items_per_chunk) * bytes_per_item;
// SSZ requires little-endian encoding
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// SSZ requires little-endian encoding

Comment thread src/utils.zig
Comment on lines +289 to +293
// Composite types: each item is its own chunk (hash tree root of item)
// Composite items may contain pointers to mutable data that
// can change without going through set(), so we must always
// recompute all item hashes. We compare against cached leaves
// to only mark actually-changed nodes dirty for tree rehashing.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the cache bringing any performance, then? Most of the types we want are composite types, so if they are recomputed no matter what, that's a lot of performance left on the table.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants