Skip to content

Add fuzz targets for package and project coverage#3015

Merged
huitseeker merged 2 commits intonextfrom
huitseeker/add-project-fuzzing
Apr 23, 2026
Merged

Add fuzz targets for package and project coverage#3015
huitseeker merged 2 commits intonextfrom
huitseeker/add-project-fuzzing

Conversation

@huitseeker
Copy link
Copy Markdown
Collaborator

Add cargo-fuzz coverage for package semantic deserialization and Miden project manifests, loading, and assembly.

The package semantic target starts from binary Package deserialization, then exercises debug-section decoding, runtime dependency checks, embedded kernel package decoding, and program/kernel conversion paths.

The TOML target exercises MidenProject::parse and direct AST TOML parsing. The loader target writes fuzz input into temporary project trees and exercises Package::load, Project::load, and load_project_reference.

The assembly target builds tiny temporary projects and exercises ProjectAssembler for library, executable, kernel, and kernel-linked executable targets.

Add CI matrix entries and seed corpora for the new targets.

@huitseeker huitseeker force-pushed the huitseeker/add-project-fuzzing branch from bc83a2a to 3b58d14 Compare April 17, 2026 15:39
@huitseeker huitseeker marked this pull request as ready for review April 17, 2026 15:39
Copy link
Copy Markdown
Collaborator

@bitwalker bitwalker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not as familiar with the ins and outs of cargo fuzz, but this looks fine to me. I think the main questions I have are:

  1. When the workflow fails, do we have a seed value that we can pass to cargo fuzz locally to run the exact same fuzz inputs? I assume yes, but not super clear on where that would appear (presumably in the log output of the GitHub Action job).
  2. This necessarily only covers assembly of MASM projects, not projects authored in some other language, e.g. Rust. It's probably impractical to add fuzz targets for project assembly where the sources are provided manually, rather than read from disk, but without coverage there, we may miss things, or we may make changes that break stuff for such projects. I guess this isn't so much a question, but an issue of how best to ensure we don't end up over-indexing on MASM source projects, when those are going to be much less common in practice.

Marking this approved, as I think it is mergeable as-is, but I just wanted to mention the points above as part of my review.

Add cargo-fuzz coverage for package semantic deserialization and Miden project manifests, loading, and assembly.

The package semantic target starts from binary Package deserialization, then exercises debug-section decoding, runtime dependency checks, embedded kernel package decoding, and program/kernel conversion paths.

The TOML target exercises MidenProject::parse and direct AST TOML parsing. The loader target writes fuzz input into temporary project trees and exercises Package::load, Project::load, and load_project_reference.

The assembly target builds tiny temporary projects and exercises ProjectAssembler for library, executable, kernel, and kernel-linked executable targets.

Add CI matrix entries and seed corpora for the new targets.
@huitseeker huitseeker force-pushed the huitseeker/add-project-fuzzing branch from 3b58d14 to 3f86e44 Compare April 22, 2026 12:06
@huitseeker
Copy link
Copy Markdown
Collaborator Author

huitseeker commented Apr 22, 2026

For failures, libFuzzer should print the crashing input path and usually a base64 form in the job log. With this workflow we run the fuzz binary directly, so the crash file is not uploaded today (it's on the CI machine). I think that is enough to reproduce from logs, but I agree it would be nicer to persist crash-/timeout- files as GitHub Action artifacts on failure.

I can add that in a follow-up, or in this PR if you prefer. But please note that fuzzer runs are meant to be done on a beefy machine for a while, what we're doing in CI is really non-regression under a tight computational budget. If one finds something broken in such a constrained run, it's usually trivial to reproduce.

On project coverage, I agree this target is biased toward MASM source projects. I see this PR as covering the miden-vm surfaces we own directly: package decoding, project manifest/load paths, and MASM project assembly from disk.

  • The later is a very secondary goal, this is meant to be mostly a coverage of (de)serialization.
  • It is not meant to cover Rust-to-MASM or other frontend behavior, where the benefits of fuzzing are a bit less clear: the inputs aren't unstructured bytes, and something like proptest makes a ton more sense. Those should probably live in the compiler repo (or as separate integration fuzzing using compiler-emitted MASM/package seeds).

The package semantic target should still catch many artifact-level issues after frontend output exists, but it will not replace frontend-specific fuzzing.

@bitwalker
Copy link
Copy Markdown
Collaborator

Thanks! I think once this is rebased, it's good to merge unless @bobbinth has any questions/concerns

@huitseeker huitseeker merged commit 918342c into next Apr 23, 2026
23 checks passed
@huitseeker huitseeker deleted the huitseeker/add-project-fuzzing branch April 23, 2026 16:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants