Add fuzz targets for package and project coverage#3015
Conversation
bc83a2a to
3b58d14
Compare
bitwalker
left a comment
There was a problem hiding this comment.
I'm not as familiar with the ins and outs of cargo fuzz, but this looks fine to me. I think the main questions I have are:
- When the workflow fails, do we have a seed value that we can pass to
cargo fuzzlocally to run the exact same fuzz inputs? I assume yes, but not super clear on where that would appear (presumably in the log output of the GitHub Action job). - This necessarily only covers assembly of MASM projects, not projects authored in some other language, e.g. Rust. It's probably impractical to add fuzz targets for project assembly where the sources are provided manually, rather than read from disk, but without coverage there, we may miss things, or we may make changes that break stuff for such projects. I guess this isn't so much a question, but an issue of how best to ensure we don't end up over-indexing on MASM source projects, when those are going to be much less common in practice.
Marking this approved, as I think it is mergeable as-is, but I just wanted to mention the points above as part of my review.
Add cargo-fuzz coverage for package semantic deserialization and Miden project manifests, loading, and assembly. The package semantic target starts from binary Package deserialization, then exercises debug-section decoding, runtime dependency checks, embedded kernel package decoding, and program/kernel conversion paths. The TOML target exercises MidenProject::parse and direct AST TOML parsing. The loader target writes fuzz input into temporary project trees and exercises Package::load, Project::load, and load_project_reference. The assembly target builds tiny temporary projects and exercises ProjectAssembler for library, executable, kernel, and kernel-linked executable targets. Add CI matrix entries and seed corpora for the new targets.
3b58d14 to
3f86e44
Compare
|
For failures, libFuzzer should print the crashing input path and usually a base64 form in the job log. With this workflow we run the fuzz binary directly, so the crash file is not uploaded today (it's on the CI machine). I think that is enough to reproduce from logs, but I agree it would be nicer to persist crash-/timeout- files as GitHub Action artifacts on failure. I can add that in a follow-up, or in this PR if you prefer. But please note that fuzzer runs are meant to be done on a beefy machine for a while, what we're doing in CI is really non-regression under a tight computational budget. If one finds something broken in such a constrained run, it's usually trivial to reproduce. On project coverage, I agree this target is biased toward MASM source projects. I see this PR as covering the miden-vm surfaces we own directly: package decoding, project manifest/load paths, and MASM project assembly from disk.
The package semantic target should still catch many artifact-level issues after frontend output exists, but it will not replace frontend-specific fuzzing. |
|
Thanks! I think once this is rebased, it's good to merge unless @bobbinth has any questions/concerns |
Add cargo-fuzz coverage for package semantic deserialization and Miden project manifests, loading, and assembly.
The package semantic target starts from binary Package deserialization, then exercises debug-section decoding, runtime dependency checks, embedded kernel package decoding, and program/kernel conversion paths.
The TOML target exercises MidenProject::parse and direct AST TOML parsing. The loader target writes fuzz input into temporary project trees and exercises Package::load, Project::load, and load_project_reference.
The assembly target builds tiny temporary projects and exercises ProjectAssembler for library, executable, kernel, and kernel-linked executable targets.
Add CI matrix entries and seed corpora for the new targets.