fix(snapshot): harden --json output for CI consumers#186
Merged
Conversation
Follow-up to Matt's --json flag:
- Keep stdout clean in JSON mode by plumbing json_output into
_execute_snapshot_tests and running without the Rich live spinner.
Previously per-test prints and spinner frames leaked into the
payload, making stdout unparseable.
- Track the real set of saved tests via a {name -> Path} map so per-test
`saved` and `golden_file` reflect actual disk writes — previously any
passing test was reported as saved whenever at least one sibling
saved, and the path guessed `.yaml` instead of the variant-aware
`.golden.json` GoldenStore actually writes.
- Reject `--preview --json` upfront with a JSON error and non-zero exit
rather than silently dropping --json.
- Pretty-print JSON (indent=2) to match `skill`/`model-check` output,
tidy the --json help text to document the suppression and
auto-approve behavior, and keep `import json` with the other stdlib
imports.
- Cover the contract with tests/test_snapshot_json_output.py:
parseable payload, accurate per-test saved/golden_file tracking
(including partial save failures), variant-aware paths, --preview
rejection, and the empty-suite error shape.
https://claude.ai/code/session_01YPVyciLBGFEpKoMV7y8zFQ
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Follow-up to #182. Tightens the edges on
evalview snapshot --jsonso the payload is actually consumable by CI.json_outputinto_execute_snapshot_testsand skiprun_with_spinner. Previously per-test prints and Rich spinner frames wrote to the same stream as the JSON payload, soevalview snapshot --json | jqwould fail on real runs.saved/golden_file—_save_snapshot_resultsnow returns a{name -> Path}map; the JSON builder keys off that instead of a global count. Before, a passing test whosesave_goldenraised would still appear assaved: trueas long as at least one sibling saved.Pathreturned byGoldenStore.save_golden(which is variant-aware and ends in.golden.json) instead of guessing{test_case}.yaml.--preview --jsonrejected upfront — emits a JSON error +ctx.exit(2)instead of silently dropping--json.jsonimport grouped with stdlib,indent=2for consistency withskill/model-check --json, and the--jsonhelp text now documents the suppression + auto-approve behavior.No change to the JSON schema.
Test plan
New
tests/test_snapshot_json_output.pycovers the contract:--jsonemits a single parseable JSON document on stdout with no Rich markup / banner / spinner leakagesaved/golden_filereflect actual disk writes even when some saves raisegolden_fileuses the variant-awareGoldenStorepath--preview --jsonexits non-zero with a JSON error payload{"error": "no tests found"}test_snapshot_generated_workflow.py,test_e2e_snapshot_check.py) still pass — 25/25https://claude.ai/code/session_01YPVyciLBGFEpKoMV7y8zFQ