Skip to content

fix(snapshot): harden --json output for CI consumers#186

Merged
hidai25 merged 1 commit intomainfrom
claude/review-pr-182-r840b
Apr 20, 2026
Merged

fix(snapshot): harden --json output for CI consumers#186
hidai25 merged 1 commit intomainfrom
claude/review-pr-182-r840b

Conversation

@hidai25
Copy link
Copy Markdown
Owner

@hidai25 hidai25 commented Apr 20, 2026

Summary

Follow-up to #182. Tightens the edges on evalview snapshot --json so the payload is actually consumable by CI.

  • Clean stdout in JSON mode — plumb json_output into _execute_snapshot_tests and skip run_with_spinner. Previously per-test prints and Rich spinner frames wrote to the same stream as the JSON payload, so evalview snapshot --json | jq would fail on real runs.
  • Accurate per-test saved / golden_file_save_snapshot_results now returns a {name -> Path} map; the JSON builder keys off that instead of a global count. Before, a passing test whose save_golden raised would still appear as saved: true as long as at least one sibling saved.
  • Real golden paths — use the Path returned by GoldenStore.save_golden (which is variant-aware and ends in .golden.json) instead of guessing {test_case}.yaml.
  • --preview --json rejected upfront — emits a JSON error + ctx.exit(2) instead of silently dropping --json.
  • Stylejson import grouped with stdlib, indent=2 for consistency with skill/model-check --json, and the --json help text now documents the suppression + auto-approve behavior.

No change to the JSON schema.

Test plan

New tests/test_snapshot_json_output.py covers the contract:

  • --json emits a single parseable JSON document on stdout with no Rich markup / banner / spinner leakage
  • saved / golden_file reflect actual disk writes even when some saves raise
  • golden_file uses the variant-aware GoldenStore path
  • --preview --json exits non-zero with a JSON error payload
  • Empty suite emits {"error": "no tests found"}
  • Existing snapshot tests (test_snapshot_generated_workflow.py, test_e2e_snapshot_check.py) still pass — 25/25

https://claude.ai/code/session_01YPVyciLBGFEpKoMV7y8zFQ

Follow-up to Matt's --json flag:

- Keep stdout clean in JSON mode by plumbing json_output into
  _execute_snapshot_tests and running without the Rich live spinner.
  Previously per-test prints and spinner frames leaked into the
  payload, making stdout unparseable.
- Track the real set of saved tests via a {name -> Path} map so per-test
  `saved` and `golden_file` reflect actual disk writes — previously any
  passing test was reported as saved whenever at least one sibling
  saved, and the path guessed `.yaml` instead of the variant-aware
  `.golden.json` GoldenStore actually writes.
- Reject `--preview --json` upfront with a JSON error and non-zero exit
  rather than silently dropping --json.
- Pretty-print JSON (indent=2) to match `skill`/`model-check` output,
  tidy the --json help text to document the suppression and
  auto-approve behavior, and keep `import json` with the other stdlib
  imports.
- Cover the contract with tests/test_snapshot_json_output.py:
  parseable payload, accurate per-test saved/golden_file tracking
  (including partial save failures), variant-aware paths, --preview
  rejection, and the empty-suite error shape.

https://claude.ai/code/session_01YPVyciLBGFEpKoMV7y8zFQ
@hidai25 hidai25 merged commit c0d5758 into main Apr 20, 2026
7 checks passed
@hidai25 hidai25 deleted the claude/review-pr-182-r840b branch April 20, 2026 14:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants