feat(cli-pr-monitor): 順位 85+86 verdict guard + matrix tests (Bundle g-1 / §A-2 P-1) by aloekun · Pull Request #125 · aloekun/claude-code-hook-test

aloekun · 2026-05-07T12:27:33Z

Summary

§A-2 Phase 5 dogfood P-1 (Bundle g-1) — 3 PR 連続観測 (PR feat(cli-finding-classifier): Ollama でローカル LLM 分類 CLI 追加 (ADR-038) #119/feat(cli-pr-monitor): cli-finding-classifier 統合 + Finding C strict (ADR-038 Phase 5) #120/docs: ADR-038 textual fix + Bundle f task registration (post-PR #120 follow-up) #121) で発覚した cli-pr-monitor の誤 approved 判定を fix
順位 85 (T1): compute_verdict() に review_state guard 追加 — CodeRabbit が not_found / pending のとき判定保留
順位 86 (T2): (action, review_state, findings) → verdict の transition matrix を 12 unit test で網羅
master 上 classifier (P-0 で有効化済) が本 PR の post-pr-monitor で classification を起動 → §A-2 計測ログに dogfood 結果を記録予定

Changes

`src/cli-pr-monitor/src/stages/monitor.rs`

順位 85 fix — compute_verdict() に early-return guard を追加:
```
if let Some(cr) = &result.coderabbit {
    if cr.review_state == "not_found" || cr.review_state == "pending" {
        return "CodeRabbit review が未完了のため、判定を保留します";
    }
}
```
位置: parked_* の match 分岐後、severity ベース判定の前。coderabbit が None (初期 state / skip_coderabbit) のときは guard 通過、既存挙動を保持
順位 86 transition matrix tests — mod tests 末尾に 12 unit test 追加:
- parked_rate_limit / parked_review_recheck (action precedence)
- not_found + empty / pending + empty / not_found + findings (順位 85 fix)
- success + empty / minor / critical / high / major (severity classification)
- skipped (skip_coderabbit path)
- coderabbit field None (初期 state)
既存 should_resume_wakeup_* 7 件と並存、regression なし

`.claude/cli-pr-monitor.exe`

pnpm build:cli-pr-monitor で release build を再生成し配備

バグ再現条件

review_state	findings	旧挙動	新挙動 (順位 85 fix)
`not_found`	`[]`	❌ "問題は見つかりませんでした" (誤 approved)	✅ "review が未完了のため判定を保留"
`pending`	`[]`	❌ "問題は見つかりませんでした"	✅ 判定保留
`not_found`	`[major]`	"修正が必要な指摘があります"	✅ 判定保留 (review 未完了優先)
`success`	`[]`	"問題は見つかりませんでした"	(変化なし、正常な approved)
`skipped`	`[]`	"問題は見つかりませんでした"	(変化なし、user opt-out)

§A-2 dogfood (P-1) 観測ポイント

本 PR の post-pr-monitor で [classifier] (mistral:7b) が CodeRabbit findings を分類する初の dogfood サイクル。計測対象:

agreement rate: 私 (Claude) の独立評価と classifier の action 一致率 — 目標 ≥80%
classifier latency: finding あたり秒数 — 目標 ≤5s/件
fallback rate: human_review fallback の比率 — 目標 ≤20%
normalized_issue 言語制約違反率: 英語 8 文字以上連続検出 — 目標 ≤10%
session token Δ: PR feat(cli-finding-classifier): Ollama でローカル LLM 分類 CLI 追加 (ADR-038) #119/feat(cli-pr-monitor): cli-finding-classifier 統合 + Finding C strict (ADR-038 Phase 5) #120/docs: ADR-038 textual fix + Bundle f task registration (post-PR #120 follow-up) #121 (classifier OFF) との input token 比較 — 目標 ≥10% 削減

PR merge 後に .takt/dogfood-pr-NNN-classified.json で findings + classified を保存し、docs/local-llm-offload-analysis.md §A-2 計測ログに追記する。

Test plan

cargo test -p cli-pr-monitor verdict --quiet — 12/12 pass
cargo test -p cli-pr-monitor --quiet — 163 active passed (7 ignored は事前)、regression なし
pre-push-review APPROVE (1 iter / 2m 41s、no findings)
release build + deploy (.claude/cli-pr-monitor.exe)
(post-merge) classifier 起動確認 + §A-2 計測ログ記録

Summary by CodeRabbit

リリースノート

改善
- CodeRabbit レビュー状態の判定ロジックを強化しました。レビューが保留中または見つからない場合、判定を保留するようになります。
テスト
- 判定ロジックの検証テストを追加し、複数のシナリオをカバーするようにしました。

…g + transition matrix tests (順位 85+86 / Bundle g-1) §A-2 Phase 5 dogfood P-1 (Bundle g-1)。3 PR 連続観測 (PR #119/#120/#121) で発覚した monitor の誤 approved 判定を fix。順位 85 (Tier 1, T1-1): - compute_verdict() に review_state guard 追加。CodeRabbit が未投稿 (review_state: not_found) もしくは進行中 (pending) のときは findings の有無に関わらず判定保留。空 findings を 'no problems' と誤同一視する false negative を防止。順位 86 (Tier 2, T2-4): - mod tests に (action, review_state, findings) → verdict transition matrix を 12 unit test で網羅: - parked_rate_limit / parked_review_recheck (action 優先) - not_found + 空 / pending + 空 / not_found + findings (順位 85 fix) - success + 空 / minor / critical / high / major - skipped (skip_coderabbit 経路) - coderabbit field None (初期 state) - 既存 should_resume_wakeup_* テスト 7 件と並存、no regression build + deploy 済 (.claude/cli-pr-monitor.exe を release build で再生成)。

coderabbitai · 2026-05-07T12:27:47Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: a38e06cd-b375-4a51-93cd-35c482a74a4f

📥 Commits

Reviewing files that changed from the base of the PR and between 229356c and 78c3a8f.

📒 Files selected for processing (1)

src/cli-pr-monitor/src/stages/monitor.rs

📝 Walkthrough

Walkthrough

PR監視ロジックのcompute_verdict関数がCodeRabbitのreview_stateフィールドを考慮するように拡張されました。review_stateが「not_found」または「pending」の場合、新しい「pending」判定を返します。既存のparked actionロジックは優先度を維持し、テストスイートが判定の優先順位ルールと新しい動作を検証します。

Changes

Review State Verdict Logic

Layer / File(s)	Summary
Verdict Logic with Review State `src/cli-pr-monitor/src/stages/monitor.rs`	compute_verdict関数がresult.coderabbit.review_stateを確認し、「not_found」または「pending」の場合は「pending」判定を返すようになります。parked actionルールは既存通り優先度を保ちます。
Tests and Constants `src/cli-pr-monitor/src/stages/monitor.rs`	PollResultとFindingのヘルパーコンストラクタ、判定メッセージ定数が追加され、parked actionとreview_stateの優先順位、および様々な組み合わせにおけるpending判定動作を検証する複数の単体テストが追加されます。

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Possibly related PRs

aloekun/claude-code-hook-test#122: review_state「not_found」/「pending」の処理とreview_stateベースの判定テストの実装を行う関連PR
aloekun/claude-code-hook-test#113: monitor.rsの判定計算を修正し、parked_rate_limitハンドリングを導入した関連PR

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	タイトルは「verdict guard + matrix tests」という実装内容に一致し、PR の主要な変更（CodeRabbit review_state の判定保留ロジック追加とテスト追加）を的確に要約しています。
Docstring Coverage	✅ Passed	Docstring coverage is 81.25% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

…P-1/P-2 ledger (順位 7 / §A-2 P-3) (#127) * feat(hooks-post-tool-linter): PowerShell (?i) flag validation + §A-2 P-1/P-2 ledger 記録 (順位 7 / §A-2 P-3) §A-2 Phase 5 dogfood P-3 (順位 7)。PR #91 で発覚した PowerShell case-insensitive regex の構造的落とし穴 (no-empty-powershell-catch / no-silent-error-action で (?i) 欠落 → CR Major 指摘) を機械強制で再発防止。順位 7 (Tier 1, T1-1): - find_powershell_rules_missing_case_insensitive_flag() 関数追加: extensions に ps1 を含む rule で pattern に (?i) が無いものを ID リストで返す - load_custom_rules() で起動時 warn (本番運用層) - 7 unit tests (異種 violator / valid / mixed-ext / case-insensitive ext / multi-violator / non-ps1 ignore) + 1 deployed-config 全 rule 検証 test (CI 検出層) - 既存 no-ephemeral-todo-reference rule の pattern に (?i) を追加 (deployed validation を pass、Windows file path 大文字小文字混在対応の副次効果あり) - ~/.claude/rules/common/code-review.md § Custom lint rule patterns 追加 (PR #91 a15b263 fix の経緯 + 順位 7 機械強制の事後参照) §A-2 計測ログ (P-1/P-2) 記録: - P-1 (PR #125): findings: 0 (CR APPROVE no comments)、classifier 未起動 → dogfood 不発 - P-2 (PR #126): findings: 1 (CR Nitpick、review body <details> block 内、 check-ci-coderabbit 抽出漏れ)、手動 synthetic finding で classifier 実行 → action=human_review / action_confidence=0.0 / fallback=length_contract、 agreement: 1/1 (100%)、latency: 6.4s/件 (>5s 目標)、fallback: 1/1 - 既知 system gap: check-ci-coderabbit が review body の <details> Nitpick を抽出しない (post-pr-monitor が classifier に渡す入力経路に欠落) build + deploy 済 (.claude/hooks-post-tool-linter.exe)。全 82 tests pass、regression なし。 * docs(todo): 順位 7 (PowerShell (?i) flag 自動検証) 完了に伴い削除

aloekun merged commit be5ee6f into master May 7, 2026
1 check passed

aloekun deleted the feature/p1-bundle-g1-verdict-guard branch May 7, 2026 13:51

aloekun mentioned this pull request May 7, 2026

feat(hooks-post-tool-linter): PowerShell (?i) flag validation + §A-2 P-1/P-2 ledger (順位 7 / §A-2 P-3) #127

Merged

7 tasks

This was referenced May 7, 2026

test(cli-pr-monitor): cross-module overflow safety + 2100 baseline (順位 76+77 / §A-2 P-4) #128

Merged

feat(cli-pr-monitor): rate-limit Posted park 予約 + ADR-018 transient failure 明文化 (順位 80+82 / §A-2 P-5) #129

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(cli-pr-monitor): 順位 85+86 verdict guard + matrix tests (Bundle g-1 / §A-2 P-1)#125

feat(cli-pr-monitor): 順位 85+86 verdict guard + matrix tests (Bundle g-1 / §A-2 P-1)#125
aloekun merged 1 commit intomasterfrom
feature/p1-bundle-g1-verdict-guard

aloekun commented May 7, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 7, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

aloekun commented May 7, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

src/cli-pr-monitor/src/stages/monitor.rs

.claude/cli-pr-monitor.exe

バグ再現条件

§A-2 dogfood (P-1) 観測ポイント

Test plan

関連

Summary by CodeRabbit

リリースノート

Uh oh!

coderabbitai Bot commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

aloekun commented May 7, 2026 •

edited by coderabbitai Bot

Loading

`src/cli-pr-monitor/src/stages/monitor.rs`

`.claude/cli-pr-monitor.exe`

coderabbitai Bot commented May 7, 2026 •

edited

Loading