Conversation
|
ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughこのPRは、ADR による eval fixture 規約追加、lint‑screen prompt の決定論化、合意度メトリクス(strict/normalized)とルール正規化、6 件の合成フィクスチャ、およびテストランナーと集計レポートの実装を一貫して導入します。docs は Phase b 判定を conditional GO(9/12=75.0%)へ更新しました。 ChangesPhase b' 評価パイプライン統合
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/local-llm-offload-analysis.md`:
- Around line 26-29: 冒頭のステータス文「Phase b/c/d は未着手」と、下部の見出し「Phase b — 判定 GO/NO-GO ✅
**GO 達成 (2026-05-08)**」が矛盾しているため、冒頭ステータスを更新して Phase b が GO に達したことを反映してください(例:
「Phase b: GO(2026-05-08) / Phase c/d: 未着手」など)。該当箇所は文頭のステータス説明と見出し "Phase b — 判定
GO/NO-GO ✅ **GO 達成 (2026-05-08)**" を整合させるよう修正してください。
In `@src/cli-finding-classifier/evals/files/eval8-large-refactor.diff`:
- Around line 31-42: Session 構造体の private フィールド user が書き込みのみで読み取りが無いため dead_code
警告が出ているので、Session::new と合わせて修正してください。対処は (A) Session に pub fn user(&self) ->
&str(または &String)という読み取りアクセサを追加して実際に参照できるようにするか、意図的に未使用であることを明示するなら (B) user
フィールドに #[allow(dead_code)] を付与して警告を抑制する、のいずれかを採用してください;参照箇所は Session 構造体定義内の
user フィールドと impl Session ブロック(Session::new)です。
- Around line 49-51: is_expired が常に false
を返してセッションが永遠に期限切れにならない状態になっているので、ADR-038 の「単一 issue_pattern」意図を壊さないよう修正してください;
具体的には is_expired(&self, _now: Duration) を TTL ベースの正規実装に差し替え(例えば構造体に expires_at か
ttl フィールドを追加して now と比較するロジックを入れる)か、もしくはこの無条件 false
が意図的なスタブであることを明確にするコメント/ドキュメントを追加して未実装であることを明示してください。対象シンボル: is_expired,
issue_pattern(およびセッション構造体名/フィールド名を必要に応じて更新)。
In `@src/cli-finding-classifier/prompts/lint-screen.txt`:
- Line 65: Examples A–H are missing unified-diff headers which contradicts the
schema that says the "file" field must be copied from the +++ b/<path> header;
update each Example A–H Input diff to include both --- a/<path> and +++ b/<path>
unified-diff headers so the examples match the real fixtures and the "file"
value is the verbatim +++ b/<path> header, or if you prefer the other approach,
relax the schema to accept either +++ b/<path> or diff --git ... b/<path> when
populating the "file" field (recommended: add the headers to the examples).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: a18da21d-b8bc-41a4-9df9-7463e9af4f65
📒 Files selected for processing (11)
docs/adr/adr-038-local-llm-finding-classification.mddocs/local-llm-offload-analysis.mdsrc/cli-finding-classifier/evals/files/eval10-nesting-boundary.diffsrc/cli-finding-classifier/evals/files/eval11-comment-only.diffsrc/cli-finding-classifier/evals/files/eval12-test-cfg.diffsrc/cli-finding-classifier/evals/files/eval7-style-only.diffsrc/cli-finding-classifier/evals/files/eval8-large-refactor.diffsrc/cli-finding-classifier/evals/files/eval9-multi-import-leak.diffsrc/cli-finding-classifier/evals/lint-screen-evals.jsonsrc/cli-finding-classifier/prompts/lint-screen.txtsrc/cli-finding-classifier/tests/lint_screen_evals.rs
…ti-metrics + prompt v3 (agreement 75% conditional GO, temperature=0)
d08faa3 to
485e636
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
docs/local-llm-offload-analysis.md (1)
5-5:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winPhase b 判定の記述を 1 つの基準に統一してください。
同一ページ内で「GO 達成」「conditional GO (75.0%)」「agreement ≥80% 達成済」が混在しており、再開時のゲート判定が不明確です。特に Line 52 と Line 85 は、Line 28 の 75.0% と整合していません。
修正例(いずれかに統一)
- > **状態**: 試験運用 (Phase a 完了 = PR `#130` land / Phase b 完了 = GO 達成 2026-05-08, PR `#131`、Phase c/d は未着手)。 + > **状態**: 試験運用 (Phase a 完了 = PR `#130` land / Phase b' は conditional GO (75.0%, 2026-05-08, PR `#131`) / Phase c/d は未着手)。 - これらは漸近的な改善余地はあるが、Phase c 着手の前提条件 (agreement ≥ 80%) は達成済のため scope 外。Phase d (PR-based dogfood) で実観測が必要。 + これらは漸近的な改善余地はあるが、現時点では agreement 75.0% のため、§8.E は conditional GO の範囲(auto_fix lane 限定)で扱う。full GO 判定は追加改善後に再評価する。 - - **依存**: **Phase b で agreement rate ≥ 80% 達成** (旧依存の §A-2 dogfood は無効化、Phase a evals に置換) + - **依存**: **Phase b' の判定条件に準拠**(conditional GO 運用か、agreement rate ≥ 80% の full GO かを本節上部の最終判定と一致させる)Also applies to: 26-29, 52-53, 85-85
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/local-llm-offload-analysis.md` at line 5, The Phase b acceptance criteria are inconsistent across the document—instances of "GO 達成", "conditional GO (75.0%)", and "agreement ≥80%" must be unified to a single, explicit criterion; pick which rule to use (e.g., "conditional GO (75.0%)" or "agreement ≥80%"), then update every occurrence of those phrases (including the header phrase "試験運用 (Phase a 完了... Phase b 完了 = GO 達成 ...)" and the in-body mentions currently showing "GO 達成", "conditional GO (75.0%)", and "agreement ≥80%") to the chosen wording and numeric threshold, and make sure any adjacent explanatory text or summary sections reflect the same numeric threshold and wording so the restart/gate decision is unambiguous.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/cli-finding-classifier/evals/files/eval8-large-refactor.diff`:
- Around line 2-4: The fixture metadata is inaccurate: update the issue_pattern
and verification_purpose in the diff header to reflect the actual added lines
(mod.rs ~9 + session.rs ~23 + token.rs ~12 ≈ 44 lines). Replace "3 file / 80+ 行の
architectural addition" with something like "3 file / 40+ 行の architectural
addition (約44行)" and change verification_purpose from "大規模 context 内で
magic-number を取りこぼさず拾えるか" to a wording that matches the actual scale (e.g., "中規模
context (約44行) 内で magic-number を取りこぼさず拾えるか"); keep expected_screen_decision and
other fields unchanged. Ensure the textual counts in the header match the sum of
added lines so downstream threshold/compare logic uses correct metadata.
---
Duplicate comments:
In `@docs/local-llm-offload-analysis.md`:
- Line 5: The Phase b acceptance criteria are inconsistent across the
document—instances of "GO 達成", "conditional GO (75.0%)", and "agreement ≥80%"
must be unified to a single, explicit criterion; pick which rule to use (e.g.,
"conditional GO (75.0%)" or "agreement ≥80%"), then update every occurrence of
those phrases (including the header phrase "試験運用 (Phase a 完了... Phase b 完了 = GO
達成 ...)" and the in-body mentions currently showing "GO 達成", "conditional GO
(75.0%)", and "agreement ≥80%") to the chosen wording and numeric threshold, and
make sure any adjacent explanatory text or summary sections reflect the same
numeric threshold and wording so the restart/gate decision is unambiguous.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 4a20cb27-e979-4172-9da9-e0eb4094028d
📒 Files selected for processing (11)
docs/adr/adr-038-local-llm-finding-classification.mddocs/local-llm-offload-analysis.mdsrc/cli-finding-classifier/evals/files/eval10-nesting-boundary.diffsrc/cli-finding-classifier/evals/files/eval11-comment-only.diffsrc/cli-finding-classifier/evals/files/eval12-test-cfg.diffsrc/cli-finding-classifier/evals/files/eval7-style-only.diffsrc/cli-finding-classifier/evals/files/eval8-large-refactor.diffsrc/cli-finding-classifier/evals/files/eval9-multi-import-leak.diffsrc/cli-finding-classifier/evals/lint-screen-evals.jsonsrc/cli-finding-classifier/prompts/lint-screen.txtsrc/cli-finding-classifier/tests/lint_screen_evals.rs
✅ Files skipped from review due to trivial changes (3)
- src/cli-finding-classifier/evals/files/eval7-style-only.diff
- docs/adr/adr-038-local-llm-finding-classification.md
- src/cli-finding-classifier/evals/files/eval11-comment-only.diff
🚧 Files skipped from review as they are similar to previous changes (6)
- src/cli-finding-classifier/evals/files/eval10-nesting-boundary.diff
- src/cli-finding-classifier/evals/lint-screen-evals.json
- src/cli-finding-classifier/evals/files/eval12-test-cfg.diff
- src/cli-finding-classifier/evals/files/eval9-multi-import-leak.diff
- src/cli-finding-classifier/prompts/lint-screen.txt
- src/cli-finding-classifier/tests/lint_screen_evals.rs
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
Summary
Phase b (
docs/local-llm-offload-analysis.md§1) の GO/NO-GO 判定を実施し、agreement rate 10/12 = 83.3% で §8.E 着手 GO を達成した。Phase b' として:tests/lint_screen_evals.rsで出力# SYNTHETIC FIXTURE標準ヘッダ規約を追記、新規 6 fixture に適用iteration 履歴
v2 → v3で +33pt 改善。LLM がinformational列を選べるようになった (混同行列 0/6 → 4/6)。残る 2 件 (LLM 側の限界)
Phase c (§8.E 実装) で v4 prompt iteration を継続予定。
Final confusion matrix (v3 + baseline fix)
Test plan
post-merge-feedback (PR #130) 採用項目の反映
(file, rule, line ±2)→(file, rule)に変更 (finding_matches)#[ignore]test を CI optional step として有効化 — 未対応 (Phase c 開始時)docs-governance.mdに codify)Out of scope
Summary by CodeRabbit
Documentation
Tests
Chores
Style
New Features