Conversation
- 状態 banner / §1 Phase c+ Bundle i / §2 §8.E / §4 再開チェックリスト の 4 箇所に同期 - dogfood 結果 (73.3% / fallback 2/15) と Phase d 着手前提条件の充足状況を記録 - 次の最終 gate を「§8.D v4 prompt 改訂で eval13/15 の JSON 完全性問題に一次対策」に lazer 化
…use: prompt truncation) ## 背景 PR #135 (Bundle i) の dogfood で eval13 (5 file / 280 行) と eval15 (1 file / 208 行) が JSON schema breakdown: - eval13: 'missing field screen_decision' (top-level field omitted) - eval15: 'missing field severity at line 38' (nested field omitted) todo は当初 §8.D を 'v4 prompt 改訂ループ' と命名していたが、root cause を実証検証 (raw Ollama output dump) すると prompt 設計の問題ではなく client 設定が原因と判明。 ## 実証検証 (raw output dump) eval13: prompt_eval_count=4096 (Ollama default num_ctx 上限到達) → schema definition section が context window から truncate され、model は無関係な directory tree JSON を 生成。 eval15: prompt_eval_count=4096 → 部分的に schema が残存し、近い field 名 (rule_id / message / start_line) で出力。 両 eval の prompt_eval_count が completely identical (= context window cap) であり、 prompt 改訂では解決不能。 ## 修正 - OllamaClient に num_ctx field 追加 + DEFAULT_NUM_CTX = 8192 (mistral:7b は理論上 32K 対応、安全マージン + 推論コストの兼合いで 8192) - GenerateOptions に num_ctx を serialize - with_num_ctx builder 追加 (将来 prompt をさらに長く扱う用途向け) - tests: num_ctx_defaults_and_overrides_apply / num_ctx_is_serialized_into_request_body (mockito で request body に num_ctx:8192 が含まれることを assert) ## dogfood 結果 | 指標 | Before (default) | After (num_ctx=8192) | |---|---|---| | agreement rate | 11/15 = 73.3% | 13/15 = 86.7% | | eval13 fallback | screen_decision missing | none (decision match) | | eval15 fallback | severity missing | none (decision match) | | verdict | CONDITIONAL-GO | GO (§8.E 着手) | 残る 2 件の disagreement (eval5 / eval10) は Phase b' から既知の境界判定で num_ctx と 無関係 (= LLM 側の限界、改善の漸近余地あり)。 ## Phase d 着手前提条件の充足完了 Bundle i + 本 commit で以下 4 要素が揃った: - (a) [lint_screen] config silent failure 防止 (PR #135 順位 91) - (b) scale-aware fixtures による reproducible measurement (PR #135 順位 92) - (c) cross-file partial fix anti-pattern の global rule 化 (PR #135 順位 93) - (d) JSON 完全性問題への一次対策 (本 commit、num_ctx 8192) 次は Phase d (PR-based 実環境 dogfood) 着手判定。
|
Warning Rate limit exceeded
You’ve run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughウォークスルーPR 変更内容Phase c+ ドキュメント更新
Ollama クライアント num_ctx 機能
推定コード審査作業量🎯 2 (Simple) | ⏱️ ~12 分 関連する可能性のある PR
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
src/lib-ollama-client/src/lib.rs (1)
306-328: ⚡ Quick win
num_ctxのオーバーライド値がリクエストボディに反映されるテストも追加したいです。現状は default 値 (
8192) のシリアライズだけを検証しているため、with_num_ctx(...)の配線回帰を見逃す余地があります。追加例
#[test] fn num_ctx_is_serialized_into_request_body() { @@ - let mock = server + let mock = server .mock("POST", "/api/generate") .match_body(mockito::Matcher::PartialJsonString( - r#"{"options":{"num_ctx":8192}}"#.to_string(), + r#"{"options":{"num_ctx":16384}}"#.to_string(), )) @@ - let client = OllamaClient::new(server.url(), "mistral:7b"); + let client = OllamaClient::new(server.url(), "mistral:7b") + .with_num_ctx(16384); let _: TestPayload = generate_json(&client, "test prompt").unwrap();🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/lib-ollama-client/src/lib.rs` around lines 306 - 328, Add a test that verifies a non-default num_ctx is serialized into the request body: duplicate the existing num_ctx_is_serialized_into_request_body test but configure the request with a different value (e.g. 4096) via the API used to set the option (call the builder/option method that sets num_ctx on OllamaClient or the request payload—refer to with_num_ctx or the request/options builder you have), update the mock to match PartialJsonString r#"{"options":{"num_ctx":4096}}"#, send the request using generate_json(&client, ...) (or the corresponding method that uses the overridden option), and assert the mock; this ensures the override wiring for with_num_ctx is tested instead of only the default 8192.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/local-llm-offload-analysis.md`:
- Around line 182-184: Replace the hard-coded pass counts after the cargo test
commands with non-fixed wording so the docs don't rot; locate the two lines
containing "cargo test -p cli-finding-classifier --test lint_screen_evals" and
"cargo test -p cli-push-runner" and remove the "20 件 pass" and "53+ 件 pass"
suffixes, replacing them with a generic phrase such as "tests pass" or "確認済み"
(or omit counts entirely) so the commands remain accurate without brittle
numeric assertions.
In `@src/lib-ollama-client/src/lib.rs`:
- Around line 87-90: The builder method with_num_ctx currently allows zero which
leads to runtime errors; add an input guard in with_num_ctx to reject 0
immediately (e.g., assert!/debug_assert! or explicit panic with a clear message)
so callers cannot set self.num_ctx to 0, and then continue to set self.num_ctx =
num_ctx and return self; reference the with_num_ctx method and the num_ctx field
when making this change.
---
Nitpick comments:
In `@src/lib-ollama-client/src/lib.rs`:
- Around line 306-328: Add a test that verifies a non-default num_ctx is
serialized into the request body: duplicate the existing
num_ctx_is_serialized_into_request_body test but configure the request with a
different value (e.g. 4096) via the API used to set the option (call the
builder/option method that sets num_ctx on OllamaClient or the request
payload—refer to with_num_ctx or the request/options builder you have), update
the mock to match PartialJsonString r#"{"options":{"num_ctx":4096}}"#, send the
request using generate_json(&client, ...) (or the corresponding method that uses
the overridden option), and assert the mock; this ensures the override wiring
for with_num_ctx is tested instead of only the default 8192.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: ad313ece-60b6-4f95-abf1-07cf213536c6
📒 Files selected for processing (2)
docs/local-llm-offload-analysis.mdsrc/lib-ollama-client/src/lib.rs
| cargo test -p cli-finding-classifier --test lint_screen_evals # schema validation 20 件 pass (Bundle i で 12→20) | ||
| cargo test -p cli-push-runner # 53+ 件 pass (Bundle i で +5 lint_screen config tests) | ||
|
|
There was a problem hiding this comment.
テストの「pass 件数」を固定値で書かない方が運用しやすいです。
Line 182-184 の 20 件 pass / 53+ 件 pass は将来の通常変更で陳腐化しやすく、再開チェック時にノイズになります。
修正例
-cargo test -p cli-finding-classifier --test lint_screen_evals # schema validation 20 件 pass (Bundle i で 12→20)
-cargo test -p cli-push-runner # 53+ 件 pass (Bundle i で +5 lint_screen config tests)
+cargo test -p cli-finding-classifier --test lint_screen_evals # schema validation が全件 pass すること
+cargo test -p cli-push-runner # lint_screen config tests を含め全件 pass すること🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@docs/local-llm-offload-analysis.md` around lines 182 - 184, Replace the
hard-coded pass counts after the cargo test commands with non-fixed wording so
the docs don't rot; locate the two lines containing "cargo test -p
cli-finding-classifier --test lint_screen_evals" and "cargo test -p
cli-push-runner" and remove the "20 件 pass" and "53+ 件 pass" suffixes, replacing
them with a generic phrase such as "tests pass" or "確認済み" (or omit counts
entirely) so the commands remain accurate without brittle numeric assertions.
…377294) num_ctx = 0 は Ollama API で実行時 error になるため、build pattern 段階で assert! で fail-fast。 - Panics doc note を with_num_ctx の rustdoc に追記 - #[should_panic(expected = "...")] test (with_num_ctx_panics_on_zero) で seal CodeRabbit PR #136 review #r3213377294 採用。同 review の #r3213377290 (docs の test pass 件数固定) はユーザー判断で任意 → skip。
…eanup (#137) * docs: PR #135/#136 land 反映 — 順位 97 追加 + Phase d gate 解消の stale-doc cleanup PR #135 (Bundle i) と PR #136 (§8.D / num_ctx 8192) merge 後の post-merge-feedback / 状態整合に伴う doc 更新を 1 PR で集約。 ## 順位 97 追加 (PR #136 T2-#1 採用) PR #136 post-merge-feedback で ✅ 採用された Tier 2 #1 (`with_num_ctx(X)` override 値 serialization 検証テスト、Effort S / Adoption Risk None) を todo に登録。 - docs/todo-summary.md: 順位 97 行を table に追加 (Tier 2 / S / 依存なし) - docs/todo6.md: 詳細エントリ (動機 / 設計決定 / 作業計画 / 完了基準) を追加 CodeRabbit nitpick ではなく post-merge-feedback agent が独立に発見した test gap。 既存 num_ctx_is_serialized_into_request_body は default 値 (8192) のみ検証で with_num_ctx(X) の wiring が壊れた場合の silent degrade を捕捉できない。 ## Phase d gate 解消の stale-doc cleanup PR #136 で §8.D (= num_ctx 8192) が land したため、analysis.md 内で「Phase d 着手の 最終 gate」と記述していた箇所が stale 化。4 箇所を更新: - L5 (状態 banner): §8.D 完了表記を追加、'最終 gate' を 'kickoff 待機' に書き換え - L129 (§1 Phase c+ 残る最終 gate): 'land 完了' に書き換え + root cause pivot 経緯を保存 - L153 (§2 §8.E Phase d 着手前提): ✅ 完了マーク + pivot 経緯 - L185 (§4 再開チェックリスト): '§8.D v4 改訂ループ' → 'Phase d dogfood' - L200 (§4 次に何をするか): Phase d kickoff prep / 実 dogfood / 結果集約 の 3 段階に再構成 ## Phase d 着手前提の確認 (a) [lint_screen] config silent failure 防止 ✅ PR #135 (b) scale-aware fixtures による reproducible measurement ✅ PR #135 (c) cross-file partial fix anti-pattern global rule 化 ✅ PR #135 (d) JSON 完全性問題への一次対策 ✅ PR #136 (num_ctx 8192) → Phase d kickoff 可能。dogfood 自体は実 PR 3-5 件で long-running。 * docs(todo-summary): 推奨実行順序サマリーの更新日を 2026-05-10 に同期 (CodeRabbit Minor #r3213575272) 順位 97 を 2026-05-10 に追加したが見出しの「(2026-04-29 更新)」が stale 化していた ため修正。 外部参照は併設の `<a id="recommended-order-summary"></a>` 経由のため anchor は break しない (`coding-style.md` § 日付入り見出しアンカー — 安定識別子優先 に準拠)。 CodeRabbit PR #137 review #r3213575272 採用。
Summary
Phase d 着手前の最終 gate である §8.D (大規模 diff の JSON 完全性問題への一次対策) を land。
todo は当初「§8.D v4 prompt 改訂ループ」と命名されていたが、PR #135 (Bundle i) dogfood で観測された JSON schema breakdown を raw Ollama output dump で root cause 検証 したところ、prompt 設計の問題ではなく client 側の
num_ctxデフォルト (4096) 超過 が原因と判明。手段を pivot し、lib-ollama-clientのnum_ctxを 8192 に拡張する形で §8.D 着手を完了。変更内容
docs/local-llm-offload-analysis.md: Bundle i 完了 (PR feat(cli-finding-classifier, cli-push-runner): Bundle i — Phase d 着手前必須 follow-up (順位 91 + 92 + 93) #135 land) の反映 (4 箇所同期: 状態 banner / §1 Phase c+ / §2 §8.E / §4 再開チェックリスト)src/lib-ollama-client/src/lib.rs:OllamaClient.num_ctxfield 追加 +DEFAULT_NUM_CTX = 8192(mistral:7b は理論上 32K 対応、安全マージン + 推論コストの兼合いで 8192)GenerateOptionsにnum_ctxを serializewith_num_ctxbuilder 追加 (将来 prompt をさらに長く扱う用途向け)num_ctx_defaults_and_overrides_apply/num_ctx_is_serialized_into_request_body(mockito で request body を assert)実証検証 (raw output dump)
PR #135 で fallback した eval13 / eval15 に対して、
__dump_raw_ollama.sh(scratch、gitignored) で raw mistral 出力をキャプチャ:rule_id/message/start_line) で生成 (schema が部分的に残存)両 eval の
prompt_eval_countが completely identical (= context window cap) であり、prompt 設計改善では解決不能と確定。dogfood 結果 (mistral:7b / temperature=0)
missing field 'screen_decision'missing field 'severity' at line 38残る 2 件の disagreement (eval5: multi-issue / eval10: nesting boundary) は Phase b' v3 prompt 採用時から既知の LLM 側の限界で、num_ctx と無関係。漸近的な改善余地はあるが Phase d 着手の前提条件は達成済。
Phase d 着手前提条件の充足完了
PR #135 (Bundle i) + 本 PR で以下 4 要素が揃った:
[lint_screen]config silent failure 防止次は Phase d (PR-based 実環境 dogfood) の着手判定。
Test plan
cargo test -p lib-ollama-client10 tests pass (新 2 tests 含む)cargo test -p cli-finding-classifier --test lint_screen_evals -- --ignored1 test pass / agreement 13/15 = 86.7% (verdict GO)pnpm build:all全 exe 成功Out of scope
Summary by CodeRabbit
リリースノート
新機能
ドキュメント