Skip to content

feat(lib-ollama-client): num_ctx overflow detection diagnostic warn log (Phase A / 順位 98)#142

Merged
aloekun merged 3 commits into
masterfrom
feature/phase-a-num-ctx-overflow-detection
May 11, 2026
Merged

feat(lib-ollama-client): num_ctx overflow detection diagnostic warn log (Phase A / 順位 98)#142
aloekun merged 3 commits into
masterfrom
feature/phase-a-num-ctx-overflow-detection

Conversation

@aloekun
Copy link
Copy Markdown
Owner

@aloekun aloekun commented May 11, 2026

Summary

Phase d retirement への critical path Phase A (順位 98 = num_ctx overflow detection diagnostic warn log) を land。Phase A 実装と同時の dogfood で Phase B 真因 (num_ctx truncation = 100% 到達) を確定。次の next action は Phase C (DEFAULT_NUM_CTX = 8192 → 16384 増加) で確定。

変更内容

Commit 1: docs only (前 turn carry-over)

順位 106-108 (PR #141 post-merge-feedback 採用分) を docs/todo5.md + docs/todo-summary.md に登録 + analysis.md の「次に何をするか」を critical path to retirement 構造 (Phase A〜E) に書き換え。retirement 条件 (本ファイル削除) を明示、旧 dogfood outcome を 1 行 table に圧縮、Bundle c-1 等を critical path 外 section に分離。

Commit 2: Phase A 実装 + 順位 98 cleanup + analysis.md Phase A/B 結果反映

src/lib-ollama-client/src/lib.rs (+198 行)

新 API:

#[derive(Debug, Clone, Default)]
pub struct OllamaMetadata {
    pub prompt_eval_count: Option<u32>,
    pub eval_count: Option<u32>,
    pub num_ctx: Option<u32>,
}

pub trait OllamaApi {
    fn generate_raw_json(&self, prompt: &str) -> Result<String, OllamaError>;

    // default 実装で既存 stub impl は変更不要
    fn generate_with_metadata(&self, prompt: &str) -> Result<(String, OllamaMetadata), OllamaError> {
        let raw = self.generate_raw_json(prompt)?;
        Ok((raw, OllamaMetadata::default()))
    }
}

generate_json ヘルパー (型付き parse) は generate_with_metadata を経由するよう変更。parse error 時に emit_overflow_diagnostic を呼び、overflow_hint (純粋関数、testable) が prompt_eval_count >= 90% num_ctx 時に hint を返す。

OllamaClient::generate_with_metadata を override: request_envelope ヘルパーで envelope 全体を取得し、prompt_eval_count / eval_count を metadata に流す。num_ctx は client の setting を埋める。

warn log format (stderr):

[lib-ollama-client] WARN: Ollama JSON output may be truncated.
  parse_error: <serde error>
  prompt_eval_count: <N> (vs num_ctx: <M>)
  eval_count: <K>, response_length: <L> chars
  hint: prompt_eval_count が num_ctx の <ratio>% に達しています。num_ctx を増やすことで解決可能 (`with_num_ctx` で override)

Tests (16/16 pass)

新規 5 件 + 既存 11 件:

  • metadata_carries_prompt_eval_count_when_provided: mockito で envelope に prompt_eval_count: 1234 を含めて流し、metadata に伝搬されることを assert
  • metadata_handles_missing_eval_counts_gracefully: prompt_eval_count 欠落時に None になることを assert + num_ctx は client setting を反映
  • stub_trait_default_returns_empty_metadata: trait の default impl で metadata が Default::default() になることを assert (StubOllama 互換性)
  • emit_overflow_diagnostic_includes_hint_when_prompt_eval_count_near_cap: prompt_eval_count 7400 / num_ctx 8192 (90.3%) で hint emit を確認
  • emit_overflow_diagnostic_skips_hint_when_metadata_absent: metadata default で hint なし

todo cleanup

  • docs/todo6.md: 順位 98 detail entry 削除 (-54 行)
  • docs/todo-summary.md: 順位 98 row 削除

analysis.md Phase A完了 + Phase B 真因確定 を反映

  • Phase A row を ✅ 完了化 + 実装内容を 1 段落要約
  • Phase B row を ✅ 完了化 + 真因 = num_ctx truncation を decisive 判定 を明記
  • Phase C は C-1 経路 (num_ctx 増加) で確定、C-2 (model swap 等) を scope 外として削除

Phase B dogfood result (実機確認、本 PR commit 2 land の smoke で取得)

PR #141 (P-3 = 187 行 mixed diff) を deployed cli-finding-classifier に replay:

[lib-ollama-client] WARN: Ollama JSON output may be truncated.
  parse_error: missing field `screen_decision` at line 20 column 1
  prompt_eval_count: 8192 (vs num_ctx: 8192)
  eval_count: 199, response_length: 479 chars
  hint: prompt_eval_count が num_ctx の 100% に達しています。num_ctx を増やすことで解決可能 (`with_num_ctx` で override)

真因 decisive 判定: prompt_eval_count = num_ctx = 8192 で完全 truncation 発生 → mistral が JSON output を完成できず screen_decision field 欠落の症状。lint-screen prompt + 187 行 diff で context が完全に埋まる。

Test plan

  • cargo test -p lib-ollama-client — 16/16 pass (5 新規 + 11 既存)
  • cargo build --release -p lib-ollama-client clean
  • cargo build --release -p cli-finding-classifier clean (依存 crate)
  • pnpm deploy:hooks で linter / 関連 hook を更新
  • cli-finding-classifier.exe を .claude/ に再 deploy (target/release/cli-finding-classifier.exe.claude/cli-finding-classifier.exe)
  • PR test(custom-lint): rule⑥ self-exclusion 回帰テスト + Bundle g-1 stale cleanup (Phase d P-3 繰上げ) #141 (P-3) replay で warn log が stderr に emit、prompt_eval_count: 8192 (vs num_ctx: 8192) = 100% を実機確認
  • takt pre-push-review pass (12m 41s)
  • markdownlint clean (各 edit 後の PostToolUse hook で確認)

Out of scope (Phase C 以降で対応)

  • Phase C (DEFAULT_NUM_CTX = 8192 → 16384 増加): 真因確定により本 PR 別で実施。 RAM 影響評価 + lint-screen evals 15 件 regression test + smoke dogfood で fallback rate 低下確認
  • Push-runner ログへの stderr 取込: 現状 lint_screen.rs (src/cli-push-runner/src/stages/lint_screen.rs) は classifier exit が 0 のとき stderr を .takt/lint-screen-report.md に出さない。push-runner 経由でも warn log を見るには別 PR で取込対応必要 (今回は manual 呼出で diagnostic 機能を確認)
  • 派生プロジェクト deploy: lib-ollama-client は本リポ専用、現状 deploy 不要

関連

Summary by CodeRabbit

リリースノート

  • New Features

    • 生成時にメタデータを取得し、トークン数に基づくオーバーフロー検知と詳しい診断ログを出力できるようになりました。
  • Documentation

    • 分析フローをPhase A〜Eのクリティカルパスに沿って整理・更新しました。
    • 実行順序サマリーを更新し、優先タスク(106〜108)を追加。旧のnum_ctxオーバーフロー検知エントリは削除しています。
    • マイナーなドキュメント整形追記を行いました。

Review Change Stack

aloekun added 2 commits May 11, 2026 14:20
…og + 順位 98 cleanup + analysis.md Phase A完了/B真因確定 (Phase A)
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 11, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 5c795f34-3cc6-47c9-9829-6e995f0d1f50

📥 Commits

Reviewing files that changed from the base of the PR and between 1c2054b and 229c532.

📒 Files selected for processing (4)
  • docs/local-llm-offload-analysis.md
  • docs/todo-summary.md
  • docs/todo5.md
  • docs/todo6.md
✅ Files skipped from review due to trivial changes (2)
  • docs/todo5.md
  • docs/todo6.md
🚧 Files skipped from review as they are similar to previous changes (1)
  • docs/local-llm-offload-analysis.md

📝 Walkthrough

Walkthrough

本PRは Ollama クライアントの JSON 解析失敗時に num_ctx overflow を診断する型・メソッド・ロジックを追加し、Phase A〜E の段階的修復スケジュールをドキュメント化する。生レスポンスメタデータから閾値ベースの hint を算出して stderr に出力し、関連タスクを task roadmap に登録する。

Changes

Overflow Diagnosis & Repair

Layer / File(s) Summary
Metadata & Public Types
src/lib-ollama-client/src/lib.rs
OllamaMetadata struct(prompt_eval_count, eval_count, num_ctx)と OllamaApi::generate_with_metadata を追加。
Response Envelope Schema
src/lib-ollama-client/src/lib.rs
GenerateResponse に optional prompt_eval_counteval_count を追加し、Ollama レスポンスから token 情報を取得。
Request & Envelope Helper
src/lib-ollama-client/src/lib.rs
request_envelope ヘルパーを実装し、/api/generate リクエストと GenerateResponse の deserialization を統一。
Overflow Diagnostic Logic
src/lib-ollama-client/src/lib.rs
overflow_hintemit_overflow_diagnostic を追加し、token 比率が閾値(例:90%)を超える場合に truncation hint を stderr に出力。
OllamaClient Implementation
src/lib-ollama-client/src/lib.rs
generate_raw_jsongenerate_with_metadata を返すように trait 実装をリファクタ。
JSON Parse Integration
src/lib-ollama-client/src/lib.rs
generate_jsongenerate_with_metadata ベースに書き換え、JSON parse 失敗時に emit_overflow_diagnostic を呼んで診断ログを出力。
Tests
src/lib-ollama-client/src/lib.rs
metadata 抽出(存在/欠落)、閾値周辺の hint 挙動、trait default stub をカバーするユニットテストを追加・拡張。
Critical Path Documentation
docs/local-llm-offload-analysis.md
Phase A〜E(診断→原因特定→修正→検証→採否/retirement)を記載し、dogfood signal log と並行タスクを更新。
Task Roadmap
docs/todo-summary.md, docs/todo5.md, docs/todo6.md
PR #141 関連タスクを追加(self-exclusion ガード、development-workflow の anti-pattern 追補、CLAUDE.md の Tier 2 偽装検知 table 公開)、および todo6.md から num_ctx overflow detection タスクを削除。

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • aloekun/claude-code-hook-test#138: 本PRが実装する「num_ctx overflow detection」(prompt_eval_count/eval_count キャプチャと generate_with_metadata)と密接に関連。
  • aloekun/claude-code-hook-test#136: src/lib-ollama-client/src/lib.rs の num_ctx 関連の設定/シリアライズ周りを修正する PR、コードレベルで関連。
  • aloekun/claude-code-hook-test#133: docs/todo-* 系の編集を巡る変更が重複しており、タスクリスト更新の観点で関連。
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed タイトルはPhase A実装の num_ctx overflow 検出診断機能に直結し、PR内容と正確に対応している。
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/local-llm-offload-analysis.md`:
- Line 200: The markdown link
"[docs-governance.md](https://github.com/aloekun/claude-code-hook-test/blob/master/docs/adr/)"
is pointing to the wrong URL; update the link in
docs/local-llm-offload-analysis.md (the retirement workflow line) so the link
target matches the link text (docs-governance.md) and points to the actual
governance document location (replace the "/docs/adr/" URL with the correct path
to docs-governance.md in the repo).

In `@docs/todo-summary.md`:
- Around line 75-77: The table rows currently point to todo5.md but the
file-wide policy in the intro says new details should go to docs/todo6.md; pick
one canonical target and make the file consistent: either change the three table
entries that reference "todo5.md" to "docs/todo6.md" (or vice‑versa), and update
the introductory policy text to match that choice; also search for other
occurrences of "todo5.md" or "docs/todo6.md" in this document and adjust
cross‑references so all links and the intro policy are consistent with the
selected canonical file.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 2ec3bd1e-646f-481f-9d43-565d4df2da72

📥 Commits

Reviewing files that changed from the base of the PR and between 967ab9e and 1c2054b.

📒 Files selected for processing (5)
  • docs/local-llm-offload-analysis.md
  • docs/todo-summary.md
  • docs/todo5.md
  • docs/todo6.md
  • src/lib-ollama-client/src/lib.rs

Comment thread docs/local-llm-offload-analysis.md Outdated
Comment thread docs/todo-summary.md Outdated
@aloekun
Copy link
Copy Markdown
Owner Author

aloekun commented May 11, 2026

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 11, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@aloekun aloekun merged commit 7b9e64d into master May 11, 2026
1 check passed
@aloekun aloekun deleted the feature/phase-a-num-ctx-overflow-detection branch May 11, 2026 08:41
aloekun added a commit that referenced this pull request May 11, 2026
…t cause fix) (#143)

* docs(todo): 順位 109-113 (PR #142 T2-#1+#3 + T3-#1+#3+#4 採用) を追加

* feat(lib-ollama-client): DEFAULT_NUM_CTX 8192 → 32768 — Phase C で context overflow を完全解消 (順位 98 root cause fix)

* docs: CR PR #143 Minor 採用 — 順位 109 Tier 表記を 🚀Tier1 → 🔧Tier2 に修正 (todo6.md 詳細との整合)
aloekun added a commit that referenced this pull request May 11, 2026
…4) + workflow gap 反映

Phase D dogfood の 1 本目 (D-1)。3 件 ADR 採用 + Phase D 計画 land + workflow gap 反映を bundled。

順位 112 (PR #142 T3-#3): ADR-038 に 2 section 追記
  - Diagnostic logging scope: eprintln (CLI 前提) と structured logging 移行条件
  - 90% 閾値 rationale: 保守採用根拠 + Phase D 完了時の precision/recall 評価方針

順位 113 (PR #142 T3-#4): ADR-027 に metrics override 判断基準追記
  - Incidental change vs Responsibility change の線引き
  - Override 記述様式 (Override / Reason / Rationale の 3 項目)

順位 114 (PR #143 T3-#1): ADR-040 新規作成
  - mistral:7b context size 実測値 (8K 512MB 5-20s ↔ 32K 2GB 30-90s)
  - step_timeout 3.33x 比例係数の根拠
  - num_ctx 選定 flow chart
  - CLAUDE.md ADR index 追加
  - lib-ollama-client/src/lib.rs L128-139 dogfood evolution コメントを 4 行参照に短縮

Phase D 計画 land + workflow gap (新規):
  - analysis.md に D-1/D-2/D-3 PR 構成 + 計測手順 + 想定リスクを追加
  - D-1 着手時に判明した workflow gap (jj auto-snapshot vs session-only opt-in) を計測手順に反映
  - env var override (LINT_SCREEN_ENABLED) を todo8.md / todo-summary.md (順位 115) に backlog 登録
  - D-1 自身は dogfood skip、env var override land 後に D-2 / D-3 で実 dogfood 実施

D-1 lint_screen 状態: skipped。Phase D guide §1 session-only opt-in workflow が jj auto-snapshot と
本質的に衝突するため、配列 env var override の cli-push-runner 実装 (順位 115) を D-2 前に land 必須。
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant