aloekun · aloekun · May 4, 2026 · May 3, 2026 · May 3, 2026 · May 3, 2026
diff --git a/.takt/facets/instructions/fix.md b/.takt/facets/instructions/fix.md
@@ -78,3 +78,24 @@ New files (post-only) are reported `metrics_check: skipped` and do not block fix
 | reopened (recurrence fixed) | {N} |
 | persists (carried over, not addressed this iteration) | {N} |
 | misdirected (suggestion pointed at a read-only zone, skipped) | {N} |
+
+## Convergence verdict (REQUIRED — Phase 3 / #C-2 fix-trust shortcut)
+
+After completing fixes, evaluate the gate above and emit one of two verdicts. The next workflow step is selected from this verdict, so it must accurately reflect the gate state.
+
+- **fully_resolved** — `persists == 0` AND `misdirected == 0`. All findings of this iteration were either fixed or correctly skipped. No remaining work for the analyze step to re-examine.
+- **partial** — `persists > 0` OR `misdirected > 0`. Some findings carried over (still need fixing in a later iteration) or were skipped due to misdirection (and need to be reported). Re-analysis is required.
+
+Place the verdict at the **end of your report** as a single bare line in this exact form (no surrounding quotes, no trailing punctuation):
+
+```text
+convergence_verdict: fully_resolved
+```
+
+or:
+
+```text
+convergence_verdict: partial
+```
+
+**Honesty constraint**: This verdict gates whether the analyze step runs again. Reporting `fully_resolved` while leaving findings unaddressed bypasses the safety re-check. If you are uncertain whether a finding was truly resolved (e.g., you applied a fix but did not verify the build passes), emit `partial` so the analyze step can re-evaluate.
diff --git a/.takt/facets/instructions/review-security.md b/.takt/facets/instructions/review-security.md
@@ -1,10 +1,4 @@
-Review the changes from a security perspective. Check for the following vulnerabilities:
-- Injection attacks (SQL, command, XSS)
-- Authentication and authorization flaws
-- Data exposure risks (hardcoded secrets, API keys, tokens)
-- Cryptographic weaknesses
-- Unsafe code without safety comments (Rust)
-- Path traversal risks
+Focus on **security anomaly detection** in the changed diff. Categorical vulnerability classes (injection / auth flaws / data exposure / crypto weakness / unsafe code / path traversal) remain in scope, but the bar for raising a finding is the same as the simplicity facet: an articulable concern with a concrete exploit path, not a checklist tick.
 
 ## Obtaining the diff
 
@@ -21,16 +15,38 @@ Before evaluating the change, **read the following project documents** using the
 
 **Important:**
 - Do not treat documented precedence rules, extension points, or configuration override behavior as vulnerabilities by themselves.
-- To issue a blocking finding, make the exploit path concrete: who controls what input, and what newly becomes possible.
+- To raise a blocking finding, make the exploit path concrete: who controls what input, and what newly becomes possible.
 
-## Judgment Procedure
+## Vulnerability dimensions (use as memory aid, not a checklist)
 
-1. Review the change diff and extract issue candidates
-2. For each candidate, verify the concrete exploit path
-   - Which actor controls the input or configuration
-   - Whether the change enables new privilege, data access, code execution, or prompt modification
-3. For each detected issue, classify it as blocking or non-blocking
-4. If there is even one blocking issue, judge as REJECT
+The following classes remain reviewable, but flag them only when you can construct a concrete exploit path:
+
+- **Injection attacks**: SQL, command, XSS — actor-controlled input flowing into an interpreter without escaping
+- **Authentication and authorization flaws**: Missing checks, scope confusion, privilege escalation paths
+- **Data exposure risks**: Hardcoded secrets, API keys, tokens, sensitive logs
+- **Cryptographic weaknesses**: Broken algorithms, missing randomness, weak key handling
+- **Unsafe code without safety comments** (Rust): `unsafe` blocks lacking `// SAFETY:` justification
+- **Path traversal**: Unsanitized file paths reaching filesystem APIs
+
+## Anomaly mode (preferred entry point)
+
+Read the diff once, end-to-end, before consulting the dimensions list. If a pattern reads as **unusual / unexplained / hard to justify** from a security standpoint, that is your primary signal. The dimensions above are a memory aid for what to do with that signal, not a substitute for it.
+
+For each finding, articulate:
+
+- **What is unusual or risky**
+- **Who controls the relevant input or configuration**
+- **What newly becomes possible** (data access, privilege, code execution, prompt modification)
+
+If you cannot articulate the third bullet, the finding is speculative — downgrade or drop it.
+
+## Judgment procedure
+
+1. Read the diff from `.takt/review-diff.txt`
+2. Read straight through. Note any pattern that triggers a security concern
+3. For each candidate, verify the concrete exploit path (input control, what becomes possible)
+4. Classify each verified concern as blocking or non-blocking
+5. If there is even one blocking concern with a concrete exploit path, judge as REJECT
 
 ## Docs-only changes: trust boundary criterion
 

diff --git a/.takt/facets/instructions/review-simplicity.md b/.takt/facets/instructions/review-simplicity.md
@@ -1,43 +1,60 @@
-Focus on reviewing **code simplicity** within the changed diff only.
+Focus on **anomaly detection** in the changed diff -- patterns that look unusual, unexplained, or out of step with the surrounding codebase. Do NOT enumerate against a fixed checklist; the deterministic layer already handles structural metrics.
 
 ## Obtaining the diff
 
 The diff has been pre-collected by push-runner (Rust exe) and saved to `.takt/review-diff.txt`.
 **Read this file first** using the Read tool. This is the authoritative review target.
 Do NOT run `git diff` or `jj diff` yourself -- the file already contains the correct diff scope.
 
-## Scope constraint
+## Determinism layer guarantees (do NOT duplicate)
+
+The following dimensions are enforced by deterministic hooks at write time and by `fix-metrics-check.ps1` during fix iterations. Skip them — flagging them duplicates the deterministic layer and produces noise:
+
+- **Comment policy** (Bundle Z #B-α / `hooks-post-tool-comment-lint-rust`): Non-doc comments are blocked at PostToolUse. Existing comments in the diff have already passed the allowlist (`// SAFETY:` / `// TODO:` / rustdoc etc.).
+- **Function length** (順位 48, same hook): Functions >50 lines are blocked at write time (touch-trigger ratchet, grandfathered until touched). New >50 functions or growth past 50 cannot land in changed regions.
+- **Function metrics during fix** (Bundle Z #B-β / `fix-metrics-check.ps1`): non-doc comment count, function length, max nesting depth cannot increase per function during fix iterations. Pre/post comparison enforces this structurally.
+
+Reviewing these dimensions is duplicative. Skip them.
+
+## Anomaly criteria (subjective judgment required)
 
-Review ONLY the lines changed in the diff. Do NOT explore cross-file dependencies, call chains, or project-wide architecture. Every finding must be traceable to a specific hunk in the diff.
+Read the diff straight through. Note any pattern that prompted "this looks unusual / unexpected / hard to explain" — patterns deterministic checks cannot catch:
 
-## Review criteria (all diff-local)
+- **Unexplained complexity**: Logic choices with no obvious motivation given the surrounding code; algorithm complexity that seems disproportionate to the problem
+- **Inconsistent style**: Naming or structural patterns that diverge from neighboring code without rationale
+- **Dead-on-arrival code**: Branches, parameters, or abstractions with no apparent caller or use site
+- **Hidden coupling**: Changes that silently depend on global state, environment, ordering, or undocumented invariants
+- **Missing failure paths**: Operations that can fail (I/O, parse, network, optional unwrap) with no visible error handling
+- **Non-obvious magic values**: Numeric or string literals whose meaning isn't clear from context
 
-- **Nesting depth**: Flag blocks nested >4 levels; suggest flattening via early returns or extraction
-- **Function length**: Flag functions exceeding 50 lines
-- **Early return opportunities**: Identify guard clauses that would reduce nesting
-- **Redundant / duplicate code**: Flag copy-paste patterns or unnecessarily verbose logic within the diff
-- **Magic numbers**: Flag unexplained numeric or string literals; suggest named constants
-- **YAGNI violations**: Flag speculative abstractions, unused parameters, or over-engineered patterns that serve no current need
-- **Naming clarity**: Flag ambiguous variable/function names that obscure intent
+For each anomaly, articulate **what looks unusual**, **why it caught your attention**, and **what alternative would be expected**. If you cannot articulate the "why", it likely isn't an anomaly worth flagging.
+
+## Scope constraint
+
+Review primarily within the changed diff. **Limited** cross-file lookups are permitted only to *verify* an anomaly already raised by the diff (e.g., confirming a hidden coupling, checking whether a referenced symbol exists, distinguishing dead-on-arrival code from a legitimate caller elsewhere). Do NOT use this allowance to expand into project-wide architecture review, unrelated call chains, or speculative exploration. Every anomaly finding must still be traceable to a specific hunk in the diff — cross-file evidence supports the finding, it does not become its own finding.
 
 ## Scope of DRY / YAGNI (do NOT raise findings outside this scope)
 
-The DRY and YAGNI criteria above apply **only to executable code logic**.
+The DRY and YAGNI dimensions in anomaly detection apply **only to executable code logic**.
 
 - **DRY scope**: Flag duplicated *code logic* (copy-paste functions, repeated control flow, redundant computations). Do NOT flag:
   - Documentation hierarchies that intentionally restate context (e.g., a summary table followed by detailed bullet points)
   - Repetition between docs and code (docs explain, code executes — they serve different audiences)
   - Test code mirroring production code structure (test independence > test DRY)
 - **YAGNI scope**: Flag *speculative code abstractions* (unused parameters, premature interfaces, over-engineered patterns in production code). Do NOT flag:
-  - Planning documents listing "future candidates", "Phase 2 検討", or "out of scope but worth considering" sections — these capture design intent for shared understanding, not speculative implementation
-  - ADR alternatives sections describing rejected options — these document the decision rationale
-  - Comments documenting *known constraints or limitations* of the current implementation (these are not speculation; they are recorded reality)
+  - Planning documents listing "future candidates", "Phase 2 検討", or "out of scope but worth considering" sections
+  - ADR alternatives sections describing rejected options
+  - Comments documenting *known constraints or limitations* of the current implementation
+
+If a finding cannot be tied to executable code logic, it is out of scope.
+
+## Calibration: avoid over-narrowing
 
-If a finding cannot be tied to executable code logic, it is out of scope — do not raise it.
+The shift to anomaly detection is meant to remove the duplicative checklist work, not to skip review. If reading the diff leaves you with a concrete unease that you can articulate, raise it — even if it doesn't fit a named criterion. Conversely, if you can only flag something by mechanically applying a rule, the deterministic layer already handles that case.
 
 ## Judgment procedure
 
 1. Read the diff from `.takt/review-diff.txt`
-2. For each changed hunk, check against the 7 criteria above
-3. For each detected issue, classify as blocking (significantly harms readability/maintainability) or non-blocking (minor suggestion)
-4. If there is even one blocking issue, judge as REJECT
+2. Read straight through. After the first pass, list any pattern that read as "unusual / unexpected / hard to explain"
+3. For each anomaly, classify as blocking (significant unexplained risk) or non-blocking (worth raising but not a blocker)
+4. If there is even one blocking anomaly, judge as REJECT
diff --git a/.takt/workflows/post-pr-review.yaml b/.takt/workflows/post-pr-review.yaml
@@ -93,7 +93,17 @@ steps:
     pass_previous_response: false
     instruction: fix
     rules:
-      - condition: Fixes complete
+      # Phase 3 / #C-2 fix-trust shortcut: fix step が Convergence verdict
+      # `convergence_verdict: fully_resolved` を report したら analyze 再実行を
+      # skip して直接 COMPLETE。3-iter run を 2-iter に圧縮する。
+      - condition: |
+          Report ends with `convergence_verdict: fully_resolved` (Convergence
+          gate shows persists: 0 AND misdirected: 0). All findings of this
+          iteration are fully resolved with no remaining work.
+        next: COMPLETE
+      - condition: |
+          Fixes applied but `convergence_verdict: partial` (some findings
+          persist or were misdirected, re-analysis needed).
         next: analyze
       - condition: Unable to proceed with fixes
         next: supervise

diff --git a/.takt/workflows/pre-push-review.yaml b/.takt/workflows/pre-push-review.yaml
@@ -111,7 +111,17 @@ steps:
     pass_previous_response: false
     instruction: fix
     rules:
-      - condition: Fixes complete
+      # Phase 3 / #C-2 fix-trust shortcut: fix step が Convergence verdict
+      # `convergence_verdict: fully_resolved` を report したら reviewers 再実行を
+      # skip して直接 COMPLETE。multi-iter の review-fix loop を圧縮する。
+      - condition: |
+          Report ends with `convergence_verdict: fully_resolved` (Convergence
+          gate shows persists: 0 AND misdirected: 0). All findings of this
+          iteration are fully resolved with no remaining work.
+        next: COMPLETE
+      - condition: |
+          Fixes applied but `convergence_verdict: partial` (some findings
+          persist or were misdirected, re-review needed).
         next: reviewers
       - condition: Unable to proceed with fixes
         next: supervise

diff --git a/docs/pipeline-token-efficiency.md b/docs/pipeline-token-efficiency.md
@@ -454,11 +454,11 @@ docs/pipeline-token-efficiency.md の「検証方法」を実行。
 
 | 改善案 | 配置 | 状態 | 採用日 | 完了 PR | 備考 |
 |---|---|---|---|---|---|
-| #C-3 rate-limit skip | **PR 1** | 計画 | - | - | PR #97 の rate-limit 検出を流用 |
-| #A-2 trivial PR skip | **PR 1** | 計画 | - | - | 単独実施可 |
-| #B-β 制約付き fix instruction | **PR 2** (Bundle Z Phase 2) | 計画 | 2026-05-01 | - | PR #99 の例外マーカー定数と同期必須 |
-| #B-γ reviewer 役割変更 | **PR 3** (Bundle Z Phase 3) | 計画 | 2026-05-01 | - | PR 2 dogfood 完了が前提 |
-| #C-2 Iter 3 短絡 | **PR 3** (fix-trust 連帯) | 計画 | - | - | PR 3 で #B-γ と同梱 |
+| #C-3 rate-limit skip | **PR 1** | 完了 | 2026-05-03 | #102 | PR #97 の rate-limit 検出を流用 |
+| #A-2 trivial PR skip | **PR 1** | 完了 | 2026-05-03 | #102 | 単独実施可 |
+| #B-β 制約付き fix instruction | **PR 2** (Bundle Z Phase 2) | 完了 | 2026-05-03 | #103 | PR #99 の例外マーカー定数と同期必須 |
+| #B-γ reviewer 役割変更 | **PR 3** (Bundle Z Phase 3) | 実装中 | 2026-05-04 | (進行中) | PR #103/104/105 dogfood 完了、anomaly mode に転換 |
+| #C-2 Iter 3 短絡 | **PR 3** (fix-trust 連帯) | 実装中 | 2026-05-04 | (進行中) | PR 3 で #B-γ と同梱、convergence_verdict マーカー導入 |
 | #A-3 transcript filter | **番外** | 計画 | - | - | スキマ時間の単独 PR か #D-4 再評価時に同梱 |
 | #D-4 応答スタイル簡素化 | (保留) | 保留 (ADR-034) | 2026-05-02 | - | Bundle Z (PR 2 + PR 3) 完了後再評価 |
 

diff --git a/docs/todo.md b/docs/todo.md
@@ -69,6 +69,8 @@
 | 54 | 🔧 Tier 2 | **review 完了待ちの CronCreate 化 + observer 廃止 (Bundle b PR-2) ★ Bundle b** | todo5.md | M | 順位 53 land 後 (Bb-1 で導入する Cron 機構を review 完了待ちにも展開、45s polling + 5s observer polling を完全排除、固定値 wakeup 化) |
 | 55 | 💎 Tier 3 | **config 拡張 + SessionStart catch-up (Bundle b PR-3) ★ Bundle b** | todo5.md | S | 順位 53 / 54 land 後 (固定値の `monitor.toml` 化 + Claude Code 不在時に発火した wakeup を SessionStart で catch-up、AI 不在時の silent loss 防止) |
 | 56 | 🔧 Tier 2 | **comment-lint hook test 拡充 (PR #104 T2-1+T2-2 bundle)** | todo5.md | S | なし (UTF-8 multi-byte 5 パターン + block comment boundary 6 パターンを `locate_string_line_ranges` / `span_overlaps_ranges` の回帰テストとして体系化、PR #104 Critical/Minor fix の固定化) |
+| 57 | 🔧 Tier 2 | **Aggregation cap integration test (PR #105 T2-1 採用)** | todo5.md | S | なし (`collect_all_violations` の MAX_VIOLATIONS contract を test 化、将来の lint 追加時に `truncate(MAX)` 削除 regression を防止する explicit 安全網) |
+| 58 | 🔧 Tier 2 | **post-merge-feedback findings table format 拡張 (Severity / Frequency / Adoption Risk / Recommendation を必須列化)** | todo5.md | S | なし (PR #105 評価で Effort + Rationale のみでは AI 採用判定が安定しないことを確認、rubric ベースの format 固定化で評価コスト削減 + 卻下根拠の言語化) |
 
 **戦略**: Tier 1 を 2〜3 セッションで片付け → Tier 2 で ADR-032 の前提 + rate-limit + convergence cost 削減を進める → Tier 3 で ADR-032 を land + ドキュメント整備。Tier 4-5 は cleanup / 外部展開で daily efficiency への直接効果は小さい。