Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 21 additions & 0 deletions .takt/facets/instructions/fix.md
Original file line number Diff line number Diff line change
Expand Up @@ -78,3 +78,24 @@ New files (post-only) are reported `metrics_check: skipped` and do not block fix
| reopened (recurrence fixed) | {N} |
| persists (carried over, not addressed this iteration) | {N} |
| misdirected (suggestion pointed at a read-only zone, skipped) | {N} |

## Convergence verdict (REQUIRED — Phase 3 / #C-2 fix-trust shortcut)

After completing fixes, evaluate the gate above and emit one of two verdicts. The next workflow step is selected from this verdict, so it must accurately reflect the gate state.

- **fully_resolved** — `persists == 0` AND `misdirected == 0`. All findings of this iteration were either fixed or correctly skipped. No remaining work for the analyze step to re-examine.
- **partial** — `persists > 0` OR `misdirected > 0`. Some findings carried over (still need fixing in a later iteration) or were skipped due to misdirection (and need to be reported). Re-analysis is required.

Place the verdict at the **end of your report** as a single bare line in this exact form (no surrounding quotes, no trailing punctuation):

```text
convergence_verdict: fully_resolved
```

or:

```text
convergence_verdict: partial
```

**Honesty constraint**: This verdict gates whether the analyze step runs again. Reporting `fully_resolved` while leaving findings unaddressed bypasses the safety re-check. If you are uncertain whether a finding was truly resolved (e.g., you applied a fix but did not verify the build passes), emit `partial` so the analyze step can re-evaluate.
46 changes: 31 additions & 15 deletions .takt/facets/instructions/review-security.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,4 @@
Review the changes from a security perspective. Check for the following vulnerabilities:
- Injection attacks (SQL, command, XSS)
- Authentication and authorization flaws
- Data exposure risks (hardcoded secrets, API keys, tokens)
- Cryptographic weaknesses
- Unsafe code without safety comments (Rust)
- Path traversal risks
Focus on **security anomaly detection** in the changed diff. Categorical vulnerability classes (injection / auth flaws / data exposure / crypto weakness / unsafe code / path traversal) remain in scope, but the bar for raising a finding is the same as the simplicity facet: an articulable concern with a concrete exploit path, not a checklist tick.

## Obtaining the diff

Expand All @@ -21,16 +15,38 @@ Before evaluating the change, **read the following project documents** using the

**Important:**
- Do not treat documented precedence rules, extension points, or configuration override behavior as vulnerabilities by themselves.
- To issue a blocking finding, make the exploit path concrete: who controls what input, and what newly becomes possible.
- To raise a blocking finding, make the exploit path concrete: who controls what input, and what newly becomes possible.

## Judgment Procedure
## Vulnerability dimensions (use as memory aid, not a checklist)

1. Review the change diff and extract issue candidates
2. For each candidate, verify the concrete exploit path
- Which actor controls the input or configuration
- Whether the change enables new privilege, data access, code execution, or prompt modification
3. For each detected issue, classify it as blocking or non-blocking
4. If there is even one blocking issue, judge as REJECT
The following classes remain reviewable, but flag them only when you can construct a concrete exploit path:

- **Injection attacks**: SQL, command, XSS — actor-controlled input flowing into an interpreter without escaping
- **Authentication and authorization flaws**: Missing checks, scope confusion, privilege escalation paths
- **Data exposure risks**: Hardcoded secrets, API keys, tokens, sensitive logs
- **Cryptographic weaknesses**: Broken algorithms, missing randomness, weak key handling
- **Unsafe code without safety comments** (Rust): `unsafe` blocks lacking `// SAFETY:` justification
- **Path traversal**: Unsanitized file paths reaching filesystem APIs

## Anomaly mode (preferred entry point)

Read the diff once, end-to-end, before consulting the dimensions list. If a pattern reads as **unusual / unexplained / hard to justify** from a security standpoint, that is your primary signal. The dimensions above are a memory aid for what to do with that signal, not a substitute for it.

For each finding, articulate:

- **What is unusual or risky**
- **Who controls the relevant input or configuration**
- **What newly becomes possible** (data access, privilege, code execution, prompt modification)

If you cannot articulate the third bullet, the finding is speculative — downgrade or drop it.

## Judgment procedure

1. Read the diff from `.takt/review-diff.txt`
2. Read straight through. Note any pattern that triggers a security concern
3. For each candidate, verify the concrete exploit path (input control, what becomes possible)
4. Classify each verified concern as blocking or non-blocking
5. If there is even one blocking concern with a concrete exploit path, judge as REJECT

## Docs-only changes: trust boundary criterion

Expand Down
55 changes: 36 additions & 19 deletions .takt/facets/instructions/review-simplicity.md
Original file line number Diff line number Diff line change
@@ -1,43 +1,60 @@
Focus on reviewing **code simplicity** within the changed diff only.
Focus on **anomaly detection** in the changed diff -- patterns that look unusual, unexplained, or out of step with the surrounding codebase. Do NOT enumerate against a fixed checklist; the deterministic layer already handles structural metrics.

## Obtaining the diff

The diff has been pre-collected by push-runner (Rust exe) and saved to `.takt/review-diff.txt`.
**Read this file first** using the Read tool. This is the authoritative review target.
Do NOT run `git diff` or `jj diff` yourself -- the file already contains the correct diff scope.

## Scope constraint
## Determinism layer guarantees (do NOT duplicate)

The following dimensions are enforced by deterministic hooks at write time and by `fix-metrics-check.ps1` during fix iterations. Skip them — flagging them duplicates the deterministic layer and produces noise:

- **Comment policy** (Bundle Z #B-α / `hooks-post-tool-comment-lint-rust`): Non-doc comments are blocked at PostToolUse. Existing comments in the diff have already passed the allowlist (`// SAFETY:` / `// TODO:` / rustdoc etc.).
- **Function length** (順位 48, same hook): Functions >50 lines are blocked at write time (touch-trigger ratchet, grandfathered until touched). New >50 functions or growth past 50 cannot land in changed regions.
- **Function metrics during fix** (Bundle Z #B-β / `fix-metrics-check.ps1`): non-doc comment count, function length, max nesting depth cannot increase per function during fix iterations. Pre/post comparison enforces this structurally.

Reviewing these dimensions is duplicative. Skip them.

## Anomaly criteria (subjective judgment required)

Review ONLY the lines changed in the diff. Do NOT explore cross-file dependencies, call chains, or project-wide architecture. Every finding must be traceable to a specific hunk in the diff.
Read the diff straight through. Note any pattern that prompted "this looks unusual / unexpected / hard to explain" — patterns deterministic checks cannot catch:

## Review criteria (all diff-local)
- **Unexplained complexity**: Logic choices with no obvious motivation given the surrounding code; algorithm complexity that seems disproportionate to the problem
- **Inconsistent style**: Naming or structural patterns that diverge from neighboring code without rationale
- **Dead-on-arrival code**: Branches, parameters, or abstractions with no apparent caller or use site
- **Hidden coupling**: Changes that silently depend on global state, environment, ordering, or undocumented invariants
- **Missing failure paths**: Operations that can fail (I/O, parse, network, optional unwrap) with no visible error handling
- **Non-obvious magic values**: Numeric or string literals whose meaning isn't clear from context

- **Nesting depth**: Flag blocks nested >4 levels; suggest flattening via early returns or extraction
- **Function length**: Flag functions exceeding 50 lines
- **Early return opportunities**: Identify guard clauses that would reduce nesting
- **Redundant / duplicate code**: Flag copy-paste patterns or unnecessarily verbose logic within the diff
- **Magic numbers**: Flag unexplained numeric or string literals; suggest named constants
- **YAGNI violations**: Flag speculative abstractions, unused parameters, or over-engineered patterns that serve no current need
- **Naming clarity**: Flag ambiguous variable/function names that obscure intent
For each anomaly, articulate **what looks unusual**, **why it caught your attention**, and **what alternative would be expected**. If you cannot articulate the "why", it likely isn't an anomaly worth flagging.

## Scope constraint

Review primarily within the changed diff. **Limited** cross-file lookups are permitted only to *verify* an anomaly already raised by the diff (e.g., confirming a hidden coupling, checking whether a referenced symbol exists, distinguishing dead-on-arrival code from a legitimate caller elsewhere). Do NOT use this allowance to expand into project-wide architecture review, unrelated call chains, or speculative exploration. Every anomaly finding must still be traceable to a specific hunk in the diff — cross-file evidence supports the finding, it does not become its own finding.

Comment thread
coderabbitai[bot] marked this conversation as resolved.
## Scope of DRY / YAGNI (do NOT raise findings outside this scope)

The DRY and YAGNI criteria above apply **only to executable code logic**.
The DRY and YAGNI dimensions in anomaly detection apply **only to executable code logic**.

- **DRY scope**: Flag duplicated *code logic* (copy-paste functions, repeated control flow, redundant computations). Do NOT flag:
- Documentation hierarchies that intentionally restate context (e.g., a summary table followed by detailed bullet points)
- Repetition between docs and code (docs explain, code executes — they serve different audiences)
- Test code mirroring production code structure (test independence > test DRY)
- **YAGNI scope**: Flag *speculative code abstractions* (unused parameters, premature interfaces, over-engineered patterns in production code). Do NOT flag:
- Planning documents listing "future candidates", "Phase 2 検討", or "out of scope but worth considering" sections — these capture design intent for shared understanding, not speculative implementation
- ADR alternatives sections describing rejected options — these document the decision rationale
- Comments documenting *known constraints or limitations* of the current implementation (these are not speculation; they are recorded reality)
- Planning documents listing "future candidates", "Phase 2 検討", or "out of scope but worth considering" sections
- ADR alternatives sections describing rejected options
- Comments documenting *known constraints or limitations* of the current implementation

If a finding cannot be tied to executable code logic, it is out of scope.

## Calibration: avoid over-narrowing

If a finding cannot be tied to executable code logic, it is out of scope — do not raise it.
The shift to anomaly detection is meant to remove the duplicative checklist work, not to skip review. If reading the diff leaves you with a concrete unease that you can articulate, raise it — even if it doesn't fit a named criterion. Conversely, if you can only flag something by mechanically applying a rule, the deterministic layer already handles that case.

## Judgment procedure

1. Read the diff from `.takt/review-diff.txt`
2. For each changed hunk, check against the 7 criteria above
3. For each detected issue, classify as blocking (significantly harms readability/maintainability) or non-blocking (minor suggestion)
4. If there is even one blocking issue, judge as REJECT
2. Read straight through. After the first pass, list any pattern that read as "unusual / unexpected / hard to explain"
3. For each anomaly, classify as blocking (significant unexplained risk) or non-blocking (worth raising but not a blocker)
4. If there is even one blocking anomaly, judge as REJECT
12 changes: 11 additions & 1 deletion .takt/workflows/post-pr-review.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -93,7 +93,17 @@ steps:
pass_previous_response: false
instruction: fix
rules:
- condition: Fixes complete
# Phase 3 / #C-2 fix-trust shortcut: fix step が Convergence verdict
# `convergence_verdict: fully_resolved` を report したら analyze 再実行を
# skip して直接 COMPLETE。3-iter run を 2-iter に圧縮する。
- condition: |
Report ends with `convergence_verdict: fully_resolved` (Convergence
gate shows persists: 0 AND misdirected: 0). All findings of this
iteration are fully resolved with no remaining work.
next: COMPLETE
- condition: |
Fixes applied but `convergence_verdict: partial` (some findings
persist or were misdirected, re-analysis needed).
next: analyze
- condition: Unable to proceed with fixes
next: supervise
Expand Down
12 changes: 11 additions & 1 deletion .takt/workflows/pre-push-review.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -111,7 +111,17 @@ steps:
pass_previous_response: false
instruction: fix
rules:
- condition: Fixes complete
# Phase 3 / #C-2 fix-trust shortcut: fix step が Convergence verdict
# `convergence_verdict: fully_resolved` を report したら reviewers 再実行を
# skip して直接 COMPLETE。multi-iter の review-fix loop を圧縮する。
- condition: |
Report ends with `convergence_verdict: fully_resolved` (Convergence
gate shows persists: 0 AND misdirected: 0). All findings of this
iteration are fully resolved with no remaining work.
next: COMPLETE
- condition: |
Fixes applied but `convergence_verdict: partial` (some findings
persist or were misdirected, re-review needed).
next: reviewers
- condition: Unable to proceed with fixes
next: supervise
Expand Down
10 changes: 5 additions & 5 deletions docs/pipeline-token-efficiency.md
Original file line number Diff line number Diff line change
Expand Up @@ -454,11 +454,11 @@ docs/pipeline-token-efficiency.md の「検証方法」を実行。

| 改善案 | 配置 | 状態 | 採用日 | 完了 PR | 備考 |
|---|---|---|---|---|---|
| #C-3 rate-limit skip | **PR 1** | 計画 | - | - | PR #97 の rate-limit 検出を流用 |
| #A-2 trivial PR skip | **PR 1** | 計画 | - | - | 単独実施可 |
| #B-β 制約付き fix instruction | **PR 2** (Bundle Z Phase 2) | 計画 | 2026-05-01 | - | PR #99 の例外マーカー定数と同期必須 |
| #B-γ reviewer 役割変更 | **PR 3** (Bundle Z Phase 3) | 計画 | 2026-05-01 | - | PR 2 dogfood 完了が前提 |
| #C-2 Iter 3 短絡 | **PR 3** (fix-trust 連帯) | 計画 | - | - | PR 3 で #B-γ と同梱 |
| #C-3 rate-limit skip | **PR 1** | 完了 | 2026-05-03 | #102 | PR #97 の rate-limit 検出を流用 |
| #A-2 trivial PR skip | **PR 1** | 完了 | 2026-05-03 | #102 | 単独実施可 |
| #B-β 制約付き fix instruction | **PR 2** (Bundle Z Phase 2) | 完了 | 2026-05-03 | #103 | PR #99 の例外マーカー定数と同期必須 |
| #B-γ reviewer 役割変更 | **PR 3** (Bundle Z Phase 3) | 実装中 | 2026-05-04 | (進行中) | PR #103/104/105 dogfood 完了、anomaly mode に転換 |
| #C-2 Iter 3 短絡 | **PR 3** (fix-trust 連帯) | 実装中 | 2026-05-04 | (進行中) | PR 3 で #B-γ と同梱、convergence_verdict マーカー導入 |
| #A-3 transcript filter | **番外** | 計画 | - | - | スキマ時間の単独 PR か #D-4 再評価時に同梱 |
| #D-4 応答スタイル簡素化 | (保留) | 保留 (ADR-034) | 2026-05-02 | - | Bundle Z (PR 2 + PR 3) 完了後再評価 |

Expand Down
2 changes: 2 additions & 0 deletions docs/todo.md
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,8 @@
| 54 | 🔧 Tier 2 | **review 完了待ちの CronCreate 化 + observer 廃止 (Bundle b PR-2) ★ Bundle b** | todo5.md | M | 順位 53 land 後 (Bb-1 で導入する Cron 機構を review 完了待ちにも展開、45s polling + 5s observer polling を完全排除、固定値 wakeup 化) |
| 55 | 💎 Tier 3 | **config 拡張 + SessionStart catch-up (Bundle b PR-3) ★ Bundle b** | todo5.md | S | 順位 53 / 54 land 後 (固定値の `monitor.toml` 化 + Claude Code 不在時に発火した wakeup を SessionStart で catch-up、AI 不在時の silent loss 防止) |
| 56 | 🔧 Tier 2 | **comment-lint hook test 拡充 (PR #104 T2-1+T2-2 bundle)** | todo5.md | S | なし (UTF-8 multi-byte 5 パターン + block comment boundary 6 パターンを `locate_string_line_ranges` / `span_overlaps_ranges` の回帰テストとして体系化、PR #104 Critical/Minor fix の固定化) |
| 57 | 🔧 Tier 2 | **Aggregation cap integration test (PR #105 T2-1 採用)** | todo5.md | S | なし (`collect_all_violations` の MAX_VIOLATIONS contract を test 化、将来の lint 追加時に `truncate(MAX)` 削除 regression を防止する explicit 安全網) |
| 58 | 🔧 Tier 2 | **post-merge-feedback findings table format 拡張 (Severity / Frequency / Adoption Risk / Recommendation を必須列化)** | todo5.md | S | なし (PR #105 評価で Effort + Rationale のみでは AI 採用判定が安定しないことを確認、rubric ベースの format 固定化で評価コスト削減 + 卻下根拠の言語化) |
Comment on lines +72 to +73
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot May 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

新規タスクの追加先がずれています。

このファイルの運用ルールでは新規タスクは追加しない前提なので、72-73行の追加は役割と衝突しています。実体は docs/todo5.md に置いて、ここは参照用サマリーに留めてください。

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/todo.md` around lines 72 - 73, The two new task rows you added (titles
"Aggregation cap integration test (PR `#105` T2-1 採用)" and "post-merge-feedback
findings table format 拡張 (Severity / Frequency / Adoption Risk / Recommendation
を必須列化)") were created in the wrong document; remove these detailed rows from
this summary file and instead add them to the dedicated todo5 document as full
tasks, and in this file keep only a compact reference/summary line pointing to
the dedicated todo5 entry so the summary stays read-only.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

本 finding は採用しません。docs/todo.md の運用ルール「新規タスクは追加しない」は task body (詳細セクション) を todoN.md (現在 todo5.md) に置く規約で、priority table 行の追加は既存運用です (順位 1-56 すべて同パターン、本 PR の 順位 57/58 もユーザー指示で同様に追加)。文言の曖昧性自体は既知で、順位 32 (★ Bundle V) として「docs/todo.md ヘッダの「新規タスクは追加しない」表記を実態整合 (PR #95 T3-3)」が登録済み。本 PR では変更せず、順位 32 の land 時に文言を更新します。

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aloekun、ご説明ありがとうございます。理解しました。docs/todo.md の「新規タスクは追加しない」は task body(詳細セクション)todoN.md に置く規約であり、priority table 行の追加は既存運用(順位 1〜56 すべて同パターン)に則ったものですね。今回の 順位 57/58 の追加は正しい操作です。

文言の曖昧性は順位 32(★ Bundle V)の land 時に整合される予定とのこと、承知しました。


✏️ Learnings added
Learnt from: aloekun
Repo: aloekun/claude-code-hook-test PR: 106
File: docs/todo.md:72-73
Timestamp: 2026-05-03T20:41:26.405Z
Learning: In the repository aloekun/claude-code-hook-test, docs/todo.md has a header saying "新規タスクは追加しない" (don't add new tasks), but this rule applies only to task body/detail sections (which go into todoN.md, currently todo5.md). Adding new rows to the priority summary table at the top of docs/todo.md IS the established convention — all rows 1-56 follow this pattern. The wording ambiguity is tracked as item `#32` (Bundle V, PR `#95` T3-3) to be fixed when that task lands.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.


🧠 Learnings used
Learnt from: aloekun
Repo: aloekun/claude-code-hook-test PR: 64
File: docs/adr/adr-022-automation-responsibility-separation.md:128-128
Timestamp: 2026-04-21T09:08:59.321Z
Learning: For Markdown linting in this repository, do not treat MD038 (no-space-in-code) as an error for lines inside fenced code blocks (```...``` or ~~~...~~~). MD038 is intended for inline backtick code spans, so lint warnings should only be enforced outside fenced code blocks.


**戦略**: Tier 1 を 2〜3 セッションで片付け → Tier 2 で ADR-032 の前提 + rate-limit + convergence cost 削減を進める → Tier 3 で ADR-032 を land + ドキュメント整備。Tier 4-5 は cleanup / 外部展開で daily efficiency への直接効果は小さい。

Expand Down
Loading