diff --git a/.claude/commands/review-pr.md b/.claude/commands/review-pr.md new file mode 100644 index 0000000000..b94f88f14c --- /dev/null +++ b/.claude/commands/review-pr.md @@ -0,0 +1,135 @@ +--- +description: Produce a structured PR review for cowprotocol/services. Invoked locally as `/review-pr` (diff mode, against current branch vs main) or `/review-pr ` (PR mode). Same command also runs in CI via `.github/workflows/claude-code-review.yml`, where it posts a single review comment instead of printing to terminal. Read-only in local modes; the user posts any comments manually. +--- + +Review PR: $ARGUMENTS + +Follow the instructions in `./docs/COW_PR_REVIEW_SKILL.md` to produce the review report. + +## Prologue (execute in order; abort on any failure) + +### 1. Detect mode + +The skill runs in one of three modes. Detection: + +- If the environment variable `$GITHUB_ACTIONS == "true"` → `mode = "pr-ci"`. + - `$ARGUMENTS` MUST be a PR number, URL, or `owner/repo#N` form (the workflow passes it). +- Else if `$ARGUMENTS` is non-empty → `mode = "pr-local"`. Parse the argument (see [step 2](#2-parse-arguments-pr-modes-only)). +- Else (`$ARGUMENTS` empty, not in CI) → `mode = "diff"`. No `gh` calls needed; source is `git diff $(git merge-base HEAD origin/main)..HEAD` (the actual command runs in step 3 below, after fetching `origin/main`). + +### 2. Parse $ARGUMENTS (PR modes only) + +*(Skip in `diff` mode.)* + +Accept any of: + +- A PR number: `4267` +- A full URL: `https://github.com/cowprotocol/services/pull/4267` +- An `owner/repo#N` form: `cowprotocol/services#4267` + +Default `owner/repo` to `cowprotocol/services` when only a number is given. + +Extract: ``, ``, ``. + +If unparseable, print and abort: + +``` +Usage: /review-pr # diff mode (current branch vs main) + /review-pr # PR mode + /review-pr https://github.com/owner/repo/pull/ + /review-pr owner/repo# +``` + +### 3. Diff-mode preflight + +*(Only in `mode == "diff"`.)* + +Run: + +```bash +git fetch origin main --quiet +BASE=$(git merge-base HEAD origin/main) +git diff --stat "$BASE..HEAD" +``` + +If `git diff "$BASE..HEAD"` is empty, print `No diff vs main — nothing to review.` and exit clean (not an error). + +There is **no** clean-tree check, **no** rebase, and **no** `git pull` of main. Diff scope comes from the fetched `origin/main` merge-base. + +### 4. PR-mode preflight + +*(Only in `mode == "pr-local"` or `mode == "pr-ci"`.)* + +#### 4a. Working tree (pr-local only) + +Run `git status --porcelain`. If non-empty, print it plus: + +``` +Working tree is dirty. Stash or commit your changes, then re-run. + + git stash # temporary + git stash pop # to restore later +``` + +Then **abort**. Never auto-stash. + +#### 4b. Save the current branch (pr-local only) + +Save the current branch name to `` so the report's NEXT STEPS footer can suggest `git switch ` when the review is done. + +#### 4c. Fetch base ref + +Run `git fetch origin --quiet`. This makes the diff comparable to base without rebasing or modifying the user's branch. + +#### 4d. Checkout the PR + +Run `gh pr checkout -R /`. Failure handling: + +- **`gh` not installed** → print `Install gh: https://cli.github.com/` and abort. +- **Auth error** → print `gh auth status` output plus `Run: gh auth login` and abort. +- **PR doesn't exist / wrong repo** → surface `gh`'s error verbatim and abort. +- **Fork without checkout permission** → switch to **degraded static-diff mode**: + - Set `mode_qualifier = "degraded static-diff"`. + - In the reference doc's §2, replace `gh pr diff ` with `gh pr diff --patch -R /`. + - Flag the qualifier in the report header's `Mode:` line. +- **Any other error** → surface verbatim and abort. + +In `pr-ci`, the workflow has already checked out the PR branch — skip 4d and just verify `HEAD` matches the expected ref. + +### 5. Optional-tooling probe + +Detect which optional accelerators are available in the current session: Serena MCP (`mcp__plugin_serena_serena__*`), `actionbook/rust-skills` (`rust-call-graph`, `rust-symbol-analyzer`, `rust-trait-explorer`, `rust-code-navigator`), `ra-qm-skills`, the `m04`/`m06`/`m07`/`m15` modules, `unsafe-checker`. + +Build a `loaded_context` list of whichever ones resolved. Pass it through to the reference doc; it prints verbatim in the report header's `Loaded context:` line. + +Do not abort if a skill is missing. Do not print install banners. + +### 6. Noise filter (before handoff) + +Classify each changed file: + +**Review surface (read fully):** + +- Anything under `crates/*/src/**/*.rs` (excluding `contracts/generated/**`). +- `crates/e2e/tests/**/*.rs`. +- `contracts/solidity/**/*.sol` (authored Solidity). +- Config files: `**/openapi.yml`, `database/sql/**`, `configs/**`, semantically interesting `*.toml`. + +**Noise (skip or skim):** + +- `Cargo.lock` (any). CI validates the resolution; reviewing the lockfile diff is high-cost, low-signal. +- `contracts/generated/**` and `contracts/artifacts/**` — machine-generated bindings and ABI JSON. +- Auto-generated `Cargo.toml` entries from contract-binding crates. +- Binary fixtures, ABI blobs, snapshot files. + +Report the filter in the report's `Scope:` line as `+X −Y across Z files (~N LOC human-written; rest generated/lockfile, filtered)`. + +### 7. Handoff + +Read `docs/COW_PR_REVIEW_SKILL.md` and follow it from §2 (Metadata Fetch) onward, passing through: + +- `mode` (`diff` / `pr-local` / `pr-ci`) and `mode_qualifier` if set. +- ``, ``, `` (PR modes only). +- `prior_branch` (`pr-local` only). +- `loaded_context` (optional skills detected in step 5). +- The noise-filter classification from step 6 — the reference doc uses it to decide what to read. diff --git a/.github/workflows/claude-code-review.yml b/.github/workflows/claude-code-review.yml index 5a1d3228c8..8623219db7 100644 --- a/.github/workflows/claude-code-review.yml +++ b/.github/workflows/claude-code-review.yml @@ -18,14 +18,17 @@ jobs: - name: Checkout repository uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2 with: - fetch-depth: 1 + # Full history is needed so the skill's `git blame`-based + # historic-context step can resolve any line in the diff. + fetch-depth: 0 - name: Run Claude Code Review id: claude-review uses: anthropics/claude-code-action@e0f2d99545298b87c2f984ab534af3a6534142ae # v1 with: anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }} - plugin_marketplaces: 'https://github.com/anthropics/claude-code.git' - plugins: 'code-review@claude-code-plugins' - prompt: '/code-review:code-review ${{ github.repository }}/pull/${{ github.event.pull_request.number }} --comment' - + # Invoke the in-repo slash command. The skill detects + # $GITHUB_ACTIONS=true and switches to `pr-ci` mode, posting + # a single consolidated review comment instead of streaming + # to a terminal. Skill source of truth: docs/COW_PR_REVIEW_SKILL.md + prompt: '/review-pr ${{ github.repository }}#${{ github.event.pull_request.number }}' diff --git a/docs/COW_PR_REVIEW_SKILL.md b/docs/COW_PR_REVIEW_SKILL.md new file mode 100644 index 0000000000..0a81bfa5ed --- /dev/null +++ b/docs/COW_PR_REVIEW_SKILL.md @@ -0,0 +1,383 @@ +# CoW Services PR Review Skill + +This document instructs Claude how to produce a PR review for cowprotocol/services. It is invoked by `.claude/commands/review-pr.md` (locally) or by `.github/workflows/claude-code-review.yml` (in CI). One skill, three operating modes. + +At the point this document is read, the entry-point has already determined: + +- **`mode`** — one of: + - `diff` — local, no PR yet. Source is `git diff $(git merge-base HEAD origin/main)..HEAD` (the whole feature-branch worth of work, computed against the freshly-fetched remote `main` to avoid a stale local `main`). No PR metadata, no `gh` calls. Output to terminal. + - `pr-local` — local, PR exists. Source is `gh pr diff ` plus PR metadata. Output to terminal. + - `pr-ci` — running inside `.github/workflows/claude-code-review.yml`. Source is `gh pr diff ` plus PR metadata. Output is a single review comment posted to the PR. +- **``, ``, ``** (only in `pr-local` / `pr-ci`). +- **`prior_branch`** (only in `pr-local` — the branch to print at the end). + +The mode shapes which steps run and how the report is delivered, but the *content* of the review is the same in all three. + +--- + +## Core Principles (read before executing) + +- **Signal over noise.** Report genuine concerns only. LGTM is a perfectly valid verdict and is the correct one whenever the PR is clean. The goal is not to maximise finding count — it is to be worth a senior reviewer's attention. +- **Local modes never post to GitHub.** In `diff` and `pr-local`, output is strictly terminal. No `gh pr review`, no `gh pr comment`. The user posts whatever they choose. In `pr-ci`, post exactly one consolidated review comment — no per-line spam. +- **Code is the primary source of truth.** `CLAUDE.md`, design docs in `docs/`, and this skill's own sibling docs can go stale. When a finding turns on *"X is called from Y"* or *"this field is read by Z"*, verify with `git grep` / `rg` / LSP — not by citing a doc. +- **Inverted: this PR can make existing docs / comments / its own description stale.** If a code change makes a comment, a `docs/` page, or the PR's own description no longer match the diff's current state, that is itself a finding (`Action:` → update X). +- **`git blame` before flagging code that looks unusual.** Often code looks weird because it had to. Before suggesting a "cleanup", blame the affected lines, read the originating commit message and (if any) linked PR. A comment that says *"this looks accidental, did you mean X?"* without that step risks asking the author to undo a hard-won fix. +- **Explain, don't just flag.** Each finding must give the reviewer enough context to understand *and defend* the point — not just forward AI-generated text. +- **One framing per finding:** end with either `Action:` (concrete task) or `Question:` (clarification needed). Never both. +- **Token discipline.** Don't read whole files when grep or LSP suffices. Build a codemap before reading file bodies. + +--- + +## Universal Guardrails + +Apply these as the default lens for every change. Pull in CoW-specific siblings ([§3](#3-conditional-context)) only when the diff warrants them. + +1. **Keep the public API surface minimal.** A new `pub fn`, `pub struct`, or `pub mod` that isn't required by an external caller is a Medium finding asking why it isn't `pub(crate)` / `pub(super)` / private. Smaller surface = fewer downstream breakages = freer refactor for the next person. +2. **Avoid rightward drift.** Code that's deeply nested (4+ levels, especially `match` inside `if let` inside `for` inside `async`) is hard to read and usually hides a missing extraction. Suggest an early-return, a helper, or `let-else`. +3. **One responsibility per component.** A function, struct, or module that does two unrelated things (validates *and* persists; parses *and* renders) is harder to test and to reuse. Flag with a `Question:` if you're not sure the split is artificial. +4. **Split big files.** A new file pushing past ~500 lines, or a touched file growing past ~1000 lines, is worth flagging. Suggest a sensible split (often by responsibility from #3). +5. **Avoid argument bloat.** A function taking 6+ positional arguments is a code smell — usually missing a config struct, a builder, or a method on a context object. Especially flag if the arguments are mostly being threaded through unchanged. +6. **Errors carry context.** `?` propagating a low-level error to a high-level boundary without enrichment loses the *what was the caller trying to do* information. `anyhow!("{err}")` flattens cause chains. Both are findings; severity depends on the path. + +--- + +## Execution Flow + +Steps run in this order. `diff` mode skips PR-metadata steps; `pr-ci` swaps the report sink at the end. + +1. Fetch PR metadata and linked issue(s) — [§2](#2-metadata-fetch). *(`pr-local` / `pr-ci` only.)* +2. Classify diff paths and load conditional context — [§3](#3-conditional-context). +3. Build a targeted codemap — [§4](#4-codemap-phase). +4. Synthesize the context block — [§5](#5-context-synthesis). +5. Review and produce findings — [§6](#6-review-and-severity). +6. Emit the report — [§7](#7-report-templates). Sink depends on mode. +7. Offer verification (background) — [§8](#8-verification-offer). *(Local modes only.)* +8. Print cleanup hint — [§9](#9-cleanup). *(`pr-local` only.)* + +Error behaviour is consolidated in [§10](#10-error-playbook). + +--- + +## 2. Metadata Fetch + +*(Skip in `diff` mode — there is no PR yet.)* + +Run in **parallel** (single message, multiple Bash tool calls): + +```bash +gh pr view -R / \ + --json title,body,author,labels,files,baseRefName,headRefName,additions,deletions,commits,reviewDecision,isDraft,state + +gh pr diff -R / +``` + +In **degraded static-diff mode** (fork without checkout permission, only relevant in `pr-local`), replace `gh pr diff` with: + +```bash +gh pr diff --patch -R / +``` + +### Linked issues + +Parse the PR body for `Fixes #`, `Closes #`, `Resolves #` (case-insensitive). Fetch each in parallel with the above: + +```bash +gh issue view -R / --json title,body,labels,state +``` + +If no linked issue is referenced, proceed without one. Do not manufacture one. + +### State handling + +- `state == "CLOSED"` or `"MERGED"` → proceed; prepend a one-line warning to the report. +- `isDraft == true` → proceed; prepend `Draft — author may still be iterating.` + +--- + +## 3. Conditional Context + +For each changed file, evaluate this table and accumulate a `context_docs` list. Read each matched doc once. + +| Match | Load | +|---|---| +| Any file under `database/sql/**` | `docs/review-context/database-migrations.md` | + +Add a new sibling only when: + +- A real review surfaced a CoW-specific concern the AI consistently missed, **and** +- That concern can't reasonably be inferred from the [Universal Guardrails](#universal-guardrails) plus general Rust judgment, **and** +- It can be expressed as a tight checklist (≤30 lines), not a sprawling rulebook. + +### One inline guardrail worth keeping + +When a PR touches **any `openapi.yml`**: scan for breaking changes (removed/renamed/typed-changed fields, new required request fields, narrowed enums, changed auth or HTTP method). If any are present, ask whether the goal could be achieved non-breakingly (additive field, new optional, deprecation window) and whether the affected consumer teams (Frontend, SAFE, etc.) have been notified. Severity: High when undisclosed in the PR description; Medium otherwise. + +--- + +## 4. Codemap Phase + +**Purpose:** before reading file bodies, map the symbols the diff touches, their callers, and their call sites. A codemap turns a 1000-line diff into a ~20-line mental model and catches findings that only become visible at the *shape* level (caller-count inconsistencies, dead abstractions, leaky public APIs). + +### What to map + +For each non-trivial symbol the diff adds, modifies, or deletes: + +1. **New public types / traits / functions** — fields, methods, signatures. +2. **Modified function signatures** — caller count, were all sites updated? +3. **New trait impls** — which types implement the trait? Is the trait used outside this PR? +4. **Error-type changes** — where do callers match on this error? + +### Tools (cheapest viable option first) + +| Tool | Status | When to use | +|---|---|---| +| `Grep` / `rg` with `-n` on a symbol name | Always available | Caller counts and basic location lookups. Example: `rg 'OrderValidator::new\b' crates/`. | +| `gh api` / `git blame` | Always available | Historic context on suspicious-looking lines. | +| `mcp__plugin_serena_serena__find_symbol` / `find_referencing_symbols` / `get_symbols_overview` | Optional (LSP-backed; available when Serena MCP is configured) | Precise location + kind + signature without reading the full file. | +| Skills from `actionbook/rust-skills` (`rust-call-graph`, `rust-symbol-analyzer`, `rust-trait-explorer`, `rust-code-navigator`) | Optional (installed via `npx skills add actionbook/rust-skills`) | Richer cross-crate analysis. Not present in CI by default. | +| `Read` of a full file | Last resort | Only when the diff hunks plus the cheaper tools don't pin down what you need. | + +**Fallback rule:** in CI and in `pr-local` (full checkout), every codemap step still works with `rg` and `git` against the local working tree. The optional tools are accelerators, not requirements. + +**Exception — degraded static-diff mode.** When `pr-local` falls back to static-diff (a fork the user can't `gh pr checkout`), the working tree is the *prior* branch, not the PR's. `rg crates/`, `Read` of repo files, and `git blame` on the working tree all reflect the wrong code. In this mode: + +- Take the diff content from `gh pr diff --patch -R /` (already arranged in [`commands/review-pr.md` §4d](../.claude/commands/review-pr.md)). +- Read individual files at the PR's head SHA via `gh api /repos///contents/?ref=` instead of `Read`. +- Skip `git blame`-based historic context for lines that only exist on the PR branch (the local repo doesn't know about them); blame against `origin/main` is still fine for *unchanged* surrounding code. +- Caller-count searches via `rg` are usable only for symbols that exist on `origin/main`; new symbols introduced by the PR must be looked up via the patch and `gh api`. + +The report header's `Mode:` line already flags `degraded static-diff` — a reviewer reading the output knows which constraint applied. + +### What the codemap produces + +A short block in the report that looks like: + +``` +Codemap +─────────────────────────────────────────────────────────── +New symbols (crate::module): + (kind, key fact) + ... + +Callers of : sites ( real, test). All updated ✓ +``` + +This is the raw material §5 (synthesis) and §6 (findings) work from. + +### When to skip + +Trivial PRs (docs-only, single-line bump, pure test addition) — skip. Pure refactors with no added public API — skim. Everything else — do it. + +--- + +## 5. Context Synthesis + +Produce 1–3 paragraphs combining the PR title, description, linked issue(s), file scope, and intent (feature, bugfix, refactor, dep bump, docs, test). + +### Rules + +1. **Synthesize, don't copy-paste.** If the description is five words, say so plainly: *"description is minimal; intent inferred from diff"*. +2. **Watch for description-vs-diff drift.** A PR description must describe the diff's *current* state, not the author's iteration history. If a claim in the description is no longer true of the diff, raise a finding with `Action: update the PR description to match the current diff`. Do not flag the absence of a changelog of removed/superseded behaviour — that belongs in commit history, not the description. +3. **No vague verbs.** *"This PR updates something"* is a failure. Name the component, the change, and the mechanism. + +### Shape + +- **Paragraph 1** — *what* changed. +- **Paragraph 2** — *why* (description + linked issue). +- **Paragraph 3** (if warranted) — *how* (the approach, not a line-by-line walkthrough). + +--- + +## 6. Review and Severity + +Apply, in order: + +1. The [Universal Guardrails](#universal-guardrails). +2. The conditional context from [§3](#3-conditional-context), if any was loaded. +3. CoW-services conventions from `CLAUDE.md`. +4. Optional skills from [§4 → tools table](#tools-cheapest-viable-option-first), activated by what the diff actually contains. If installed, invoke `m06-error-handling` for `Result`/`Option`/`?` changes, `m07-concurrency` for `tokio::*` / async / locking, `m04-zero-cost` for new generics or trait objects, `m15-anti-pattern` for general sanity, `unsafe-checker` for any `unsafe` (mandatory High). If they aren't installed, reason from general Rust knowledge plus the [Universal Guardrails](#universal-guardrails). + +### Use `git blame` for historic context + +Before flagging code that looks unusual, redundant, or "easy to clean up": + +```bash +git blame -L , -- # who/what/when +git log --format='%H %s' -n 1 # commit message +gh pr view <#> -R cowprotocol/services # if commit message links a PR +``` + +If the originating commit message or PR explains *why* the code is shaped that way, factor that into your finding. A "this looks accidental" comment is much weaker when blame shows a deliberate fix from six months ago. Mention what blame revealed in the finding's Explanation so the reviewer can defend the point. + +### Severity Rubric + +| Severity | Meaning | +|---|---| +| **High** | Merging as-is is a real risk: correctness bug, data loss, security issue, incompatible DB migration, auction/settlement invariant broken, likely panic, unsound `unsafe`. | +| **Medium** | Worth fixing before merge — won't break prod but will cost later. Missing error context, public-API ergonomics, unhandled edge case, n-1 rollout incompatibility, undisclosed breaking API change. | +| **Small / QoL** | Would genuinely improve the code. **Not a nit.** | + +### Anti-nit Rule (mandatory) + +- If the only reason to change it is taste or stylistic preference, **do not report it**. +- Formatting belongs to `rustfmt` / CI. Never raise a finding whose fix is "run `cargo +nightly fmt`". +- Clippy lints are a CI concern by default. **Exception:** a clippy warning inside the new code may be reported as **Small** if and only if it improves correctness or clarity — never style. +- If you're uncertain whether something is a nit, omit it. LGTM is the right verdict when the PR is clean. +- **Don't inflate severity** to look thorough. Each finding's severity is what a senior reviewer would actually call it on GitHub. Either it's worth discussing or it's omitted. + +### Per-finding shape + +1. **Title** — short noun phrase (≤ 8 words). +2. **Location** — `path/to/file.ext:line` or `path:start-end`. +3. **Explanation** — mechanism, impact, and (if relevant) what `git blame` / the codemap revealed. +4. **`Action:` OR `Question:`** — exactly one. + +When a finding has a clear mechanical fix, phrase the `Action:` so the reviewer can paste it as a GitHub *Suggested change* block. + +--- + +## 7. Report Templates + +### Terminal form (`diff` and `pr-local`) + +``` +═══════════════════════════════════════════════════════════ +PR # (or "Diff review — <branch>" in diff mode) +═══════════════════════════════════════════════════════════ +Author: @<author> (omit in diff mode) +Scope: +<add> −<del> across <N> files + (~X LOC human-written; rest generated/lockfile, filtered) +Labels: <labels or "—"> (omit in diff mode) +Base/Head: <baseRef> ← <headRef> (or "main ← <branch>" in diff mode) +Linked issue: #<N> — <title> (omit if none) +Mode: diff | pr-local | pr-ci ; full checkout | degraded static-diff + +Codemap +─────────────────────────────────────────────────────────── +<from §4 — omit if skipped> + +─────────────────────────────────────────────────────────── +CONTEXT +─────────────────────────────────────────────────────────── +<synthesis from §5> + +─────────────────────────────────────────────────────────── +VERDICT: <LGTM | Changes requested | Needs clarification> +─────────────────────────────────────────────────────────── + +FINDINGS + [High] <count> + [Medium] <count> + [Small] <count> (QoL-only; nits omitted) + +───── High ───────────────────────────────────────────────── +1. <title> + Location: <path>:<line> + + <explanation> + + Action: <task> OR + Question: <question> + +───── Medium ─────────────────────────────────────────────── +<same shape> + +───── Small / QoL ───────────────────────────────────────── +<same shape> + +─────────────────────────────────────────────────────────── +VERIFICATION (only if user opted in) +─────────────────────────────────────────────────────────── +cargo check <status> +cargo clippy <status> +cargo +nightly fmt <status> + +─────────────────────────────────────────────────────────── +NEXT STEPS (pr-local only) +─────────────────────────────────────────────────────────── +Currently on: <current branch> +Return with: git switch <prior_branch> +``` + +#### LGTM short form + +When there are zero findings at all severities, collapse everything between `VERDICT:` and `NEXT STEPS` to a single line: + +``` +VERDICT: LGTM — no blocking or notable issues. +``` + +Header, CONTEXT, and (in `pr-local`) NEXT STEPS still print. + +### Comment form (`pr-ci`) + +In CI, post **one** review comment via the action. Use the same body shape as the terminal form, minus: + +- The NEXT STEPS section (no local branch state to print). +- The VERIFICATION block (CI runs its own checks separately). +- ANSI box-drawing characters (Markdown headings instead). + +GitHub-render the body using `##`/`###` headings and fenced code blocks. Keep findings collapsible with `<details>` if there are more than ~5. + +### Verdict selection + +- **LGTM** — no High or Medium findings. +- **Changes requested** — any High, or Medium that blocks safety/correctness. +- **Needs clarification** — no High, but one or more findings use the `Question:` form. + +--- + +## 8. Verification Offer + +*(Local modes only. Skip in `pr-ci` — CI runs its own check/clippy/fmt jobs in parallel.)* + +After printing the report, ask the user: + +> "Run local verification in the background? Reply with one of: +> `check` — `cargo check --locked --workspace --all-features --all-targets` +> `clippy` — `cargo clippy --locked --workspace --all-features --all-targets -- -D warnings` +> `fmt` — `cargo +nightly fmt --all -- --check` +> `all` — run all three +> `skip` — don't run anything" + +Dispatch each selected command as a **background** Bash invocation. On completion, append the result to a VERIFICATION block. + +If `cargo check` or `cargo clippy` surfaces issues **inside files this PR modified**, fold them into Findings (compile errors → High, correctness clippy → Medium, clarity clippy → Small subject to the Anti-nit rule). Do not surface warnings from files the PR didn't touch. `fmt --check` failures are a status, never a finding. + +Tests are intentionally not in the menu — services' suite is too long-running to gate every review on. + +--- + +## 9. Cleanup + +*(`pr-local` only.)* + +The NEXT STEPS footer names the current branch and the `git switch` to return to. Print it; never run it. The user may want to stay on the PR branch. + +--- + +## 10. Error Playbook + +| Condition | Behaviour | +|---|---| +| `gh` not installed (pr modes) | Print `Install gh: https://cli.github.com/`, abort. | +| `gh` not authenticated | Print `gh auth status` output + `Run: gh auth login`, abort. | +| PR number not parseable | Print usage, abort. | +| PR doesn't exist / wrong repo | Surface `gh` error verbatim, abort. | +| PR closed/merged | Prepend warning, proceed. | +| PR is draft | Prepend warning, proceed. | +| Working tree dirty (pr-local) | Print `git status --porcelain`, instruct stash/commit, abort. **No auto-stash.** | +| `gh pr checkout` fails (fork permission) | Degrade to static-diff mode, flag in report header. | +| Optional skill not installed | Continue using `rg` / general Rust knowledge. Do **not** abort. | +| `diff` mode but no diff (branch == main) | Print `No diff vs main — nothing to review.`, exit clean. | +| Verification command fails | Surface output in VERIFICATION block; raise findings only for issues inside changed files. | + +**Rule of thumb:** never silently degrade. If a tool was missing or a step was skipped, the report's Mode/Header line should reflect it. + +--- + +## 11. Maintenance Notes + +- When the AI consistently misses a CoW-specific concern across multiple reviews, first try expressing it as one more bullet in [Universal Guardrails](#universal-guardrails). Only carve a sibling doc if it can't be expressed generically. +- When the skill produces a false-positive finding, add a one-line counter-example to the [Anti-nit Rule](#anti-nit-rule-mandatory). +- Keep this document under 500 lines. diff --git a/docs/review-context/database-migrations.md b/docs/review-context/database-migrations.md new file mode 100644 index 0000000000..c12626356b --- /dev/null +++ b/docs/review-context/database-migrations.md @@ -0,0 +1,54 @@ +# Review Context — Database Migrations + +Loaded by `COW_PR_REVIEW_SKILL.md §3` when a PR touches `database/sql/**`. + +This file extends — does not replace — the reminder in `.github/nitpicks.yml`. The nitpick bot posts the generic warning automatically; the reviewer applies judgment using the checks below. + +The checks are dictated by *how* we roll out — not by SQL itself. + +## Rollout shape + +Two facts shape every check below: + +1. **k8s rolls pods, not the cluster.** During a release, the new autopilot starts and runs Flyway *before* the old pod is shut down. For a non-trivial overlap window the previous version is processing requests against the new schema. Anything that breaks that path breaks production. +2. **Staging and production are independent and roll out at different times.** A migration lands in staging first; production follows on its own cadence. While the gap is open, our shadow environment can be running an *older* code version against the *newer* schema (or vice versa, briefly). Any change that breaks that combination breaks shadow without breaking the obvious path the author tested. + +Both points generalise beyond DB — config schema changes, request/response formats, and message-queue payloads have the same n-1 / staging-vs-production caveat. When a PR touches any of those *and* the DB, mention the cross-cutting risk in the synthesis. + +## Non-negotiables + +Each item carries its own severity. Items 1–3 are correctness/availability concerns and default to **High**; items 4–5 are author hygiene and default to **Medium**. + +1. **Reversibility.** State whether the migration is reversible. If yes, include or link the rollback script. If no, the PR explains *why* irreversibility is acceptable. Missing either → **High**. +2. **n-1 schema compatibility.** The previous app version must still function against the new schema. A migration that drops or renames a column, narrows a type, or adds a `NOT NULL` constraint without code already tolerating both shapes → **High** until the rollout plan is spelled out (typically: ship change in three releases — add → migrate code → drop). +3. **Blocking index creation on hot tables.** Use `CREATE INDEX CONCURRENTLY` on anything in the auction/settlement critical path (`orders`, `trades`, `auctions`, `settlements`, `order_events`, `auction_orders`, `quotes`, `order_quotes`). A blocking `CREATE INDEX` on one of these → **High**. +4. **Authoritative table list.** New tables must appear in `crates/database/src/lib.rs` (search for the top-level table list). Missing → **Medium** (CI may also catch it; still the author's responsibility). +5. **README.** Schema changes update the SQL or DB README, whichever is the convention at review time. Missing → **Medium**. + +## Other shapes that usually warrant **High** + +- `ALTER TABLE ... ADD COLUMN NOT NULL` on a multi-million-row table without a default — table lock plus slow backfill. Remedy: add nullable, batched backfill, then `NOT NULL` in a later migration. +- Renaming a column in a single migration rather than the three-release add → migrate → drop pattern. +- Adding a `UNIQUE` constraint without first verifying current data is unique (the migration will fail in prod if duplicates exist). +- Type changes in place on a large table without a plan for the rewrite cost. + +## Usually worth flagging as **Medium** + +- Migration scripts without comments when the *what* is non-obvious. +- New foreign keys without an explicit `ON DELETE` clause (default `NO ACTION` is often surprising). +- New tables without indexes on the columns obvious queries will filter on. + +## Not findings + +- SQL style — trailing commas, keyword casing, indentation. +- Whether the migration could be one statement instead of three. If correct, accept it. +- Migration filename style — Flyway naming is enforced mechanically. + +## Questions worth asking the author + +When in doubt, prefer a `Question:` over a flagged `Action:`: + +- *"What's the expected row count in `<table>` at rollout time?"* — drives `CONCURRENTLY` and batched-backfill decisions. +- *"Which release pairs this migration with the code change?"* — makes the n-1 reasoning explicit. +- *"How does this look on shadow during the staging→production gap?"* — surfaces cross-environment compatibility before the author finds out by paging. +- *"Has the rollback script been tested against a production-sized dataset?"* — only relevant if a rollback was included.