diff --git a/AGENTS.md b/AGENTS.md
index 586a1fd941..61bc38306f 100644
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -175,6 +175,26 @@ If you touched `ansible/`, also follow <ansible/AGENTS.md>.
 
 `plans/`: for future work or work in progress. Once a plan is fully completed, remove it from `plans/` (delete, or squash into short tombstone/summary elsewhere).
 
+### SPEC.md — High-level component specifications
+
+`<subproject>/SPEC.md`: high-level, user-facing specification of what a
+component guarantees to its users. An outside observer should be able to read
+SPEC.md to understand what behaviors they can rely on, without having to read
+the implementation. Example: <devinfra/claude/hook_daemon/SPEC.md> describes
+what the Claude Code hook daemon provides to every session, and the
+`/web_selfcheck` skill runs the acceptance tests derived from it.
+
+SPEC.md files **must** be updated when the high-level requirements of the
+thing they cover change — a new class of credential gets injected, a new
+shim behavior is added, a new profile lands, a new promise is made to the
+agent, etc.
+
+SPEC.md files **must not** record low-level implementation details that an
+outside observer would not notice. "Credentials are refreshed regularly by
+the backend service" belongs in SPEC.md; "credentials live in
+`<session_dir>/creds.json` and rotate every 300s via RPC to
+`rotate.example.com`" does not — that belongs in README.md or in the code.
+
 ### TODO Tracking
 
 Subprojects use `TODO.md` for persistent TODO tracking. TODOs local to a specific code location are fine as inline comments; cross-cutting or project-level TODOs belong in `TODO.md`.
diff --git a/devinfra/claude/README.md b/devinfra/claude/README.md
index 97baf3db84..491157ac69 100644
--- a/devinfra/claude/README.md
+++ b/devinfra/claude/README.md
@@ -45,6 +45,13 @@ By preserving the original proxy env vars:
 - JWT token refreshes are automatically picked up
 - The bazelisk shim sends fresh credentials to the daemon on each invocation
 
+## Specification
+
+See <hook_daemon/SPEC.md> for the high-level, user-facing specification of
+what the hook daemon guarantees to every Claude Code session (on CLI and on
+web). Read that first if you want to know **what** the daemon does for the
+agent — this README covers **how** those behaviors are implemented.
+
 ## Components
 
 - **Session Start Hook**: Sets up the development environment for Claude Code web sessions
diff --git a/devinfra/claude/hook_daemon/SPEC.md b/devinfra/claude/hook_daemon/SPEC.md
new file mode 100644
index 0000000000..95fccb7e93
--- /dev/null
+++ b/devinfra/claude/hook_daemon/SPEC.md
@@ -0,0 +1,275 @@
+# Hook Daemon Specification
+
+See @README.md for architectural and implementation details.
+
+## Overview
+
+Every Claude Code session — whether it is running in Claude Code CLI on a
+developer workstation or in Claude Code on the web inside a sandboxed container
+— is paired with a **session-scoped hook daemon**. The daemon is launched by
+Claude Code's `SessionStart` hook and lives for the duration of the session.
+
+Its job is to make every session look the same to the agent:
+
+- Bazel (via `bbr` / `bazelisk` / `bb`) is wired up to BuildBuddy and works
+  out of the box: plain `bazelisk build <target>` or `bb build <target>`
+  automatically uses BuildBuddy remote execution and remote cache, with no
+  extra flags from the agent.
+- Credentials the agent needs (BuildBuddy, GitHub, Kubernetes, tracing) are
+  available in the environment without the agent having to fetch or decrypt
+  them.
+- Dangerous or footgun git operations are blocked by a PATH shim.
+- Pre-commit lint/format hooks run automatically on Edit/Write and their
+  failures are reported back to the agent.
+- The `claude-sandbox-kubectl` MCP server is configured to talk to the
+  cluster as the expected Claude identity.
+- Hook activity is traced to the central OpenTelemetry collector.
+
+The daemon exposes two **profiles** — `cli` and `web` — that differ both in
+what the surrounding environment is expected to provide and in which
+behaviors are enabled (e.g., the git safety shim and direnv bridge are
+CLI-only; egress proxy handling, mkcert, tmpfs, managed credentials, and
+idle shutdown are web-only).
+
+## Common Behaviors (CLI and Web)
+
+These guarantees hold in every session, regardless of profile.
+
+### Credentials in the agent's environment
+
+Every Bash tool call sees:
+
+- A valid `BUILDBUDDY_API_KEY`.
+- A valid `GITHUB_TOKEN` (on web this is the `agentydragon-agent` machine
+  user; on CLI this is whatever token the user's outer shell already
+  exposes).
+- `DUCKTAPE_OTEL_BEARER_TOKEN` for tracing.
+
+The agent should never need to decrypt SOPS files or run `gh auth login`
+manually — if a credential is missing, the daemon is broken.
+
+### Bazel / BuildBuddy
+
+- `bazelisk`, `bb`, and `bbr` on `PATH` are wired to BuildBuddy.
+- A plain `bazelisk build <target>` or `bb build <target>` automatically
+  uses BuildBuddy remote execution **and** remote cache out of the box.
+  The agent does not need to pass `--config=rbe`, `--remote_executor=...`,
+  `--remote_cache=...`, or any authentication flags.
+- BuildBuddy invocations are tagged with the session ID so they can be
+  filtered later via `bbapi invocation list --tag session:<id>`.
+- **`bbr` preserves the Bazel analysis cache across invocations, at least
+  mostly.** Running `bbr` a second time with the same inputs should
+  usually land on a warm BuildBuddy runner that has the analysis cache
+  already populated, so the second build is substantially faster than a
+  cold one. This is best-effort, not a hard guarantee — today runners are
+  shared across all concurrent sessions and may be evicted or rotated,
+  so an occasional cold hit is acceptable. A session where _every_ `bbr`
+  call is cold is broken.
+
+### Pre-commit lint & format on Edit/Write
+
+When the agent edits a file via the Edit or Write tool, the daemon runs the
+project's `pre-commit` configuration against the touched files as a
+`PostToolUse` hook:
+
+- Pure format/whitespace hooks (e.g. `ruff-format`) are **auto-applied**
+  and the fixed file is kept. See the profile YAMLs under <profiles/> for
+  the full auto-apply list.
+- Any other hook that fails blocks the edit: changes made by that hook are
+  reverted and the failure is reported back to Claude as a `PostToolUse`
+  block, so Claude can fix and retry.
+
+### OpenTelemetry tracing
+
+- Every hook invocation (SessionStart, PreToolUse, PostToolUse, background
+  tasks) is traced to the central OTLP collector with a bearer token.
+- Traces are keyed by session ID so they can be retrieved per session for
+  debugging.
+
+### MCP servers
+
+- The `claude-sandbox-kubectl` MCP server is configured and authenticated so
+  that `kubectl`-equivalent calls act as the cluster's designated Claude
+  identity (see <../../../cluster/k8s/agents/claude-rbac/>). The agent
+  should always prefer it over raw `Bash(kubectl ...)` for `claude-sandbox`
+  operations.
+
+### Observability
+
+- Hook daemon logs are available on disk under the session directory for the
+  duration of the session (exact path documented in <README.md>).
+- A session context banner surfaces warnings from setup and background tasks
+  to the agent at SessionStart.
+
+## CLI Profile
+
+The CLI profile targets a developer workstation where the user is already
+logged in and has a `nix`/`direnv`-managed devshell. The daemon therefore
+relies on the outer environment for most credentials and focuses on safety
+rails.
+
+### What the surrounding environment provides
+
+- **Credentials come from `.envrc`** (via `direnv`), which sources the
+  repo's encrypted CLI env script. `BUILDBUDDY_API_KEY`, `GITHUB_TOKEN`, and
+  `DUCKTAPE_OTEL_BEARER_TOKEN` are expected to already be in the process
+  environment when Claude Code launches. They reflect the **user's own**
+  identity (the developer's GitHub PAT, the user's own BuildBuddy key).
+- **Kubeconfig comes from `~/.kube/config`** — the user's personal cluster
+  access. The daemon does not write its own kubeconfig; MCP and `kubectl`
+  use whatever the user has.
+- **The devshell provides `bazelisk`, `bb`, `sops`, `gh`, etc.** on PATH via
+  Nix home-manager.
+
+The daemon's job is to propagate those env vars into every Bash tool call
+(since Claude Code's Bash tool does not automatically run through direnv) and
+to layer the shims on top.
+
+### CLI-specific guarantees
+
+- **Git safety shim.** A `git` wrapper on PATH blocks footgun commands:
+  - `git commit --amend` (prevents rewriting shared history)
+  - `git add -A` / `git add .` (forces explicit file listing)
+  - `git stash` (prevents accidental stash-and-forget)
+
+  Blocked commands exit non-zero with a clear error and are never run.
+  Read-only operations (`git stash list`, `git stash show`) are allowed.
+
+- **direnv bridge.** Every Bash tool call sees the env exported by the
+  nearest `.envrc`, so `cd`-ing between subprojects picks up the right
+  devshell environment.
+
+### What CLI does NOT do
+
+- Does not configure an egress proxy.
+- Does not set up tmpfs, mkcert, docker, or supervisor.
+- Does not write a kubeconfig — the user provides one.
+- Does not idle-shutdown.
+
+## Web Profile
+
+The Web profile targets Claude Code on the web, running inside a sandboxed
+container with TLS-inspecting network egress. The surrounding environment
+provides almost nothing beyond a SOPS age key and the agent's container
+identity; the daemon is responsible for standing up everything else.
+
+### What the surrounding environment provides
+
+- The **`web_setup.sh`** bootstrap script has already run and installed Nix,
+  devtools, skills, and a `settings.local.json` containing secrets needed by
+  MCP servers.
+- A **SOPS age key** (`SOPS_AGE_KEY`) that can decrypt the repo's
+  `claude-web` secrets.
+- A **TLS-inspecting egress proxy** via `HTTPS_PROXY` / `HTTP_PROXY` with a
+  periodically-refreshed JWT. The pre-installed TLS inspection CA is present
+  on the container filesystem.
+
+### Web-specific guarantees
+
+- **Managed credentials.** The daemon decrypts SOPS secrets at startup and
+  injects them into the agent's environment:
+  - `BUILDBUDDY_API_KEY` — shared BuildBuddy key for the claude-web identity.
+  - `GITHUB_TOKEN` — the **`agentydragon-agent` machine user PAT**, not a
+    personal token. The agent commits and pushes as that identity.
+  - `DUCKTAPE_OTEL_BEARER_TOKEN` — for tracing.
+  - Kubernetes service account token (see below).
+
+  Credentials that the cluster rotates (e.g., the k8s service account token)
+  are refreshed regularly so that long-running sessions keep working. The
+  agent should never see a stale token as a session drags on.
+
+- **Kubernetes access as `claude-code-web` ServiceAccount.** The daemon
+  writes a kubeconfig pointing at the cluster API, authenticated as the
+  `claude-code-web` ServiceAccount. `KUBECONFIG` is exported into the
+  agent's environment, and the `claude-sandbox-kubectl` MCP server uses the
+  same identity. Both `kubectl` and MCP calls land with the RBAC documented
+  in <../../../AGENTS.md>.
+
+- **GitHub fork remote.** If the machine user has a fork of the repo, the
+  daemon configures it as a `fork` remote with push credentials, so that
+  `git push -u fork <branch>` works without further setup.
+
+- **Network to BuildBuddy works out of the box.** `bazelisk`, `bb`, and
+  `bbr` reach BuildBuddy and the Bazel Central Registry successfully on
+  the first invocation. The agent never has to configure CA bundles,
+  truststores, proxy env vars, or `--remote_proxy` flags to get builds
+  working over the container's constrained egress.
+
+- **Container runtime.** Docker, supervisor, and mkcert are set up so that
+  integration tests that need a local container runtime work.
+
+- **Tmpfs caching.** Performance-sensitive caches (Bazel output base, Docker
+  storage when the container root is slow) are backed by tmpfs. From the
+  agent's perspective this is invisible except that Bazel is not absurdly
+  slow.
+
+- **Idle shutdown.** The daemon auto-exits after a period of inactivity so
+  stale containers don't accumulate.
+
+### What Web does NOT do
+
+- Does **not** install the git safety shim. (Web sessions push to a fork,
+  not to `devel`, so `git amend`/`add -A` are less dangerous. If this ever
+  changes, update this file.)
+
+## Observable Acceptance Criteria
+
+These are the checks that the `/web_selfcheck` skill effectively runs as an
+acceptance test against a live session. A healthy session satisfies all
+applicable criteria for its profile.
+
+### Common
+
+1. `echo $BUILDBUDDY_API_KEY` is non-empty, and a GetUser RPC against
+   `remote.buildbuddy.io` authenticates successfully.
+2. `echo $GITHUB_TOKEN` is non-empty, and `GET https://api.github.com/user`
+   returns the expected login (`agentydragon-agent` on web, the developer's
+   own login on CLI).
+3. `bbr build <trivial target> --nobuild` succeeds without TLS or proxy
+   errors.
+4. `bazelisk` on PATH points at the daemon's shim, and invocations are
+   tagged with `session:<id>` in BuildBuddy.
+5. Editing a Python file via Write/Edit triggers `ruff-format`
+   (auto-applied) and, on a lint violation, the edit is blocked with a
+   clear reason.
+6. Pre-commit runs end to end: a throwaway commit on a scratch branch
+   passes all hooks.
+7. Hook daemon logs are present and contain no unhandled exceptions from
+   SessionStart.
+8. Tracing reaches the OTLP collector (the bearer token test returns a
+   non-auth-error status).
+9. Running `bbr build <target>` twice in a row with identical inputs
+   lands on a warm runner the second time: the second invocation's
+   analysis phase is substantially faster than the first (rule of thumb:
+   warm < cold / 3). This is best-effort; a single cold-hit failure can
+   be transient (runner rotation, cache eviction), but consistent
+   cold-every-time across repeated runs is a daemon bug.
+
+### CLI only
+
+1. `git commit --amend` fails with a `[git-shim] BLOCKED` error.
+2. `git add -A` / `git add .` fails with a `[git-shim] BLOCKED` error.
+3. `git stash` (without `list`/`show`) fails with a `[git-shim] BLOCKED`
+   error.
+4. A `cd` into a subproject with its own `.envrc` propagates the expected
+   env vars into the next Bash tool call.
+
+### Web only
+
+1. `kubectl get pods -n claude-sandbox` works, authenticated as
+   `claude-code-web`. The `claude-sandbox-kubectl` MCP server returns the
+   same pod list.
+2. `$GITHUB_TOKEN` resolves to the `agentydragon-agent` machine user (not a
+   personal account).
+3. `git remote -v` shows a `fork` remote with push access to the machine
+   user's fork.
+4. `bbr build <any target>` works out of the box. No extra flags, no
+   manual `git remote` setup, no prompt asking the user to pick a remote —
+   the default remote is selected automatically, and the build reaches a
+   BuildBuddy runner on the first try.
+5. Docker is available (`docker info` succeeds) for tests that need a local
+   container runtime.
+
+Anything that fails these criteria is a daemon bug, not a user problem. The
+`/web_selfcheck` skill is the canonical runnable acceptance test for this
+spec.
diff --git a/devinfra/claude/hook_daemon/bes_interceptor.py b/devinfra/claude/hook_daemon/bes_interceptor.py
index 436b9ae890..63d41ef0f2 100644
--- a/devinfra/claude/hook_daemon/bes_interceptor.py
+++ b/devinfra/claude/hook_daemon/bes_interceptor.py
@@ -7,6 +7,10 @@
 
 If a build/test invocation lacks --remote_executor, a mailbox message is posted
 to the session nudging the agent toward `bb remote`.
+
+TODO: this nudge behavior is experimental and deliberately NOT in SPEC.md yet.
+If it proves reliable and useful, promote it to a committed behavior under
+"Common Behaviors" in <SPEC.md>.
 """
 
 from __future__ import annotations
diff --git a/skills/web_selfcheck/SKILL.md b/skills/web_selfcheck/SKILL.md
index 41e2bb6103..a84e14b03e 100644
--- a/skills/web_selfcheck/SKILL.md
+++ b/skills/web_selfcheck/SKILL.md
@@ -1,656 +1,370 @@
 ---
 name: web_selfcheck
 description: >
-  Diagnose the health of a Claude Code web session — checks whether
-  web_setup.sh ran, whether the session start hook succeeded, whether
-  the installed claude-hooks package is stale relative to the repo,
-  whether each SOPS-encrypted credential is decryptable and live-tests
-  each one against its upstream API, and whether ducktape git hooks
-  (pre-commit, commit-msg) actually pass when committing. Reports what's
-  broken and how to fix it. Use when the user asks "did setup go ok",
-  "why isn't bbr working", "check credentials", "selfcheck", "why do my
-  commits fail", or any question about web session health.
+  Diagnose the health of a Claude Code session (CLI or web) by running the
+  observable acceptance criteria in the hook daemon SPEC against the live
+  session. Also runs out-of-SPEC diagnostics (web_setup.sh freshness,
+  claude-hooks pin staleness, bbr runner recycling, git hook origin-URL
+  issues). Use when the user asks "did setup go ok", "why isn't bbr
+  working", "check credentials", "selfcheck", "why do my commits fail", or
+  any question about session health.
 ---
 
-# Web Session Selfcheck
+# Session Selfcheck
 
-Comprehensive health check for a Claude Code web session. Run all checks,
-then produce a single structured report with clear pass/fail status and
-actionable remediation steps for anything that's broken.
+This skill is the **runnable acceptance test** for the hook daemon
+specification at <../../devinfra/claude/hook_daemon/SPEC.md>.
 
-## CRITICAL: Observe Only — Do NOT Fix Without Explicit User Approval
+## How to use this skill
+
+1. **Read SPEC.md first.** It enumerates every behavior a healthy session
+   must satisfy, split into `### Common`, `### CLI only`, and
+   `### Web only` under the `## Observable Acceptance Criteria` heading.
+   The SPEC is the source of truth. If the SPEC and this skill disagree,
+   the SPEC wins — update the skill.
+2. **Detect the profile.** `$DUCKTAPE_CLAUDE_HOOKS_PROFILE` (or the file
+   path that the daemon was launched with) tells you whether to run the
+   CLI or Web criteria. Always run the Common criteria.
+3. **For each SPEC criterion, run the matching check** from the
+   "SPEC acceptance checks" section below.
+4. **Then run the out-of-SPEC diagnostics** section, which catches
+   real-world failure modes the SPEC does not (yet) codify.
+5. **Produce the report** using the format at the end.
+
+Run all `Bash` commands with `dangerouslyDisableSandbox: true` (needs
+network and filesystem access outside the sandbox). Run independent checks
+in parallel where possible.
+
+## CRITICAL: observe only — do NOT fix without explicit user approval
 
 This is a **diagnostic skill**. Treat a broken session like a crime scene:
 observe, document, and report — do not touch.
 
 **Do NOT run any remediation commands** (e.g. `web_setup.sh`, re-triggering
-SessionStart, sourcing env files, installing packages) unless the user
-explicitly says to proceed. The "Fix" blocks throughout this skill are
-documentation of what _could_ be done — they are **not instructions for you
-to execute**. Report your findings and wait for a go-ahead.
-
-**Exception — debugging workarounds**: when the session hooks are broken and
-you are actively debugging or documenting (e.g. committing this very skill),
-the following lightweight workarounds are acceptable without explicit approval:
-
-- Committing with hooks bypassed: `git commit --no-verify` (or point hooks at
-  `/dev/null` temporarily) to record diagnostic work while hooks are broken
+SessionStart, sourcing env files, installing packages, re-running
+`git remote add`) unless the user explicitly says to proceed. If a check
+fails, the fix is "the daemon is broken, tell the user" — not "let me
+work around it."
+
+**Exception — debugging workarounds**: when the session hooks are
+demonstrably broken and you are actively debugging or documenting, the
+following lightweight workarounds are acceptable without explicit
+approval:
+
+- Committing with hooks bypassed: `git commit --no-verify` to record
+  diagnostic work while hooks are broken
 - Unsetting `BUILDBUDDY_API_KEY` to force local bazel when bbr is broken
-- Creating a `bazel` wrapper in the session bin that injects `--bazelrc` when
-  the session bazelrc exists but the shim is missing
-
-Run all `Bash` commands with `dangerouslyDisableSandbox: true` (needs network
-and filesystem access outside the sandbox).
+- Creating a `bazel` wrapper in the session bin that injects `--bazelrc`
+  when the session bazelrc exists but the shim is missing
 
-## What to Check
+## SPEC acceptance checks
 
-Run all checks in parallel where possible.
-
----
+Each check below corresponds one-to-one with a numbered criterion in
+SPEC.md. Cross-reference the SPEC for the authoritative statement of what
+the check is verifying.
 
-### 1. web_setup.sh
+### Common
 
-**Goal**: confirm Nix and the `devtools` profile were installed successfully,
-and that setup ran from the current repo commit.
-
-**VM reuse warning**: Anthropic reuses Firecracker microVMs between sessions.
-`/tmp` persists across reuses, so `/tmp/web-setup.log` may be from a prior
-session running an older version of `web_setup.sh`. Always verify the setup
-commit matches the current repo HEAD — a stale setup means Nix devtools and
-skills may not match what the current code expects.
+**C1 — BUILDBUDDY_API_KEY is present and valid.**
 
 ```bash
-# Was it run at all?
-ls -la /tmp/web-setup.log 2>/dev/null || echo "MISSING"
-# Did it succeed? (last line should be "Setup complete.")
-tail -5 /tmp/web-setup.log 2>/dev/null
-# Was it recent? (mtime)
-stat -c '%y' /tmp/web-setup.log 2>/dev/null
-# Did Nix install?
-nix --version 2>/dev/null || echo "nix not found"
-# Is the devtools profile active?
-nix profile list 2>/dev/null | grep -E 'devtools|claude-hooks' | head -5 || echo "no devtools profile"
-
-# What commit did web_setup.sh run from?
-grep 'web_setup.sh commit:' /tmp/web-setup.log 2>/dev/null | tail -1 || echo "commit not logged (old web_setup.sh)"
-# Current repo HEAD
-git -C /home/user/ducktape rev-parse HEAD
-
-# Do they match?
-SETUP_COMMIT=$(grep 'web_setup.sh commit:' /tmp/web-setup.log 2>/dev/null | tail -1 | grep -oE '[0-9a-f]{40}' || echo '')
-HEAD_COMMIT=$(git -C /home/user/ducktape rev-parse HEAD)
-if [ -z "$SETUP_COMMIT" ]; then
-  echo "UNKNOWN: web_setup.sh predates commit logging — assume STALE"
-elif [ "$SETUP_COMMIT" = "$HEAD_COMMIT" ]; then
-  echo "OK: setup commit matches HEAD ($HEAD_COMMIT)"
-else
-  echo "STALE: setup ran from ${SETUP_COMMIT:0:12}, HEAD is ${HEAD_COMMIT:0:12}"
-fi
-
-# What env var keys were present when web_setup.sh ran?
-grep -A200 'environment keys' /tmp/web-setup.log 2>/dev/null | grep -B200 '^---$' | grep -v '^---'
-```
-
-**Failure indicators**: log missing, last line not "Setup complete", nix not
-found, devtools not in profile list, setup commit doesn't match HEAD.
-
-**Fix**: re-run setup from the Claude Code web UI setup command:
-
-```
-bash ducktape/devinfra/claude/web_setup.sh
+[ -n "${BUILDBUDDY_API_KEY:-}" ] || echo "FAIL: BUILDBUDDY_API_KEY unset"
+curl -s -o /dev/null -w "%{http_code}\n" \
+  -H "x-buildbuddy-api-key: ${BUILDBUDDY_API_KEY}" \
+  -H "Content-Type: application/proto" \
+  --data-binary '' \
+  https://remote.buildbuddy.io/rpc/BuildBuddyService/GetUser
 ```
 
-If re-running, note that `SOPS_AGE_KEY` is typically not available when
-`web_setup.sh` runs — all SOPS decryptions will fail. Secrets are instead
-decrypted by the session start hook daemon (which inherits `SOPS_AGE_KEY`
-from the container after k8s injects it). This is expected and not a bug.
-
----
-
-### 2. claude-hooks Daemon Version
-
-**Goal**: check whether the installed `claude-hooks` daemon matches the current
-repo code, how far behind it is, and whether any breaking changes have landed
-since the pinned commit.
+Pass: `200` (or `400` = malformed proto but auth passed). `401`/`403` =
+invalid key.
 
-Do this early — a stale daemon is often the root cause of session hook failures.
+**C2 — GITHUB_TOKEN is present and valid.**
 
 ```bash
-# Pinned commit (what's actually installed)
-python3 -c "
-import json, re
-pins = json.load(open('/home/user/ducktape/npins/sources.json'))['pins']
-url = pins.get('claude-hooks', {}).get('url', '')
-m = re.search(r'claude-hooks-([0-9a-f]+)', url)
-print('pinned commit:', m.group(1) if m else 'unknown')
-print('pin url:', url[:100])
-"
-
-# Current HEAD of the repo
-git -C /home/user/ducktape rev-parse --short HEAD
-git -C /home/user/ducktape log --oneline -1
-
-# How many devinfra/claude/ commits have landed since the pin was last updated?
-git -C /home/user/ducktape log --oneline -10 -- devinfra/claude/ npins/sources.json
-
-# When was the pin last updated?
-git -C /home/user/ducktape log --oneline -3 -- npins/sources.json
+curl -s -H "Authorization: Bearer ${GITHUB_TOKEN}" https://api.github.com/user \
+  | python3 -c "import sys,json; d=json.load(sys.stdin); print('login:', d.get('login'), 'message:', d.get('message',''))"
 ```
 
-**If the pin is behind HEAD**, diff the installed package against the repo
-source to spot breaking changes: look at the git log for `devinfra/claude/`
-since the pinned commit, read the relevant changed files in both the installed
-Nix store package and the repo, and use your judgement to assess whether any
-of those changes are likely to cause incompatibility with the running session.
-Report any suspicious mismatches (e.g. renamed classes, changed config file
-paths, new required fields, removed hooks).
-
-Also check GitHub CI on `agentydragon/ducktape` to
-understand whether an update is expected soon or something is wedged. Look at:
-
-- Recent `release.yml` runs on `devel` — did it pass after the relevant commit?
-- Recent `sync-pins.yml` runs — did it succeed and push?
-- Recent `ci.yml` runs on `devel` — any blocking test failures?
-
-For each, report: last run status, when it ran, and if failed, what failed.
+Pass on web: `login: agentydragon-agent`. Pass on CLI: the user's own
+GitHub login. `Bad credentials` = expired/revoked.
 
-**Interpretation**:
-
-- `release.yml` failing → new daemon won't be released; find the failing test/step
-- `sync-pins.yml` not running or failing → pin won't auto-update
-- CI tests failing on `devel` → release is blocked until tests are fixed
-
-**Suggested fix** (do not run — report to user):
-
-1. If `release.yml` recently passed after the relevant commit: `sync-pins.yml`
-   will update the pin within 30 min; wait or trigger manually
-2. If `release.yml` hasn't run or failed: identify and fix the blocking issue on `devel`
-3. Once pin is updated and merged, re-run `web_setup.sh`
-
----
-
-### 3. Session Start Hook
-
-**Goal**: confirm the session start hook ran successfully and wrote the env file.
+**C3 — `bbr build <trivial>` succeeds without TLS or proxy errors.**
 
 ```bash
-# Find live session ID (from hook_daemon process)
-LIVE=$(ps aux | grep hook_daemon | grep -v grep | grep -oP '(?<=--sock /tmp/claude-hd/)[^/]+' | head -1)
-echo "live session: $LIVE"
-
-# Check env file (presence + CANARY marker = success)
-head -3 ~/.claude/session-env/$LIVE/sessionstart-hook-0.sh 2>/dev/null || echo "ENV FILE MISSING"
-
-# Check daemon log for errors
-grep -E 'ERROR|Exception|FileNotFoundError|sessionstart|SessionStart' \
-  ~/.claude/session-env/$LIVE/hook-daemon/daemon.log 2>/dev/null | tail -20
-
-# Is BUILDBUDDY_API_KEY set?
-echo "BUILDBUDDY_API_KEY in env: $([ -n "${BUILDBUDDY_API_KEY:-}" ] && echo YES || echo NO)"
-
-# Is the auth proxy running?
-ls ~/.claude/session-env/$LIVE/auth-proxy/combined_ca.pem 2>/dev/null && echo "CA present" || echo "CA MISSING"
-ls ~/.claude/session-env/$LIVE/bazelrc 2>/dev/null && echo "session bazelrc present" || echo "BAZELRC MISSING"
-
-# Is the git proxy shim running? (bbr connects via 127.0.0.1:35233)
-ss -tlnp 2>/dev/null | grep 35233 || echo "git proxy NOT listening on 35233"
+cd /home/user/ducktape
+bbr build //devinfra:gazelle --nobuild 2>&1 | tail -5
 ```
 
-**Suggested fix if env file is missing** (do not run — report to user):
+Pass: exit 0 with no `Unable to resolve host`, `certificate`,
+`127.0.0.1:*`, or proxy errors.
 
-Re-trigger SessionStart on the live daemon:
+**C4 — bazelisk shim is active and invocations are session-tagged.**
 
 ```bash
-LIVE=<live_session_id>
-SOCK=/tmp/claude-hd/$LIVE/d.sock
-python3.13 -c "
-import json, os
-env = dict(os.environ)
-env['CLAUDE_ENV_FILE'] = f'/root/.claude/session-env/$LIVE/sessionstart-hook-0.sh'
-env['CLAUDE_PROJECT_DIR'] = '/home/user/ducktape'
-env['CLAUDE_CODE_REMOTE'] = 'true'
-print(json.dumps({'hook': {'hook_event_name': 'SessionStart', 'session_id': '$LIVE',
-  'cwd': '/home/user/ducktape', 'transcript_path': '/tmp/transcript.json',
-  'source': 'startup'}, 'env': env}))
-" | curl -s --max-time 300 --unix-socket $SOCK http://localhost/hook -X POST \
-  -H 'Content-Type: application/json' -d @-
-source ~/.claude/session-env/$LIVE/sessionstart-hook-0.sh
+ls -l "$(command -v bazelisk)"  # must point into $DUCKTAPE_CLAUDE_HOOKS_SESSION_DIR/bin
+grep -E 'build_metadata|TAGS' "$DUCKTAPE_CLAUDE_HOOKS_SESSION_DIR/bbr.bazelrc" 2>/dev/null
 ```
 
-**Manual fallback** (if daemon is down or still broken after fix):
-
-```bash
-source /home/user/ducktape/devinfra/secrets/web_env.sh
-mkdir -p ~/.config/bazel
-cat > ~/.config/bazel/buildbuddy.bazelrc <<EOF
-common --remote_header=x-buildbuddy-api-key=${BUILDBUDDY_API_KEY}
-build --config=rbe
-EOF
-```
+Pass: bazelisk resolves inside the session dir, and bbr.bazelrc contains
+`session:<id>` metadata.
 
----
+**C5 — PostToolUse pre-commit auto-apply works.**
 
-### 4. Credentials — SOPS Decryption
+Hard to automate without actually exercising Edit/Write. Report this as
+"manually verified" if you have just edited a file in this session and
+observed `ruff-format` apply, or as "NOT TESTED" otherwise. Do not fake
+this check.
 
-**Goal**: confirm `SOPS_AGE_KEY` is present and can decrypt all claude-web secrets.
+**C6 — throwaway-commit pre-commit end-to-end.**
 
 ```bash
-echo "SOPS_AGE_KEY present: $([ -n "${SOPS_AGE_KEY:-}" ] && echo YES || echo NO)"
-echo "Age public key: $(echo "${SOPS_AGE_KEY:-}" | age-keygen -y 2>/dev/null || echo 'age-keygen not found')"
-
-# Expected public key from .sops.yaml (claude-web entry):
-grep 'claude-web' /home/user/ducktape/.sops.yaml
-
-for f in \
-  secrets/buildbuddy.yaml \
-  secrets/github-pat-agentydragon-agent.yaml \
-  secrets/github-ci-read-pat.yaml \
-  secrets/alloy-otlp-bearer-token.yaml \
-  secrets/claude-web-k8s-token.yaml \
-  secrets/docker-ci/client-key.sops.pem; do
-    result=$(sops -d /home/user/ducktape/$f 2>&1 | head -1)
-    if echo "$result" | grep -qE 'FAILED|failed|error|Error'; then
-        echo "FAIL: $f — $result"
-    else
-        echo "OK:   $f"
-    fi
-done
+set -e
+cd /home/user/ducktape
+TEST_BRANCH="selfcheck/$(date +%s)"
+git checkout -q -b "$TEST_BRANCH"
+TEST_FILE=$(mktemp /home/user/ducktape/selfcheck-XXXXX.txt)
+echo "selfcheck $(date -Iseconds)" > "$TEST_FILE"
+git add "$TEST_FILE"
+git commit -m "test: selfcheck — delete me" 2>&1
+EXIT=$?
+git checkout -q -
+git branch -D "$TEST_BRANCH"
+rm -f "$TEST_FILE"
+echo "exit: $EXIT"
 ```
 
-**Failure indicator**: any `FAIL` line, or `SOPS_AGE_KEY` not present.
+Pass: exit 0.
 
-**Fix**: if `SOPS_AGE_KEY` is missing, the session didn't receive the age
-private key at startup. This is injected from the `claude-sandbox` k8s Secret
-by the container runtime. Check whether the k8s Secret exists:
+**C7 — hook daemon logs present, no unhandled exceptions.**
 
 ```bash
-kubectl -n claude-sandbox get secret claude-web-age-key 2>/dev/null
+LOG="$DUCKTAPE_CLAUDE_HOOKS_SESSION_DIR/hook-daemon/daemon.log"
+[ -f "$LOG" ] && grep -cE 'ERROR|Traceback|Exception' "$LOG" || echo MISSING
 ```
 
----
+Pass: log exists, zero matches (or only expected warnings — use judgement).
 
-### 5. Credentials — Live API Tests
-
-Run each live test and capture HTTP status / response content.
-
-#### BuildBuddy API Key
+**C8 — OTLP tracing reaches the collector.**
 
 ```bash
-BB_KEY=$(sops -d /home/user/ducktape/secrets/buildbuddy.yaml 2>/dev/null \
-  | awk '/buildbuddy_api_key:/ {print $2}')
-# Test via bbapi (needs BUILDBUDDY_API_KEY in env)
-export BUILDBUDDY_API_KEY="$BB_KEY"
-curl -s -o /dev/null -w "%{http_code}" \
-  -H "x-buildbuddy-api-key: $BB_KEY" \
-  -H "Content-Type: application/proto" \
-  "https://remote.buildbuddy.io/rpc/BuildBuddyService/GetUser" \
-  --data-binary ''
+curl -s -o /dev/null -w "%{http_code}\n" \
+  -H "Authorization: Bearer ${DUCKTAPE_OTEL_BEARER_TOKEN}" \
+  -H "Content-Type: application/json" -d '{}' \
+  https://alloy-otlp.allegedly.works/v1/traces
 ```
 
-Expected: `200` (or `400` for malformed proto — means auth passed).
-`401`/`403` means key is invalid or expired.
+Pass: `200` or `400` (bad proto = auth passed). `401` = token rotated or
+missing.
+
+**C9 — `bbr` preserves the analysis cache on a second identical run.**
 
-**Fix if invalid**: regenerate key in BuildBuddy org settings, re-encrypt into
-`secrets/buildbuddy.yaml`, push to `devel`, wait for `sync-pins.yml`.
+Low-precision, high-recall sensor with a high false-positive rate (runner
+rotation, BB server restart, cache eviction can all cause transient cold
+hits). Report the finding but don't act on a single failure. Stop early
+rather than spending many minutes retrying.
 
-#### GitHub Agent PAT (`agentydragon-agent`)
+**Method — cache poisoning**: append a comment to `MODULE.bazel` so the
+first build is guaranteed cold, then time an immediately-following
+identical build. The SPEC permits occasional cold-hits; only flag if
+warm ≈ cold across **two** repeated runs.
 
 ```bash
-GH_TOKEN=$(sops -d /home/user/ducktape/secrets/github-pat-agentydragon-agent.yaml 2>/dev/null \
-  | awk '/github_token:/ {print $2}')
-curl -s -H "Authorization: Bearer $GH_TOKEN" https://api.github.com/user \
-  | python3 -c "import sys,json; d=json.load(sys.stdin); print('login:', d.get('login'), 'message:', d.get('message',''))"
+cd /home/user/ducktape
+echo "# selfcheck-poison-$(date +%s)" >> MODULE.bazel
+T1_START=$(date +%s%N)
+bbr build //... --nobuild 2>&1 | tail -3
+T1_SEC=$(( ($(date +%s%N) - T1_START) / 1000000000 ))
+T2_START=$(date +%s%N)
+bbr build //... --nobuild 2>&1 | tail -3
+T2_SEC=$(( ($(date +%s%N) - T2_START) / 1000000000 ))
+git checkout -- MODULE.bazel
+echo "cold=${T1_SEC}s warm=${T2_SEC}s"
 ```
 
-Expected: `login: agentydragon-agent`.
-`Bad credentials` or `Requires authentication` means token expired/revoked.
+Interpret: `warm < cold/3` = recycling works. `warm ≈ cold` = likely not
+recycling (but re-run before diagnosing — high FP rate). `cold < 5s` =
+build graph too small to measure. If consistently warm≈cold across two
+runs, inspect `bbapi invocation <id>` for runner IDs.
 
-**Fix**: generate new PAT for `agentydragon-agent` machine user (Settings →
-Developer Settings → Personal Access Tokens), re-encrypt into
-`secrets/github-pat-agentydragon-agent.yaml`, push to `devel`.
+### CLI only
 
-#### GitHub CI Read PAT (`agentydragon` fine-grained)
+**CLI1 — `git commit --amend` is blocked by the git shim.**
 
 ```bash
-GH_CI=$(sops -d /home/user/ducktape/secrets/github-ci-read-pat.yaml 2>/dev/null \
-  | awk '/github_token:/ {print $2}')
-curl -s -H "Authorization: Bearer $GH_CI" https://api.github.com/user \
-  | python3 -c "import sys,json; d=json.load(sys.stdin); print('login:', d.get('login'), 'message:', d.get('message',''))"
+(cd /tmp && git init -q selfcheck && cd selfcheck && \
+  git commit --allow-empty -m init -q 2>/dev/null && \
+  git commit --amend --no-edit 2>&1 | grep -c '\[git-shim\] BLOCKED')
+rm -rf /tmp/selfcheck
 ```
 
-Expected: `login: agentydragon`.
+Pass: `1`.
 
-#### K8s Service Account Token
+**CLI2 — `git add -A` / `git add .` is blocked.**
 
 ```bash
-K8S_TOKEN=$(sops -d /home/user/ducktape/secrets/claude-web-k8s-token.yaml 2>/dev/null \
-  | awk '/k8s_token:/ {print $2}')
-curl -sk -o /dev/null -w "%{http_code}" \
-  -H "Authorization: Bearer $K8S_TOKEN" \
-  "https://api.allegedly.works:16443/api/v1/namespaces/claude-sandbox"
+(cd /tmp && mkdir -p selfcheck2 && cd selfcheck2 && \
+  git init -q && git add -A 2>&1 | grep -c '\[git-shim\] BLOCKED')
+rm -rf /tmp/selfcheck2
 ```
 
-Expected: `200`. `401` means the token was rotated and the SOPS file wasn't
-updated yet.
+Pass: `1`.
 
-**Note**: this token is **auto-rotated by an in-cluster CronJob**. The SOPS
-file should be updated automatically. If it returns 401, check:
+**CLI3 — `git stash` is blocked (but `list` / `show` allowed).**
 
 ```bash
-# Check CronJob last run and next run
-kubectl -n default get cronjob claude-web-token-rotator -o yaml 2>/dev/null | grep -E 'lastScheduleTime|schedule'
-kubectl -n default get jobs -l app=claude-web-token-rotator 2>/dev/null | tail -5
+git stash 2>&1 | grep -c '\[git-shim\] BLOCKED'
+git stash list 2>&1 | grep -c '\[git-shim\] BLOCKED'  # must be 0
 ```
 
-#### OTLP Bearer Token (Grafana Alloy)
+**CLI4 — direnv bridge propagates `.envrc` exports into Bash tool calls.**
 
 ```bash
-OTLP_TOKEN=$(sops -d /home/user/ducktape/secrets/alloy-otlp-bearer-token.yaml 2>/dev/null \
-  | awk '/token:/ {print $2}')
-curl -s -o /dev/null -w "%{http_code}" \
-  -H "Authorization: Bearer $OTLP_TOKEN" \
-  -H "Content-Type: application/json" \
-  -d '{}' \
-  "https://alloy-otlp.allegedly.works/v1/traces"
+# Expect a representative env var (e.g. one set only by .envrc) to appear
+# after cd into a subproject that has one.
+cd /home/user/ducktape && env | grep -c '^DUCKTAPE_PRECOMMIT_ENFORCE_TEST_TAG='
 ```
 
-Expected: `200` or `400` (bad proto = auth passed). `401` means token
-was rotated. **Fix**: bump `rotation_version` in
-`cluster/terraform/gitops/alloy-otlp-bearer-token/`, apply with `tofu`.
+Pass: `1` (or whatever var your `.envrc` exports).
 
----
+### Web only
 
-### 6. bbr / BuildBuddy RBE
+**W1 — `kubectl` works as `claude-code-web`; MCP returns the same pods.**
 
 ```bash
-# Set key from SOPS if not already in env
-[ -z "${BUILDBUDDY_API_KEY:-}" ] && \
-  export BUILDBUDDY_API_KEY=$(sops -d /home/user/ducktape/secrets/buildbuddy.yaml 2>/dev/null \
-    | awk '/buildbuddy_api_key:/ {print $2}')
-
-# Fix origin/HEAD if missing (needed by bbr)
-git -C /home/user/ducktape remote set-head origin --auto 2>/dev/null || true
-
-# Test bbr connectivity (dry run)
-bbr build //devinfra:gazelle --nobuild 2>&1 | tail -5
+kubectl -n claude-sandbox get pods 2>&1 | tail -5
 ```
 
-**Failure: `cannot connect to 127.0.0.1:35233`** → session start hook didn't
-run; the git proxy shim is not running. Follow session start hook fix above.
+Then invoke the `mcp__claude-sandbox-kubectl__pods_list_in_namespace` tool
+with `namespace=claude-sandbox` and compare. Pass: both succeed and agree.
 
-**Failure: `Unable to resolve host remote.buildbuddy.io`** → TLS proxy/CA
-issue; session start hook didn't set up auth proxy. Follow session start hook
-fix above.
+**W2 — `$GITHUB_TOKEN` identifies as `agentydragon-agent`.**
 
----
+Covered by C2 on web — no separate check.
 
-### 6b. BuildBuddy Runner Recycling & Analysis Cache
+**W3 — `fork` remote is configured with push access.**
 
-**Goal**: Confirm that `bbr` reuses the same BuildBuddy runner VM between
-invocations so the Bazel analysis cache is warm for subsequent builds. A
-recycled runner means the second build completes analysis significantly faster
-than the first cold build.
+```bash
+git -C /home/user/ducktape remote -v | grep '^fork'
+```
 
-**Calibration note**: This is a low-precision, high-recall sensor with a high
-false-positive rate. A single "not recycling" result may be transient (runner
-rotation, BB server restart, cache eviction). Report the finding but don't act
-on a single failure. Stop early rather than spending many minutes trying.
+Pass: `fork` appears with a URL the machine user can push to. If absent
+with no warning in the context banner, the fork-remote background task
+failed.
 
-**Method — cache poisoning**: Append a comment to `MODULE.bazel` so the first
-build is guaranteed cold even if the runner was already warm, then measure
-whether the immediately-following identical build is significantly faster.
+**W4 — `bbr build <any target>` works out of the box, no manual remote
+setup, no remote picker.**
 
 ```bash
 cd /home/user/ducktape
-
-# Step 1: Poison the analysis cache.
-# Any uncommmitted change to MODULE.bazel forces Bazel to re-analyse from
-# scratch on the runner, even if it was already warm.
-POISON_MARKER="# selfcheck-poison-$(date +%s)"
-echo "$POISON_MARKER" >> MODULE.bazel
-
-# Step 2: Cold build. MODULE.bazel changed → Bazel must re-analyse everything.
-# --nobuild skips compilation; we only care about analysis time.
-# Cold analysis of //... can take 5–20 minutes on this repo — be patient.
-# If it hangs with no output for >10 minutes, abort (Ctrl-C), restore
-# MODULE.bazel with `git checkout -- MODULE.bazel`, and skip this check.
-echo "--- Cold build (poisoned MODULE.bazel) ---"
-T1_START=$(date +%s%N)
-bbr build //... --nobuild 2>&1 | tail -5
-BBR_EXIT=$?
-T1_END=$(date +%s%N)
-T1_SEC=$(( (T1_END - T1_START) / 1000000000 ))
-echo "Cold build: ${T1_SEC}s (exit $BBR_EXIT)"
-
-if [ $BBR_EXIT -ne 0 ]; then
-  git checkout -- MODULE.bazel
-  echo "SKIP: Cold build failed — skipping cache warmth check."
-else
-  # Step 3: Warm build. Same poisoned MODULE.bazel, same runner expected.
-  echo "--- Warm build (same inputs, runner should be recycled) ---"
-  T2_START=$(date +%s%N)
-  bbr build //... --nobuild 2>&1 | tail -5
-  T2_SEC=$(( ($(date +%s%N) - T2_START) / 1000000000 ))
-  echo "Warm build: ${T2_SEC}s"
-
-  # Step 4: Restore MODULE.bazel.
-  git checkout -- MODULE.bazel
-
-  # Step 5: Assess.
-  echo "--- Result: cold=${T1_SEC}s  warm=${T2_SEC}s ---"
-  if [ "$T1_SEC" -lt 5 ]; then
-    echo "AMBIGUOUS: Cold build was <5s — build graph may be too small to"
-    echo "           measure, or poisoning had no effect on this runner."
-  elif [ "$T2_SEC" -lt $(( T1_SEC / 3 )) ]; then
-    echo "OK: Warm build (${T2_SEC}s) < 1/3 of cold (${T1_SEC}s)."
-    echo "    Runner recycling + analysis cache reuse is working."
-  else
-    echo "WARN: Warm build (${T2_SEC}s) is not much faster than cold (${T1_SEC}s)."
-    echo "      Runner may not be recycled, or analysis cache is evicted."
-    echo "      High false-positive rate — verify with a second run before diagnosing."
-  fi
-fi
+# Run interactively — the test fails if bb prints a remote picker prompt
+# or errors on missing git config.
+timeout 60 bbr build //devinfra:gazelle --nobuild 2>&1 | tail -10
 ```
 
-**Interpreting results:**
-
-| Warm / Cold ratio | Interpretation                                                |
-| ----------------- | ------------------------------------------------------------- |
-| < 33%             | ✅ Runner recycling and analysis cache are working            |
-| 33–70%            | ⚠️ Partial benefit; runner may be rotating                    |
-| > 70%             | ⚠️ Likely no recycling — but check again, high FP rate        |
-| Cold < 5s         | ❓ Ambiguous — build graph too small or poisoning ineffective |
-
-**If consistently warm ≈ cold across two runs**: check the BuildBuddy run UI
-(`bbapi invocation <id>`) to see if runner IDs differ between invocations. If
-they do, BB is not reusing runners — this may be a BB configuration or quota
-issue. If runner IDs are the same but analysis is still slow, the Bazel server
-on the runner may be restarting between invocations.
-
----
-
-### 7. Ducktape Git Hooks
-
-**Goal**: confirm pre-commit hooks actually pass when committing. Hooks break in
-web sessions due to `bbr` using the session's local git proxy URL
-(`127.0.0.1:*`) that the BuildBuddy cloud runner can't reach, or due to
-`DUCKTAPE_PRECOMMIT_ENFORCE_TEST_TAG` being active while the commit-msg hook
-receives no argv.
+Pass: exit 0, no "which remote" prompt, no `Unable to resolve host`, no
+`127.0.0.1:*` in the runner's origin URL.
 
-#### 7a. Framework & installation
+**W5 — Docker is available.**
 
 ```bash
-# Are the git hook shims installed?
-ls -la /home/user/ducktape/.git/hooks/pre-commit \
-       /home/user/ducktape/.git/hooks/commit-msg 2>/dev/null || echo "HOOKS NOT INSTALLED"
+docker info >/dev/null 2>&1 && echo OK || echo FAIL
+```
 
-# What version of pre-commit?
-pre-commit --version 2>/dev/null || echo "pre-commit not found"
+## Out-of-SPEC diagnostics
 
-# Which backend will detect_bazel_backend() pick?
-python3 -c "
-import os, shutil
-bb = shutil.which('bbr')
-key = os.environ.get('BUILDBUDDY_API_KEY')
-print('bbr on PATH:', bool(bb))
-print('BUILDBUDDY_API_KEY set:', bool(key))
-print('=> backend:', 'BUILDBUDDY (bbr)' if bb and key else 'LOCAL (bazel)')
-"
+These are not in SPEC.md but catch real-world failure modes. Include them
+in the report under a separate "Diagnostics" heading.
 
-# Active test-tag enforcement?
-echo "DUCKTAPE_PRECOMMIT_ENFORCE_TEST_TAG=${DUCKTAPE_PRECOMMIT_ENFORCE_TEST_TAG:-<unset>}"
-```
+### D1 — `web_setup.sh` freshness (web only)
 
-#### 7b. bbr git remote URL
-
-When the backend is `BUILDBUDDY`, `bb remote` reads `git remote -v` locally and
-sends the remote URL to the cloud runner. If `origin` is `127.0.0.1:*` (Claude
-Code web session proxy), the runner can't reach it.
+Anthropic reuses Firecracker microVMs; `/tmp/web-setup.log` may be from a
+prior session running an older `web_setup.sh`. A stale setup means Nix
+devtools and skills may not match the current code.
 
 ```bash
-git -C /home/user/ducktape remote -v | head -4
-
-# Is origin a local proxy?
-ORIGIN_URL=$(git -C /home/user/ducktape remote get-url origin 2>/dev/null)
-if echo "$ORIGIN_URL" | grep -qE '127\.0\.0\.1|localhost'; then
-  echo "WARN: origin is a local proxy ($ORIGIN_URL)"
-  echo "      BuildBuddy cloud runner cannot reach this — bbr queries will fail"
-  echo "FIX:  git remote add github https://github.com/agentydragon/ducktape"
-  echo "      git config buildbuddy.remote-bazel-default-remote github"
-else
-  echo "OK: origin is externally reachable ($ORIGIN_URL)"
-fi
-
-# Is the bb default remote override already set?
-git -C /home/user/ducktape config buildbuddy.remote-bazel-default-remote 2>/dev/null \
-  && echo "(buildbuddy.remote-bazel-default-remote override is set)" \
-  || echo "(no remote override — bb will use origin)"
+ls -la /tmp/web-setup.log 2>/dev/null || echo "MISSING"
+tail -3 /tmp/web-setup.log 2>/dev/null                    # last line should be "Setup complete."
+SETUP_COMMIT=$(grep 'web_setup.sh commit:' /tmp/web-setup.log 2>/dev/null | tail -1 | grep -oE '[0-9a-f]{40}')
+HEAD_COMMIT=$(git -C /home/user/ducktape rev-parse HEAD)
+[ "$SETUP_COMMIT" = "$HEAD_COMMIT" ] && echo "OK" || echo "STALE: setup=$SETUP_COMMIT head=$HEAD_COMMIT"
 ```
 
-#### 7c. Direct hook invocation
+### D2 — `claude-hooks` daemon pin staleness
 
-Test each hook binary directly, bypassing the full `git commit` flow.
+A stale installed daemon is often the root cause of session hook failures.
+Check the pinned commit against HEAD, and check whether release CI is
+passing so that the pin would move if you waited.
 
 ```bash
-# --- pytest-main-check ---
-# Unset BUILDBUDDY_API_KEY to force local bazel (avoids bbr/proxy issues)
-BUILDBUDDY_API_KEY= \
-  ducktape-pytest-main-check \
-  /home/user/ducktape/devinfra/claude/hook_daemon/test_hook_daemon.py \
-  2>&1 | tail -3
-echo "pytest-main-check exit: $?"
-
-# --- commit-msg hook ---
-# Write a sample commit message and run the hook against it.
-TMP_MSG=$(mktemp)
-cat > "$TMP_MSG" <<'MSG'
-test: dummy message for hook selfcheck
-
-https://claude.ai/code/test
-MSG
-
-DUCKTAPE_PRECOMMIT_ENFORCE_TEST_TAG= \
-  ducktape-commit-msg "$TMP_MSG" 2>&1
-echo "commit-msg exit: $?"
-rm -f "$TMP_MSG"
+python3 -c "
+import json, re
+pins = json.load(open('/home/user/ducktape/npins/sources.json'))['pins']
+url = pins.get('claude-hooks', {}).get('url', '')
+m = re.search(r'claude-hooks-([0-9a-f]+)', url)
+print('pinned:', m.group(1) if m else 'unknown')
+"
+git -C /home/user/ducktape log --oneline -5 -- devinfra/claude/ npins/sources.json
 ```
 
-**Failure: `pytest-main-check` exits 1 with `Command 'bazel' returned non-zero exit status 255`**
-
-Two sub-causes:
+If the pin is behind HEAD, diff the installed Nix store package against
+the repo source for breaking changes (renamed classes, changed config
+paths, removed hooks). Check GitHub CI on `agentydragon/ducktape`:
+recent `release.yml` and `sync-pins.yml` runs on `devel`.
 
-- _bbr backend, local proxy_: `BUILDBUDDY_API_KEY` is set and origin is `127.0.0.1:*`. The cloud runner fetches from the local proxy URL it received in `RunRequest.repo.url` and fails. Fix: add the `github` remote and set `buildbuddy.remote-bazel-default-remote` (see 7b above). Or temporarily unset `BUILDBUDDY_API_KEY` before committing.
-- _Concurrent bbr calls_: `build_bazel_index` runs two concurrent `bbr query` calls; if both race to re-initialise the runner repo, one fails. The local-proxy issue is the root cause in web sessions.
+### D3 — `git remote origin` URL reachability (web only)
 
-**Failure: `commit-msg` exits 1 with `commit message file path required as argument`**
-
-`DUCKTAPE_PRECOMMIT_ENFORCE_TEST_TAG=1` is active but `pass_filenames: false` was set (now fixed in `.pre-commit-config.yaml`). Confirm the fix is present:
+Known failure mode: `bb remote` reads `git remote -v` locally and sends
+the URL to the cloud runner. If `origin` is `127.0.0.1:*` (Claude Code
+web session proxy), the runner can't reach it and `bbr` fails on hook
+invocations.
 
 ```bash
-grep -A6 'ducktape-commit-msg' /home/user/ducktape/.pre-commit-config.yaml | grep pass_filenames \
-  && echo "WARN: pass_filenames still set" \
-  || echo "OK: pass_filenames not set"
+ORIGIN=$(git -C /home/user/ducktape remote get-url origin)
+echo "$ORIGIN" | grep -qE '127\.0\.0\.1|localhost' && echo "WARN: local proxy origin" || echo "OK"
+git -C /home/user/ducktape config buildbuddy.remote-bazel-default-remote 2>/dev/null \
+  && echo "(buildbuddy remote override is set)" \
+  || echo "(no remote override)"
 ```
 
-#### 7d. Live end-to-end commit test
-
-Actually exercises the full hook pipeline (pre-commit + commit-msg) on a
-throwaway commit in a temp branch. Cleans up afterwards.
-
-```bash
-set -e
-cd /home/user/ducktape
-
-# Create a throwaway branch from HEAD
-TEST_BRANCH="selfcheck/hook-test-$(date +%s)"
-git checkout -q -b "$TEST_BRANCH"
-
-# Make a trivial tracked change
-TEST_FILE=$(mktemp /home/user/ducktape/selfcheck-hook-test-XXXXX.txt)
-echo "hook selfcheck $(date -Iseconds)" > "$TEST_FILE"
-git add "$TEST_FILE"
-
-# Commit with BUILDBUDDY_API_KEY unset (forces local bazel in pre-commit)
-# and DUCKTAPE_PRECOMMIT_ENFORCE_TEST_TAG unset (skips test-tag check)
-BUILDBUDDY_API_KEY= DUCKTAPE_PRECOMMIT_ENFORCE_TEST_TAG= \
-  git commit -m "test: selfcheck hook test — delete me" 2>&1
-COMMIT_EXIT=$?
-
-# Clean up: remove branch and file regardless of outcome
-git checkout -q -
-git branch -D "$TEST_BRANCH"
-rm -f "$TEST_FILE"
+## Report Format
 
-echo "Live commit test exit: $COMMIT_EXIT"
-[ "$COMMIT_EXIT" -eq 0 ] && echo "PASS: git hooks work" || echo "FAIL: git hooks broken"
 ```
+# Session Selfcheck — <timestamp>
 
-Add `| grep -E '^(PASS|FAIL|Passed|Failed|error|Error)'` to the git commit
-line to reduce noise, or run it unfiltered to see all hook output.
+Profile: <CLI/Web>    Summary: <healthy / degraded / broken>
 
-**If the live test fails**: capture the full pre-commit output, then run the
-failing hook directly (7c) to isolate which hook is broken.
+## SPEC acceptance criteria
 
----
+| ID  | Check                          | Status   | Detail                |
+| --- | ------------------------------ | -------- | --------------------- |
+| C1  | BUILDBUDDY_API_KEY valid       | OK/FAIL       | HTTP <code>           |
+| C2  | GITHUB_TOKEN valid             | OK/FAIL       | login=...             |
+| C3  | bbr build trivial              | OK/FAIL       | ...                   |
+| C4  | bazelisk shim + session tag    | OK/FAIL       | ...                   |
+| C5  | PostToolUse auto-apply         | OK/SKIP       | manually verified?    |
+| C6  | throwaway commit end-to-end    | OK/FAIL       | ...                   |
+| C7  | daemon log clean               | OK/FAIL       | N errors              |
+| C8  | OTLP tracing                   | OK/FAIL       | HTTP <code>           |
+| C9  | bbr analysis cache warm        | OK/WARN/AMBIG | cold=Xs warm=Ys       |
+| CLI1–4 / W1–5                   | ...           | ...                   |
 
-## Report Format
+## Out-of-SPEC diagnostics
 
-After running all checks, produce:
+| ID | Check                          | Status        | Detail                 |
+| -- | ------------------------------ | ------------- | ---------------------- |
+| D1 | web_setup.sh freshness         | OK/STALE/MISS | setup=<sha> head=<sha> |
+| D2 | claude-hooks pin staleness     | OK/BEHIND     | pin=<sha>, CI status   |
+| D3 | origin URL reachable for bbr   | OK/WARN       | origin=...             |
 
-```
-# Web Session Selfcheck — <timestamp>
-
-## Summary
-<one-line: healthy / degraded / broken>
-
-## Checks
-
-| Check                        | Status   | Detail                                          |
-| ---------------------------- | -------- | ----------------------------------------------- |
-| web_setup.sh ran             | OK/FAIL  | ...                                             |
-| web_setup.sh commit          | OK/STALE | setup=<sha12> head=<sha12> (VM reuse risk)      |
-| claude-hooks daemon version  | OK/STALE | pinned=<sha> head=<sha> N commits behind; CI status |
-| Session start hook           | OK/FAIL  | CANARY present / FileNotFoundError              |
-| SOPS_AGE_KEY                 | OK/FAIL  | age public key matches .sops.yaml               |
-| Secret: buildbuddy.yaml      | OK/FAIL| decrypts / API <http_code>          |
-| Secret: github-agent-pat     | OK/FAIL| decrypts / login=agentydragon-agent |
-| Secret: github-ci-read-pat   | OK/FAIL| decrypts / login=agentydragon       |
-| Secret: k8s-token            | OK/FAIL| decrypts / API <http_code>          |
-| Secret: otlp-token           | OK/FAIL| decrypts / API <http_code>          |
-| bbr / BuildBuddy RBE         | OK/FAIL| ...                                 |
-| bbr runner recycling         | OK/WARN/AMBIG | cold=Xs warm=Ys ratio=Z%     |
-| Git hooks (pre-commit)       | OK/FAIL| backend=LOCAL/bbr; live commit pass/fail |
-| Git hooks (commit-msg)       | OK/FAIL| pass_filenames ok; ENFORCE_TEST_TAG state |
-
-## Issues & Remediation
+## Issues & remediation
 
 ### <issue title>
-**Impact**: <what's broken>
+**Spec criterion violated**: <ID>
+**Impact**: <what's broken for the agent>
 **Root cause**: <why>
-**Fix**: <exact commands or steps>
-
-...
+**Fix** (for the user to run, not the skill): <exact commands>
 ```
 
-Prioritize issues by impact: hook failure > stale claude-hooks > credential
-failures > CI pipeline issues.
+Prioritize: SPEC violations first (the daemon is broken), then out-of-SPEC
+diagnostics.