diff --git a/AGENTS.md b/AGENTS.md index 586a1fd941..61bc38306f 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -175,6 +175,26 @@ If you touched `ansible/`, also follow . `plans/`: for future work or work in progress. Once a plan is fully completed, remove it from `plans/` (delete, or squash into short tombstone/summary elsewhere). +### SPEC.md — High-level component specifications + +`/SPEC.md`: high-level, user-facing specification of what a +component guarantees to its users. An outside observer should be able to read +SPEC.md to understand what behaviors they can rely on, without having to read +the implementation. Example: describes +what the Claude Code hook daemon provides to every session, and the +`/web_selfcheck` skill runs the acceptance tests derived from it. + +SPEC.md files **must** be updated when the high-level requirements of the +thing they cover change — a new class of credential gets injected, a new +shim behavior is added, a new profile lands, a new promise is made to the +agent, etc. + +SPEC.md files **must not** record low-level implementation details that an +outside observer would not notice. "Credentials are refreshed regularly by +the backend service" belongs in SPEC.md; "credentials live in +`/creds.json` and rotate every 300s via RPC to +`rotate.example.com`" does not — that belongs in README.md or in the code. + ### TODO Tracking Subprojects use `TODO.md` for persistent TODO tracking. TODOs local to a specific code location are fine as inline comments; cross-cutting or project-level TODOs belong in `TODO.md`. diff --git a/devinfra/claude/README.md b/devinfra/claude/README.md index 97baf3db84..491157ac69 100644 --- a/devinfra/claude/README.md +++ b/devinfra/claude/README.md @@ -45,6 +45,13 @@ By preserving the original proxy env vars: - JWT token refreshes are automatically picked up - The bazelisk shim sends fresh credentials to the daemon on each invocation +## Specification + +See for the high-level, user-facing specification of +what the hook daemon guarantees to every Claude Code session (on CLI and on +web). Read that first if you want to know **what** the daemon does for the +agent — this README covers **how** those behaviors are implemented. + ## Components - **Session Start Hook**: Sets up the development environment for Claude Code web sessions diff --git a/devinfra/claude/hook_daemon/SPEC.md b/devinfra/claude/hook_daemon/SPEC.md new file mode 100644 index 0000000000..95fccb7e93 --- /dev/null +++ b/devinfra/claude/hook_daemon/SPEC.md @@ -0,0 +1,275 @@ +# Hook Daemon Specification + +See @README.md for architectural and implementation details. + +## Overview + +Every Claude Code session — whether it is running in Claude Code CLI on a +developer workstation or in Claude Code on the web inside a sandboxed container +— is paired with a **session-scoped hook daemon**. The daemon is launched by +Claude Code's `SessionStart` hook and lives for the duration of the session. + +Its job is to make every session look the same to the agent: + +- Bazel (via `bbr` / `bazelisk` / `bb`) is wired up to BuildBuddy and works + out of the box: plain `bazelisk build ` or `bb build ` + automatically uses BuildBuddy remote execution and remote cache, with no + extra flags from the agent. +- Credentials the agent needs (BuildBuddy, GitHub, Kubernetes, tracing) are + available in the environment without the agent having to fetch or decrypt + them. +- Dangerous or footgun git operations are blocked by a PATH shim. +- Pre-commit lint/format hooks run automatically on Edit/Write and their + failures are reported back to the agent. +- The `claude-sandbox-kubectl` MCP server is configured to talk to the + cluster as the expected Claude identity. +- Hook activity is traced to the central OpenTelemetry collector. + +The daemon exposes two **profiles** — `cli` and `web` — that differ both in +what the surrounding environment is expected to provide and in which +behaviors are enabled (e.g., the git safety shim and direnv bridge are +CLI-only; egress proxy handling, mkcert, tmpfs, managed credentials, and +idle shutdown are web-only). + +## Common Behaviors (CLI and Web) + +These guarantees hold in every session, regardless of profile. + +### Credentials in the agent's environment + +Every Bash tool call sees: + +- A valid `BUILDBUDDY_API_KEY`. +- A valid `GITHUB_TOKEN` (on web this is the `agentydragon-agent` machine + user; on CLI this is whatever token the user's outer shell already + exposes). +- `DUCKTAPE_OTEL_BEARER_TOKEN` for tracing. + +The agent should never need to decrypt SOPS files or run `gh auth login` +manually — if a credential is missing, the daemon is broken. + +### Bazel / BuildBuddy + +- `bazelisk`, `bb`, and `bbr` on `PATH` are wired to BuildBuddy. +- A plain `bazelisk build ` or `bb build ` automatically + uses BuildBuddy remote execution **and** remote cache out of the box. + The agent does not need to pass `--config=rbe`, `--remote_executor=...`, + `--remote_cache=...`, or any authentication flags. +- BuildBuddy invocations are tagged with the session ID so they can be + filtered later via `bbapi invocation list --tag session:`. +- **`bbr` preserves the Bazel analysis cache across invocations, at least + mostly.** Running `bbr` a second time with the same inputs should + usually land on a warm BuildBuddy runner that has the analysis cache + already populated, so the second build is substantially faster than a + cold one. This is best-effort, not a hard guarantee — today runners are + shared across all concurrent sessions and may be evicted or rotated, + so an occasional cold hit is acceptable. A session where _every_ `bbr` + call is cold is broken. + +### Pre-commit lint & format on Edit/Write + +When the agent edits a file via the Edit or Write tool, the daemon runs the +project's `pre-commit` configuration against the touched files as a +`PostToolUse` hook: + +- Pure format/whitespace hooks (e.g. `ruff-format`) are **auto-applied** + and the fixed file is kept. See the profile YAMLs under for + the full auto-apply list. +- Any other hook that fails blocks the edit: changes made by that hook are + reverted and the failure is reported back to Claude as a `PostToolUse` + block, so Claude can fix and retry. + +### OpenTelemetry tracing + +- Every hook invocation (SessionStart, PreToolUse, PostToolUse, background + tasks) is traced to the central OTLP collector with a bearer token. +- Traces are keyed by session ID so they can be retrieved per session for + debugging. + +### MCP servers + +- The `claude-sandbox-kubectl` MCP server is configured and authenticated so + that `kubectl`-equivalent calls act as the cluster's designated Claude + identity (see <../../../cluster/k8s/agents/claude-rbac/>). The agent + should always prefer it over raw `Bash(kubectl ...)` for `claude-sandbox` + operations. + +### Observability + +- Hook daemon logs are available on disk under the session directory for the + duration of the session (exact path documented in ). +- A session context banner surfaces warnings from setup and background tasks + to the agent at SessionStart. + +## CLI Profile + +The CLI profile targets a developer workstation where the user is already +logged in and has a `nix`/`direnv`-managed devshell. The daemon therefore +relies on the outer environment for most credentials and focuses on safety +rails. + +### What the surrounding environment provides + +- **Credentials come from `.envrc`** (via `direnv`), which sources the + repo's encrypted CLI env script. `BUILDBUDDY_API_KEY`, `GITHUB_TOKEN`, and + `DUCKTAPE_OTEL_BEARER_TOKEN` are expected to already be in the process + environment when Claude Code launches. They reflect the **user's own** + identity (the developer's GitHub PAT, the user's own BuildBuddy key). +- **Kubeconfig comes from `~/.kube/config`** — the user's personal cluster + access. The daemon does not write its own kubeconfig; MCP and `kubectl` + use whatever the user has. +- **The devshell provides `bazelisk`, `bb`, `sops`, `gh`, etc.** on PATH via + Nix home-manager. + +The daemon's job is to propagate those env vars into every Bash tool call +(since Claude Code's Bash tool does not automatically run through direnv) and +to layer the shims on top. + +### CLI-specific guarantees + +- **Git safety shim.** A `git` wrapper on PATH blocks footgun commands: + - `git commit --amend` (prevents rewriting shared history) + - `git add -A` / `git add .` (forces explicit file listing) + - `git stash` (prevents accidental stash-and-forget) + + Blocked commands exit non-zero with a clear error and are never run. + Read-only operations (`git stash list`, `git stash show`) are allowed. + +- **direnv bridge.** Every Bash tool call sees the env exported by the + nearest `.envrc`, so `cd`-ing between subprojects picks up the right + devshell environment. + +### What CLI does NOT do + +- Does not configure an egress proxy. +- Does not set up tmpfs, mkcert, docker, or supervisor. +- Does not write a kubeconfig — the user provides one. +- Does not idle-shutdown. + +## Web Profile + +The Web profile targets Claude Code on the web, running inside a sandboxed +container with TLS-inspecting network egress. The surrounding environment +provides almost nothing beyond a SOPS age key and the agent's container +identity; the daemon is responsible for standing up everything else. + +### What the surrounding environment provides + +- The **`web_setup.sh`** bootstrap script has already run and installed Nix, + devtools, skills, and a `settings.local.json` containing secrets needed by + MCP servers. +- A **SOPS age key** (`SOPS_AGE_KEY`) that can decrypt the repo's + `claude-web` secrets. +- A **TLS-inspecting egress proxy** via `HTTPS_PROXY` / `HTTP_PROXY` with a + periodically-refreshed JWT. The pre-installed TLS inspection CA is present + on the container filesystem. + +### Web-specific guarantees + +- **Managed credentials.** The daemon decrypts SOPS secrets at startup and + injects them into the agent's environment: + - `BUILDBUDDY_API_KEY` — shared BuildBuddy key for the claude-web identity. + - `GITHUB_TOKEN` — the **`agentydragon-agent` machine user PAT**, not a + personal token. The agent commits and pushes as that identity. + - `DUCKTAPE_OTEL_BEARER_TOKEN` — for tracing. + - Kubernetes service account token (see below). + + Credentials that the cluster rotates (e.g., the k8s service account token) + are refreshed regularly so that long-running sessions keep working. The + agent should never see a stale token as a session drags on. + +- **Kubernetes access as `claude-code-web` ServiceAccount.** The daemon + writes a kubeconfig pointing at the cluster API, authenticated as the + `claude-code-web` ServiceAccount. `KUBECONFIG` is exported into the + agent's environment, and the `claude-sandbox-kubectl` MCP server uses the + same identity. Both `kubectl` and MCP calls land with the RBAC documented + in <../../../AGENTS.md>. + +- **GitHub fork remote.** If the machine user has a fork of the repo, the + daemon configures it as a `fork` remote with push credentials, so that + `git push -u fork ` works without further setup. + +- **Network to BuildBuddy works out of the box.** `bazelisk`, `bb`, and + `bbr` reach BuildBuddy and the Bazel Central Registry successfully on + the first invocation. The agent never has to configure CA bundles, + truststores, proxy env vars, or `--remote_proxy` flags to get builds + working over the container's constrained egress. + +- **Container runtime.** Docker, supervisor, and mkcert are set up so that + integration tests that need a local container runtime work. + +- **Tmpfs caching.** Performance-sensitive caches (Bazel output base, Docker + storage when the container root is slow) are backed by tmpfs. From the + agent's perspective this is invisible except that Bazel is not absurdly + slow. + +- **Idle shutdown.** The daemon auto-exits after a period of inactivity so + stale containers don't accumulate. + +### What Web does NOT do + +- Does **not** install the git safety shim. (Web sessions push to a fork, + not to `devel`, so `git amend`/`add -A` are less dangerous. If this ever + changes, update this file.) + +## Observable Acceptance Criteria + +These are the checks that the `/web_selfcheck` skill effectively runs as an +acceptance test against a live session. A healthy session satisfies all +applicable criteria for its profile. + +### Common + +1. `echo $BUILDBUDDY_API_KEY` is non-empty, and a GetUser RPC against + `remote.buildbuddy.io` authenticates successfully. +2. `echo $GITHUB_TOKEN` is non-empty, and `GET https://api.github.com/user` + returns the expected login (`agentydragon-agent` on web, the developer's + own login on CLI). +3. `bbr build --nobuild` succeeds without TLS or proxy + errors. +4. `bazelisk` on PATH points at the daemon's shim, and invocations are + tagged with `session:` in BuildBuddy. +5. Editing a Python file via Write/Edit triggers `ruff-format` + (auto-applied) and, on a lint violation, the edit is blocked with a + clear reason. +6. Pre-commit runs end to end: a throwaway commit on a scratch branch + passes all hooks. +7. Hook daemon logs are present and contain no unhandled exceptions from + SessionStart. +8. Tracing reaches the OTLP collector (the bearer token test returns a + non-auth-error status). +9. Running `bbr build ` twice in a row with identical inputs + lands on a warm runner the second time: the second invocation's + analysis phase is substantially faster than the first (rule of thumb: + warm < cold / 3). This is best-effort; a single cold-hit failure can + be transient (runner rotation, cache eviction), but consistent + cold-every-time across repeated runs is a daemon bug. + +### CLI only + +1. `git commit --amend` fails with a `[git-shim] BLOCKED` error. +2. `git add -A` / `git add .` fails with a `[git-shim] BLOCKED` error. +3. `git stash` (without `list`/`show`) fails with a `[git-shim] BLOCKED` + error. +4. A `cd` into a subproject with its own `.envrc` propagates the expected + env vars into the next Bash tool call. + +### Web only + +1. `kubectl get pods -n claude-sandbox` works, authenticated as + `claude-code-web`. The `claude-sandbox-kubectl` MCP server returns the + same pod list. +2. `$GITHUB_TOKEN` resolves to the `agentydragon-agent` machine user (not a + personal account). +3. `git remote -v` shows a `fork` remote with push access to the machine + user's fork. +4. `bbr build ` works out of the box. No extra flags, no + manual `git remote` setup, no prompt asking the user to pick a remote — + the default remote is selected automatically, and the build reaches a + BuildBuddy runner on the first try. +5. Docker is available (`docker info` succeeds) for tests that need a local + container runtime. + +Anything that fails these criteria is a daemon bug, not a user problem. The +`/web_selfcheck` skill is the canonical runnable acceptance test for this +spec. diff --git a/devinfra/claude/hook_daemon/bes_interceptor.py b/devinfra/claude/hook_daemon/bes_interceptor.py index 436b9ae890..63d41ef0f2 100644 --- a/devinfra/claude/hook_daemon/bes_interceptor.py +++ b/devinfra/claude/hook_daemon/bes_interceptor.py @@ -7,6 +7,10 @@ If a build/test invocation lacks --remote_executor, a mailbox message is posted to the session nudging the agent toward `bb remote`. + +TODO: this nudge behavior is experimental and deliberately NOT in SPEC.md yet. +If it proves reliable and useful, promote it to a committed behavior under +"Common Behaviors" in . """ from __future__ import annotations diff --git a/skills/web_selfcheck/SKILL.md b/skills/web_selfcheck/SKILL.md index 41e2bb6103..a84e14b03e 100644 --- a/skills/web_selfcheck/SKILL.md +++ b/skills/web_selfcheck/SKILL.md @@ -1,656 +1,370 @@ --- name: web_selfcheck description: > - Diagnose the health of a Claude Code web session — checks whether - web_setup.sh ran, whether the session start hook succeeded, whether - the installed claude-hooks package is stale relative to the repo, - whether each SOPS-encrypted credential is decryptable and live-tests - each one against its upstream API, and whether ducktape git hooks - (pre-commit, commit-msg) actually pass when committing. Reports what's - broken and how to fix it. Use when the user asks "did setup go ok", - "why isn't bbr working", "check credentials", "selfcheck", "why do my - commits fail", or any question about web session health. + Diagnose the health of a Claude Code session (CLI or web) by running the + observable acceptance criteria in the hook daemon SPEC against the live + session. Also runs out-of-SPEC diagnostics (web_setup.sh freshness, + claude-hooks pin staleness, bbr runner recycling, git hook origin-URL + issues). Use when the user asks "did setup go ok", "why isn't bbr + working", "check credentials", "selfcheck", "why do my commits fail", or + any question about session health. --- -# Web Session Selfcheck +# Session Selfcheck -Comprehensive health check for a Claude Code web session. Run all checks, -then produce a single structured report with clear pass/fail status and -actionable remediation steps for anything that's broken. +This skill is the **runnable acceptance test** for the hook daemon +specification at <../../devinfra/claude/hook_daemon/SPEC.md>. -## CRITICAL: Observe Only — Do NOT Fix Without Explicit User Approval +## How to use this skill + +1. **Read SPEC.md first.** It enumerates every behavior a healthy session + must satisfy, split into `### Common`, `### CLI only`, and + `### Web only` under the `## Observable Acceptance Criteria` heading. + The SPEC is the source of truth. If the SPEC and this skill disagree, + the SPEC wins — update the skill. +2. **Detect the profile.** `$DUCKTAPE_CLAUDE_HOOKS_PROFILE` (or the file + path that the daemon was launched with) tells you whether to run the + CLI or Web criteria. Always run the Common criteria. +3. **For each SPEC criterion, run the matching check** from the + "SPEC acceptance checks" section below. +4. **Then run the out-of-SPEC diagnostics** section, which catches + real-world failure modes the SPEC does not (yet) codify. +5. **Produce the report** using the format at the end. + +Run all `Bash` commands with `dangerouslyDisableSandbox: true` (needs +network and filesystem access outside the sandbox). Run independent checks +in parallel where possible. + +## CRITICAL: observe only — do NOT fix without explicit user approval This is a **diagnostic skill**. Treat a broken session like a crime scene: observe, document, and report — do not touch. **Do NOT run any remediation commands** (e.g. `web_setup.sh`, re-triggering -SessionStart, sourcing env files, installing packages) unless the user -explicitly says to proceed. The "Fix" blocks throughout this skill are -documentation of what _could_ be done — they are **not instructions for you -to execute**. Report your findings and wait for a go-ahead. - -**Exception — debugging workarounds**: when the session hooks are broken and -you are actively debugging or documenting (e.g. committing this very skill), -the following lightweight workarounds are acceptable without explicit approval: - -- Committing with hooks bypassed: `git commit --no-verify` (or point hooks at - `/dev/null` temporarily) to record diagnostic work while hooks are broken +SessionStart, sourcing env files, installing packages, re-running +`git remote add`) unless the user explicitly says to proceed. If a check +fails, the fix is "the daemon is broken, tell the user" — not "let me +work around it." + +**Exception — debugging workarounds**: when the session hooks are +demonstrably broken and you are actively debugging or documenting, the +following lightweight workarounds are acceptable without explicit +approval: + +- Committing with hooks bypassed: `git commit --no-verify` to record + diagnostic work while hooks are broken - Unsetting `BUILDBUDDY_API_KEY` to force local bazel when bbr is broken -- Creating a `bazel` wrapper in the session bin that injects `--bazelrc` when - the session bazelrc exists but the shim is missing - -Run all `Bash` commands with `dangerouslyDisableSandbox: true` (needs network -and filesystem access outside the sandbox). +- Creating a `bazel` wrapper in the session bin that injects `--bazelrc` + when the session bazelrc exists but the shim is missing -## What to Check +## SPEC acceptance checks -Run all checks in parallel where possible. - ---- +Each check below corresponds one-to-one with a numbered criterion in +SPEC.md. Cross-reference the SPEC for the authoritative statement of what +the check is verifying. -### 1. web_setup.sh +### Common -**Goal**: confirm Nix and the `devtools` profile were installed successfully, -and that setup ran from the current repo commit. - -**VM reuse warning**: Anthropic reuses Firecracker microVMs between sessions. -`/tmp` persists across reuses, so `/tmp/web-setup.log` may be from a prior -session running an older version of `web_setup.sh`. Always verify the setup -commit matches the current repo HEAD — a stale setup means Nix devtools and -skills may not match what the current code expects. +**C1 — BUILDBUDDY_API_KEY is present and valid.** ```bash -# Was it run at all? -ls -la /tmp/web-setup.log 2>/dev/null || echo "MISSING" -# Did it succeed? (last line should be "Setup complete.") -tail -5 /tmp/web-setup.log 2>/dev/null -# Was it recent? (mtime) -stat -c '%y' /tmp/web-setup.log 2>/dev/null -# Did Nix install? -nix --version 2>/dev/null || echo "nix not found" -# Is the devtools profile active? -nix profile list 2>/dev/null | grep -E 'devtools|claude-hooks' | head -5 || echo "no devtools profile" - -# What commit did web_setup.sh run from? -grep 'web_setup.sh commit:' /tmp/web-setup.log 2>/dev/null | tail -1 || echo "commit not logged (old web_setup.sh)" -# Current repo HEAD -git -C /home/user/ducktape rev-parse HEAD - -# Do they match? -SETUP_COMMIT=$(grep 'web_setup.sh commit:' /tmp/web-setup.log 2>/dev/null | tail -1 | grep -oE '[0-9a-f]{40}' || echo '') -HEAD_COMMIT=$(git -C /home/user/ducktape rev-parse HEAD) -if [ -z "$SETUP_COMMIT" ]; then - echo "UNKNOWN: web_setup.sh predates commit logging — assume STALE" -elif [ "$SETUP_COMMIT" = "$HEAD_COMMIT" ]; then - echo "OK: setup commit matches HEAD ($HEAD_COMMIT)" -else - echo "STALE: setup ran from ${SETUP_COMMIT:0:12}, HEAD is ${HEAD_COMMIT:0:12}" -fi - -# What env var keys were present when web_setup.sh ran? -grep -A200 'environment keys' /tmp/web-setup.log 2>/dev/null | grep -B200 '^---$' | grep -v '^---' -``` - -**Failure indicators**: log missing, last line not "Setup complete", nix not -found, devtools not in profile list, setup commit doesn't match HEAD. - -**Fix**: re-run setup from the Claude Code web UI setup command: - -``` -bash ducktape/devinfra/claude/web_setup.sh +[ -n "${BUILDBUDDY_API_KEY:-}" ] || echo "FAIL: BUILDBUDDY_API_KEY unset" +curl -s -o /dev/null -w "%{http_code}\n" \ + -H "x-buildbuddy-api-key: ${BUILDBUDDY_API_KEY}" \ + -H "Content-Type: application/proto" \ + --data-binary '' \ + https://remote.buildbuddy.io/rpc/BuildBuddyService/GetUser ``` -If re-running, note that `SOPS_AGE_KEY` is typically not available when -`web_setup.sh` runs — all SOPS decryptions will fail. Secrets are instead -decrypted by the session start hook daemon (which inherits `SOPS_AGE_KEY` -from the container after k8s injects it). This is expected and not a bug. - ---- - -### 2. claude-hooks Daemon Version - -**Goal**: check whether the installed `claude-hooks` daemon matches the current -repo code, how far behind it is, and whether any breaking changes have landed -since the pinned commit. +Pass: `200` (or `400` = malformed proto but auth passed). `401`/`403` = +invalid key. -Do this early — a stale daemon is often the root cause of session hook failures. +**C2 — GITHUB_TOKEN is present and valid.** ```bash -# Pinned commit (what's actually installed) -python3 -c " -import json, re -pins = json.load(open('/home/user/ducktape/npins/sources.json'))['pins'] -url = pins.get('claude-hooks', {}).get('url', '') -m = re.search(r'claude-hooks-([0-9a-f]+)', url) -print('pinned commit:', m.group(1) if m else 'unknown') -print('pin url:', url[:100]) -" - -# Current HEAD of the repo -git -C /home/user/ducktape rev-parse --short HEAD -git -C /home/user/ducktape log --oneline -1 - -# How many devinfra/claude/ commits have landed since the pin was last updated? -git -C /home/user/ducktape log --oneline -10 -- devinfra/claude/ npins/sources.json - -# When was the pin last updated? -git -C /home/user/ducktape log --oneline -3 -- npins/sources.json +curl -s -H "Authorization: Bearer ${GITHUB_TOKEN}" https://api.github.com/user \ + | python3 -c "import sys,json; d=json.load(sys.stdin); print('login:', d.get('login'), 'message:', d.get('message',''))" ``` -**If the pin is behind HEAD**, diff the installed package against the repo -source to spot breaking changes: look at the git log for `devinfra/claude/` -since the pinned commit, read the relevant changed files in both the installed -Nix store package and the repo, and use your judgement to assess whether any -of those changes are likely to cause incompatibility with the running session. -Report any suspicious mismatches (e.g. renamed classes, changed config file -paths, new required fields, removed hooks). - -Also check GitHub CI on `agentydragon/ducktape` to -understand whether an update is expected soon or something is wedged. Look at: - -- Recent `release.yml` runs on `devel` — did it pass after the relevant commit? -- Recent `sync-pins.yml` runs — did it succeed and push? -- Recent `ci.yml` runs on `devel` — any blocking test failures? - -For each, report: last run status, when it ran, and if failed, what failed. +Pass on web: `login: agentydragon-agent`. Pass on CLI: the user's own +GitHub login. `Bad credentials` = expired/revoked. -**Interpretation**: - -- `release.yml` failing → new daemon won't be released; find the failing test/step -- `sync-pins.yml` not running or failing → pin won't auto-update -- CI tests failing on `devel` → release is blocked until tests are fixed - -**Suggested fix** (do not run — report to user): - -1. If `release.yml` recently passed after the relevant commit: `sync-pins.yml` - will update the pin within 30 min; wait or trigger manually -2. If `release.yml` hasn't run or failed: identify and fix the blocking issue on `devel` -3. Once pin is updated and merged, re-run `web_setup.sh` - ---- - -### 3. Session Start Hook - -**Goal**: confirm the session start hook ran successfully and wrote the env file. +**C3 — `bbr build ` succeeds without TLS or proxy errors.** ```bash -# Find live session ID (from hook_daemon process) -LIVE=$(ps aux | grep hook_daemon | grep -v grep | grep -oP '(?<=--sock /tmp/claude-hd/)[^/]+' | head -1) -echo "live session: $LIVE" - -# Check env file (presence + CANARY marker = success) -head -3 ~/.claude/session-env/$LIVE/sessionstart-hook-0.sh 2>/dev/null || echo "ENV FILE MISSING" - -# Check daemon log for errors -grep -E 'ERROR|Exception|FileNotFoundError|sessionstart|SessionStart' \ - ~/.claude/session-env/$LIVE/hook-daemon/daemon.log 2>/dev/null | tail -20 - -# Is BUILDBUDDY_API_KEY set? -echo "BUILDBUDDY_API_KEY in env: $([ -n "${BUILDBUDDY_API_KEY:-}" ] && echo YES || echo NO)" - -# Is the auth proxy running? -ls ~/.claude/session-env/$LIVE/auth-proxy/combined_ca.pem 2>/dev/null && echo "CA present" || echo "CA MISSING" -ls ~/.claude/session-env/$LIVE/bazelrc 2>/dev/null && echo "session bazelrc present" || echo "BAZELRC MISSING" - -# Is the git proxy shim running? (bbr connects via 127.0.0.1:35233) -ss -tlnp 2>/dev/null | grep 35233 || echo "git proxy NOT listening on 35233" +cd /home/user/ducktape +bbr build //devinfra:gazelle --nobuild 2>&1 | tail -5 ``` -**Suggested fix if env file is missing** (do not run — report to user): +Pass: exit 0 with no `Unable to resolve host`, `certificate`, +`127.0.0.1:*`, or proxy errors. -Re-trigger SessionStart on the live daemon: +**C4 — bazelisk shim is active and invocations are session-tagged.** ```bash -LIVE= -SOCK=/tmp/claude-hd/$LIVE/d.sock -python3.13 -c " -import json, os -env = dict(os.environ) -env['CLAUDE_ENV_FILE'] = f'/root/.claude/session-env/$LIVE/sessionstart-hook-0.sh' -env['CLAUDE_PROJECT_DIR'] = '/home/user/ducktape' -env['CLAUDE_CODE_REMOTE'] = 'true' -print(json.dumps({'hook': {'hook_event_name': 'SessionStart', 'session_id': '$LIVE', - 'cwd': '/home/user/ducktape', 'transcript_path': '/tmp/transcript.json', - 'source': 'startup'}, 'env': env})) -" | curl -s --max-time 300 --unix-socket $SOCK http://localhost/hook -X POST \ - -H 'Content-Type: application/json' -d @- -source ~/.claude/session-env/$LIVE/sessionstart-hook-0.sh +ls -l "$(command -v bazelisk)" # must point into $DUCKTAPE_CLAUDE_HOOKS_SESSION_DIR/bin +grep -E 'build_metadata|TAGS' "$DUCKTAPE_CLAUDE_HOOKS_SESSION_DIR/bbr.bazelrc" 2>/dev/null ``` -**Manual fallback** (if daemon is down or still broken after fix): - -```bash -source /home/user/ducktape/devinfra/secrets/web_env.sh -mkdir -p ~/.config/bazel -cat > ~/.config/bazel/buildbuddy.bazelrc <` metadata. ---- +**C5 — PostToolUse pre-commit auto-apply works.** -### 4. Credentials — SOPS Decryption +Hard to automate without actually exercising Edit/Write. Report this as +"manually verified" if you have just edited a file in this session and +observed `ruff-format` apply, or as "NOT TESTED" otherwise. Do not fake +this check. -**Goal**: confirm `SOPS_AGE_KEY` is present and can decrypt all claude-web secrets. +**C6 — throwaway-commit pre-commit end-to-end.** ```bash -echo "SOPS_AGE_KEY present: $([ -n "${SOPS_AGE_KEY:-}" ] && echo YES || echo NO)" -echo "Age public key: $(echo "${SOPS_AGE_KEY:-}" | age-keygen -y 2>/dev/null || echo 'age-keygen not found')" - -# Expected public key from .sops.yaml (claude-web entry): -grep 'claude-web' /home/user/ducktape/.sops.yaml - -for f in \ - secrets/buildbuddy.yaml \ - secrets/github-pat-agentydragon-agent.yaml \ - secrets/github-ci-read-pat.yaml \ - secrets/alloy-otlp-bearer-token.yaml \ - secrets/claude-web-k8s-token.yaml \ - secrets/docker-ci/client-key.sops.pem; do - result=$(sops -d /home/user/ducktape/$f 2>&1 | head -1) - if echo "$result" | grep -qE 'FAILED|failed|error|Error'; then - echo "FAIL: $f — $result" - else - echo "OK: $f" - fi -done +set -e +cd /home/user/ducktape +TEST_BRANCH="selfcheck/$(date +%s)" +git checkout -q -b "$TEST_BRANCH" +TEST_FILE=$(mktemp /home/user/ducktape/selfcheck-XXXXX.txt) +echo "selfcheck $(date -Iseconds)" > "$TEST_FILE" +git add "$TEST_FILE" +git commit -m "test: selfcheck — delete me" 2>&1 +EXIT=$? +git checkout -q - +git branch -D "$TEST_BRANCH" +rm -f "$TEST_FILE" +echo "exit: $EXIT" ``` -**Failure indicator**: any `FAIL` line, or `SOPS_AGE_KEY` not present. +Pass: exit 0. -**Fix**: if `SOPS_AGE_KEY` is missing, the session didn't receive the age -private key at startup. This is injected from the `claude-sandbox` k8s Secret -by the container runtime. Check whether the k8s Secret exists: +**C7 — hook daemon logs present, no unhandled exceptions.** ```bash -kubectl -n claude-sandbox get secret claude-web-age-key 2>/dev/null +LOG="$DUCKTAPE_CLAUDE_HOOKS_SESSION_DIR/hook-daemon/daemon.log" +[ -f "$LOG" ] && grep -cE 'ERROR|Traceback|Exception' "$LOG" || echo MISSING ``` ---- +Pass: log exists, zero matches (or only expected warnings — use judgement). -### 5. Credentials — Live API Tests - -Run each live test and capture HTTP status / response content. - -#### BuildBuddy API Key +**C8 — OTLP tracing reaches the collector.** ```bash -BB_KEY=$(sops -d /home/user/ducktape/secrets/buildbuddy.yaml 2>/dev/null \ - | awk '/buildbuddy_api_key:/ {print $2}') -# Test via bbapi (needs BUILDBUDDY_API_KEY in env) -export BUILDBUDDY_API_KEY="$BB_KEY" -curl -s -o /dev/null -w "%{http_code}" \ - -H "x-buildbuddy-api-key: $BB_KEY" \ - -H "Content-Type: application/proto" \ - "https://remote.buildbuddy.io/rpc/BuildBuddyService/GetUser" \ - --data-binary '' +curl -s -o /dev/null -w "%{http_code}\n" \ + -H "Authorization: Bearer ${DUCKTAPE_OTEL_BEARER_TOKEN}" \ + -H "Content-Type: application/json" -d '{}' \ + https://alloy-otlp.allegedly.works/v1/traces ``` -Expected: `200` (or `400` for malformed proto — means auth passed). -`401`/`403` means key is invalid or expired. +Pass: `200` or `400` (bad proto = auth passed). `401` = token rotated or +missing. + +**C9 — `bbr` preserves the analysis cache on a second identical run.** -**Fix if invalid**: regenerate key in BuildBuddy org settings, re-encrypt into -`secrets/buildbuddy.yaml`, push to `devel`, wait for `sync-pins.yml`. +Low-precision, high-recall sensor with a high false-positive rate (runner +rotation, BB server restart, cache eviction can all cause transient cold +hits). Report the finding but don't act on a single failure. Stop early +rather than spending many minutes retrying. -#### GitHub Agent PAT (`agentydragon-agent`) +**Method — cache poisoning**: append a comment to `MODULE.bazel` so the +first build is guaranteed cold, then time an immediately-following +identical build. The SPEC permits occasional cold-hits; only flag if +warm ≈ cold across **two** repeated runs. ```bash -GH_TOKEN=$(sops -d /home/user/ducktape/secrets/github-pat-agentydragon-agent.yaml 2>/dev/null \ - | awk '/github_token:/ {print $2}') -curl -s -H "Authorization: Bearer $GH_TOKEN" https://api.github.com/user \ - | python3 -c "import sys,json; d=json.load(sys.stdin); print('login:', d.get('login'), 'message:', d.get('message',''))" +cd /home/user/ducktape +echo "# selfcheck-poison-$(date +%s)" >> MODULE.bazel +T1_START=$(date +%s%N) +bbr build //... --nobuild 2>&1 | tail -3 +T1_SEC=$(( ($(date +%s%N) - T1_START) / 1000000000 )) +T2_START=$(date +%s%N) +bbr build //... --nobuild 2>&1 | tail -3 +T2_SEC=$(( ($(date +%s%N) - T2_START) / 1000000000 )) +git checkout -- MODULE.bazel +echo "cold=${T1_SEC}s warm=${T2_SEC}s" ``` -Expected: `login: agentydragon-agent`. -`Bad credentials` or `Requires authentication` means token expired/revoked. +Interpret: `warm < cold/3` = recycling works. `warm ≈ cold` = likely not +recycling (but re-run before diagnosing — high FP rate). `cold < 5s` = +build graph too small to measure. If consistently warm≈cold across two +runs, inspect `bbapi invocation ` for runner IDs. -**Fix**: generate new PAT for `agentydragon-agent` machine user (Settings → -Developer Settings → Personal Access Tokens), re-encrypt into -`secrets/github-pat-agentydragon-agent.yaml`, push to `devel`. +### CLI only -#### GitHub CI Read PAT (`agentydragon` fine-grained) +**CLI1 — `git commit --amend` is blocked by the git shim.** ```bash -GH_CI=$(sops -d /home/user/ducktape/secrets/github-ci-read-pat.yaml 2>/dev/null \ - | awk '/github_token:/ {print $2}') -curl -s -H "Authorization: Bearer $GH_CI" https://api.github.com/user \ - | python3 -c "import sys,json; d=json.load(sys.stdin); print('login:', d.get('login'), 'message:', d.get('message',''))" +(cd /tmp && git init -q selfcheck && cd selfcheck && \ + git commit --allow-empty -m init -q 2>/dev/null && \ + git commit --amend --no-edit 2>&1 | grep -c '\[git-shim\] BLOCKED') +rm -rf /tmp/selfcheck ``` -Expected: `login: agentydragon`. +Pass: `1`. -#### K8s Service Account Token +**CLI2 — `git add -A` / `git add .` is blocked.** ```bash -K8S_TOKEN=$(sops -d /home/user/ducktape/secrets/claude-web-k8s-token.yaml 2>/dev/null \ - | awk '/k8s_token:/ {print $2}') -curl -sk -o /dev/null -w "%{http_code}" \ - -H "Authorization: Bearer $K8S_TOKEN" \ - "https://api.allegedly.works:16443/api/v1/namespaces/claude-sandbox" +(cd /tmp && mkdir -p selfcheck2 && cd selfcheck2 && \ + git init -q && git add -A 2>&1 | grep -c '\[git-shim\] BLOCKED') +rm -rf /tmp/selfcheck2 ``` -Expected: `200`. `401` means the token was rotated and the SOPS file wasn't -updated yet. +Pass: `1`. -**Note**: this token is **auto-rotated by an in-cluster CronJob**. The SOPS -file should be updated automatically. If it returns 401, check: +**CLI3 — `git stash` is blocked (but `list` / `show` allowed).** ```bash -# Check CronJob last run and next run -kubectl -n default get cronjob claude-web-token-rotator -o yaml 2>/dev/null | grep -E 'lastScheduleTime|schedule' -kubectl -n default get jobs -l app=claude-web-token-rotator 2>/dev/null | tail -5 +git stash 2>&1 | grep -c '\[git-shim\] BLOCKED' +git stash list 2>&1 | grep -c '\[git-shim\] BLOCKED' # must be 0 ``` -#### OTLP Bearer Token (Grafana Alloy) +**CLI4 — direnv bridge propagates `.envrc` exports into Bash tool calls.** ```bash -OTLP_TOKEN=$(sops -d /home/user/ducktape/secrets/alloy-otlp-bearer-token.yaml 2>/dev/null \ - | awk '/token:/ {print $2}') -curl -s -o /dev/null -w "%{http_code}" \ - -H "Authorization: Bearer $OTLP_TOKEN" \ - -H "Content-Type: application/json" \ - -d '{}' \ - "https://alloy-otlp.allegedly.works/v1/traces" +# Expect a representative env var (e.g. one set only by .envrc) to appear +# after cd into a subproject that has one. +cd /home/user/ducktape && env | grep -c '^DUCKTAPE_PRECOMMIT_ENFORCE_TEST_TAG=' ``` -Expected: `200` or `400` (bad proto = auth passed). `401` means token -was rotated. **Fix**: bump `rotation_version` in -`cluster/terraform/gitops/alloy-otlp-bearer-token/`, apply with `tofu`. +Pass: `1` (or whatever var your `.envrc` exports). ---- +### Web only -### 6. bbr / BuildBuddy RBE +**W1 — `kubectl` works as `claude-code-web`; MCP returns the same pods.** ```bash -# Set key from SOPS if not already in env -[ -z "${BUILDBUDDY_API_KEY:-}" ] && \ - export BUILDBUDDY_API_KEY=$(sops -d /home/user/ducktape/secrets/buildbuddy.yaml 2>/dev/null \ - | awk '/buildbuddy_api_key:/ {print $2}') - -# Fix origin/HEAD if missing (needed by bbr) -git -C /home/user/ducktape remote set-head origin --auto 2>/dev/null || true - -# Test bbr connectivity (dry run) -bbr build //devinfra:gazelle --nobuild 2>&1 | tail -5 +kubectl -n claude-sandbox get pods 2>&1 | tail -5 ``` -**Failure: `cannot connect to 127.0.0.1:35233`** → session start hook didn't -run; the git proxy shim is not running. Follow session start hook fix above. +Then invoke the `mcp__claude-sandbox-kubectl__pods_list_in_namespace` tool +with `namespace=claude-sandbox` and compare. Pass: both succeed and agree. -**Failure: `Unable to resolve host remote.buildbuddy.io`** → TLS proxy/CA -issue; session start hook didn't set up auth proxy. Follow session start hook -fix above. +**W2 — `$GITHUB_TOKEN` identifies as `agentydragon-agent`.** ---- +Covered by C2 on web — no separate check. -### 6b. BuildBuddy Runner Recycling & Analysis Cache +**W3 — `fork` remote is configured with push access.** -**Goal**: Confirm that `bbr` reuses the same BuildBuddy runner VM between -invocations so the Bazel analysis cache is warm for subsequent builds. A -recycled runner means the second build completes analysis significantly faster -than the first cold build. +```bash +git -C /home/user/ducktape remote -v | grep '^fork' +``` -**Calibration note**: This is a low-precision, high-recall sensor with a high -false-positive rate. A single "not recycling" result may be transient (runner -rotation, BB server restart, cache eviction). Report the finding but don't act -on a single failure. Stop early rather than spending many minutes trying. +Pass: `fork` appears with a URL the machine user can push to. If absent +with no warning in the context banner, the fork-remote background task +failed. -**Method — cache poisoning**: Append a comment to `MODULE.bazel` so the first -build is guaranteed cold even if the runner was already warm, then measure -whether the immediately-following identical build is significantly faster. +**W4 — `bbr build ` works out of the box, no manual remote +setup, no remote picker.** ```bash cd /home/user/ducktape - -# Step 1: Poison the analysis cache. -# Any uncommmitted change to MODULE.bazel forces Bazel to re-analyse from -# scratch on the runner, even if it was already warm. -POISON_MARKER="# selfcheck-poison-$(date +%s)" -echo "$POISON_MARKER" >> MODULE.bazel - -# Step 2: Cold build. MODULE.bazel changed → Bazel must re-analyse everything. -# --nobuild skips compilation; we only care about analysis time. -# Cold analysis of //... can take 5–20 minutes on this repo — be patient. -# If it hangs with no output for >10 minutes, abort (Ctrl-C), restore -# MODULE.bazel with `git checkout -- MODULE.bazel`, and skip this check. -echo "--- Cold build (poisoned MODULE.bazel) ---" -T1_START=$(date +%s%N) -bbr build //... --nobuild 2>&1 | tail -5 -BBR_EXIT=$? -T1_END=$(date +%s%N) -T1_SEC=$(( (T1_END - T1_START) / 1000000000 )) -echo "Cold build: ${T1_SEC}s (exit $BBR_EXIT)" - -if [ $BBR_EXIT -ne 0 ]; then - git checkout -- MODULE.bazel - echo "SKIP: Cold build failed — skipping cache warmth check." -else - # Step 3: Warm build. Same poisoned MODULE.bazel, same runner expected. - echo "--- Warm build (same inputs, runner should be recycled) ---" - T2_START=$(date +%s%N) - bbr build //... --nobuild 2>&1 | tail -5 - T2_SEC=$(( ($(date +%s%N) - T2_START) / 1000000000 )) - echo "Warm build: ${T2_SEC}s" - - # Step 4: Restore MODULE.bazel. - git checkout -- MODULE.bazel - - # Step 5: Assess. - echo "--- Result: cold=${T1_SEC}s warm=${T2_SEC}s ---" - if [ "$T1_SEC" -lt 5 ]; then - echo "AMBIGUOUS: Cold build was <5s — build graph may be too small to" - echo " measure, or poisoning had no effect on this runner." - elif [ "$T2_SEC" -lt $(( T1_SEC / 3 )) ]; then - echo "OK: Warm build (${T2_SEC}s) < 1/3 of cold (${T1_SEC}s)." - echo " Runner recycling + analysis cache reuse is working." - else - echo "WARN: Warm build (${T2_SEC}s) is not much faster than cold (${T1_SEC}s)." - echo " Runner may not be recycled, or analysis cache is evicted." - echo " High false-positive rate — verify with a second run before diagnosing." - fi -fi +# Run interactively — the test fails if bb prints a remote picker prompt +# or errors on missing git config. +timeout 60 bbr build //devinfra:gazelle --nobuild 2>&1 | tail -10 ``` -**Interpreting results:** - -| Warm / Cold ratio | Interpretation | -| ----------------- | ------------------------------------------------------------- | -| < 33% | ✅ Runner recycling and analysis cache are working | -| 33–70% | ⚠️ Partial benefit; runner may be rotating | -| > 70% | ⚠️ Likely no recycling — but check again, high FP rate | -| Cold < 5s | ❓ Ambiguous — build graph too small or poisoning ineffective | - -**If consistently warm ≈ cold across two runs**: check the BuildBuddy run UI -(`bbapi invocation `) to see if runner IDs differ between invocations. If -they do, BB is not reusing runners — this may be a BB configuration or quota -issue. If runner IDs are the same but analysis is still slow, the Bazel server -on the runner may be restarting between invocations. - ---- - -### 7. Ducktape Git Hooks - -**Goal**: confirm pre-commit hooks actually pass when committing. Hooks break in -web sessions due to `bbr` using the session's local git proxy URL -(`127.0.0.1:*`) that the BuildBuddy cloud runner can't reach, or due to -`DUCKTAPE_PRECOMMIT_ENFORCE_TEST_TAG` being active while the commit-msg hook -receives no argv. +Pass: exit 0, no "which remote" prompt, no `Unable to resolve host`, no +`127.0.0.1:*` in the runner's origin URL. -#### 7a. Framework & installation +**W5 — Docker is available.** ```bash -# Are the git hook shims installed? -ls -la /home/user/ducktape/.git/hooks/pre-commit \ - /home/user/ducktape/.git/hooks/commit-msg 2>/dev/null || echo "HOOKS NOT INSTALLED" +docker info >/dev/null 2>&1 && echo OK || echo FAIL +``` -# What version of pre-commit? -pre-commit --version 2>/dev/null || echo "pre-commit not found" +## Out-of-SPEC diagnostics -# Which backend will detect_bazel_backend() pick? -python3 -c " -import os, shutil -bb = shutil.which('bbr') -key = os.environ.get('BUILDBUDDY_API_KEY') -print('bbr on PATH:', bool(bb)) -print('BUILDBUDDY_API_KEY set:', bool(key)) -print('=> backend:', 'BUILDBUDDY (bbr)' if bb and key else 'LOCAL (bazel)') -" +These are not in SPEC.md but catch real-world failure modes. Include them +in the report under a separate "Diagnostics" heading. -# Active test-tag enforcement? -echo "DUCKTAPE_PRECOMMIT_ENFORCE_TEST_TAG=${DUCKTAPE_PRECOMMIT_ENFORCE_TEST_TAG:-}" -``` +### D1 — `web_setup.sh` freshness (web only) -#### 7b. bbr git remote URL - -When the backend is `BUILDBUDDY`, `bb remote` reads `git remote -v` locally and -sends the remote URL to the cloud runner. If `origin` is `127.0.0.1:*` (Claude -Code web session proxy), the runner can't reach it. +Anthropic reuses Firecracker microVMs; `/tmp/web-setup.log` may be from a +prior session running an older `web_setup.sh`. A stale setup means Nix +devtools and skills may not match the current code. ```bash -git -C /home/user/ducktape remote -v | head -4 - -# Is origin a local proxy? -ORIGIN_URL=$(git -C /home/user/ducktape remote get-url origin 2>/dev/null) -if echo "$ORIGIN_URL" | grep -qE '127\.0\.0\.1|localhost'; then - echo "WARN: origin is a local proxy ($ORIGIN_URL)" - echo " BuildBuddy cloud runner cannot reach this — bbr queries will fail" - echo "FIX: git remote add github https://github.com/agentydragon/ducktape" - echo " git config buildbuddy.remote-bazel-default-remote github" -else - echo "OK: origin is externally reachable ($ORIGIN_URL)" -fi - -# Is the bb default remote override already set? -git -C /home/user/ducktape config buildbuddy.remote-bazel-default-remote 2>/dev/null \ - && echo "(buildbuddy.remote-bazel-default-remote override is set)" \ - || echo "(no remote override — bb will use origin)" +ls -la /tmp/web-setup.log 2>/dev/null || echo "MISSING" +tail -3 /tmp/web-setup.log 2>/dev/null # last line should be "Setup complete." +SETUP_COMMIT=$(grep 'web_setup.sh commit:' /tmp/web-setup.log 2>/dev/null | tail -1 | grep -oE '[0-9a-f]{40}') +HEAD_COMMIT=$(git -C /home/user/ducktape rev-parse HEAD) +[ "$SETUP_COMMIT" = "$HEAD_COMMIT" ] && echo "OK" || echo "STALE: setup=$SETUP_COMMIT head=$HEAD_COMMIT" ``` -#### 7c. Direct hook invocation +### D2 — `claude-hooks` daemon pin staleness -Test each hook binary directly, bypassing the full `git commit` flow. +A stale installed daemon is often the root cause of session hook failures. +Check the pinned commit against HEAD, and check whether release CI is +passing so that the pin would move if you waited. ```bash -# --- pytest-main-check --- -# Unset BUILDBUDDY_API_KEY to force local bazel (avoids bbr/proxy issues) -BUILDBUDDY_API_KEY= \ - ducktape-pytest-main-check \ - /home/user/ducktape/devinfra/claude/hook_daemon/test_hook_daemon.py \ - 2>&1 | tail -3 -echo "pytest-main-check exit: $?" - -# --- commit-msg hook --- -# Write a sample commit message and run the hook against it. -TMP_MSG=$(mktemp) -cat > "$TMP_MSG" <<'MSG' -test: dummy message for hook selfcheck - -https://claude.ai/code/test -MSG - -DUCKTAPE_PRECOMMIT_ENFORCE_TEST_TAG= \ - ducktape-commit-msg "$TMP_MSG" 2>&1 -echo "commit-msg exit: $?" -rm -f "$TMP_MSG" +python3 -c " +import json, re +pins = json.load(open('/home/user/ducktape/npins/sources.json'))['pins'] +url = pins.get('claude-hooks', {}).get('url', '') +m = re.search(r'claude-hooks-([0-9a-f]+)', url) +print('pinned:', m.group(1) if m else 'unknown') +" +git -C /home/user/ducktape log --oneline -5 -- devinfra/claude/ npins/sources.json ``` -**Failure: `pytest-main-check` exits 1 with `Command 'bazel' returned non-zero exit status 255`** - -Two sub-causes: +If the pin is behind HEAD, diff the installed Nix store package against +the repo source for breaking changes (renamed classes, changed config +paths, removed hooks). Check GitHub CI on `agentydragon/ducktape`: +recent `release.yml` and `sync-pins.yml` runs on `devel`. -- _bbr backend, local proxy_: `BUILDBUDDY_API_KEY` is set and origin is `127.0.0.1:*`. The cloud runner fetches from the local proxy URL it received in `RunRequest.repo.url` and fails. Fix: add the `github` remote and set `buildbuddy.remote-bazel-default-remote` (see 7b above). Or temporarily unset `BUILDBUDDY_API_KEY` before committing. -- _Concurrent bbr calls_: `build_bazel_index` runs two concurrent `bbr query` calls; if both race to re-initialise the runner repo, one fails. The local-proxy issue is the root cause in web sessions. +### D3 — `git remote origin` URL reachability (web only) -**Failure: `commit-msg` exits 1 with `commit message file path required as argument`** - -`DUCKTAPE_PRECOMMIT_ENFORCE_TEST_TAG=1` is active but `pass_filenames: false` was set (now fixed in `.pre-commit-config.yaml`). Confirm the fix is present: +Known failure mode: `bb remote` reads `git remote -v` locally and sends +the URL to the cloud runner. If `origin` is `127.0.0.1:*` (Claude Code +web session proxy), the runner can't reach it and `bbr` fails on hook +invocations. ```bash -grep -A6 'ducktape-commit-msg' /home/user/ducktape/.pre-commit-config.yaml | grep pass_filenames \ - && echo "WARN: pass_filenames still set" \ - || echo "OK: pass_filenames not set" +ORIGIN=$(git -C /home/user/ducktape remote get-url origin) +echo "$ORIGIN" | grep -qE '127\.0\.0\.1|localhost' && echo "WARN: local proxy origin" || echo "OK" +git -C /home/user/ducktape config buildbuddy.remote-bazel-default-remote 2>/dev/null \ + && echo "(buildbuddy remote override is set)" \ + || echo "(no remote override)" ``` -#### 7d. Live end-to-end commit test - -Actually exercises the full hook pipeline (pre-commit + commit-msg) on a -throwaway commit in a temp branch. Cleans up afterwards. - -```bash -set -e -cd /home/user/ducktape - -# Create a throwaway branch from HEAD -TEST_BRANCH="selfcheck/hook-test-$(date +%s)" -git checkout -q -b "$TEST_BRANCH" - -# Make a trivial tracked change -TEST_FILE=$(mktemp /home/user/ducktape/selfcheck-hook-test-XXXXX.txt) -echo "hook selfcheck $(date -Iseconds)" > "$TEST_FILE" -git add "$TEST_FILE" - -# Commit with BUILDBUDDY_API_KEY unset (forces local bazel in pre-commit) -# and DUCKTAPE_PRECOMMIT_ENFORCE_TEST_TAG unset (skips test-tag check) -BUILDBUDDY_API_KEY= DUCKTAPE_PRECOMMIT_ENFORCE_TEST_TAG= \ - git commit -m "test: selfcheck hook test — delete me" 2>&1 -COMMIT_EXIT=$? - -# Clean up: remove branch and file regardless of outcome -git checkout -q - -git branch -D "$TEST_BRANCH" -rm -f "$TEST_FILE" +## Report Format -echo "Live commit test exit: $COMMIT_EXIT" -[ "$COMMIT_EXIT" -eq 0 ] && echo "PASS: git hooks work" || echo "FAIL: git hooks broken" ``` +# Session Selfcheck — -Add `| grep -E '^(PASS|FAIL|Passed|Failed|error|Error)'` to the git commit -line to reduce noise, or run it unfiltered to see all hook output. +Profile: Summary: -**If the live test fails**: capture the full pre-commit output, then run the -failing hook directly (7c) to isolate which hook is broken. +## SPEC acceptance criteria ---- +| ID | Check | Status | Detail | +| --- | ------------------------------ | -------- | --------------------- | +| C1 | BUILDBUDDY_API_KEY valid | OK/FAIL | HTTP | +| C2 | GITHUB_TOKEN valid | OK/FAIL | login=... | +| C3 | bbr build trivial | OK/FAIL | ... | +| C4 | bazelisk shim + session tag | OK/FAIL | ... | +| C5 | PostToolUse auto-apply | OK/SKIP | manually verified? | +| C6 | throwaway commit end-to-end | OK/FAIL | ... | +| C7 | daemon log clean | OK/FAIL | N errors | +| C8 | OTLP tracing | OK/FAIL | HTTP | +| C9 | bbr analysis cache warm | OK/WARN/AMBIG | cold=Xs warm=Ys | +| CLI1–4 / W1–5 | ... | ... | -## Report Format +## Out-of-SPEC diagnostics -After running all checks, produce: +| ID | Check | Status | Detail | +| -- | ------------------------------ | ------------- | ---------------------- | +| D1 | web_setup.sh freshness | OK/STALE/MISS | setup= head= | +| D2 | claude-hooks pin staleness | OK/BEHIND | pin=, CI status | +| D3 | origin URL reachable for bbr | OK/WARN | origin=... | -``` -# Web Session Selfcheck — - -## Summary - - -## Checks - -| Check | Status | Detail | -| ---------------------------- | -------- | ----------------------------------------------- | -| web_setup.sh ran | OK/FAIL | ... | -| web_setup.sh commit | OK/STALE | setup= head= (VM reuse risk) | -| claude-hooks daemon version | OK/STALE | pinned= head= N commits behind; CI status | -| Session start hook | OK/FAIL | CANARY present / FileNotFoundError | -| SOPS_AGE_KEY | OK/FAIL | age public key matches .sops.yaml | -| Secret: buildbuddy.yaml | OK/FAIL| decrypts / API | -| Secret: github-agent-pat | OK/FAIL| decrypts / login=agentydragon-agent | -| Secret: github-ci-read-pat | OK/FAIL| decrypts / login=agentydragon | -| Secret: k8s-token | OK/FAIL| decrypts / API | -| Secret: otlp-token | OK/FAIL| decrypts / API | -| bbr / BuildBuddy RBE | OK/FAIL| ... | -| bbr runner recycling | OK/WARN/AMBIG | cold=Xs warm=Ys ratio=Z% | -| Git hooks (pre-commit) | OK/FAIL| backend=LOCAL/bbr; live commit pass/fail | -| Git hooks (commit-msg) | OK/FAIL| pass_filenames ok; ENFORCE_TEST_TAG state | - -## Issues & Remediation +## Issues & remediation ### -**Impact**: +**Spec criterion violated**: +**Impact**: **Root cause**: -**Fix**: - -... +**Fix** (for the user to run, not the skill): ``` -Prioritize issues by impact: hook failure > stale claude-hooks > credential -failures > CI pipeline issues. +Prioritize: SPEC violations first (the daemon is broken), then out-of-SPEC +diagnostics.