Skip to content

feat: add --sandbox=container mode to vault run#99

Open
dangtony98 wants to merge 7 commits intomainfrom
feat/sandbox-container-mode
Open

feat: add --sandbox=container mode to vault run#99
dangtony98 wants to merge 7 commits intomainfrom
feat/sandbox-container-mode

Conversation

@dangtony98
Copy link
Copy Markdown
Contributor

Summary

  • agent-vault vault run --sandbox=container -- <agent> launches the child inside a Docker container whose egress is locked down by iptables. Only the Agent Vault proxy is reachable — everything else is dropped at the kernel, closing the cooperative-sandbox escape hatches (unsetting HTTPS_PROXY, raw sockets, DNS exfil, subprocesses that don't inherit env).
  • Opt-in for now via --sandbox=container or AGENT_VAULT_SANDBOX=container; --sandbox=process stays default.
  • No server-side changes needed — existing SNI-driven leaf minting already accepts host.docker.internal, and routing container traffic via a loopback-dial forwarder means isLoopbackPeer still exempts it from TierAuth rate limits.

How it works

  • Per-invocation docker network (agent-vault-<sessionID>, labeled so PruneStaleNetworks can reconcile on next run with a 60s grace window to avoid racing with freshly-created peers). Not the default bridge — sibling containers cannot reach the forwarder.
  • Two-port raw TCP forwarder bound on that network's gateway, relaying to loopback 14321 + 14322. Preserves the MITM's SNI-based leaf minting (client sees host.docker.internal, matching leaf is minted on demand).
  • Container image built on first use from embedded Dockerfile + scripts (claude-code preinstalled, node 22 + iptables + gosu + curl + git + python3). Cached by content hash of embedded assets; bumping the assets auto-invalidates. User can override with --image.
  • Egress policy (init-firewall.sh): OUTPUT DROP default; ACCEPT only loopback, ESTABLISHED/RELATED, and the two forwarder ports at host.docker.internal. No DNS rule — resolved via /etc/hosts from --add-host=host-gateway, closing the DNS-exfil channel.
  • Child runs as unprivileged claude user (gosu post-init). --cap-drop=ALL + --cap-add=NET_ADMIN,NET_RAW (only init-firewall uses them; non-root process post-gosu doesn't get them as ambient caps). --security-opt=no-new-privileges.

Changes

CLI

  • New flags on vault run: --sandbox (enum, parse-time validated), --image, --mount (symlink-resolved, reserved-path rejection), --keep, --no-firewall, --home-volume-shared (cmd/sandbox_flag.go, cmd/run.go, cmd/run_container.go).
  • Container orchestrator: preflight → prune stale CA files + networks → fetch CA → mint session ID → write CA to bind-mount tempfile → create per-invocation network → detect gateway → start forwarder → ensure image → build docker args → syscall.Exec("docker", …) so TTY/signals propagate naturally (cmd/run_container.go).

internal/sandbox/ (new package)

  • env.go — shared BuildProxyEnv (now used by the process path in cmd/run.go too, eliminating the drift risk across sources that emit the 9 MITM env vars) + container-specific BuildContainerEnv.
  • docker.go — pure BuildRunArgs + mount validator with filepath.EvalSymlinks defense against symlink-laundering forbidden host paths.
  • forwarder.go — context-cancellable two-port TCP relay; listeners are FD_CLOEXEC so they close cleanly when the caller execs docker.
  • network.goCreatePerInvocationNetwork + PruneStaleNetworks with 60s grace window and label=agent-vault-sandbox=1 AND name=agent-vault-* double filter.
  • gateway.goHostBindIP (loopback on macOS/Windows, bridge gateway on Linux).
  • cacopy.go — CA bind-mount at ~/.agent-vault/sandbox/ca-<sid>.pem (0o644 via explicit Chmod, parent 0o700). SessionID hex-regex validated so it can't traverse paths. 24h prune of stale files.
  • image.goEnsureImage with content-hash tag caching (agent-vault/sandbox:<hash>), build-on-first-use via go:embed'd assets.
  • assets/{Dockerfile,init-firewall.sh,entrypoint.sh} — embedded into the binary.

Regression tests (cross-package invariants the sandbox depends on)

  • internal/ca/sandbox_sni_test.go — pins validateSNI(\"host.docker.internal\") == (false, nil) so tightening SNI validation without updating this test would silently break the container path.
  • internal/mitm/sandbox_loopback_test.go — pins the forwarder-laundering invariant so the rate-limit loopback exemption keeps working without any change to the limiter.

Docs

Test plan

  • go build ./... clean
  • go build -tags docker_integration ./... clean
  • go test -race ./... all green (unit + sandbox-package tests)
  • Enum-typed --sandbox rejects bogus values at flag-parse time: agent-vault vault run --sandbox=bogus -- claudeinvalid argument \"bogus\" for \"--sandbox\" flag: must be one of: process, container
  • Manual: agent-vault vault run --sandbox=container -- claude --version on Linux + macOS; confirm first-run image build succeeds and subsequent runs hit the cache
  • Manual egress proof: agent-vault vault run --sandbox=container -- bash -lc 'curl --max-time 3 https://1.1.1.1; echo exit=$?' → non-zero exit (SYN dropped)
  • Manual proxy proof: inside the same invocation, curl -fsS https://api.github.com/zen succeeds via the broker
  • Manual DNS proof: getent hosts google.com fails (no DNS rule)
  • Manual identity proof: agent-vault vault run --sandbox=container -- whoami prints claude
  • Manual cleanup proof: docker kill a running container mid-session; next vault run prunes the leaked network
  • go test -tags docker_integration ./internal/sandbox/ on a machine with docker running

🤖 Generated with Claude Code

Launches the agent inside a Docker container with iptables-locked
egress, so the child physically cannot reach anything except the
Agent Vault proxy — regardless of what it tries. Opt-in for now;
--sandbox=process remains the default.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@mintlify
Copy link
Copy Markdown

mintlify bot commented Apr 21, 2026

Preview deployment for your docs. Learn more about Mintlify Previews.

Project Status Preview Updated (UTC)
agent-vault 🟢 Ready View Preview Apr 21, 2026, 8:08 AM

💡 Tip: Enable Workflows to automatically generate PRs for you.

@gitguardian
Copy link
Copy Markdown

gitguardian bot commented Apr 21, 2026

⚠️ GitGuardian has uncovered 1 secret following the scan of your pull request.

Please consider investigating the findings and remediating the incidents. Failure to do so may lead to compromising the associated services or software components.

🔎 Detected hardcoded secret in your pull request
GitGuardian id GitGuardian status Secret Commit Filename
30536913 Triggered Generic Password eab09d8 internal/sandbox/cacopy_test.go View secret
🛠 Guidelines to remediate hardcoded secrets
  1. Understand the implications of revoking this secret by investigating where it is used in your code.
  2. Replace and store your secret safely. Learn here the best practices.
  3. Revoke and rotate this secret.
  4. If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.

To avoid such incidents in the future consider


🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.

dangtony98 and others added 2 commits April 21, 2026 01:09
On macOS Docker Desktop, `getent hosts host.docker.internal` returns
the AAAA record first, which init-firewall.sh then rejected as "not a
plain IPv4 literal" and aborted the container. Our iptables rules are
IPv4, so we need `getent ahostsv4` which only walks A records.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- --cap-drop=ALL strips CAP_SETUID/CAP_SETGID, so gosu in entrypoint.sh
  failed with EPERM when dropping root → claude. Re-add both caps; the
  non-root claude process still has an empty effective cap set after
  gosu, so the sandbox contract is unchanged.
- Quiet gosec G302 on the CA-file 0o644 Chmod: the container's claude
  user has to read the bind mount and the parent dir is 0o700, so the
  host attack surface is unchanged.
- Tighten the test-fixture WriteFile to 0o600 (G306) and wrap the
  forwarder's deferred Close in an explicit _ = (errcheck).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Binding on 127.0.0.1 relied on Docker Desktop's vpnkit routing
host.docker.internal traffic to the host's lo0. On newer Docker
Desktop builds (VZ / virtiofsd on Apple Silicon), that traffic is
delivered to a different host interface, so the loopback listener
never received the container's connection and the forwarder was
unreachable — producing ECONNREFUSED on the HTTPS_PROXY path.

Bind 0.0.0.0 instead to accept on whichever interface Desktop routes
through. The broker still requires a vault-scoped session token on
every request, so LAN reachability on an ephemeral port is not a
meaningful attack surface.

Linux path (bridge-gateway bind) is unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Comment on lines +106 to +110

args = append(args, "-v", cfg.WorkDir+":/workspace")
args = append(args, "-v", cfg.HostCAPath+":"+ContainerCAPath+":ro")

homeVolume := "agent-vault-claude-home-" + cfg.SessionID
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Each non-shared vault run --sandbox=container invocation creates a named Docker volume (agent-vault-claude-home-) that is never deleted. Docker's --rm flag removes the container but explicitly does NOT remove named volumes; only anonymous volumes are cleaned up. The PR adds PruneStaleNetworks and PruneHostCAFiles for other ephemeral resources but omits an equivalent PruneStaleVolumes for these named home volumes. A fix should call docker volume rm on the per-invocation volume after each run completes, or add a startup prune pass analogous to PruneStaleNetworks.

Extended reasoning...

What the bug is

In internal/sandbox/docker.go:106-110, each non-shared invocation creates a named volume:

homeVolume := "agent-vault-claude-home-" + cfg.SessionID
if cfg.HomeVolumeShared {
    homeVolume = "agent-vault-claude-home"
}
args = append(args, "-v", homeVolume+":"+ContainerClaudeHome)

This produces a docker run -v agent-vault-claude-home-<16-hex-chars>:/home/claude/.claude argument, creating a named Docker volume on first use.

Why --rm does not help

Docker's --rm flag (set when !cfg.Keep in BuildRunArgs) removes the container after exit and also removes any anonymous volumes attached to it, but the Docker documentation explicitly states: "Named volumes are not removed". Since agent-vault-claude-home- is a named volume (it has an explicit name, not an auto-generated ID), Docker leaves it on disk indefinitely after the container exits.

Why existing cleanup mechanisms do not cover it

cmd/run_container.go:58-59 calls two prune helpers at startup:

sandbox.PruneHostCAFiles()
_ = sandbox.PruneStaleNetworks(ctx, sandbox.DefaultPruneGrace)

There is no analogous PruneStaleVolumes call. A search of the entire internal/sandbox package confirms no volume removal code exists anywhere. The pattern is clearly established for networks and CA files but was not extended to volumes.

Why deferred cleanup cannot help on the success path

cmd/run_container.go ends with syscall.Exec(dockerBin, ...), which replaces the current process image entirely. Go deferred functions are never called after syscall.Exec succeeds, so even if docker volume rm were added as a deferred call it would be a dead letter on the normal exit path. The deferred network removal added in the same file only fires on error arms before the Exec.

Step-by-step proof

  1. User runs vault run --sandbox=container -- claude (default: HomeVolumeShared=false).
  2. NewSessionID() returns e.g. a3f9c1e2b4d50678.
  3. BuildRunArgs appends -v agent-vault-claude-home-a3f9c1e2b4d50678:/home/claude/.claude.
  4. Docker creates the named volume on first use of that name.
  5. Claude runs, writes auth tokens and session history to /home/claude/.claude inside the container; this data lands in the volume.
  6. The container exits; --rm removes the container but not the volume. docker volume ls now shows agent-vault-claude-home-a3f9c1e2b4d50678.
  7. User runs the command a second time. A new session ID, a new volume agent-vault-claude-home-, is created. The old volume is untouched.
  8. After N invocations, N named volumes exist, each potentially containing megabytes of Claude session state, auth tokens, and caches. They accumulate without bound.

Impact

On a developer machine or CI system running many --sandbox=container sessions (e.g. automated agentic pipelines), disk usage grows proportionally to the number of invocations. The volumes contain /home/claude/.claude -- session history, credential caches, MCP configs -- which can reach tens to hundreds of MB each depending on usage. Manual cleanup via docker volume prune or docker volume ls | grep agent-vault-claude-home- | xargs docker volume rm is the only current remedy. The design intent stated in the PR description -- "per-invocation volume, losing auth state" -- implies these are meant to be ephemeral, making the omission of cleanup a design-implementation gap.

How to fix

Add a PruneStaleVolumes function in internal/sandbox/ (analogous to PruneStaleNetworks) that lists volumes matching the agent-vault-claude-home-* pattern and removes those not currently attached to a container. Call it from runContainer alongside the existing prune calls. Alternatively, since the volume name encodes the session ID, docker volume rm agent-vault-claude-home- could be invoked in a post-Exec wrapper or signal handler on the container exit path.

Comment thread cmd/run_container.go
Comment on lines +123 to +127
workDir, err := os.Getwd()
if err != nil {
return fmt.Errorf("getwd: %w", err)
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 The current working directory is bind-mounted read-write at /workspace without the same ~/.agent-vault protection applied to user-supplied --mount flags. If a user runs vault run --sandbox=container while CWD is inside ~/.agent-vault (or a symlink resolving there), the vault directory — containing the encrypted credential database, the MITM CA private key, and session tokens — is exposed read-write to the container. The fix is to call validateHostSrc(workDir, home) on the os.Getwd() result before passing it to BuildRunArgs, mirroring the protection already applied to --mount entries.

Extended reasoning...

The bug

In cmd/run_container.go (lines 123–138), workDir is obtained via os.Getwd() and passed directly as Config.WorkDir to sandbox.BuildRunArgs. Inside BuildRunArgs (docker.go:103), it is added unconditionally as -v cfg.WorkDir:/workspace with no path validation whatsoever.

Existing protection is asymmetric

User-supplied --mount values flow through parseAndValidateMount → validateHostSrc (docker.go:152–175), which calls filepath.EvalSymlinks to resolve symlinks and then checks whether the resolved path equals or is nested under ~/.agent-vault. The workDir path skips this check entirely. The protection intent is explicit and deliberate in the --mount path; its absence on the workspace mount is an oversight.

Step-by-step proof

  1. Developer has agent-vault installed locally with a vault at ~/.agent-vault/.
  2. Developer navigates: cd ~/.agent-vault
  3. Developer runs: agent-vault vault run --sandbox=container -- claude
  4. os.Getwd() returns /home/user/.agent-vault
  5. BuildRunArgs appends -v /home/user/.agent-vault:/workspace to the docker argv
  6. The container starts with the entire vault directory mounted read-write at /workspace
  7. The container agent can now: read ca/ca.pem (MITM CA private key used to sign TLS leaves for all intercepted HTTPS traffic), read vault.db (credential database), and overwrite any of these files (key replacement attack)

The same scenario triggers if CWD is any subdirectory of ~/.agent-vault, or if a symlink in a normal-looking path resolves to somewhere under ~/.agent-vault.

Impact

In passwordless mode (DEK stored in plaintext — the documented default for local/PaaS use), the CA private key stored under ~/.agent-vault/ca/ is directly readable and the database is decryptable. In password-protected mode, write access allows an attacker to replace the CA key so future MITM intercepts use an attacker-controlled key. Both scenarios violate the core sandbox guarantee. The iptables egress lock is not relevant here — the attacker reads/writes the host filesystem via the bind mount, not the network.

Fix

Before passing workDir to BuildRunArgs, call the already-existing validateHostSrc (or an exported wrapper) with the os.UserHomeDir() result. This brings the workspace mount in line with the protection already applied to user-supplied mounts. Alternatively, BuildRunArgs itself could apply the check when cfg.WorkDir is set, since it already calls os.UserHomeDir() for the --mount path.

Comment on lines +23 to +28
# reply traffic to our allowed outbound conns matters, and it's caught
# by the conntrack rule.
iptables -F OUTPUT
iptables -P OUTPUT DROP
iptables -A OUTPUT -o lo -j ACCEPT
iptables -A OUTPUT -m conntrack --ctstate ESTABLISHED,RELATED -j ACCEPT
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 init-firewall.sh sets IPv4 iptables rules only — there are no ip6tables counterparts, so on Docker daemons with IPv6 enabled the container OUTPUT traffic over IPv6 is completely unrestricted. On any host where the Docker daemon is configured with IPv6 (daemon.json ipv6:true / --fixed-cidr-v6, or certain distro/cloud defaults), the per-invocation bridge network receives an IPv6 prefix and the agent can reach arbitrary external hosts via IPv6 connections, directly bypassing the core sandbox guarantee. Fix by adding ip6tables -P OUTPUT DROP and mirroring the ACCEPT rules in init-firewall.sh, and/or passing --opt com.docker.network.enable_ipv6=false in CreatePerInvocationNetwork.

Extended reasoning...

The bug

init-firewall.sh (lines 23-28 in the diff) flushes and default-denies IPv4 OUTPUT with iptables, then adds ACCEPT rules for loopback, ESTABLISHED/RELATED, and the two forwarder ports at host.docker.internal. There are zero ip6tables rules. The ip6tables OUTPUT chain default policy stays ACCEPT, meaning all IPv6 OUTPUT traffic from the container is unrestricted.

Code path that triggers it

CreatePerInvocationNetwork in internal/sandbox/network.go calls docker network create with only --driver bridge and label flags. It does not pass --opt com.docker.network.enable_ipv6=false. On Docker daemons where IPv6 is enabled at the daemon level (daemon.json with "ipv6":true plus a "fixed-cidr-v6" block, or the system-level dockerd --ipv6 flag), Docker allocates an IPv6 subnet for every user-defined bridge network, including agent-vault-. The container receives both an IPv4 and IPv6 address on that network.

Why existing code does not prevent it

init-firewall.sh validates that VAULT_HTTP_PORT and VAULT_MITM_PORT are set and that host.docker.internal resolves to a plain IPv4 literal. The IPv4 check only validates GW_IP is IPv4-formatted — it does not probe for or block IPv6 connectivity at all. The script then applies iptables rules exclusively, leaving the ip6tables OUTPUT chain untouched with its ACCEPT default.

Impact

The threat model in docs/guides/container-sandbox.mdx states: "the only TCP destination the container can reach is the Agent Vault proxy. Everything else is dropped at the kernel" and "Reach destinations outside host.docker.internal: via any network method (HTTPS, raw sockets, ICMP, whatever)." These claims are false on any Docker host where IPv6 is enabled on the daemon. An agent can open a raw IPv6 socket or call curl with an IPv6 address, completely bypassing the egress lockdown.

Step-by-step proof

  1. Docker daemon configured with "ipv6":true and "fixed-cidr-v6":"fd00::/80" in daemon.json.
  2. Agent runs: agent-vault vault run --sandbox=container -- bash
  3. CreatePerInvocationNetwork creates the bridge without --opt com.docker.network.enable_ipv6=false; Docker assigns an IPv6 subnet.
  4. Container starts; init-firewall.sh runs as root. iptables OUTPUT is set to DROP + selective ACCEPT. ip6tables OUTPUT stays at default ACCEPT.
  5. Inside the container, running: curl --max-time 3 https://[2606:4700:4700::1111] succeeds because no ip6tables DROP rule exists.
  6. The Dockerfile installs iptables (which includes ip6tables on Debian bookworm), so the tool is available; it is simply never invoked.

Fix

Add "ip6tables -F OUTPUT && ip6tables -P OUTPUT DROP" followed by "ip6tables -A OUTPUT -o lo -j ACCEPT" and "ip6tables -A OUTPUT -m conntrack --ctstate ESTABLISHED,RELATED -j ACCEPT" in init-firewall.sh. Additionally or alternatively, pass --opt com.docker.network.enable_ipv6=false in CreatePerInvocationNetwork to prevent IPv6 address assignment entirely. Both together is the most robust defense.

Comment thread cmd/run_container.go
Comment on lines +25 to +38
func validateSandboxFlagConflicts(cmd *cobra.Command, mode SandboxMode) error {
if mode == SandboxContainer {
return nil
}
for _, name := range containerOnlyFlags {
f := cmd.Flags().Lookup(name)
if f == nil {
continue
}
if f.Changed {
return fmt.Errorf("--%s requires --sandbox=container", name)
}
}
return nil
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 The --no-mitm flag is silently accepted and ignored when --sandbox=container is used, even though container mode always routes through MITM and cannot bypass it. This is inconsistent with the explicit design principle stated in the adjacent code: container-only flags are rejected rather than silently ignored in process mode "rather than silently ignoring them, which would be a foot-gun" — the same principle should apply symmetrically for process-only flags in container mode.

Extended reasoning...

What the bug is and how it manifests

validateSandboxFlagConflicts (cmd/run_container.go:25-38) returns nil immediately when mode == SandboxContainer, skipping any validation of process-only flags. As a result, agent-vault vault run --no-mitm --sandbox=container -- claude accepts the flag without error or warning, even though --no-mitm has absolutely zero effect in container mode — the container path always calls fetchMITMCA and always routes all traffic through the MITM proxy.

The specific code path that triggers it

When the user passes --no-mitm --sandbox=container, validateSandboxFlagConflicts is called at cmd/run.go:75, but returns immediately at the if mode == SandboxContainer { return nil } branch on run_container.go:27. runContainer never calls cmd.Flags().GetBool("no-mitm") at any point — the flag is simply never consulted. The code even documents this on run_container.go:62: "Container mode always routes through MITM — --no-mitm is a process-mode-only escape hatch."

Why existing code doesn't prevent it

The containerOnlyFlags list only enumerates flags that are container-only (image, mount, keep, no-firewall, home-volume-shared). There is no corresponding list of process-only flags (like --no-mitm) that should be rejected in container mode. The validation function is asymmetric by construction.

Why the design principle demands symmetry

The comment on lines 21-23 explicitly states the governing rule: "containerOnlyFlags are no-ops in process mode; we reject them explicitly rather than silently ignoring them, which would be a foot-gun." This exact principle applies in reverse: --no-mitm is a no-op in container mode. A user who passes it may reasonably believe the MITM proxy is bypassed — particularly because --no-mitm is a meaningful, effective escape hatch in process mode (it disables all HTTPS_PROXY injection entirely).

Impact

No security regression: container mode enforces MITM at the iptables level regardless of what flags are passed, so the MITM is never actually bypassed. The impact is purely UX/correctness — a user who passes --no-mitm --sandbox=container gets no feedback that their flag is a no-op, which contradicts the stated design principle and could mislead the operator about the sandbox's actual network behavior.

Step-by-step proof

  1. User runs: agent-vault vault run --no-mitm --sandbox=container -- claude
  2. RunE resolves mode = SandboxContainer and calls validateSandboxFlagConflicts(cmd, SandboxContainer)
  3. validateSandboxFlagConflicts hits line 27: if mode == SandboxContainer { return nil } — returns immediately with no error
  4. RunE proceeds to runContainer(cmd, args, ...) (cmd/run.go:101-103)
  5. runContainer calls fetchMITMCA unconditionally (line ~60) and routes all HTTPS through MITM — --no-mitm is never read
  6. User believes MITM is disabled; MITM is fully active

How to fix

Add a processOnlyFlags list (e.g. ["no-mitm"]) and check it symmetrically inside validateSandboxFlagConflicts when mode == SandboxContainer, returning an error such as "--no-mitm is not supported in container mode (MITM is always active)". Alternatively, emit a fmt.Fprintf(os.Stderr, "warning: --no-mitm has no effect in container mode") instead of a hard error, matching the pattern used by --no-firewall.

dangtony98 and others added 3 commits April 21, 2026 01:23
Listeners set SOCK_CLOEXEC by default in Go, so syscall.Exec("docker",
…) closed the forwarder sockets before the container even started.
Claude's HTTPS_PROXY calls then hit an empty port and the HTTP client
surfaced ECONNREFUSED against api.anthropic.com.

Replace the exec with fork+Wait: stdio is passed through, signals are
ignored in the parent so the kernel delivers them to docker (which
fans them out via --init/tini → claude), and the forwarder goroutines
stay live for the container's lifetime. Exit with the child's exit
code on non-zero. Defer-based network cleanup now actually runs on
the success path too.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
gocritic flagged os.Exit as bypassing the deferred signal.Stop and
network teardown. Wrap the ExitError in a clear error string so Cobra
prints it and exits 1 — losing the exact child exit code, but keeping
network cleanup + signal-handler cleanup intact.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1. Validate the workspace (CWD) against the same reserved-host-path
   rules as user --mount. Running vault run --sandbox=container from
   inside ~/.agent-vault previously bind-mounted the encrypted CA
   key + vault database into the container read-write.

2. Lock down IPv6 egress in init-firewall.sh. iptables rules alone
   left the ip6tables OUTPUT chain at default ACCEPT, so on Docker
   daemons with IPv6 enabled the agent had unrestricted v6 egress.
   ip6tables now default-denies; we resolve host.docker.internal via
   ahostsv4 so v4 is the only path we need.

3. Clean up per-invocation agent-vault-claude-home-<sid> volumes.
   Docker's --rm removes the container but not named volumes, so
   previously one claude-home volume leaked per invocation. Add
   deferred RemoveVolume + startup PruneStaleVolumes (analogous to
   PruneStaleNetworks). The shared volume is excluded by name.

4. Reject --no-mitm in container mode symmetrically to how
   container-only flags are rejected in process mode. Container mode
   always routes through MITM — silently ignoring --no-mitm misled
   operators about the sandbox's network behavior.

Asset hash updated (init-firewall.sh changed).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant