Skip to content

Move kubeconfig setup from Python to inline shell script#1281

Open
agentydragon wants to merge 1 commit intodevelfrom
claude/investigate-kubeconfig-staging-xCmX2
Open

Move kubeconfig setup from Python to inline shell script#1281
agentydragon wants to merge 1 commit intodevelfrom
claude/investigate-kubeconfig-staging-xCmX2

Conversation

@agentydragon
Copy link
Copy Markdown
Owner

Summary

Migrate kubeconfig generation from a Python module executed during session start to an inline shell script in the background commands. This simplifies the codebase by eliminating dedicated Python modules while maintaining the same functionality through a background task.

Key Changes

  • Removed Python kubeconfig module: Deleted devinfra/claude/hook_daemon/session_start/kubeconfig.py and its tests (test_kubeconfig.py), moving the logic into an inline Python script within the background commands
  • Added background command: Implemented "kubeconfig setup" as a background command in devinfra/claude/hook_daemon/profiles/web/profile.yaml that:
    • Reads K8S_TOKEN environment variable
    • Generates ~/.kube/config with full-content comparison (write if absent, no-op if identical, refuse if different)
    • Supports optional CA certificate from session directory
    • Respects HTTPS_PROXY environment variable
    • Includes canary header to identify auto-generated files
  • Removed kubeconfig from session handler: Eliminated kubeconfig writing logic from handler.py, including the kubeconfig_path field from PlatformSetup and related imports
  • Removed KUBECONFIG export: Deleted kubeconfig path export from env_file.py since the file is now written to standard ~/.kube/config location
  • Removed profile configuration: Deleted write_kubeconfig boolean flag from ProfileConfig in config.py (was used to skip kubeconfig setup in CLI profile)
  • Updated CLI profile: Removed write_kubeconfig: false from CLI profile configuration
  • Updated documentation: Modified context template and error messages to reflect that kubeconfig is now set up via background task rather than explicit session start step

Implementation Details

  • The inline Python script uses the same kubeconfig structure as the original module
  • Full-content comparison prevents accidental overwrites of user-modified kubeconfig files
  • The background command runs with after_env: true to ensure environment variables are available
  • Exit code 0 indicates success (write or no-op), exit code 1 indicates refusal due to unexpected content
  • Standard ~/.kube/config location is used instead of session-specific path, making kubectl available across sessions

https://claude.ai/code/session_01ADEasirb4eXB836Nm5A52S

Removes the Python kubeconfig module and writes ~/.kube/config via a
shell background command in the web profile instead. Uses full-content
comparison to refuse overwriting unexpected file content, and reports
status via stdout (auto-piped to agent by drain_bg_output). No KUBECONFIG
env export needed since the standard location is used.

https://claude.ai/code/session_01ADEasirb4eXB836Nm5A52S
Copilot AI review requested due to automatic review settings April 14, 2026 00:06
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR migrates kubeconfig generation for Claude Code sessions from a dedicated Python module executed during session start into a background task (inline Python embedded in a shell command), and removes the related session-start wiring and environment exports.

Changes:

  • Removed session_start/kubeconfig.py and its tests, plus the session-start handler plumbing that wrote/exported KUBECONFIG.
  • Added a new kubeconfig setup background command in the web profile to write ~/.kube/config based on K8S_TOKEN, optional combined CA, and proxy env vars.
  • Simplified profile/config/env-file structures by removing the write_kubeconfig flag and the kubeconfig_path field/export.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
devinfra/claude/hook_daemon/session_start/test_kubeconfig.py Deletes unit tests for the removed kubeconfig writer.
devinfra/claude/hook_daemon/session_start/kubeconfig.py Deletes the Python kubeconfig builder/writer module.
devinfra/claude/hook_daemon/session_start/handler.py Removes kubeconfig write step and related fields/imports from session start flow.
devinfra/claude/hook_daemon/profiles/web/profile.yaml Adds kubeconfig setup as an after_env background command writing ~/.kube/config.
devinfra/claude/hook_daemon/profiles/web/context.mako Updates session context messaging to reflect new kubeconfig setup behavior.
devinfra/claude/hook_daemon/profiles/cli/profile.yaml Removes now-defunct write_kubeconfig: false profile setting.
devinfra/claude/hook_daemon/config.py Removes ProfileConfig.write_kubeconfig flag.
devinfra/claude/env_file.py Removes kubeconfig_path field and stops exporting KUBECONFIG.

Comment on lines +169 to +174
else
case $? in
1) echo "kubectl: ~/.kube/config has unexpected content — refusing to overwrite. Delete it to auto-configure as ${K8S_SA} SA." ;;
*) echo "kubectl: setup failed (exit $?) — check hook daemon log." ;;
esac
fi
Copy link

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The shell wrapper doesn’t propagate the Python failure/refusal exit code. In the else branch you print an error, but the overall command will still exit 0 (because echo succeeds), so the hook daemon will report the task as successful even when Python returned 1. Capture the Python return code into a variable immediately after the heredoc, use that for the case, and exit with that same code after printing the message.

Copilot uses AI. Check for mistakes.
Comment on lines +170 to +172
case $? in
1) echo "kubectl: ~/.kube/config has unexpected content — refusing to overwrite. Delete it to auto-configure as ${K8S_SA} SA." ;;
*) echo "kubectl: setup failed (exit $?) — check hook daemon log." ;;
Copy link

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The case $? in and the fallback message exit $?) both rely on $?, but $? will change as soon as any command runs in the else branch (including the case itself / echo). Save the Python exit status to a variable (e.g., rc=$?) before the case, then reference $rc in messages and in exit so the reported status is accurate.

Suggested change
case $? in
1) echo "kubectl: ~/.kube/config has unexpected content — refusing to overwrite. Delete it to auto-configure as ${K8S_SA} SA." ;;
*) echo "kubectl: setup failed (exit $?) — check hook daemon log." ;;
rc=$?
case $rc in
1) echo "kubectl: ~/.kube/config has unexpected content — refusing to overwrite. Delete it to auto-configure as ${K8S_SA} SA." ;;
*) echo "kubectl: setup failed (exit $rc) — check hook daemon log." ;;

Copilot uses AI. Check for mistakes.
Comment on lines +146 to +148
proxy = os.environ.get("HTTPS_PROXY") or os.environ.get("https_proxy")
if proxy:
cluster["proxy-url"] = proxy
Copy link

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The proxy selection here only checks HTTPS_PROXY/https_proxy. The previous implementation used devinfra.claude.auth_proxy.vars.get_proxy_url() which walks a broader set of proxy env vars (including HTTP_PROXY, GLOBAL_AGENT_*, YARN_*). To preserve behavior, consider matching that priority list (or at least include HTTP_PROXY/http_proxy) so kubectl doesn’t silently miss the configured proxy in environments that don’t set HTTPS_PROXY.

Copilot uses AI. Check for mistakes.
Comment on lines +118 to +138
K8S_SERVER="https://api.allegedly.works:16443"
K8S_SA="claude-code-web"
K8S_NS="claude-sandbox"

if [ -z "${K8S_TOKEN:-}" ]; then
echo "kubectl: K8S_TOKEN unavailable — ~/.kube/config not written."
exit 0
fi

if python3 - <<'PYEOF'; then
import base64, os, pathlib, sys, yaml

CANARY_HEADER = (
"# CANARY:SESSION_START_HOOK — Generated by devinfra/claude (session_start hook).\n"
"# This file is managed automatically. Delete it to force regeneration.\n"
)
token = os.environ["K8S_TOKEN"]
server = "https://api.allegedly.works:16443"
sa = "claude-code-web"
ns = "claude-sandbox"
kc_path = pathlib.Path.home() / ".kube" / "config"
Copy link

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

K8S_SERVER, K8S_SA, and K8S_NS are set in the shell block, but the embedded Python duplicates them as hard-coded literals (and the profile also has a top-level k8s: section with the same values). This makes it easy for these to drift out of sync. Prefer passing the values into Python via environment (export the vars and read them in Python) or deriving them from a single source of truth.

Copilot uses AI. Check for mistakes.
Comment on lines +159 to +163
if kc_path.exists():
if kc_path.read_text() == intended:
sys.exit(0) # already correct — no-op
sys.exit(1) # unexpected content — refuse

Copy link

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The overwrite guard currently treats any difference from the intended content as “foreign content” and exits 1, even if the existing file was previously generated by this task (it contains the CANARY header) but needs an update (e.g., token rotation, CA bundle change, cluster URL change). Consider allowing overwrite when the existing file starts with the canary header, and only refusing when the file is non-canary/user-managed, so the system can self-heal without manual deletion.

Copilot uses AI. Check for mistakes.
platform_detect,
tmpfs,
)
from devinfra.claude.hook_daemon.session_start import buildbuddy, container_runtime, mkcert, platform_detect, tmpfs
Copy link

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bazel targets under devinfra/claude/hook_daemon/session_start/BUILD.bazel still reference the removed kubeconfig.py (both the :kubeconfig library and test_kubeconfig). Since this change removes the import and deletes the file, the Bazel build/test graph will break until those targets/deps are removed or updated accordingly.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants