Skip to content

Allow container socket paths to be configured via config file#4943

Open
kantord wants to merge 2 commits intomainfrom
issue-4612
Open

Allow container socket paths to be configured via config file#4943
kantord wants to merge 2 commits intomainfrom
issue-4612

Conversation

@kantord
Copy link
Copy Markdown
Member

@kantord kantord commented Apr 20, 2026

Summary

Users with non-standard container socket paths had to set environment variables
(TOOLHIVE_DOCKER_SOCKET, TOOLHIVE_PODMAN_SOCKET, TOOLHIVE_COLIMA_SOCKET) in every
shell session or wrapper script. This is fragile because GUI and CLI processes can inherit
environment variables differently depending on how they are launched, making misconfiguration
hard to spot and debug.

  • Add container_runtime block to ~/.config/toolhive/config.yaml so socket paths can be
    set once and persist across all launch methods
  • Socket resolution now follows: env var → config file → auto-detection (existing precedence preserved)
  • Add SocketConfig to pkg/container/runtime as the shared type used by both the config
    package and the socket detection code (avoids an import cycle)

Fixes #4612

Type of change

  • New feature

Test plan

  • Unit tests (task test)
  • Linting (task lint-fix)

New unit tests in pkg/container/docker/sdk/client_unix_test.go:

  • TestFindContainerSocket_ConfigPath — table-driven: valid config path accepted, nonexistent path returns error
  • TestFindContainerSocket_EnvVarPrecedence — env var overrides a config-supplied path

Does this introduce a user-facing change?

Yes. Users can now configure socket paths in ~/.config/toolhive/config.yaml:

container_runtime:
  docker_socket: /run/user/1000/docker.sock
  podman_socket: /run/user/1000/podman/podman.sock
  colima_socket: /home/user/.colima/custom/docker.sock

This is lower-friction than setting environment variables, and works consistently regardless of how ToolHive is launched.

Special notes for reviewers

The SocketConfig struct lives in pkg/container/runtime rather than pkg/config to avoid
an import cycle: pkg/container/docker/sdkpkg/configpkg/secrets → ... → pkg/container/docker/sdk.
The pkg/container/runtime package is already imported by both sides and has precedent for
this pattern (see ScalingConfig).

Generated with Claude Code

@github-actions github-actions Bot added the size/S Small PR: 100-299 lines changed label Apr 20, 2026
@kantord
Copy link
Copy Markdown
Member Author

kantord commented Apr 20, 2026

@claude deep review of branch please

@claude
Copy link
Copy Markdown
Contributor

claude Bot commented Apr 20, 2026

Claude finished @kantord's task in 2m 27s —— View job


Deep Code Review

  • Gather PR context
  • Read changed files
  • Read surrounding context
  • Analyze for correctness, security, style, and test coverage

Summary: This PR adds support for configuring container socket paths (Docker, Podman, Colima) via the ToolHive config file (~/.config/toolhive/config.yaml), with env vars taking precedence over config overrides, which take precedence over auto-detection. The intent is good and the approach is mostly sound, but there are a few bugs ranging from a build-breaking Windows issue to a behavioral runtime bug.


🔴 Critical — Build broken on Windows

client_windows.go:62 still has the old signature:

func findPlatformContainerSocket(rt runtime.Type) (string, runtime.Type, error) {

But factory.go:162 now calls it as:

return findPlatformContainerSocket(rt, overrides)

This is a compile error on Windows. The Windows implementation needs to be updated to accept overrides socketPathConfig even if it ignores the value (config file overrides don't apply to named pipes in the same way, but the signature must match what factory.go expects).

Fix this →


🔴 High — Config overrides ignore the rt parameter, blocking fallback

In client_unix.go around line 80–88, the config override checks fire unconditionally regardless of which runtime is being probed:

// This block is reached for EVERY iteration of the loop (TypePodman, TypeDocker, TypeColima)
if overrides.podmanSocket != "" {
    return resolveConfigSocket(overrides.podmanSocket, runtime.TypePodman, "Podman")
}
if overrides.dockerSocket != "" {
    return resolveConfigSocket(overrides.dockerSocket, runtime.TypeDocker, "Docker")
}

Scenario: user sets podman_socket to a path that doesn't exist yet (e.g., a path they plan to create). All three iterations of the outer loop in NewDockerClient hit the same podmanSocket != "" check, all return an error for that path, and Docker and Colima are never tried. The fallback logic is effectively dead.

The fix is to gate each override on the matching rt:

switch rt {
case runtime.TypePodman:
    if overrides.podmanSocket != "" {
        return resolveConfigSocket(overrides.podmanSocket, runtime.TypePodman, "Podman")
    }
case runtime.TypeDocker:
    if overrides.dockerSocket != "" {
        return resolveConfigSocket(overrides.dockerSocket, runtime.TypeDocker, "Docker")
    }
case runtime.TypeColima:
    if overrides.colimaSocket != "" {
        return resolveConfigSocket(overrides.colimaSocket, runtime.TypeColima, "Colima")
    }
}

Fix this →


🟡 High — New config fields aren't in pkg/config/Config

factory.go defines containerRuntimeOverrides as a parallel struct for parsing the container_runtime subtree of the config YAML. But pkg/config/config.go:Config has no ContainerRuntime field. As a result:

  • There's no way to write these values via thv config set or UpdateConfig
  • If the main config machinery re-serializes the config, container_runtime will be silently dropped
  • Users have to hand-edit the YAML with no documentation or validation

The comment acknowledges the import cycle (pkg/transport → pkg/container) prevents using pkg/config here, but that constraint only affects the reading side in this package. The main config struct in pkg/config should be extended:

// In pkg/config/config.go Config struct:
ContainerRuntime ContainerRuntimeConfig `yaml:"container_runtime,omitempty"`

This breaks no cycle (the pkg/config package doesn't import pkg/container), and it makes the feature a first-class, documented config field.


🟡 High — loadSocketOverrides always ignores custom --config path

loadSocketOverrides in factory.go:77 hardcodes the XDG default path:

configPath, err := xdg.ConfigFile("toolhive/config.yaml")

If a user runs thv --config /custom/path/config.yaml, the socket overrides will silently be read from ~/.config/toolhive/config.yaml, not from the specified file. This will cause confusing behavior.

There's no obvious fix here without significant refactoring (passing the config path down to NewDockerClient), but the limitation should at minimum be documented in the function's comment — and ideally tracked as a follow-up issue.


🟡 Medium — Parallel config type risks drift

containerRuntimeOverrides in factory.go:66 mirrors a fragment of the main config schema. Per project conventions (go-style.md):

Don't define a separate config/data type that mirrors an existing one. Embed or reuse the original — two parallel structs require a conversion step and will diverge over time.

If the field names in the main config ever change, containerRuntimeOverrides will silently diverge. The recommended fix (above) of adding the fields to pkg/config/Config removes this risk.


🟡 Medium — Colima override returns runtime.TypeDocker inconsistently

client_unix.go:87:

if overrides.colimaSocket != "" {
    return resolveConfigSocket(overrides.colimaSocket, runtime.TypeDocker, "Colima")
}

Auto-detection (client_unix.go:104) returns runtime.TypeColima. The env var handler also returns runtime.TypeDocker for Colima, which is a pre-existing inconsistency — but the config override should match auto-detection (runtime.TypeColima), or the inconsistency should be resolved consistently across all three paths.


🟡 Medium — Missing test coverage

client_unix_test.go has two tests, both for Docker. The following cases are untested:

  • podmanSocket override
  • colimaSocket override
  • Invalid socket path (file doesn't exist) — should return an error, not succeed
  • Config override when rt doesn't match the override type (currently broken, see above)

Also: TestFindContainerSocket_ConfigOverride is not parallel-safe if the rt-awareness bug is fixed — it currently passes only because the override fires unconditionally. After the fix, passing TypeDocker with dockerSocket set should still work, but there should be a complementary test that passes TypePodman with only dockerSocket set and verifies auto-detection is used for Podman.


🟢 Low — makeTempSocket creates a regular file, not a socket

func makeTempSocket(t *testing.T) string {
    t.Helper()
    p := filepath.Join(t.TempDir(), "test.sock")
    require.NoError(t, os.WriteFile(p, nil, 0600))
    return p
}

os.WriteFile creates a regular file. resolveConfigSocket only calls os.Stat, so the test passes regardless. The name makeTempSocket implies a Unix domain socket (net.Listen("unix", p)). The name should either be makeTempFile or use net.Listen("unix", p) with t.Cleanup. This is cosmetic but the misleading name could confuse future contributors.


🟢 Low — Empty PR description

The PR body is entirely unfilled template text. This makes it hard to understand the motivation, what issue it fixes, or how to test it. Per project conventions, the Summary, Type of change, and Test plan sections are required.


Summary table

Severity Issue
🔴 Build-breaking Windows compile error — client_windows.go not updated to new signature
🔴 Runtime bug Config overrides don't gate on rt, blocking runtime fallback
🟡 Usability container_runtime fields missing from main Config struct
🟡 Correctness loadSocketOverrides ignores custom --config path
🟡 Maintainability Parallel config type risks drifting from main schema
🟡 Correctness Colima override returns TypeDocker vs TypeColima in auto-detection
🟡 Test coverage No tests for Podman/Colima overrides, invalid paths, or rt-mismatch cases
🟢 Cosmetic makeTempSocket creates a regular file, not a socket
🟢 Process PR description is empty

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 20, 2026

Codecov Report

❌ Patch coverage is 62.85714% with 13 lines in your changes missing coverage. Please review.
✅ Project coverage is 69.54%. Comparing base (ee39caf) to head (dca6c82).

Files with missing lines Patch % Lines
pkg/container/docker/sdk/factory.go 52.63% 6 Missing and 3 partials ⚠️
pkg/container/docker/sdk/client_unix.go 75.00% 2 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4943      +/-   ##
==========================================
+ Coverage   69.49%   69.54%   +0.04%     
==========================================
  Files         551      551              
  Lines       55907    55938      +31     
==========================================
+ Hits        38854    38902      +48     
+ Misses      14058    14039      -19     
- Partials     2995     2997       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@github-actions github-actions Bot added size/S Small PR: 100-299 lines changed and removed size/S Small PR: 100-299 lines changed labels Apr 20, 2026
@github-actions github-actions Bot added size/S Small PR: 100-299 lines changed and removed size/S Small PR: 100-299 lines changed labels Apr 20, 2026
@github-actions github-actions Bot added size/S Small PR: 100-299 lines changed and removed size/S Small PR: 100-299 lines changed labels Apr 20, 2026
@github-actions github-actions Bot added size/S Small PR: 100-299 lines changed and removed size/S Small PR: 100-299 lines changed labels Apr 20, 2026
@github-actions github-actions Bot added size/S Small PR: 100-299 lines changed and removed size/S Small PR: 100-299 lines changed labels Apr 20, 2026
@kantord kantord marked this pull request as ready for review April 21, 2026 12:54
@kantord kantord requested review from JAORMX and jhrozek as code owners April 21, 2026 12:54
@kantord kantord self-assigned this Apr 21, 2026
@github-actions github-actions Bot added size/S Small PR: 100-299 lines changed and removed size/S Small PR: 100-299 lines changed labels Apr 21, 2026
Copy link
Copy Markdown
Contributor

@jhrozek jhrozek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few observations from reviewing the socket-override threading.

return customSocketPath, runtime.TypeDocker, nil
}

// Check config file overrides (after env vars, before auto-detection)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's a correctness issue here. checkSocketConfigOverrides doesn't consider the rt argument, so whichever override field is non-empty wins for every iteration of the outer loop in NewDockerClient.

Concrete scenario: a user has a stale podman_socket: in config and Docker is reachable via auto-detect. Iter 1 (Podman) fails because the Podman path doesn't exist. Iter 2 should auto-detect Docker — but checkSocketConfigOverrides returns the same Podman error again, Docker auto-detect never runs, and NewDockerClient returns "no supported container runtime available: invalid Podman socket path from config" on a machine where Docker works fine.

Windows has the same shape at client_windows.go:92-99.

Fix I'd suggest: make the override check match the current rt — only look at overrides.PodmanSocket when rt == TypePodman, etc. Then a broken podman override only disables podman, and the outer loop's independent-per-runtime semantics work as intended. Side benefit: the "priority ordering" question (which field wins when multiple are set) disappears, since each runtime consults only its own field.


// loadSocketOverrides reads socket path overrides from the ToolHive config file.
// Best-effort: returns empty overrides on any error so auto-detection takes over.
func loadSocketOverrides() runtime.SocketConfig {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be worth routing this through the existing config Provider abstraction? pkg/config.NewProvider() dispatches to DefaultProvider / PathProvider / KubernetesProvider and honours RegisterProviderFactory for injected providers (see pkg/config/interface.go:225, :396, :566). Reading XDG directly means PathProvider consumers (custom config paths, tests) and any code using RegisterProviderFactory won't see this feature.

The reason to not just use NewProvider().GetConfig() is the "first-run creates the file + runs migrations" side effect — but in practice the CLI has almost always touched config by the time NewDockerClient runs, so the singleton is populated and no extra file creation happens. A read-only LoadIfExists() on the Provider interface would make the concern go away entirely.

return resolveConfigSocket(overrides.DockerSocket, runtime.TypeDocker, "Docker")
}
if overrides.ColimaSocket != "" {
return resolveConfigSocket(overrides.ColimaSocket, runtime.TypeDocker, "Colima")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small inconsistency: the config override for Colima returns runtime.TypeDocker here, but the auto-detect branch at client_unix.go:98-103 returns runtime.TypeColima. The env-var branch at client_unix.go:74-76 also returns TypeDocker, so this PR is copying the existing pattern — but the disagreement with auto-detect means any caller that branches on runtime.TypeColima behaves differently depending on how the user configured Colima.

Either align all three branches (env + config to return TypeColima, or auto-detect to TypeDocker), or leave a comment explaining why the three disagree.

var cfg struct {
ContainerRuntime runtime.SocketConfig `yaml:"container_runtime,omitempty"`
}
if err := yaml.Unmarshal(data, &cfg); err != nil {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

loadSocketOverrides has no tests, and it's the integration point that turns YAML into SocketConfig. It also has a deliberate "best-effort, return zero value on any error" contract across three distinct failure modes (xdg lookup, file read, yaml parse) — that contract should be pinned.

Worth a table-driven test that points XDG_CONFIG_HOME at t.TempDir() and covers: missing file, empty file, malformed YAML, valid file with each subset of fields populated, valid file with no container_runtime key, valid file with only unrelated keys. adrg/xdg provides xdg.Reload() if you hit caching issues on macOS. Non-parallel because of t.Setenv.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/S Small PR: 100-299 lines changed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support configuring container engine socket path via config file

2 participants