Skip to content

Add Redis Cluster mode support to auth server storage#5153

Open
reyortiz3 wants to merge 6 commits intomainfrom
worktree-redis-cluster-support-to-auth-server-storage
Open

Add Redis Cluster mode support to auth server storage#5153
reyortiz3 wants to merge 6 commits intomainfrom
worktree-redis-cluster-support-to-auth-server-storage

Conversation

@reyortiz3
Copy link
Copy Markdown
Contributor

@reyortiz3 reyortiz3 commented May 1, 2026

Summary

Managed Redis services like GCP Memorystore Cluster and AWS ElastiCache cluster mode enabled expose a single discovery endpoint — not a list of shard nodes. The operator needs a way to tell go-redis to use the Cluster protocol for that endpoint, rather than treating it as a standalone server.

This PR adds clusterMode: true as a flag on the existing addr field instead of introducing a separate clusterConfig struct with an addrs slice. When clusterMode is set, the runner calls redis.NewClusterClient({Addrs: []string{addr}}), which auto-discovers the full cluster topology from the single discovery endpoint.

Changes:

  • Storage layer (pkg/authserver/storage): replace ClusterConfig *ClusterConfig with ClusterMode bool on RedisConfig / RedisRunConfig; NewRedisStorage branches on cfg.ClusterMode and wraps cfg.Addr in the slice that ClusterOptions expects
  • Runner layer (pkg/authserver/runner): convertRedisRunConfig maps ClusterMode straight through; validation updated to: addr XOR sentinel, plus clusterMode requires addr
  • Operator CRD (cmd/thv-operator/api/v1beta1): RedisStorageConfig gains clusterMode bool; CEL rules revert to a simple XOR (addr vs sentinelConfig) plus a second rule for clusterMode requiring addr; RedisClusterConfig struct removed
  • Generated artifacts: deepcopy, CRD YAML, and Helm chart templates regenerated

Closes #5010

Type of change

  • New feature

Test plan

  • Unit tests (task test)
  • Linting (task lint-fix)

New unit tests cover: cluster mode + sentinel conflict, cluster mode without addr, cluster mode happy-path at all three layers (storage, runner, operator).

API Compatibility

  • This PR does not break the v1beta1 API, OR the api-break-allowed label is applied and the migration guidance is described above.

clusterMode is a new optional bool field (defaults to false). Adding it is a backward-compatible change. The CEL XOR rule for addr vs sentinelConfig is equivalent to the previous rule for the existing combinations.

Changes

File Change
pkg/authserver/storage/redis.go Replace ClusterConfig struct with ClusterMode bool; update validateConfig and NewRedisStorage
pkg/authserver/storage/config.go Replace ClusterRunConfig / ClusterConfig field with ClusterMode bool on RedisRunConfig
pkg/authserver/runner/embeddedauthserver.go Map ClusterMode in convertRedisRunConfig; simplified validation
cmd/thv-operator/api/v1beta1/mcpexternalauthconfig_types.go Add ClusterMode bool; remove RedisClusterConfig struct; update CEL rules
cmd/thv-operator/pkg/controllerutil/authserver.go Map ClusterMode; remove validateRedisConnectionMode helper
zz_generated.deepcopy.go, CRD YAML × 4 Regenerated artifacts

Does this introduce a user-facing change?

Yes. Kubernetes operators can now enable Redis Cluster protocol by setting spec.authServer.storage.redis.clusterMode: true alongside the existing addr field. Set addr to the single discovery endpoint of the managed Redis Cluster service (e.g., GCP Memorystore Cluster discovery IP, AWS ElastiCache configuration endpoint).

Generated with Claude Code

Managed Redis services like GCP Memorystore Cluster and AWS ElastiCache
Serverless use the Redis Cluster protocol rather than standalone or Sentinel
connections, making it impossible to use them without this third mode.

- Add ClusterConfig/ClusterRunConfig types to storage and runner layers
- Wire cluster client creation via redis.NewClusterClient in NewRedisStorage
- Update validateConfig, convertRedisRunConfig, and buildStorageRunConfig to
  enforce three-way mutual exclusion (addr / sentinel / cluster)
- Add RedisClusterConfig CRD type with MinItems=1 validation; update CEL rule
  to require exactly one of the three modes
- Regenerate deepcopy, CRD YAML, and Helm chart templates
- Add unit tests for all new validation and happy-path code paths

Closes #5010

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions github-actions Bot added the size/L Large PR: 600-999 lines changed label May 1, 2026
@reyortiz3 reyortiz3 marked this pull request as draft May 1, 2026 15:08
@codecov
Copy link
Copy Markdown

codecov Bot commented May 1, 2026

Codecov Report

❌ Patch coverage is 94.11765% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 67.55%. Comparing base (8c90184) to head (4e20681).

Files with missing lines Patch % Lines
pkg/authserver/storage/redis.go 90.90% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #5153      +/-   ##
==========================================
+ Coverage   67.53%   67.55%   +0.01%     
==========================================
  Files         601      601              
  Lines       61093    61114      +21     
==========================================
+ Hits        41262    41283      +21     
  Misses      16714    16714              
  Partials     3117     3117              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

ClusterConfig{Addrs []string} was a misleading abstraction: GCP Memorystore
Cluster and AWS ElastiCache cluster mode both expose a single discovery endpoint,
so the slice always held exactly one address and was redundant with the existing
Addr field.

Replace ClusterConfig with a ClusterMode bool flag that is set alongside the
existing Addr field. go-redis NewClusterClient still receives the single endpoint
as []string{addr} and auto-discovers the full cluster topology from there.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions github-actions Bot added size/M Medium PR: 300-599 lines changed and removed size/L Large PR: 600-999 lines changed size/M Medium PR: 300-599 lines changed labels May 1, 2026
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions github-actions Bot added size/M Medium PR: 300-599 lines changed and removed size/M Medium PR: 300-599 lines changed labels May 1, 2026
@github-actions github-actions Bot added size/M Medium PR: 300-599 lines changed and removed size/M Medium PR: 300-599 lines changed labels May 1, 2026
@reyortiz3 reyortiz3 marked this pull request as ready for review May 1, 2026 19:19
@reyortiz3 reyortiz3 requested a review from amirejaz as a code owner May 1, 2026 19:19
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions github-actions Bot added size/M Medium PR: 300-599 lines changed and removed size/M Medium PR: 300-599 lines changed labels May 1, 2026
@github-actions github-actions Bot added size/M Medium PR: 300-599 lines changed and removed size/M Medium PR: 300-599 lines changed labels May 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/M Medium PR: 300-599 lines changed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add Redis Cluster mode support to auth server storage

1 participant