Skip to content

Configure rate limits on VirtualMCPServer PR A#5079

Open
Sanskarzz wants to merge 4 commits intostacklok:mainfrom
Sanskarzz:ratelimitingVMCP
Open

Configure rate limits on VirtualMCPServer PR A#5079
Sanskarzz wants to merge 4 commits intostacklok:mainfrom
Sanskarzz:ratelimitingVMCP

Conversation

@Sanskarzz
Copy link
Copy Markdown
Contributor

@Sanskarzz Sanskarzz commented Apr 27, 2026

PR A — shared→global rename + CRD schema for vMCP

Summary

Add the VirtualMCPServer rate-limiting API surface in line with THV-0057.

This change introduces top-level spec.rateLimiting support on VirtualMCPServer and converts that CRD configuration into the generated vMCP runtime config. It also adds global as the preferred field name for the existing MCP server rate-limit configuration while preserving shared as a deprecated compatibility alias.

The CRD validation rejects invalid combinations early, including per-user limits without OIDC incoming auth and any rate limiting without Redis session storage. Generated CRDs and operator docs were updated accordingly.

Runtime vMCP enforcement is intentionally split out of this PR and will be handled in a follow-up PR.

Fixes #4552

Type of change

  • Bug fix
  • New feature
  • Refactoring (no behavior change)
  • Dependency update
  • Documentation
  • Other (describe):

Test plan

  • Unit tests (task test)
  • E2E tests (task test-e2e)
  • Linting (task lint-fix)
  • Manual testing (describe below) I will describe full manual testing in follow-up PR

API Compatibility

This PR adds optional VirtualMCPServer.spec.rateLimiting configuration to the v1beta1 API. This is an additive, backward-compatible API change.

Existing VirtualMCPServer resources do not need migration. Cluster admins can opt in by adding spec.rateLimiting and must configure Redis-backed spec.sessionStorage when rate
limiting is enabled. Per-user limits additionally require OIDC incoming auth.

  • This PR does not break the v1beta1 API, OR the api-break-allowed label is applied and the migration guidance is described above.

Changes

  • Add VirtualMCPServer.spec.rateLimiting instead of placing rate-limit config under spec.config
  • Add CEL validation for Redis session storage and OIDC-required per-user limits
  • Copy spec.rateLimiting into the generated vMCP ConfigMap runtime config
  • Support global as the primary field name while preserving shared as a compatibility alias
  • Keep pkg/ratelimit using the CRD RateLimitConfig type directly
  • Add unit coverage for global serialization and legacy shared Redis key compatibility

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Large PR Detected

This PR exceeds 1000 lines of changes and requires justification before it can be reviewed.

How to unblock this PR:

Add a section to your PR description with the following format:

## Large PR Justification

[Explain why this PR must be large, such as:]
- Generated code that cannot be split
- Large refactoring that must be atomic
- Multiple related changes that would break if separated
- Migration or data transformation

Alternative:

Consider splitting this PR into smaller, focused changes (< 1000 lines each) for easier review and reduced risk.

See our Contributing Guidelines for more details.


This review will be automatically dismissed once you add the justification section.

@github-actions github-actions Bot added the size/XL Extra large PR: 1000+ lines changed label Apr 27, 2026
@github-actions github-actions Bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Apr 27, 2026
Copy link
Copy Markdown
Contributor

@jerm-dro jerm-dro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey Sanskar, thanks for putting this together — the end-to-end shape is right: top-level spec.rateLimiting with CEL validation, converter wiring, and the vMCP middleware integration with optimizer-aware tool name resolution. The manual testing scenario in the description is thorough. Requesting changes on two axes before we go further.

1. Drop pkg/ratelimit/config — use the CRD type directly

The proxyrunner path already uses *v1beta1.RateLimitConfig directly throughout pkg/ratelimit on main today. vMCP can do the same — just pass the CRD type through. The new package duplicates the CRD types 1:1, still imports metav1.Duration, and adds conversion boilerplate (ToInternal(), hand-written DeepCopy, unused EffectiveGlobal() methods) without solving a real problem.

2. Split the PR

Even after removing pkg/ratelimit/config, this PR bundles several separable changes. At minimum I'd split into:

PR A — sharedglobal rename + CRD schema for vMCP (pure schema change, no runtime behavior):

  • Add Global field to RateLimitConfig / ToolRateLimitConfig with shared as deprecated alias
  • Add spec.rateLimiting to VirtualMCPServerSpec with CEL validation
  • Converter wiring
  • Generated CRDs, docs

PR B — vMCP rate-limit middleware wiring (depends on A):

  • buildRateLimitMiddleware + Redis client lifecycle
  • Middleware chain refactoring in server.go (this changes execution order — MCP parsing moves before audit — and deserves focused review)
  • Optimizer call_tool tool-name unwrapping
  • Unit tests + E2E test

@github-actions github-actions Bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Apr 28, 2026
@github-actions github-actions Bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Apr 28, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 28, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 67.57%. Comparing base (8c90184) to head (dc566e1).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #5079      +/-   ##
==========================================
+ Coverage   67.53%   67.57%   +0.03%     
==========================================
  Files         601      601              
  Lines       61093    61093              
==========================================
+ Hits        41262    41281      +19     
+ Misses      16714    16696      -18     
+ Partials     3117     3116       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@github-actions github-actions Bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Apr 28, 2026
@github-actions github-actions Bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Apr 28, 2026
@github-actions github-actions Bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Apr 28, 2026
@github-actions github-actions Bot added the size/XL Extra large PR: 1000+ lines changed label Apr 29, 2026
@github-actions github-actions Bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Apr 29, 2026
@github-actions github-actions Bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Apr 29, 2026
Comment on lines +509 to +517
type RateLimitConfig = vmcpconfig.RateLimitConfig

// Tools defines per-tool rate limit overrides.
// Each entry applies additional rate limits to calls targeting a specific tool name.
// A request must pass both the server-level limit and the per-tool limit.
// +listType=map
// +listMapKey=name
// +optional
Tools []ToolRateLimitConfig `json:"tools,omitempty"`
}

// RateLimitBucket defines a token bucket configuration with a maximum capacity
// and a refill period. Used by both shared (global) and per-user rate limits.
type RateLimitBucket struct {
// MaxTokens is the maximum number of tokens (bucket capacity).
// This is also the burst size: the maximum number of requests that can be served
// instantaneously before the bucket is depleted.
// +kubebuilder:validation:Required
// +kubebuilder:validation:Minimum=1
MaxTokens int32 `json:"maxTokens"`

// RefillPeriod is the duration to fully refill the bucket from zero to maxTokens.
// The effective refill rate is maxTokens / refillPeriod tokens per second.
// Format: Go duration string (e.g., "1m0s", "30s", "1h0m0s").
// +kubebuilder:validation:Required
RefillPeriod metav1.Duration `json:"refillPeriod"`
}
// RateLimitBucket defines a token bucket configuration with a maximum capacity and refill period.
// +gendoc
type RateLimitBucket = vmcpconfig.RateLimitBucket

// ToolRateLimitConfig defines rate limits for a specific tool.
// At least one of shared or perUser must be configured.
//
// +kubebuilder:validation:XValidation:rule="has(self.shared) || has(self.perUser)",message="at least one of shared or perUser must be configured"
//
//nolint:lll // kubebuilder marker exceeds line length
type ToolRateLimitConfig struct {
// Name is the MCP tool name this limit applies to.
// +kubebuilder:validation:Required
// +kubebuilder:validation:MinLength=1
Name string `json:"name"`

// Shared token bucket for this specific tool.
// +optional
Shared *RateLimitBucket `json:"shared,omitempty"`

// PerUser token bucket configuration for this tool.
// +optional
PerUser *RateLimitBucket `json:"perUser,omitempty"`
}
// +gendoc
type ToolRateLimitConfig = vmcpconfig.ToolRateLimitConfig
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you avoid moving these structs and creating the type aliases?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have moved the rate limit structs back into v1beta1 as real CRD types and removed the aliases instead of trying to patch the generated docs.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jerm-dro Getting error in CI but it looks unrelated to the scope of this PR. Could you please re-run the workflow?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

Copy link
Copy Markdown
Contributor Author

@Sanskarzz Sanskarzz Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, this new CI error looks unrelated.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now it's fixed

@github-actions github-actions Bot added size/L Large PR: 600-999 lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Apr 29, 2026
@github-actions github-actions Bot dismissed their stale review April 29, 2026 22:08

PR size has been reduced below the XL threshold. Thank you for splitting this up!

@github-actions
Copy link
Copy Markdown
Contributor

✅ PR size has been reduced below the XL threshold. The size review has been dismissed and this PR can now proceed with normal review. Thank you for splitting this up!

@github-actions github-actions Bot added size/L Large PR: 600-999 lines changed and removed size/L Large PR: 600-999 lines changed labels Apr 29, 2026
@github-actions github-actions Bot added size/L Large PR: 600-999 lines changed and removed size/L Large PR: 600-999 lines changed labels Apr 30, 2026
@github-actions github-actions Bot added size/L Large PR: 600-999 lines changed and removed size/L Large PR: 600-999 lines changed labels Apr 30, 2026
vmcpconfig.Config `yaml:",inline"`

RateLimiting *mcpv1beta1.RateLimitConfig `yaml:"rateLimiting,omitempty"`
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why introduce this new type? Why not put ratelimiting on vmcpconfig.Config?

As implemented, I don't think the RateLimiting config ever makes it into vmcp's server.go. The CRD field doesn't survive the conversion step, since there's nowhere to write the RateLimiting config to.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You’re right. I had added an operator-side ConfigMap wrapper, but that was incomplete because vMCP loads the file into vmcpconfig.Config, so rateLimiting would be ignored before reaching server.go.
I removed the wrapper and reverted this path to marshal vmcpconfig.Config directly. Since this PR is now scoped to the CRD/schema validation side, I’ll leave the actual runtime propagation/enforcement for PR B, where we can wire the config into vMCP and the middleware properly.

Sanskarzz added 4 commits May 2, 2026 21:28
Signed-off-by: Sanskarzz <sanskar.gur@gmail.com>
Signed-off-by: Sanskarzz <sanskar.gur@gmail.com>
Signed-off-by: Sanskarzz <sanskar.gur@gmail.com>
Signed-off-by: Sanskarzz <sanskar.gur@gmail.com>
@Sanskarzz Sanskarzz force-pushed the ratelimitingVMCP branch from 41b9789 to dc566e1 Compare May 2, 2026 15:59
@github-actions github-actions Bot added size/L Large PR: 600-999 lines changed and removed size/L Large PR: 600-999 lines changed labels May 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/L Large PR: 600-999 lines changed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Configure rate limits on VirtualMCPServer

2 participants