Configure rate limits on VirtualMCPServer PR A#5079
Configure rate limits on VirtualMCPServer PR A#5079Sanskarzz wants to merge 4 commits intostacklok:mainfrom
Conversation
There was a problem hiding this comment.
Large PR Detected
This PR exceeds 1000 lines of changes and requires justification before it can be reviewed.
How to unblock this PR:
Add a section to your PR description with the following format:
## Large PR Justification
[Explain why this PR must be large, such as:]
- Generated code that cannot be split
- Large refactoring that must be atomic
- Multiple related changes that would break if separated
- Migration or data transformationAlternative:
Consider splitting this PR into smaller, focused changes (< 1000 lines each) for easier review and reduced risk.
See our Contributing Guidelines for more details.
This review will be automatically dismissed once you add the justification section.
dfe941c to
d01f3cd
Compare
jerm-dro
left a comment
There was a problem hiding this comment.
Hey Sanskar, thanks for putting this together — the end-to-end shape is right: top-level spec.rateLimiting with CEL validation, converter wiring, and the vMCP middleware integration with optimizer-aware tool name resolution. The manual testing scenario in the description is thorough. Requesting changes on two axes before we go further.
1. Drop pkg/ratelimit/config — use the CRD type directly
The proxyrunner path already uses *v1beta1.RateLimitConfig directly throughout pkg/ratelimit on main today. vMCP can do the same — just pass the CRD type through. The new package duplicates the CRD types 1:1, still imports metav1.Duration, and adds conversion boilerplate (ToInternal(), hand-written DeepCopy, unused EffectiveGlobal() methods) without solving a real problem.
2. Split the PR
Even after removing pkg/ratelimit/config, this PR bundles several separable changes. At minimum I'd split into:
PR A — shared→global rename + CRD schema for vMCP (pure schema change, no runtime behavior):
- Add
Globalfield toRateLimitConfig/ToolRateLimitConfigwithsharedas deprecated alias - Add
spec.rateLimitingtoVirtualMCPServerSpecwith CEL validation - Converter wiring
- Generated CRDs, docs
PR B — vMCP rate-limit middleware wiring (depends on A):
buildRateLimitMiddleware+ Redis client lifecycle- Middleware chain refactoring in
server.go(this changes execution order — MCP parsing moves before audit — and deserves focused review) - Optimizer
call_tooltool-name unwrapping - Unit tests + E2E test
a52365e to
93ca143
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #5079 +/- ##
==========================================
+ Coverage 67.53% 67.57% +0.03%
==========================================
Files 601 601
Lines 61093 61093
==========================================
+ Hits 41262 41281 +19
+ Misses 16714 16696 -18
+ Partials 3117 3116 -1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
93ca143 to
fdb40fe
Compare
fdb40fe to
7e8bbe4
Compare
7e8bbe4 to
349bb94
Compare
a5f7986 to
f420fa9
Compare
f420fa9 to
1512e16
Compare
| type RateLimitConfig = vmcpconfig.RateLimitConfig | ||
|
|
||
| // Tools defines per-tool rate limit overrides. | ||
| // Each entry applies additional rate limits to calls targeting a specific tool name. | ||
| // A request must pass both the server-level limit and the per-tool limit. | ||
| // +listType=map | ||
| // +listMapKey=name | ||
| // +optional | ||
| Tools []ToolRateLimitConfig `json:"tools,omitempty"` | ||
| } | ||
|
|
||
| // RateLimitBucket defines a token bucket configuration with a maximum capacity | ||
| // and a refill period. Used by both shared (global) and per-user rate limits. | ||
| type RateLimitBucket struct { | ||
| // MaxTokens is the maximum number of tokens (bucket capacity). | ||
| // This is also the burst size: the maximum number of requests that can be served | ||
| // instantaneously before the bucket is depleted. | ||
| // +kubebuilder:validation:Required | ||
| // +kubebuilder:validation:Minimum=1 | ||
| MaxTokens int32 `json:"maxTokens"` | ||
|
|
||
| // RefillPeriod is the duration to fully refill the bucket from zero to maxTokens. | ||
| // The effective refill rate is maxTokens / refillPeriod tokens per second. | ||
| // Format: Go duration string (e.g., "1m0s", "30s", "1h0m0s"). | ||
| // +kubebuilder:validation:Required | ||
| RefillPeriod metav1.Duration `json:"refillPeriod"` | ||
| } | ||
| // RateLimitBucket defines a token bucket configuration with a maximum capacity and refill period. | ||
| // +gendoc | ||
| type RateLimitBucket = vmcpconfig.RateLimitBucket | ||
|
|
||
| // ToolRateLimitConfig defines rate limits for a specific tool. | ||
| // At least one of shared or perUser must be configured. | ||
| // | ||
| // +kubebuilder:validation:XValidation:rule="has(self.shared) || has(self.perUser)",message="at least one of shared or perUser must be configured" | ||
| // | ||
| //nolint:lll // kubebuilder marker exceeds line length | ||
| type ToolRateLimitConfig struct { | ||
| // Name is the MCP tool name this limit applies to. | ||
| // +kubebuilder:validation:Required | ||
| // +kubebuilder:validation:MinLength=1 | ||
| Name string `json:"name"` | ||
|
|
||
| // Shared token bucket for this specific tool. | ||
| // +optional | ||
| Shared *RateLimitBucket `json:"shared,omitempty"` | ||
|
|
||
| // PerUser token bucket configuration for this tool. | ||
| // +optional | ||
| PerUser *RateLimitBucket `json:"perUser,omitempty"` | ||
| } | ||
| // +gendoc | ||
| type ToolRateLimitConfig = vmcpconfig.ToolRateLimitConfig |
There was a problem hiding this comment.
can you avoid moving these structs and creating the type aliases?
There was a problem hiding this comment.
I have moved the rate limit structs back into v1beta1 as real CRD types and removed the aliases instead of trying to patch the generated docs.
There was a problem hiding this comment.
Again, this new CI error looks unrelated.
1512e16 to
37547eb
Compare
PR size has been reduced below the XL threshold. Thank you for splitting this up!
|
✅ PR size has been reduced below the XL threshold. The size review has been dismissed and this PR can now proceed with normal review. Thank you for splitting this up! |
37547eb to
6a64714
Compare
6a64714 to
87f7350
Compare
87f7350 to
41b9789
Compare
| vmcpconfig.Config `yaml:",inline"` | ||
|
|
||
| RateLimiting *mcpv1beta1.RateLimitConfig `yaml:"rateLimiting,omitempty"` | ||
| } |
There was a problem hiding this comment.
Why introduce this new type? Why not put ratelimiting on vmcpconfig.Config?
As implemented, I don't think the RateLimiting config ever makes it into vmcp's server.go. The CRD field doesn't survive the conversion step, since there's nowhere to write the RateLimiting config to.
There was a problem hiding this comment.
You’re right. I had added an operator-side ConfigMap wrapper, but that was incomplete because vMCP loads the file into vmcpconfig.Config, so rateLimiting would be ignored before reaching server.go.
I removed the wrapper and reverted this path to marshal vmcpconfig.Config directly. Since this PR is now scoped to the CRD/schema validation side, I’ll leave the actual runtime propagation/enforcement for PR B, where we can wire the config into vMCP and the middleware properly.
Signed-off-by: Sanskarzz <sanskar.gur@gmail.com>
Signed-off-by: Sanskarzz <sanskar.gur@gmail.com>
Signed-off-by: Sanskarzz <sanskar.gur@gmail.com>
Signed-off-by: Sanskarzz <sanskar.gur@gmail.com>
41b9789 to
dc566e1
Compare
PR A — shared→global rename + CRD schema for vMCP
Summary
Add the
VirtualMCPServerrate-limiting API surface in line with THV-0057.This change introduces top-level
spec.rateLimitingsupport onVirtualMCPServerand converts that CRD configuration into the generated vMCP runtime config. It also addsglobalas the preferred field name for the existing MCP server rate-limit configuration while preservingsharedas a deprecated compatibility alias.The CRD validation rejects invalid combinations early, including per-user limits without OIDC incoming auth and any rate limiting without Redis session storage. Generated CRDs and operator docs were updated accordingly.
Runtime vMCP enforcement is intentionally split out of this PR and will be handled in a follow-up PR.
Fixes #4552
Type of change
Test plan
task test)task test-e2e)task lint-fix)API Compatibility
This PR adds optional
VirtualMCPServer.spec.rateLimitingconfiguration to thev1beta1API. This is an additive, backward-compatible API change.Existing
VirtualMCPServerresources do not need migration. Cluster admins can opt in by addingspec.rateLimitingand must configure Redis-backedspec.sessionStoragewhen ratelimiting is enabled. Per-user limits additionally require OIDC incoming auth.
v1beta1API, OR theapi-break-allowedlabel is applied and the migration guidance is described above.Changes
VirtualMCPServer.spec.rateLimitinginstead of placing rate-limit config underspec.configspec.rateLimitinginto the generated vMCP ConfigMap runtime configglobalas the primary field name while preservingsharedas a compatibility aliaspkg/ratelimitusing the CRDRateLimitConfigtype directlyglobalserialization and legacysharedRedis key compatibility