feat: add MCP response compression with metrics instrumentation#3
Open
henrikrexed wants to merge 10 commits intomainfrom
Open
feat: add MCP response compression with metrics instrumentation#3henrikrexed wants to merge 10 commits intomainfrom
henrikrexed wants to merge 10 commits intomainfrom
Conversation
394676d to
e10bde4
Compare
Add response compression for MCP tool call results, reducing token usage when LLMs consume structured JSON responses. Supports three output formats: markdown tables, TSV, and CSV. The compression is configurable per-backend via the `responseCompression` field in both static config and the AgentgatewayBackend CRD. When enabled, JSON array/object responses from MCP tool calls are automatically converted to the specified tabular format. Key changes: - New `compress` module with format conversion logic and tests - Handler integration to compress CallToolResult content - Proto/xDS extension for responseCompression configuration - Test helpers and snapshot updates for the new field Co-Authored-By: Paperclip <noreply@paperclip.ing> Signed-off-by: Henrik Rexed <henrik.rexed@gmail.com>
Wire the responseCompression configuration from the Kubernetes CRD through the xDS protocol to the proxy. Adds the ResponseCompression type to the backend spec with enabled/format fields, updates the deepcopy generator output, Helm CRD templates, and the syncer translation layer. Co-Authored-By: Paperclip <noreply@paperclip.ing> Signed-off-by: Henrik Rexed <henrik.rexed@gmail.com>
Create proper child spans for correlated telemetry events instead of logging them as independent entries. This improves trace correlation for LLM calls, MCP operations, and guardrail checks. Co-Authored-By: Paperclip <noreply@paperclip.ing> Signed-off-by: Henrik Rexed <henrik.rexed@gmail.com>
Add Prometheus metrics to track compression operations: request counts by format, compression ratios, original/compressed sizes, and processing duration. Enables monitoring of compression effectiveness across backends. Co-Authored-By: Paperclip <noreply@paperclip.ing> Signed-off-by: Henrik Rexed <henrik.rexed@gmail.com>
Document the response compression feature including configuration options, supported formats, metrics, and architecture overview. Co-Authored-By: Paperclip <noreply@paperclip.ing> Signed-off-by: Henrik Rexed <henrik.rexed@gmail.com>
- Delete orphan compress_tests.rs (tests already inline in compress.rs) - Add cell escaping: pipe chars in markdown, tabs/newlines in TSV - Skip compression when result is larger than original - Simplify metrics labels to target+format (remove unused gateway/listener/route) - Fix xDS format mapping: unknown format with enabled=true treated as disabled - Wire responseCompression for selector-based targets in translate.go - Remove unwired Protocol field from McpTargetSelector CRD - Fix docs: note lossy summarization for nested objects/large arrays Co-Authored-By: Paperclip <noreply@paperclip.ing> Signed-off-by: Henrik Rexed <henrik.rexed@gmail.com>
PolicyClient and Relay::new() gained a span_writer field and a 4th metrics argument after the rebase; six test call sites were not updated. Co-Authored-By: Paperclip <noreply@paperclip.ing> Signed-off-by: Henrik Rexed <henrik.rexed@gmail.com>
…initializers OIDC and LLM tests were missing the span_writer field on PolicyClient, and MCP tests were missing response_compression on SseTargetSpec and OpenAPITarget, all added during the compression feature rebase. Co-Authored-By: Paperclip <noreply@paperclip.ing> Signed-off-by: Henrik Rexed <henrik.rexed@gmail.com>
Co-Authored-By: Paperclip <noreply@paperclip.ing> Signed-off-by: Henrik Rexed <henrik.rexed@gmail.com>
- Update new stateless_multiplex_delete_session_skips_uninitialized_targets test to include span_writer + metrics args after rebase onto main - Collapse consecutive str::replace in escape_tsv (clippy collapsible_str_replace under -D warnings) Co-Authored-By: Paperclip <noreply@paperclip.ing> Signed-off-by: Henrik Rexed <henrik.rexed@gmail.com>
e10bde4 to
7ce851c
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
responseCompressionfrom CRD through xDS proto to proxyMirrors upstream PR agentgateway#1537 (agentgateway/agentgateway). This fork PR validates CI before upstream merge.