Skip to content

feat: add MCP response compression with metrics instrumentation#3

Open
henrikrexed wants to merge 10 commits intomainfrom
feat/mcp-response-compression-upstream
Open

feat: add MCP response compression with metrics instrumentation#3
henrikrexed wants to merge 10 commits intomainfrom
feat/mcp-response-compression-upstream

Conversation

@henrikrexed
Copy link
Copy Markdown
Owner

Summary

  • Wire responseCompression from CRD through xDS proto to proxy
  • Add JSON-to-markdown/TSV/CSV response compression for MCP targets
  • Add metrics instrumentation for compression (bytes saved, ratio, format)
  • Documentation for compression configuration and metrics
  • Fix test initializers for span_writer and metrics arguments

Mirrors upstream PR agentgateway#1537 (agentgateway/agentgateway). This fork PR validates CI before upstream merge.

@henrikrexed henrikrexed force-pushed the feat/mcp-response-compression-upstream branch 2 times, most recently from 394676d to e10bde4 Compare April 16, 2026 08:23
henrikrexed and others added 10 commits April 17, 2026 08:26
Add response compression for MCP tool call results, reducing token
usage when LLMs consume structured JSON responses. Supports three
output formats: markdown tables, TSV, and CSV.

The compression is configurable per-backend via the `responseCompression`
field in both static config and the AgentgatewayBackend CRD. When
enabled, JSON array/object responses from MCP tool calls are
automatically converted to the specified tabular format.

Key changes:
- New `compress` module with format conversion logic and tests
- Handler integration to compress CallToolResult content
- Proto/xDS extension for responseCompression configuration
- Test helpers and snapshot updates for the new field

Co-Authored-By: Paperclip <noreply@paperclip.ing>
Signed-off-by: Henrik Rexed <henrik.rexed@gmail.com>
Wire the responseCompression configuration from the Kubernetes CRD
through the xDS protocol to the proxy. Adds the ResponseCompression
type to the backend spec with enabled/format fields, updates the
deepcopy generator output, Helm CRD templates, and the syncer
translation layer.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
Signed-off-by: Henrik Rexed <henrik.rexed@gmail.com>
Create proper child spans for correlated telemetry events instead of
logging them as independent entries. This improves trace correlation
for LLM calls, MCP operations, and guardrail checks.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
Signed-off-by: Henrik Rexed <henrik.rexed@gmail.com>
Add Prometheus metrics to track compression operations: request
counts by format, compression ratios, original/compressed sizes,
and processing duration. Enables monitoring of compression
effectiveness across backends.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
Signed-off-by: Henrik Rexed <henrik.rexed@gmail.com>
Document the response compression feature including configuration
options, supported formats, metrics, and architecture overview.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
Signed-off-by: Henrik Rexed <henrik.rexed@gmail.com>
- Delete orphan compress_tests.rs (tests already inline in compress.rs)
- Add cell escaping: pipe chars in markdown, tabs/newlines in TSV
- Skip compression when result is larger than original
- Simplify metrics labels to target+format (remove unused gateway/listener/route)
- Fix xDS format mapping: unknown format with enabled=true treated as disabled
- Wire responseCompression for selector-based targets in translate.go
- Remove unwired Protocol field from McpTargetSelector CRD
- Fix docs: note lossy summarization for nested objects/large arrays

Co-Authored-By: Paperclip <noreply@paperclip.ing>
Signed-off-by: Henrik Rexed <henrik.rexed@gmail.com>
PolicyClient and Relay::new() gained a span_writer field and a 4th
metrics argument after the rebase; six test call sites were not updated.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
Signed-off-by: Henrik Rexed <henrik.rexed@gmail.com>
…initializers

OIDC and LLM tests were missing the span_writer field on PolicyClient,
and MCP tests were missing response_compression on SseTargetSpec and
OpenAPITarget, all added during the compression feature rebase.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
Signed-off-by: Henrik Rexed <henrik.rexed@gmail.com>
Co-Authored-By: Paperclip <noreply@paperclip.ing>
Signed-off-by: Henrik Rexed <henrik.rexed@gmail.com>
- Update new stateless_multiplex_delete_session_skips_uninitialized_targets
  test to include span_writer + metrics args after rebase onto main
- Collapse consecutive str::replace in escape_tsv (clippy
  collapsible_str_replace under -D warnings)

Co-Authored-By: Paperclip <noreply@paperclip.ing>
Signed-off-by: Henrik Rexed <henrik.rexed@gmail.com>
@henrikrexed henrikrexed force-pushed the feat/mcp-response-compression-upstream branch from e10bde4 to 7ce851c Compare April 17, 2026 06:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant