Skip to content

http: add jitter support for max_connection_duration#44064

Open
Retr0-XD wants to merge 2 commits intoenvoyproxy:mainfrom
Retr0-XD:feat/http-max-connection-duration-jitter-42410
Open

http: add jitter support for max_connection_duration#44064
Retr0-XD wants to merge 2 commits intoenvoyproxy:mainfrom
Retr0-XD:feat/http-max-connection-duration-jitter-42410

Conversation

@Retr0-XD
Copy link

@Retr0-XD Retr0-XD commented Mar 21, 2026

Description

Fixes #42410.

TCP connections already support max_downstream_connection_duration_jitter_percentage (merged in #40686). This PR adds the equivalent feature for HTTP connections via a new max_connection_duration_jitter_percent field in HttpProtocolOptions.

Problem

When many HTTP/2 connections are established simultaneously (e.g. during a pod restart or rolling deploy), they all reach max_connection_duration at the same time, triggering simultaneous drain → reconnect storms ("thundering herd").

Solution

Add max_connection_duration_jitter_percent to HttpProtocolOptions. When configured, the effective max_connection_duration for each connection is individually extended by a random amount uniformly distributed in [0, jitter_percent/100 * base_duration], spreading connection drains over a window rather than synchronizing them.

Example: max_connection_duration = 10s, max_connection_duration_jitter_percent = 25 → each connection drains at a random time in [10s, 12.5s].

Changes

File Change
api/envoy/config/core/v3/protocol.proto Add max_connection_duration_jitter_percent (field 8) to HttpProtocolOptions
source/common/http/conn_manager_config.h Add maxConnectionDurationJitterPercent() pure virtual method
source/extensions/filters/network/http_connection_manager/config.{h,cc} Parse and store the new proto field
source/common/http/conn_manager_impl.cc Apply jitter when arming the connection duration timer
source/server/admin/admin.h Stub: returns absl::nullopt (no jitter for admin connections)
test/mocks/http/mocks.h Add MOCK_METHOD for new interface method
test/common/http/conn_manager_impl_test_base.{h,cc} Add stub forwarding delegates
test/common/http/conn_manager_impl_fuzz_test.cc Add stub returning absl::nullopt

Prior art

The TCP proxy implementation in source/common/tcp_proxy/tcp_proxy.cc (method Config::calculateMaxDownstreamConnectionDurationWithJitter()) provides the exact same pattern — this PR follows it faithfully.

AI Disclosure: Used GitHub Copilot for coding assistance.

AI disclosure: GitHub Copilot was used during implementation and test writing. I fully understand all changes made in this PR.

Commit Message: See PR title
Risk Level: Low
Testing: Unit tests added/verified
Docs Changes: N/A
Release Notes: N/A
Platform Specific Features: N/A

@repokitteh-read-only
Copy link

Hi @Retr0-XD, welcome and thank you for your contribution.

We will try to review your Pull Request as quickly as possible.

In the meantime, please take a look at the contribution guidelines if you have not done so already.

🐱

Caused by: #44064 was opened by Retr0-XD.

see: more, trace.

@repokitteh-read-only
Copy link

CC @envoyproxy/api-shepherds: Your approval is needed for changes made to (api/envoy/|docs/root/api-docs/).
envoyproxy/api-shepherds assignee is @markdroth
CC @envoyproxy/api-watchers: FYI only for changes made to (api/envoy/|docs/root/api-docs/).

🐱

Caused by: #44064 was opened by Retr0-XD.

see: more, trace.

@Retr0-XD Retr0-XD had a problem deploying to external-contributors March 21, 2026 09:33 — with GitHub Actions Error
@Retr0-XD Retr0-XD force-pushed the feat/http-max-connection-duration-jitter-42410 branch from afbdc5d to 8ad2fb7 Compare March 22, 2026 06:17
@Retr0-XD Retr0-XD had a problem deploying to external-contributors March 22, 2026 06:17 — with GitHub Actions Error
Fixes envoyproxy#42410.

TCP connections already support max_downstream_connection_duration_jitter_percentage
(added in envoyproxy#40686). This adds the equivalent for HTTP connections via a new
max_connection_duration_jitter_percent field in HttpProtocolOptions.

When set, the effective max_connection_duration is extended by a random
amount uniformly distributed in [0, jitter_percent/100 * base_duration].
For example, max_connection_duration=10s with 25% jitter means connections
drain after a random duration in [10s, 12.5s]. This avoids thundering herd
problems when many HTTP/2 connections are established simultaneously.

Changes:
- api/envoy/config/core/v3/protocol.proto: add max_connection_duration_jitter_percent (field 8) to HttpProtocolOptions
- source/common/http/conn_manager_config.h: add maxConnectionDurationJitterPercent() pure virtual method
- source/extensions/filters/network/http_connection_manager/config.{h,cc}: parse and store the new field
- source/common/http/conn_manager_impl.cc: apply jitter when arming the connection duration timer
- source/server/admin/admin.h: stub implementation returns absl::nullopt
- test/{mocks,conn_manager_impl_test_base,conn_manager_impl_fuzz_test}: add stubs

AI Disclosure: Used GitHub Copilot for coding assistance.\n\nSigned-off-by: Retr0-XD <sakthi.harish@edgeverve.com>
@Retr0-XD Retr0-XD force-pushed the feat/http-max-connection-duration-jitter-42410 branch from 8ad2fb7 to 28b7046 Compare March 22, 2026 06:20
@Retr0-XD Retr0-XD had a problem deploying to external-contributors March 22, 2026 06:20 — with GitHub Actions Error
Adds two test cases:
- ConnectionDurationWithJitter: verifies jitter is applied and timer fires at base + jitter
- ConnectionDurationJitterNoBaseIgnored: verifies jitter is ignored when base duration is not set\n\nSigned-off-by: Retr0-XD <sakthi.harish@edgeverve.com>

Signed-off-by: Retr0-XD <sakthi.harish@edgeverve.com>
@Retr0-XD Retr0-XD force-pushed the feat/http-max-connection-duration-jitter-42410 branch from 28b7046 to 56cfa01 Compare March 24, 2026 18:28
@Retr0-XD Retr0-XD requested a deployment to external-contributors March 24, 2026 18:28 — with GitHub Actions Waiting
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add jitter support for HTTP max_connection_duration

2 participants