http: add jitter support for max_connection_duration#44064
Open
Retr0-XD wants to merge 2 commits intoenvoyproxy:mainfrom
Open
http: add jitter support for max_connection_duration#44064Retr0-XD wants to merge 2 commits intoenvoyproxy:mainfrom
Retr0-XD wants to merge 2 commits intoenvoyproxy:mainfrom
Conversation
|
Hi @Retr0-XD, welcome and thank you for your contribution. We will try to review your Pull Request as quickly as possible. In the meantime, please take a look at the contribution guidelines if you have not done so already. |
|
CC @envoyproxy/api-shepherds: Your approval is needed for changes made to |
afbdc5d to
8ad2fb7
Compare
Fixes envoyproxy#42410. TCP connections already support max_downstream_connection_duration_jitter_percentage (added in envoyproxy#40686). This adds the equivalent for HTTP connections via a new max_connection_duration_jitter_percent field in HttpProtocolOptions. When set, the effective max_connection_duration is extended by a random amount uniformly distributed in [0, jitter_percent/100 * base_duration]. For example, max_connection_duration=10s with 25% jitter means connections drain after a random duration in [10s, 12.5s]. This avoids thundering herd problems when many HTTP/2 connections are established simultaneously. Changes: - api/envoy/config/core/v3/protocol.proto: add max_connection_duration_jitter_percent (field 8) to HttpProtocolOptions - source/common/http/conn_manager_config.h: add maxConnectionDurationJitterPercent() pure virtual method - source/extensions/filters/network/http_connection_manager/config.{h,cc}: parse and store the new field - source/common/http/conn_manager_impl.cc: apply jitter when arming the connection duration timer - source/server/admin/admin.h: stub implementation returns absl::nullopt - test/{mocks,conn_manager_impl_test_base,conn_manager_impl_fuzz_test}: add stubs AI Disclosure: Used GitHub Copilot for coding assistance.\n\nSigned-off-by: Retr0-XD <sakthi.harish@edgeverve.com>
8ad2fb7 to
28b7046
Compare
Adds two test cases: - ConnectionDurationWithJitter: verifies jitter is applied and timer fires at base + jitter - ConnectionDurationJitterNoBaseIgnored: verifies jitter is ignored when base duration is not set\n\nSigned-off-by: Retr0-XD <sakthi.harish@edgeverve.com> Signed-off-by: Retr0-XD <sakthi.harish@edgeverve.com>
28b7046 to
56cfa01
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Fixes #42410.
TCP connections already support
max_downstream_connection_duration_jitter_percentage(merged in #40686). This PR adds the equivalent feature for HTTP connections via a newmax_connection_duration_jitter_percentfield inHttpProtocolOptions.Problem
When many HTTP/2 connections are established simultaneously (e.g. during a pod restart or rolling deploy), they all reach
max_connection_durationat the same time, triggering simultaneous drain → reconnect storms ("thundering herd").Solution
Add
max_connection_duration_jitter_percenttoHttpProtocolOptions. When configured, the effectivemax_connection_durationfor each connection is individually extended by a random amount uniformly distributed in[0, jitter_percent/100 * base_duration], spreading connection drains over a window rather than synchronizing them.Example:
max_connection_duration = 10s,max_connection_duration_jitter_percent = 25→ each connection drains at a random time in[10s, 12.5s].Changes
api/envoy/config/core/v3/protocol.protomax_connection_duration_jitter_percent(field 8) toHttpProtocolOptionssource/common/http/conn_manager_config.hmaxConnectionDurationJitterPercent()pure virtual methodsource/extensions/filters/network/http_connection_manager/config.{h,cc}source/common/http/conn_manager_impl.ccsource/server/admin/admin.habsl::nullopt(no jitter for admin connections)test/mocks/http/mocks.hMOCK_METHODfor new interface methodtest/common/http/conn_manager_impl_test_base.{h,cc}test/common/http/conn_manager_impl_fuzz_test.ccabsl::nulloptPrior art
The TCP proxy implementation in
source/common/tcp_proxy/tcp_proxy.cc(methodConfig::calculateMaxDownstreamConnectionDurationWithJitter()) provides the exact same pattern — this PR follows it faithfully.AI Disclosure: Used GitHub Copilot for coding assistance.
AI disclosure: GitHub Copilot was used during implementation and test writing. I fully understand all changes made in this PR.
Commit Message: See PR title
Risk Level: Low
Testing: Unit tests added/verified
Docs Changes: N/A
Release Notes: N/A
Platform Specific Features: N/A