-
Notifications
You must be signed in to change notification settings - Fork 260
A115: disable priority LB child policy retention cache #541
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
apolcyn
wants to merge
9
commits into
grpc:master
Choose a base branch
from
apolcyn:remove_cache
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+86
−0
Open
Changes from all commits
Commits
Show all changes
9 commits
Select commit
Hold shift + click to select a range
5798457
A56 update: disable priority LB child policy retention cache
apolcyn f4ff35a
specify env var
apolcyn 5b2a3fc
rename grfc
apolcyn 0063acd
add updated header
apolcyn 27826d1
updates
apolcyn ed5414a
add updates
apolcyn a063941
review comments
apolcyn 1850df2
address comments
apolcyn 926c692
address comments
apolcyn File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,84 @@ | ||
| A115: disable Priority LB policy child policy retention cache | ||
| ---- | ||
| * Author(s): @apolcyn, @markdroth | ||
| * Approvers: @markdroth, @ejona86, @dfawley, @easwars | ||
| * Status: {Draft} | ||
| * Implemented in: C-core, Java, Go, Node | ||
| * Last updated: 2026-03-17 | ||
| * Discussion at: <google group thread> (filled after thread exists) | ||
|
|
||
| ## Abstract | ||
|
|
||
| [A56](A56-priority-lb-policy.md) describes | ||
| [mechanisms](A56-priority-lb-policy.md#child-lifetime-management) | ||
| whereby priority LB child policies are cached. There are two cases: | ||
|
|
||
| 1) When a higher priority child becomes reachable, we deactive | ||
| the lower-priority children, and remove them only after an expiry. | ||
|
|
||
| 2) When a child is removed from the LB policy config. | ||
|
|
||
| This proposal removes the usage of a cache for case 2. | ||
|
|
||
| ## Background | ||
|
|
||
| The priority LB child policy retention cache consumes excessive memory under the | ||
| right circumstances (depending on the rate and pattern of locality updates). | ||
|
|
||
| This is especially the case when a locality is flapping between failover and primary | ||
| priorities. For example, notice how priority LB child names increase in the following | ||
| sequence of locality updates. On each child name update, previous policies are added | ||
| to the retention cache. | ||
|
|
||
| ``` | ||
| [[AA, BB], [CC, DD]] => [priority-0-0 priority-0-1] | ||
| [[CC], [DD, EE]] => [priority-0-1 priority-0-2] | ||
| [[AA, BB], [CC, DD]] => [priority-0-3 priority-0-2] | ||
| ``` | ||
|
|
||
| Additionally, priority LB child names are generated with strictly increasing numbers | ||
| (once a priority LB child name is unconfigured, it will never be configured again). As such, | ||
| the cache is not providing us value. | ||
|
|
||
| ## Proposal | ||
|
|
||
| Priority LB should disable the child policy retention cache, when a | ||
| child is removed from its config (i.e., case 2 only). | ||
|
|
||
| Note this should be done for Java and Go only. | ||
|
|
||
| For C-core, we are actually potentially getting benefit from this | ||
| behavior due to subchannel pooling, so we're not planning to drop | ||
| it there until we have the longer-term solution ready. | ||
|
|
||
| ### Temporary environment variable protection | ||
easwars marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| Implementations should provide an environment variable to revert | ||
| to the previous behavior (child policy cache enabled with 15-minute timer). | ||
ejona86 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| This should be kept around for a few releases, and then removed. | ||
|
|
||
| Env var name: `GRPC_EXPERIMENTAL_ENABLE_PRIORITY_LB_CHILD_POLICY_CACHE`. | ||
|
|
||
| ## Rationale | ||
markdroth marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| - Caching the child when it gets removed from the config does not actually | ||
| accomplish anything useful in the case of choosing a priority within an | ||
| xDS cluster (which is the primary case where this policy is used), because | ||
| the hueristic in the cds policy that assigns the child names will never | ||
| reuse a child name once it has been removed from the config. | ||
|
|
||
| - We have seen cases where retaining the children has used up a lot of memory | ||
| and file descriptors, which has caused problems for users. | ||
|
|
||
| - In the long run, we want a better solution involving a separate layer for | ||
| caching subchannels rather than LB policies, but that will be a separate | ||
| project to be undertaken later. | ||
|
|
||
| ## Implementation | ||
|
|
||
| N/A | ||
|
|
||
| ## Open issues (if applicable) | ||
|
|
||
| N/A | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's make this example reflect what should ideally be happening, rather than the weirdness that was Go's original implementation: