A115: disable priority LB child policy retention cache#541
A115: disable priority LB child policy retention cache#541apolcyn wants to merge 9 commits intogrpc:masterfrom
Conversation
markdroth
left a comment
There was a problem hiding this comment.
Thanks for writing this up!
Please let me know if you have any questions.
| [[AA, BB, CC], [DD]] => [priority-0-3 priority-0-4] | ||
| [[BB, CC], [AA, DD]] => [priority-0-0 priority-0-1] | ||
| [[AA, BB, CC], [DD, EE]] => [priority-0-1 priority-0-2] | ||
| [[BB, CC], [AA, DD]] => [priority-0-2 priority-0-3] |
There was a problem hiding this comment.
I think the algorithm should be sticking with child numbers 1 and 2 in this last step.
I've sent you grpc/grpc#41896 to add tests demonstrating the algorithm's behavior in C-core.
There was a problem hiding this comment.
hmm the original text is what Go was doing (i tagged you on an internal doc to illustrate)
but maybe that was specific to Go
Changed to your original suggestion
There was a problem hiding this comment.
In the case you have now in the gRFC, in the third update, the algorithm will use child number 3 for [AA, BB] and reuse child number 2 for [CC, DD], because it was already using that number for [CC] in the second update.
If what you're seeing in Go matches what you currently have in the doc, then I suspect Go's implementation of this algorithm isn't quite right. @easwars
There was a problem hiding this comment.
Will take a look and get back asap.
There was a problem hiding this comment.
I don't see the actual algorithm described in the gRFC and I think this is where the algorithm was originally described: https://docs.google.com/document/d/13L2YVTGI2OrLpeCAPbgI4qEs0Oz02P7X0DKhk6Y9LMk/edit?tab=t.0#heading=h.r3twn1e6r8ct
Based on my reading of this section and my reading of C++, Java and Go, none of the languages actually implement the algorithm correctly. Specifically, none of the languages seem to implement this sentence:
- If there is at least one locality in the priority that was present in the same priority in the old config, copy the name from the same priority in the old config.
All languages seem to use a greedy approach where the moment they find a locality (from the new config) being present in the old config, they end up using that name. They don't actually check for localities in the same priority.
And I do see that in C++, when a map contains locality names as keys, they are ordered using the following: https://github.com/grpc/grpc/blob/2c62d7b7a927db2572b50277ba56eb4a754c43b7/src/core/xds/xds_client/xds_locality.h#L38. And I think this probably helps because in most of our examples, we seem to use very ordered locality names like "L0", "L1" ... or "aa", "bb". In Go, map key ordering is not deterministic.
I do have the changes ready in Go to actually implement the algorithm specified at the top of this comment. But, before sending them out, wanted to check here if there was a reason why none of the languages ended up implementing the logic for looking the locality in the same priority.
There was a problem hiding this comment.
I ended up making the change in the name generation logic to give preference to localities in the same priority in the old config with the assumption that if we really don't want to be doing that, I can always revert the change back.
There was a problem hiding this comment.
Yeah, unfortunately, I didn't write a gRFC for that refactoring, just the internal doc. I've regretted that several times since then, since we've had to patch holes in several subsequent gRFCs as a result (and this PR is yet another example of that). And if I had done it as a gRFC, that would likely have prompted me to be more detailed about the intended algorithm here.
In any case, you're right, it looks like C-core isn't implementing that bullet correctly. But I'm not sure it actually matters that much, because regardless of whether or not we do that, I think there will be cases where the algorithm will do the wrong thing, since it doesn't know anything about which localities are actually reachable or which ones have actually been created by the priority policy. I'm not sure that the order of localities really matters at all -- although I would at least argue for deterministic behavior, because that will be much easier to debug.
(As an aside, it's worth noting that Java and Go also do not correctly implement the "all gRPC implementations must ensure that subchannels for locality 1 are retained through this transition instead of being destroyed and recreated" part, because that requires subchannel pooling. Only C-core and Node do this correctly.)
BTW, why are Go's child names of the form "priority-{X}-{Y}"? I see that Y is the child number here, but what is X? If X is the priority, then that should not actually be part of the name at all, because we want a specific child to be able to move between priorities. In C-core, the child name is just "child{N}", which is what the doc specifies.
There was a problem hiding this comment.
BTW, why are Go's child names of the form "priority-{X}-{Y}"? I see that Y is the child number here, but what is X?
Yes, X is the priority. That is also unfortunate how it ended up being implemented. I'll clean it up soonish.
because that requires subchannel pooling. Only C-core and Node do this correctly
Yes, yes, yes :)
ejona86
left a comment
There was a problem hiding this comment.
Do we (all of us) have a plan for cluster_manager? Are we going to make a separate gRFC for that?
I don't think we need any changes there at this point. Arguably, the caching there is a gray area, because I don't think any gRFC ever specified that we would actually have caching there, but all of our implementations actually do. But on the other hand, I don't think the caching in that policy is actually causing any problems right now. So my inclination is to wait until we have a problem before changing that. |
| [[AA, BB, CC], [DD]] => [priority-0-3 priority-0-4] | ||
| [[BB, CC], [AA, DD]] => [priority-0-0 priority-0-1] | ||
| [[AA, BB, CC], [DD, EE]] => [priority-0-1 priority-0-2] | ||
| [[BB, CC], [AA, DD]] => [priority-0-2 priority-0-3] |
There was a problem hiding this comment.
In the case you have now in the gRFC, in the third update, the algorithm will use child number 3 for [AA, BB] and reuse child number 2 for [CC, DD], because it was already using that number for [CC] in the second update.
If what you're seeing in Go matches what you currently have in the doc, then I suspect Go's implementation of this algorithm isn't quite right. @easwars
Update on https://github.com/grpc/proposal/blob/master/A56-priority-lb-policy.md#child-lifetime-management