fix: embedded etcd should only connect to self#366
Conversation
Fixes a bug where the etcd clients in hosts with embedded etcd were configured to connect to all cluster members that existed when the client was initialized. This was the original intent and functionality, but I changed it while implementing support for remote Etcd. I think this was just an accidental inclusion from a different implementation of the remote Etcd feature. PLAT-581
📝 WalkthroughWalkthroughThe PR adds a new rolling add/remove host test for cluster management and refactors internal state update ordering. Changes include reordering the Cluster.Remove method's state updates and updating EmbeddedEtcd's client configuration to use ClientEndpoints instead of cluster URLs. Changes
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Up to standards ✅🟢 Issues
|
| Metric | Results |
|---|---|
| Duplication | 0 |
NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.
Summary
Fixes a bug where the etcd clients in hosts with embedded etcd were configured to connect to all cluster members that existed when the client was initialized. The correct behavior is that hosts with embedded etcd should only connect to themselves. This was the original intent and functionality, but I changed it while implementing support for remote Etcd. I did a few different implementations of the remote Etcd feature before settling on the current one, and I think this was just an accidental inclusion from a different implementation of that feature.
PLAT-581
Testing
This PR includes a cluster test for the issue:
# Using TEST_RERUN_FAILS in case the OS steals our ephemeral ports make test-cluster TEST_RERUN_FAILS=2 CLUSTER_TEST_RUN=TestRollingAddRemoveTo run the scenario by hand:
Note that our dev setup is slightly different than the scenario in the ticket and our cluster test because hosts 4-6 are in client mode. It still reproduces and validates the issue because the failure occurs in host-2.