fix: revoke meta kv writes outside metasrv leader#8060
fix: revoke meta kv writes outside metasrv leader#8060QuakeWang wants to merge 5 commits intoGreptimeTeam:mainfrom
Conversation
Signed-off-by: QuakeWang <wangfuzheng0814@foxmail.com>
There was a problem hiding this comment.
Code Review
This pull request introduces a ReadOnlyKvBackend wrapper to enforce read-only access to metadata stores for frontend, datanode, and flownode components, ensuring metadata writes are routed through metasrv procedures. It also updates the MetaClient to be read-only by default and adds leader-cache readiness checks in Metasrv. Feedback was provided regarding an inefficient clone of the Txn object during validation in ReadOnlyKvBackend, suggesting the use of references or direct field access instead.
There was a problem hiding this comment.
Pull request overview
This PR tightens meta KV consistency by revoking direct meta KV write capability from non-leader/non-metasrv paths, aligning write semantics with metasrv leader-only assumptions (Issue #7585).
Changes:
- Introduces a
ReadOnlyKvBackendwrapper that forwards reads but rejects all writes (including write txns), and uses it for frontend/datanode/flownode catalog KV backends. - Makes meta-client Store writes read-only by default, with an explicit “admin/test” builder option enabling direct (non-leader-aware) Store writes.
- Enforces leader-only Store writes in metasrv’s Store gRPC service by rejecting writes when not leader or when leader-cache is not ready, and routes accepted writes through
LeaderCachedKvBackend.
Reviewed changes
Copilot reviewed 19 out of 19 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| tests-integration/src/test_util.rs | Adds helper to prepare catalog/schema using a raw KV backend (avoids read-only catalog backend paths). |
| tests-integration/src/cluster.rs | Switches integration cluster meta backend usage to new_read_only_meta_kv_backend and updates test setup ordering. |
| src/meta-srv/src/state.rs | Fixes become_leader to update enable_leader_cache on leader→leader transitions; extends state transition tests. |
| src/meta-srv/src/service/store.rs | Adds leader + leader-cache-ready checks for write RPCs and routes writes via leader_cached_kv_backend; adds rejection tests. |
| src/meta-srv/src/metasrv.rs | Exposes leader_cached_kv_backend() and is_leader_cache_ready() for Store service gating/routing. |
| src/meta-client/src/mocks.rs | Enables direct Store writes for mocked clients (admin/test mode). |
| src/meta-client/src/error.rs | Maps ReadOnlyKvBackend to StatusCode::Unsupported. |
| src/meta-client/src/client/store.rs | Makes Store client read-only by default; adds explicit writable constructor; adds fast-fail write tests. |
| src/meta-client/src/client.rs | Adds builder flag and method enable_direct_store_writes_for_admin; ensures store is read-only by default. |
| src/meta-client/examples/meta_client.rs | Updates example to enable direct Store writes explicitly. |
| src/common/meta/src/kv_backend/read_only.rs | Implements ReadOnlyKvBackend and unit tests (read forwarding, write rejection, txn validation). |
| src/common/meta/src/kv_backend.rs | Exports the new read_only module. |
| src/common/meta/src/key.rs | Adds coverage to ensure read-only KV backend can still read full table info. |
| src/common/meta/src/error.rs | Introduces ReadOnlyKvBackend error variant and maps it to StatusCode::Unsupported. |
| src/cmd/src/frontend.rs | Wraps meta catalog KV backend with read-only wrapper before caching/registries. |
| src/cmd/src/flownode.rs | Wraps meta catalog KV backend with read-only wrapper; removes metadata init that requires writes. |
| src/cmd/src/datanode/builder.rs | Switches datanode catalog backend construction to read-only meta KV backend. |
| src/catalog/src/kvbackend/client.rs | Makes MetaKvBackend internal, adds new_read_only_meta_kv_backend, and returns Unsupported for txn; adds tests. |
| src/catalog/src/kvbackend.rs | Re-exports new_read_only_meta_kv_backend instead of MetaKvBackend. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Signed-off-by: QuakeWang <wangfuzheng0814@foxmail.com>
|
@WenyXu I have addressed the Gemini and Copilot review comments: removed the unnecessary Txn clone and redacted write request logging. PTAL. |
Nice work. I'm thinking we could separate the KvBackend trait like this: This might improve the development experience and make the trait boundaries clearer. However, we may need to find a balance between making large code changes and keeping the code well-organized. WDYT? |
I agree with the direction. Splitting read-only and writable KV capabilities would make the boundary much clearer and would avoid relying on runtime guards in read-only paths. However, I think it is better to handle this in a follow-up PR. The tricky part is For this PR, I’d prefer to keep the scope focused on enforcing the write guard and leader-only Store writes. I can open/follow up with a separate refactor for |
There was a problem hiding this comment.
This is required for the new leader-cache readiness check. on_leader_start() first marks leader cache as disabled, then enables it after cache initialization. The old Leader -> Leader transition kept the old flag, so become_leader(true) could fail to update enable_leader_cache, causing Store writes to remain rejected.
There was a problem hiding this comment.
I’d prefer not to change this file
Signed-off-by: QuakeWang <wangfuzheng0814@foxmail.com>
8c3d681 to
8157871
Compare
There was a problem hiding this comment.
I’d prefer not to change this file.
There was a problem hiding this comment.
@QuakeWang I think there’s some misunderstanding here. I’d prefer to keep the change minimal and avoid writing data through the leader-cached KvBackend
Signed-off-by: QuakeWang <wangfuzheng0814@foxmail.com>
|
@WenyXu Done. I removed the extra request-summary logging changes from this file and kept only the Store write guard path: leader check, leader-cache readiness check, and routing accepted writes through |
There was a problem hiding this comment.
@QuakeWang I think there’s some misunderstanding here. I’d prefer to keep the change minimal and avoid writing data through the leader-cached KvBackend
There was a problem hiding this comment.
I’d prefer not to change this file
There was a problem hiding this comment.
I’d prefer not to change this file
|
@WenyXu Thanks for your review. I’ll remove the leader-cached KvBackend write path and the related One thing to confirm: should this PR still keep a minimal follower-side Store write rejection in |
Yes, I prefer not to change the server side. Adding a readonly wrapper on the client side should be enough for now |
Signed-off-by: QuakeWang <wangfuzheng0814@foxmail.com>
@WenyXu Thanks for clarifying. I reverted the server-side Store changes and kept this PR scoped to the client-side readonly wrapper. Specifically, Store write RPCs now continue to use the original |
I hereby agree to the terms of the GreptimeDB CLA.
Refer to a related PR or issue link (optional)
close: #7585
What's changed and what's your intention?
This PR revokes direct meta KV write access from non-metasrv leader paths.
ReadOnlyKvBackend, which forwards read operations and rejects write operations, including write txns.LeaderCachedKvBackend.enable_leader_cacheis updated after leader cache initialization.PR Checklist
Please convert it to a draft if some of the following conditions are not met.