Skip to content

fix: revoke meta kv writes outside metasrv leader#8060

Open
QuakeWang wants to merge 5 commits intoGreptimeTeam:mainfrom
QuakeWang:meta-kv-write-guard
Open

fix: revoke meta kv writes outside metasrv leader#8060
QuakeWang wants to merge 5 commits intoGreptimeTeam:mainfrom
QuakeWang:meta-kv-write-guard

Conversation

@QuakeWang
Copy link
Copy Markdown
Contributor

I hereby agree to the terms of the GreptimeDB CLA.

Refer to a related PR or issue link (optional)

close: #7585

What's changed and what's your intention?

This PR revokes direct meta KV write access from non-metasrv leader paths.

  • Adds ReadOnlyKvBackend, which forwards read operations and rejects write operations, including write txns.
  • Wraps frontend, datanode, and flownode meta catalog KV backends with the read-only wrapper.
  • Makes meta-client Store writes read-only by default, while keeping an explicit admin/test writable mode.
  • Enforces leader-only Store writes on metasrv and routes accepted writes through LeaderCachedKvBackend.
  • Fixes leader-cache state transition so enable_leader_cache is updated after leader cache initialization.

PR Checklist

Please convert it to a draft if some of the following conditions are not met.

  • I have written the necessary rustdoc comments.
  • I have added the necessary unit tests and integration tests.
  • This PR requires documentation updates.
  • API changes are backward compatible.
  • Schema or data changes are backward compatible.

Signed-off-by: QuakeWang <wangfuzheng0814@foxmail.com>
@QuakeWang QuakeWang requested review from a team, MichaelScofield and WenyXu as code owners May 3, 2026 04:17
@github-actions github-actions Bot added size/M docs-not-required This change does not impact docs. labels May 3, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a ReadOnlyKvBackend wrapper to enforce read-only access to metadata stores for frontend, datanode, and flownode components, ensuring metadata writes are routed through metasrv procedures. It also updates the MetaClient to be read-only by default and adds leader-cache readiness checks in Metasrv. Feedback was provided regarding an inefficient clone of the Txn object during validation in ReadOnlyKvBackend, suggesting the use of references or direct field access instead.

Comment thread src/common/meta/src/kv_backend/read_only.rs
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR tightens meta KV consistency by revoking direct meta KV write capability from non-leader/non-metasrv paths, aligning write semantics with metasrv leader-only assumptions (Issue #7585).

Changes:

  • Introduces a ReadOnlyKvBackend wrapper that forwards reads but rejects all writes (including write txns), and uses it for frontend/datanode/flownode catalog KV backends.
  • Makes meta-client Store writes read-only by default, with an explicit “admin/test” builder option enabling direct (non-leader-aware) Store writes.
  • Enforces leader-only Store writes in metasrv’s Store gRPC service by rejecting writes when not leader or when leader-cache is not ready, and routes accepted writes through LeaderCachedKvBackend.

Reviewed changes

Copilot reviewed 19 out of 19 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
tests-integration/src/test_util.rs Adds helper to prepare catalog/schema using a raw KV backend (avoids read-only catalog backend paths).
tests-integration/src/cluster.rs Switches integration cluster meta backend usage to new_read_only_meta_kv_backend and updates test setup ordering.
src/meta-srv/src/state.rs Fixes become_leader to update enable_leader_cache on leader→leader transitions; extends state transition tests.
src/meta-srv/src/service/store.rs Adds leader + leader-cache-ready checks for write RPCs and routes writes via leader_cached_kv_backend; adds rejection tests.
src/meta-srv/src/metasrv.rs Exposes leader_cached_kv_backend() and is_leader_cache_ready() for Store service gating/routing.
src/meta-client/src/mocks.rs Enables direct Store writes for mocked clients (admin/test mode).
src/meta-client/src/error.rs Maps ReadOnlyKvBackend to StatusCode::Unsupported.
src/meta-client/src/client/store.rs Makes Store client read-only by default; adds explicit writable constructor; adds fast-fail write tests.
src/meta-client/src/client.rs Adds builder flag and method enable_direct_store_writes_for_admin; ensures store is read-only by default.
src/meta-client/examples/meta_client.rs Updates example to enable direct Store writes explicitly.
src/common/meta/src/kv_backend/read_only.rs Implements ReadOnlyKvBackend and unit tests (read forwarding, write rejection, txn validation).
src/common/meta/src/kv_backend.rs Exports the new read_only module.
src/common/meta/src/key.rs Adds coverage to ensure read-only KV backend can still read full table info.
src/common/meta/src/error.rs Introduces ReadOnlyKvBackend error variant and maps it to StatusCode::Unsupported.
src/cmd/src/frontend.rs Wraps meta catalog KV backend with read-only wrapper before caching/registries.
src/cmd/src/flownode.rs Wraps meta catalog KV backend with read-only wrapper; removes metadata init that requires writes.
src/cmd/src/datanode/builder.rs Switches datanode catalog backend construction to read-only meta KV backend.
src/catalog/src/kvbackend/client.rs Makes MetaKvBackend internal, adds new_read_only_meta_kv_backend, and returns Unsupported for txn; adds tests.
src/catalog/src/kvbackend.rs Re-exports new_read_only_meta_kv_backend instead of MetaKvBackend.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/meta-srv/src/service/store.rs Outdated
Signed-off-by: QuakeWang <wangfuzheng0814@foxmail.com>
@github-actions github-actions Bot added size/L and removed size/M labels May 7, 2026
@QuakeWang
Copy link
Copy Markdown
Contributor Author

@WenyXu I have addressed the Gemini and Copilot review comments: removed the unnecessary Txn clone and redacted write request logging. PTAL.

@WenyXu
Copy link
Copy Markdown
Member

WenyXu commented May 7, 2026

@WenyXu I have addressed the Gemini and Copilot review comments: removed the unnecessary Txn clone and redacted write request logging. PTAL.

Nice work. I'm thinking we could separate the KvBackend trait like this:

KvBackend: ReadonlyKvBackend + WritableKvBackend
WritableKvBackend: Txn
ReadonlyKvBackend: Txn

This might improve the development experience and make the trait boundaries clearer. However, we may need to find a balance between making large code changes and keeping the code well-organized. WDYT?

@QuakeWang
Copy link
Copy Markdown
Contributor Author

@WenyXu I have addressed the Gemini and Copilot review comments: removed the unnecessary Txn clone and redacted write request logging. PTAL.

Nice work. I'm thinking we could separate the KvBackend trait like this:

KvBackend: ReadonlyKvBackend + WritableKvBackend
WritableKvBackend: Txn
ReadonlyKvBackend: Txn

This might improve the development experience and make the trait boundaries clearer. However, we may need to find a balance between making large code changes and keeping the code well-organized. WDYT?

I agree with the direction. Splitting read-only and writable KV capabilities would make the boundary much clearer and would avoid relying on runtime guards in read-only paths.

However, I think it is better to handle this in a follow-up PR. The tricky part is TxnService: the current Txn can contain Get, Put, and Delete, while some read paths such as get_full_table_info() still use txn for multi-key reads. So a clean split likely needs either a read-only txn abstraction or moving those read paths to explicit read APIs.

For this PR, I’d prefer to keep the scope focused on enforcing the write guard and leader-only Store writes. I can open/follow up with a separate refactor for ReadonlyKvBackend / WritableKvBackend after this lands.

Comment thread src/common/meta/src/key.rs
Comment thread src/meta-client/src/client/store.rs Outdated
Comment thread src/meta-client/src/client.rs Outdated
Comment thread src/meta-srv/src/metasrv.rs
Comment thread src/meta-srv/src/state.rs
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is required for the new leader-cache readiness check. on_leader_start() first marks leader cache as disabled, then enables it after cache initialization. The old Leader -> Leader transition kept the old flag, so become_leader(true) could fail to update enable_leader_cache, causing Store writes to remain rejected.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’d prefer not to change this file

Signed-off-by: QuakeWang <wangfuzheng0814@foxmail.com>
@QuakeWang QuakeWang force-pushed the meta-kv-write-guard branch from 8c3d681 to 8157871 Compare May 7, 2026 08:07
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’d prefer not to change this file.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@QuakeWang I think there’s some misunderstanding here. I’d prefer to keep the change minimal and avoid writing data through the leader-cached KvBackend

Comment thread src/meta-client/src/client.rs Outdated
Signed-off-by: QuakeWang <wangfuzheng0814@foxmail.com>
@QuakeWang
Copy link
Copy Markdown
Contributor Author

@WenyXu Done. I removed the extra request-summary logging changes from this file and kept only the Store write guard path: leader check, leader-cache readiness check, and routing accepted writes through leader_cached_kv_backend().

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@QuakeWang I think there’s some misunderstanding here. I’d prefer to keep the change minimal and avoid writing data through the leader-cached KvBackend

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’d prefer not to change this file

Comment thread src/meta-srv/src/state.rs
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’d prefer not to change this file

@QuakeWang
Copy link
Copy Markdown
Contributor Author

@WenyXu Thanks for your review. I’ll remove the leader-cached KvBackend write path and the related metasrv.rs / state.rs changes.

One thing to confirm: should this PR still keep a minimal follower-side Store write rejection in src/meta-srv/src/service/store.rs, while continuing to write through the original kv_backend()? Or should we avoid changing store.rs entirely and leave server-side Store enforcement for a follow-up?

@WenyXu
Copy link
Copy Markdown
Member

WenyXu commented May 8, 2026

@WenyXu Thanks for your review. I’ll remove the leader-cached KvBackend write path and the related metasrv.rs / state.rs changes.

One thing to confirm: should this PR still keep a minimal follower-side Store write rejection in src/meta-srv/src/service/store.rs, while continuing to write through the original kv_backend()? Or should we avoid changing store.rs entirely and leave server-side Store enforcement for a follow-up?

Yes, I prefer not to change the server side. Adding a readonly wrapper on the client side should be enough for now

Signed-off-by: QuakeWang <wangfuzheng0814@foxmail.com>
@github-actions github-actions Bot added size/M and removed size/L labels May 8, 2026
@QuakeWang
Copy link
Copy Markdown
Contributor Author

@WenyXu Thanks for your review. I’ll remove the leader-cached KvBackend write path and the related metasrv.rs / state.rs changes.
One thing to confirm: should this PR still keep a minimal follower-side Store write rejection in src/meta-srv/src/service/store.rs, while continuing to write through the original kv_backend()? Or should we avoid changing store.rs entirely and leave server-side Store enforcement for a follow-up?

Yes, I prefer not to change the server side. Adding a readonly wrapper on the client side should be enough for now

@WenyXu Thanks for clarifying. I reverted the server-side Store changes and kept this PR scoped to the client-side readonly wrapper.

Specifically, Store write RPCs now continue to use the original kv_backend() path, and I removed the leader_cached_kv_backend() routing, leader/cache readiness checks, and the related metasrv.rs / state.rs changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docs-not-required This change does not impact docs. size/M

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Revoke KV write access outside metasrv leader

3 participants