feat: add soft-drop table recovery procedures#8061
feat: add soft-drop table recovery procedures#8061v0y4g3r wants to merge 9 commits intoGreptimeTeam:mainfrom
Conversation
Soft-drop now tombstones table metadata and closes datanode regions instead of issuing physical drop requests, while preserving hard-drop cleanup semantics and blocking conflicting drops of recreated table names. Files: - `src/common/meta/src/ddl.rs` - `src/common/meta/src/ddl/drop_table.rs` - `src/common/meta/src/ddl/drop_table/executor.rs` - `src/common/meta/src/error.rs` - `src/common/meta/src/ddl_manager.rs` - `src/meta-srv/src/metasrv/builder.rs` - `src/cmd/src/standalone.rs` - `src/common/meta/src/test_util.rs` - `src/meta-srv/src/procedure/utils.rs` - `tests-integration/src/standalone.rs` - `src/common/meta/src/ddl/tests/drop_table.rs` Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
Add soft-drop recovery and cleanup procedures, wire their DDL task handling, and update \`greptime-proto\` so the new tasks can round-trip through protobuf. Files: - \`Cargo.toml\` - \`Cargo.lock\` - \`src/common/meta/src/ddl.rs\` - \`src/common/meta/src/ddl/undrop_table.rs\` - \`src/common/meta/src/ddl/purge_dropped_table.rs\` - \`src/common/meta/src/ddl_manager.rs\` - \`src/common/meta/src/rpc/ddl.rs\` - \`src/common/meta/src/key.rs\` - \`src/common/meta/src/ddl/tests/drop_table.rs\` - \`src/mito2/src/engine/open_test.rs\` Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
Prevent soft-dropping, undropping, and purging of metric engine logical tables by explicitly returning unsupported errors. This introduces `is_metric_engine_logical_table` to identify metric logical tables and adds corresponding test cases. Files: - `src/common/meta/src/ddl/drop_table/metadata.rs` - `src/common/meta/src/ddl/purge_dropped_table.rs` - `src/common/meta/src/ddl/tests/drop_table.rs` - `src/common/meta/src/ddl/undrop_table.rs` - `src/common/meta/src/ddl/utils.rs` Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
Keep region failure detector state aligned as soft-dropped tables close, reopen, and purge regions so stale detectors do not trigger failover for unavailable or deleted regions. Files: - \`src/common/meta/src/ddl.rs\` - \`src/common/meta/src/ddl/drop_table.rs\` - \`src/common/meta/src/ddl/undrop_table.rs\` - \`src/common/meta/src/ddl/purge_dropped_table.rs\` - \`src/common/meta/src/ddl/tests/drop_table.rs\` Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
There was a problem hiding this comment.
Code Review
This pull request introduces a "soft drop" feature, enabling tables to be tombstoned instead of immediately deleted. It adds new procedures for undropping and purging tables, along with a soft_drop_enabled configuration in the DDL context. Review feedback highlights that the open_regions logic in the undrop procedure is incomplete as it only reopens leader regions, potentially leaving replicated tables without their follower replicas. Suggestions were made to include HashSet and follower-related routing imports to correctly restore the full replication state.
Consolidate redundant soft-drop lifecycle assertions into existing end-to-end tests and share dropped-table metadata setup to keep the branch coverage focused. Files: - `src/common/meta/src/ddl/tests/drop_table.rs` - `src/common/meta/src/key.rs` - `src/mito2/src/engine/open_test.rs` Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
Reopen all replicas when restoring dropped physical tables so recovered replicated tables do not leave follower regions closed. Files: - `src/common/meta/src/ddl/undrop_table.rs` - `src/common/meta/src/ddl/tests/drop_table.rs` Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
Update greptime-proto and adapt dropped table DDL task conversions to the shared expression wrappers required by the proto API. Files: - `Cargo.toml` - `Cargo.lock` - `src/api/src/helper.rs` - `src/common/meta/src/ddl/drop_table/executor.rs` - `src/common/meta/src/rpc/ddl.rs` Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
Point GreptimeDB at the proto revision that restores direct dropped table task fields and remove wrapper-expression conversion code. Files: - `Cargo.toml` - `Cargo.lock` - `src/api/src/helper.rs` - `src/common/meta/src/rpc/ddl.rs` Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
There was a problem hiding this comment.
Pull request overview
This PR adds metasrv-side procedure support for table soft-drop recovery, introducing explicit UNDROP and PURGE workflows and updating the existing DROP TABLE procedure to support a “tombstone + close regions” path when soft-drop is enabled.
Changes:
- Add
UndropTableProcedureto restore tombstoned metadata and reopen preserved regions. - Add
PurgeDroppedTableProcedureto permanently drop regions for tombstoned tables and delete tombstones. - Extend DDL task/procedure plumbing (task variants, protobuf conversions, loaders/dispatch) and add test coverage for soft-drop/undrop/purge behavior.
Reviewed changes
Copilot reviewed 19 out of 20 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| tests-integration/src/standalone.rs | Initializes new DdlContext.soft_drop_enabled field in integration standalone builder. |
| src/mito2/src/engine/open_test.rs | Strengthens reopen test by inserting rows pre-reopen and validating scan results post-reopen. |
| src/meta-srv/src/procedure/utils.rs | Updates metasrv procedure test context to include soft_drop_enabled. |
| src/meta-srv/src/metasrv/builder.rs | Wires soft_drop_enabled into metasrv DDL context (currently hardcoded false). |
| src/common/meta/src/test_util.rs | Updates common-meta test DDL contexts to include soft_drop_enabled. |
| src/common/meta/src/rpc/ddl.rs | Adds UndropTable and PurgeDroppedTable DDL tasks + protobuf roundtrip tests. |
| src/common/meta/src/key.rs | Improves dropped-table WAL options reconstruction to tolerate logical routes. |
| src/common/meta/src/error.rs | Adds TableNameTombstoneConflict error and maps it to TableAlreadyExists. |
| src/common/meta/src/ddl/utils.rs | Adds helper is_metric_engine_logical_table() for soft-drop gating logic. |
| src/common/meta/src/ddl/undrop_table.rs | New undrop procedure: restore tombstones, reopen regions, invalidate cache, re-register failure detectors. |
| src/common/meta/src/ddl/tests/drop_table.rs | Adds extensive tests for soft-drop, name conflict semantics, undrop, purge, and metric-engine logical table rejection. |
| src/common/meta/src/ddl/purge_dropped_table.rs | New purge procedure: resolve tombstone, reopen regions (if physical), drop, delete tombstones, deregister failure detectors. |
| src/common/meta/src/ddl/drop_table/metadata.rs | Adds soft-drop gating for metric-engine logical tables and refactors metadata fetch flow. |
| src/common/meta/src/ddl/drop_table/executor.rs | Adds tombstone-name conflict check for dropping recreated tables; adds close-regions path. |
| src/common/meta/src/ddl/drop_table.rs | Implements soft-drop branch: close regions + deregister detectors and stop before physical drop. |
| src/common/meta/src/ddl.rs | Adds soft_drop_enabled to DdlContext; exposes detector deregistration to crate. |
| src/common/meta/src/ddl_manager.rs | Registers/dispatches new procedures and adds submit helpers for undrop/purge tasks. |
| src/cmd/src/standalone.rs | Initializes new DdlContext.soft_drop_enabled field for standalone command. |
| Cargo.toml | Updates greptime-proto git revision. |
| Cargo.lock | Lockfile update following greptime-proto revision bump. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| .await? | ||
| .with_context(|| error::TableNotFoundSnafu { | ||
| table_name: table_ref.to_string(), | ||
| })?; |
| async fn on_prepare(&mut self) -> Result<Status> { | ||
| let dropped_table = if let Some(table_id) = self.data.task.table_id { | ||
| self.context | ||
| .table_metadata_manager | ||
| .get_dropped_table_by_id(table_id) | ||
| .await? | ||
| } else { | ||
| self.context | ||
| .table_metadata_manager | ||
| .get_dropped_table(&self.data.task.table_name()) | ||
| .await? | ||
| }; | ||
|
|
||
| let Some(dropped_table) = dropped_table else { | ||
| return Ok(Status::done()); | ||
| }; | ||
| ensure!( | ||
| !is_metric_engine_logical_table( | ||
| &dropped_table.table_info_value.table_info, | ||
| &dropped_table.table_route_value | ||
| ), | ||
| error::UnsupportedSnafu { | ||
| operation: "purging metric logical tables".to_string() | ||
| } | ||
| ); | ||
| self.data.table_id = Some(dropped_table.table_id); | ||
| self.data.table_name = Some(dropped_table.table_name); | ||
| self.data.table_info = Some(dropped_table.table_info_value.table_info); | ||
| self.data.table_route_value = Some(dropped_table.table_route_value); | ||
| self.data.region_wal_options = dropped_table.region_wal_options; |
| async fn on_open_regions(&mut self) -> Result<Status> { | ||
| if let Some(region_routes) = self.data.physical_region_routes() { | ||
| open_regions( | ||
| &self.context, | ||
| self.data.table_id(), | ||
| self.data.table_name(), | ||
| self.data.table_info(), | ||
| region_routes, | ||
| &self.data.region_wal_options, | ||
| ) | ||
| .await?; | ||
| } | ||
| self.data.state = PurgeDroppedTableState::DropRegions; | ||
| Ok(Status::executing(true)) | ||
| } |
| /// Resolves if soft-drop is enabled from metasrv options. | ||
| fn ddl_soft_drop_enabled(_options: &MetasrvOptions) -> bool { | ||
| // TODO(hl): add a dedicated soft-drop cluster config | ||
| // when wiring the user-facing option. | ||
| false | ||
| } |
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0fc2f8d8a2
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| .get_dropped_table_by_id(self.data.task.table_id) | ||
| .await? | ||
| .with_context(|| error::TableNotFoundSnafu { | ||
| table_name: table_ref.to_string(), |
There was a problem hiding this comment.
Validate undrop name against tombstone before restore
on_prepare checks TableAlreadyExists using the request name, but then fetches tombstoned metadata by table_id and never verifies that both identities match. If a caller sends a mismatched (catalog/schema/table, table_id), the procedure can pass the name-exists check and later restore_table_metadata for a different table name, overwriting an existing live TableNameKey mapping and orphaning the current table’s metadata. Reject requests where the provided name does not match the tombstoned table referenced by table_id before moving tombstoned keys back.
Useful? React with 👍 / 👎.
I hereby agree to the terms of the GreptimeDB CLA.
Refer to a related PR or issue link (optional)
Refs #8020
Related proto PR: GreptimeTeam/greptime-proto#318
What's changed and what's your intention?
This PR implements Tasks 3 and 4 of the soft-drop table plan.
DROP TABLEwhen soft-drop is enabled: metadata is tombstoned, caches are invalidated, regions are closed, and physical cleanup is deferred.UndropTableProcedureto restore tombstoned metadata, reopen preserved physical regions, refresh cache state, and re-register failure detectors.PurgeDroppedTableProcedureto resolve tombstoned tables, reopen closed physical regions before issuing real drop requests, delete tombstones, and deregister failure detectors.Current limitation:
Test plan run during development:
cargo nextest run -p common-meta drop_tablecargo nextest run -p common-meta undrop purge_droppedcargo nextest run -p mito2 test_engine_reopens_closed_soft_dropped_regioncargo check -p common-meta --all-targetsPR Checklist
Please convert it to a draft if some of the following conditions are not met.