MDEV-38412 System tablespace fails to shrink due to legacy tables#4884
MDEV-38412 System tablespace fails to shrink due to legacy tables#4884Thirunarayanan wants to merge 1 commit into11.4from
Conversation
Problem:
========
InnoDB system tablespace autoshrink can fail when legacy tables exist
in the system tablespace (space_id=0). These are typically old tables
from earlier InnoDB versions that doesn't have '/' character in the
table name.
Solution:
=========
During system tablespace autoshrinking, InnoDB now proactively drops
unknown/legacy tables that meet the following criteria:
- Table name does not contain '/'
- Table is not an InnoDB system table. In that case, delete the
record from system tables.
drop_tables_by_filter(): Scan and drop tables by predicate
- Used this function to drop the garbage table during restore
- drop the unknown tables from system tablespace
- Added DBUG_EXECUTE_IF("create_dummy_sys_tables") for testing
the scenario.
This cleanup happens automatically before the autoshrink operation,
preventing failures and allowing the system tablespace to be properly
truncated.
|
|
| DBUG_RETURN(2); | ||
| } | ||
|
|
||
| static bool should_drop_garbage_table(const rec_t* rec, ulint len) |
There was a problem hiding this comment.
This is missing noexcept, and it’d be better to use size_t instead of the alias ulint. There is no comment explaining the return value and the parameters.
| @param[in] rec SYS_TABLES record containing the table name | ||
| @param[in] len length of the table name | ||
| @return true if the table should be dropped, false otherwise */ | ||
| static bool should_drop_unknown_table(const rec_t* rec, ulint len) |
There was a problem hiding this comment.
Missing noexcept. A space around the * is misplaced. Please, no [in] in new code. The @return comment should not mention individual return values; we have @retval for that.
| const byte *field= rec_get_nth_field_old(rec, DICT_FLD__SYS_TABLES__ID, &id_len); | ||
| if (dict_sys.is_sys_table(mach_read_from_8(field))) | ||
| return false; |
There was a problem hiding this comment.
The if expression is missing id_len != 8 ||.
| if (dict_table_t *table= dict_sys.load_table( | ||
| {reinterpret_cast<const char*>(pcur.old_rec), len}, | ||
| DICT_ERR_IGNORE_DROP)) |
There was a problem hiding this comment.
What if we have some entries in SYS_INDEXES but the table cannot be loaded because the entries in SYS_TABLES, SYS_COLUMNS, SYS_FIELDS are incorrect? In that case, we would fail to drop the table.
Do we really have to load a table definition into the cache? Would it suffice to unconditionally execute some simpler SQL to delete the corresponding entries from SYS_TABLES, SYS_INDEXES, SYS_FIELDS, SYS_COLUMNS, during a slow shutdown when :autoshrinkis enabled? What really matters here is a call todict_drop_index_tree(). That could be executed by row_purge_remove_clust_if_poss_low()` as part of the slow shutdown.
The basic algorithm would be like this:
- Collect the table ID from
SYS_TABLESand report the table names. (Check forSYS_TABLES.SPACE=0directly from each record.) - For each table ID, execute the SQL to
DELETE FROM SYS_… WHERE SPACE=0 AND TABLE_ID=:id. - Let the purge run into completion. It will take care of invoking
dict_drop_index_tree(), and it’s already checking forSYS_INDEXES.PAGE=FIL_NULLthere.
| "DELETE FROM SYS_INDEXES WHERE TABLE_ID=:id;\n" | ||
| "DELETE FROM SYS_FIELDS WHERE INDEX_ID IN\n" | ||
| " (SELECT ID FROM SYS_INDEXES WHERE TABLE_ID=:id);\n" |
There was a problem hiding this comment.
It would be clearer to first delete from SYS_FIELDS, then from SYS_INDEXES.
| if (err == DB_SUCCESS) | ||
| { | ||
| trx->commit(); | ||
| ut_ad(deleted.empty()); |
There was a problem hiding this comment.
deleted.empty() should hold irrespective of the err value.
| for (pfs_os_file_t d : deleted) | ||
| os_file_close(d); |
There was a problem hiding this comment.
There should be no handles of deleted files to close, because these tables are located in the system tablespace (fil_system.sys_space), which will not be deleted.
| sql_print_information("InnoDB: Dropping the unknown table %.*s", | ||
| static_cast<int>(len), rec); |
There was a problem hiding this comment.
int(len) is shorter and equivalent to static_cast<int>(len).
Problem:
InnoDB system tablespace autoshrink can fail when legacy tables exist in the system tablespace (space_id=0). These are typically old tables from earlier InnoDB versions that doesn't have '/' character in the table name.
Solution:
During system tablespace autoshrinking, InnoDB now proactively drops unknown/legacy tables that meet the following criteria:
drop_tables_by_filter(): Scan and drop tables by predicate
Used this function to drop the garbage table during restore
drop the unknown tables from system tablespace
Added DBUG_EXECUTE_IF("create_dummy_sys_tables") for testing the scenario.
This cleanup happens automatically before the autoshrink operation, preventing failures and allowing the system tablespace to be properly truncated.