MDEV-7270: Fix replicas not honoring slave_skip_errors for 1677#4653
MDEV-7270: Fix replicas not honoring slave_skip_errors for 1677#4653OmarGamal10 wants to merge 1 commit intoMariaDB:12.3from
Conversation
gkodinov
left a comment
There was a problem hiding this comment.
Thank you for your contribution! This is a preliminary review.
Please squash all of your commits into a single one and add a commit message to it that complies with CODING_STANDARDS.md.
bnestere
left a comment
There was a problem hiding this comment.
Hi @OmarGamal10 !
Referencing your Zulip note
Though, I did this on main, then found out I had to do this on the oldest maintained version, which is 10.6 . Now there's some changes in the Rows_log_event::do_apply_event function, upon which I depended on with my fix, these changes do not exist in 10.6 along with some others, what should I do in this case?
...
I expect it would be to find a way to do the fix for 10.6, but the thing is the main has some helper functions and some modifications ~9 months ago-ish that makes handling the row errors much cleaner.
Let's actually keep this change on 12.3 where such changes exist (that should reduce the size of the patch). --skip-slave-errors is mostly used for testing purposes, not production, so missing this in some earlier versions won't be a big deal, and it reduces risk of accidental bugs in early GA versions. Also note, 10.6 is restricted to critical bug fixes only, so we couldn't put this in 10.6 anyway.
Sorry I wasn't able to answer earlier :( I was on vacation when you initially asked the question on Zulip, and have only just been able to take a look at your PR now.
No worries :). I noticed error handling changed from 10.6 to 12.3, and that |
bnestere
left a comment
There was a problem hiding this comment.
Thank you for rebasing and updating the patch, @OmarGamal10 ! I've left some notes.
ca21a82 to
c32a9d0
Compare
There was a problem hiding this comment.
Thanks for the continued work, @OmarGamal10 !
I've left another round of feedback. It is quite close!
Another note for your git commit, please mention your reviewer (me):
Reviewed-by: Brandon Nesterenko <brandon.nesterenko@mariadb.com>
| call mtr.add_suppression("Slave SQL.*Error executing row event: .Table .test.t. doesn.t exist., Error_code: 1146"); | ||
| call mtr.add_suppression("Slave SQL.*Column 0 of table .test.t. cannot be converted from type.* Error_code: 1677"); | ||
| call mtr.add_suppression("The slave coordinator and worker threads are stopped, possibly leaving data in inconsistent state"); | ||
| call mtr.add_suppression("Got error 1 during COMMIT"); |
There was a problem hiding this comment.
Nor this nor the one above it. You should only need
call mtr.add_suppression("Slave SQL.*Column 0 of table .test.t. cannot be converted from type.* Error_code: 1677");
There was a problem hiding this comment.
I think we don't need any. The pattern for suppressing warnings is added to global_suppressions.
| Field *field= table->field[slave_idx]; | ||
| set_field_to_null(field); | ||
| } | ||
| } |
There was a problem hiding this comment.
I'm not sure we want to always override to NULL. The slave can make it's own
ALTER TABLE <table> ALTER <broken_column> set default <whatever>;
which could be set to NULL if a user wants that behavior, and set differently otherwise. I think we can just scrap this change so the default behavior is taken.
It'd be good to test that as well so it is explicit that behavior is expected, e.g.
connection slave;
ALTER TABLE t ALTER name set default "slave_default";
and also to update the git commit message.
Though I think I recall mentioning at some point "the slave would use NULL", or something like that, but I suppose here I meant whatever its default is, sorry for the miscommunication
There was a problem hiding this comment.
No worries, yes I set it to NULL based on your suggestion, but it does make more sense to set it to the default field. I'll add this and add another row in the test.
gkodinov
left a comment
There was a problem hiding this comment.
LGTM. Please keep working with Brandon on the final review.
Setting slave_skip_errors to 1677 (ER_SLAVE_CONVERSION_FAILED) doesn't work as intended. The slave aborts replication on column mismatch. This fix makes it log a warning and set the problematic field to the default value gracefully. Reviewed-by: Brandon Nesterenko <brandon.nesterenko@mariadb.com>
Summary
slave_skip_errorsconfiguration.Fix: MDEV-7270