spa_sync_rewrite_vdev_config: update only special vdevs if present#18269
spa_sync_rewrite_vdev_config: update only special vdevs if present#18269Pesc0 wants to merge 1 commit intoopenzfs:masterfrom
Conversation
|
I haven't looked on the code, but another thought I've had is that we could update labels on vdevs that were already written to at the TXG. This way all vdevs that have something new, will actually report it. |
|
Regarding solving the actual problem (disk sleep): sure, this works as well. Regarding which approach is better: intuitively it seems better to distribute the writes between vdevs, but as you pointed out, if the special class dies the pool is toast anyway. It does not seem one approach is more robust than the other, but please correct me if I'm wrong. |
behlendorf
left a comment
There was a problem hiding this comment.
What I think we want is to prefer already dirty top-level vdevs and special devices.
module/zfs/spa.c
Outdated
| vd->vdev_islog || | ||
| !vdev_is_concrete(vd)) | ||
| !vdev_is_concrete(vd) || | ||
| (has_special_class && vd->vdev_alloc_bias != VDEV_BIAS_SPECIAL)) |
There was a problem hiding this comment.
Let's also write the config to all top-level vdevs which are already dirty. This way they'll reflect the latest changes for minimal extra cost as @amotin mention. You can check if the top-level vdev was dirtied with txg_list_member(&spa->spa_vdev_txg_list, vd, TXG_CLEAN(txg)).
module/zfs/spa.c
Outdated
| int svdcount = 0; | ||
| int children = rvd->vdev_children; | ||
| int c0 = random_in_range(children); | ||
| boolean_t has_special_class = spa->spa_special_class->mc_groups != 0; |
There was a problem hiding this comment.
You can use spa_has_special(spa) for this check.
9af3dd0 to
4a24692
Compare
|
Thanks for helping out :) I'll test again when I can and will report back if it still works. |
module/zfs/spa.c
Outdated
| vd->vdev_islog || | ||
| !vdev_is_concrete(vd)) | ||
| !vdev_is_concrete(vd) || | ||
| !txg_list_member(&spa->spa_vdev_txg_list, vd, TXG_CLEAN(txg))) |
There was a problem hiding this comment.
It's possible none of the top-level vdevs will be dirty so this gets a bit more complicated. From a policy perspective here's what I think we want.
- Prefer updating the config on all dirty top-level vdevs. Minimum 1.
- If no top-level vdevs are dirty, but pool contains special vdevs select those.
- If no top-level vdevs are dirty, and there are no special vdevs, randomly select up to 3 top-level vdevs (current behavior)
Only updating the special vdevs when present is a smaller change (your original PR), but updating everything that's dirty would be nice and a bit more robust. Having more backup copies of the uberblocks is pretty much always a good thing.
This change allows to write the uberblock only on (in this order of preference): - vdevs dirtied by the current txg - special vdevs - 3 random vdevs This allows to keep rotational drives asleep if they have not been written to. Signed-off-by: Pesc0 Pesc0@users.noreply.github.com
4a24692 to
d4b80ae
Compare
|
Implemented the logic you described. I still need to test if this works, my test machine is a separate system so i first need to push the commit to be able to conveniently pull it there. I'm not sure if the check is needed in the first two loops of the function. Also it's quite hard to keep lines under 80 chars with 8 char tabs and this much level of nesting. A couple of lines are inevitably over 80, hope that's not an issue. |
|
Seems to be working, hdd stays asleep on my test system. |
Motivation and Context
This change allows to write the uberblock preferentially only on special vdevs if any are present during transaction sync. This allows to keep rotational drives asleep if not used.
This is an attempt to implement what @amotin suggested here.
How Has This Been Tested?
This has been tested on a proxmox system and indeed the issue is fixed: the hdd stays asleep. Please keep in mind that I have no knowledge of the system and don't know what I'm doing, this may break other things that i don't know about. I've tried to keep the changes minimal to reduce potential problems.
Types of changes
Checklist:
Signed-off-by.