Activity log for bug #2025297

Date Who What changed Old value New value Message
2023-06-28 22:07:33 Paul Goins bug added bug
2023-06-28 22:07:33 Paul Goins attachment added Juju logs including traceback https://bugs.launchpad.net/bugs/2025297/+attachment/5682589/+files/cinder_lvm_reinit_bug.txt
2023-06-28 22:07:50 Paul Goins bug task added charm-helpers
2023-06-29 05:14:06 Paul Goins bug added subscriber Canonical Field High
2023-06-29 05:16:37 Paul Goins description During a recent maintenance window, an attempted workaround to https://bugs.launchpad.net/cinder/+bug/2023073 was put into place, involving an edit of /etc/lvm.conf to set "wipe_signatures_when_zeroing_new_lvs = 0". Unfortunately, at one point the file was rewritten incorrectly, setting that line to instead read "wipe_signatures_when_zeroing_new_lvs = ", i.e. no value for the key. This unfortunately caused certain charmhelpers commands to return incorrect values regarding whether disks were initialized for LVM or not, resulting in the charm re-initializing the disks. I'm attaching an excerpt from Juju unit logs for cinder-lvm which shows the problem, but giving a summary of what happened: * reactive/layer_openstack.py:101:run_storage_backend is entered, and its code ends up calling CinderLVMCharm.cinder_configuration(), which in turn calls configure_block_devices(). * configure_block_devices() calls configure_lvm_storage(). * In configure_lvm_storage(), is_lvm_physical_volume(device) returns False, has_partition_table(device) also returns False, and thus prepare_volume(device) gets called. * prepare_volume(device) calls clean_storage(device), which in turn calls zap_disk(device). See the attachment for a detailed traceback. Re: the has_partition_table() check: I confirmed that the configuration value "overwrite" is set to False, and the configured block-devices all lacked MBR/GPT partition tables so the has_partition_table() check wouldn't have blocked this. This leaves the is_lvm_physical_volume() check as being the only protection applicable in this case against accidental re-initialization. And in this particular charm version, that is implemented in the charmhelpers code as follows: try: check_output(['pvdisplay', block_device]) return True except CalledProcessError: return False Basically, anything which would cause the above command to fail - like, perhaps, a misconfigured /etc/lvm.conf - may cause this check to falsely return that something is *not* an LVM volume, resulting in it getting re-initialized. In summary: I believe this is a critical bug in charmhelpers which also critically impacts the cinder-lvm charm with the risk of blowing away data on configured LVM devices in the case of a misconfigured /etc/lvm.conf. During a recent maintenance window, an attempted workaround to https://bugs.launchpad.net/cinder/+bug/2023073 was put into place, involving an edit of /etc/lvm/lvm.conf to set "wipe_signatures_when_zeroing_new_lvs = 0". Unfortunately, at one point the file was rewritten incorrectly, setting that line to instead read "wipe_signatures_when_zeroing_new_lvs = ", i.e. no value for the key. This unfortunately caused certain charmhelpers commands to return incorrect values regarding whether disks were initialized for LVM or not, resulting in the charm re-initializing the disks. I'm attaching an excerpt from Juju unit logs for cinder-lvm which shows the problem, but giving a summary of what happened: * reactive/layer_openstack.py:101:run_storage_backend is entered, and its code ends up calling CinderLVMCharm.cinder_configuration(), which in turn calls configure_block_devices(). * configure_block_devices() calls configure_lvm_storage(). * In configure_lvm_storage(), is_lvm_physical_volume(device) returns False, has_partition_table(device) also returns False, and thus prepare_volume(device) gets called. * prepare_volume(device) calls clean_storage(device), which in turn calls zap_disk(device). See the attachment for a detailed traceback. Re: the has_partition_table() check: I confirmed that the configuration value "overwrite" is set to False, and the configured block-devices all lacked MBR/GPT partition tables so the has_partition_table() check wouldn't have blocked this. This leaves the is_lvm_physical_volume() check as being the only protection applicable in this case against accidental re-initialization. And in this particular charm version, that is implemented in the charmhelpers code as follows:     try:         check_output(['pvdisplay', block_device])         return True     except CalledProcessError:         return False Basically, anything which would cause the above command to fail - like, perhaps, a misconfigured /etc/lvm.conf - may cause this check to falsely return that something is *not* an LVM volume, resulting in it getting re-initialized. In summary: I believe this is a critical bug in charmhelpers which also critically impacts the cinder-lvm charm with the risk of blowing away data on configured LVM devices in the case of a misconfigured /etc/lvm.conf.
2023-06-29 18:41:29 Paul Goins attachment removed Juju logs including traceback https://bugs.launchpad.net/charm-cinder-lvm/+bug/2025297/+attachment/5682589/+files/cinder_lvm_reinit_bug.txt
2023-06-29 18:43:46 Paul Goins attachment added traceback.txt https://bugs.launchpad.net/charm-cinder-lvm/+bug/2025297/+attachment/5682920/+files/traceback.txt
2023-06-30 17:04:23 Corey Bryant charm-cinder-lvm: status New Triaged
2023-06-30 17:04:32 Corey Bryant charm-cinder-lvm: importance Undecided High
2023-06-30 23:30:43 Billy Olsen removed subscriber Canonical Field High