Activity log for bug #1821696

Date Who What changed Old value New value Message
2019-03-26 08:17:00 Magnus Lööf bug added bug
2019-03-26 08:44:02 Magnus Lööf description Description =========== We hit this bug after doing a complete cluster shutdown due to server room maintenance. The bug is however more easily reproducible. When cold starting an instance with an encrypted volume attached, it fails so start with a VolumeEncryptionNotSupported error. https://github.com/openstack/os-brick/blob/stable/rocky/os_brick/encryptors/cryptsetup.py#L52 Steps to reproduce ================== * Deploy Openstack with Barbican support using Kolla. * Create an encrypted volume type * Create an encrypted volume * Create an instance and attach the encrypted folder * Enjoy your new instance and volume, install software and store data * In our case, we shut down the entire cluster and restarted it again. First all instances were stopped in Horizon using Shut down instance command. We use Ceph so we then stopped that using these procedures https://ceph.com/planet/how-to-do-a-ceph-cluster-maintenance-shutdown/ and then shut down the compute / storage nodes and then the controller nodes one by one. Then we started the cluster in the reverse order, verified all services were up and running, examined logs and then started the instances. Instances without encrypted volumes started fine. * Instances with encrypted volumes fail to start with VolumeEncryptionNotSupported. Note: It is possible to recreate the problem by using a Hard Reboot (possibly related https://bugs.launchpad.net/nova/+bug/1597234) or by shutting down instances and then restarting all Openstack services on that compute node. Expected results ================ Instances with encrypted volumes should start fine, even after a Hard Reboot or a complete cluster shutdown. Actual results ============== Instances with encrypted volumes failed to start with VolumeEncryptionNotSupported https://pastebin.com/mvMbJQRb Environment =========== 1. Openstack version Environment is established by Kolla (Rocky release). 2. Hypervisor KVM on RHEL 3. Storage type Ceph using Kolla (Rocky release) Analysis ======== There seems to be a problem related to this code not behaving as expected: https://github.com/openstack/nova/blob/stable/rocky/nova/virt/libvirt/driver.py#L1049 It seems that it is expected that the exception should be ignored and logged, but for some reason, the `ctxt.reraise = False` does not work as expected: self.force_reraise() is called in https://github.com/openstack/oslo.utils/blob/stable/rocky/oslo_utils/excutils.py#L220 which it should not have hit since `reraise` is expected to be `False`. We did some hacking and just swallowed the exception by commenting out the `excutils.save_and_reraise_exception()` section and replacing it with a simple `pass`. Then the instance booted - but it could not boot from the image. But, it was then possible to remove the encrypted volume attachment, reboot the server and then reattach the encrypted volume. Description =========== We hit this bug after doing a complete cluster shutdown due to server room maintenance. The bug is however more easily reproducible. When cold starting an instance with an encrypted volume attached, it fails so start with a VolumeEncryptionNotSupported error. https://github.com/openstack/os-brick/blob/stable/rocky/os_brick/encryptors/cryptsetup.py#L52 Steps to reproduce ================== * Deploy Openstack with Barbican support using Kolla. * Create an encrypted volume type * Create an encrypted volume * Create an instance and attach the encrypted folder * Enjoy your new instance and volume, install software and store data * In our case, we shut down the entire cluster and restarted it again. First all instances were stopped in Horizon using Shut down instance command. We use Ceph so we then stopped that using these procedures https://ceph.com/planet/how-to-do-a-ceph-cluster-maintenance-shutdown/ and then shut down the compute / storage nodes and then the controller nodes one by one. Then we started the cluster in the reverse order, verified all services were up and running, examined logs and then started the instances. * Instances without encrypted volumes started fine. * Instances with encrypted volumes fail to start with VolumeEncryptionNotSupported. Note: It is possible to recreate the problem by using a Hard Reboot (possibly related https://bugs.launchpad.net/nova/+bug/1597234) or by shutting down instances and then restarting all Openstack services on that compute node. Expected results ================ Instances with encrypted volumes should start fine, even after a Hard Reboot or a complete cluster shutdown. Actual results ============== Instances with encrypted volumes failed to start with VolumeEncryptionNotSupported https://pastebin.com/mvMbJQRb Environment =========== 1. Openstack version Environment is established by Kolla (Rocky release). 2. Hypervisor KVM on RHEL 3. Storage type Ceph using Kolla (Rocky release) Analysis ======== There seems to be a problem related to this code not behaving as expected: https://github.com/openstack/nova/blob/stable/rocky/nova/virt/libvirt/driver.py#L1049 It seems that it is expected that the exception should be ignored and logged, but for some reason, the `ctxt.reraise = False` does not work as expected: self.force_reraise() is called in https://github.com/openstack/oslo.utils/blob/stable/rocky/oslo_utils/excutils.py#L220 which it should not have hit since `reraise` is expected to be `False`. We did some hacking and just swallowed the exception by commenting out the `excutils.save_and_reraise_exception()` section and replacing it with a simple `pass`. Then the instance booted - but it could not boot from the image. But, it was then possible to remove the encrypted volume attachment, reboot the server and then reattach the encrypted volume.
2019-03-26 14:13:32 Magnus Lööf tags volumes
2019-03-26 14:13:50 Magnus Lööf tags volumes encryption volumes
2019-03-29 07:19:06 pandatt nova: status New Confirmed
2019-03-29 07:19:12 pandatt nova: assignee pandatt (pandatt)
2019-04-04 09:17:24 OpenStack Infra nova: status Confirmed In Progress
2019-04-04 09:17:24 OpenStack Infra nova: assignee pandatt (pandatt) Lee Yarwood (lyarwood)
2019-04-04 11:21:37 Lee Yarwood affects nova kolla
2019-04-04 11:22:05 Lee Yarwood affects kolla nova
2019-04-04 11:22:26 Lee Yarwood bug task added kolla
2019-04-08 08:48:50 Mark Goddard affects kolla kolla-ansible
2019-04-10 13:33:25 Mark Goddard kolla-ansible: status New In Progress
2019-04-10 13:33:28 Mark Goddard kolla-ansible: importance Undecided High
2019-04-10 13:33:31 Mark Goddard kolla-ansible: assignee Mark Goddard (mgoddard)
2019-04-10 13:33:51 Mark Goddard nominated for series kolla-ansible/stein
2019-04-10 13:33:51 Mark Goddard bug task added kolla-ansible/stein
2019-04-10 13:33:51 Mark Goddard nominated for series kolla-ansible/rocky
2019-04-10 13:33:51 Mark Goddard bug task added kolla-ansible/rocky
2019-04-10 13:33:59 Mark Goddard kolla-ansible/stein: milestone 8.0.0
2019-04-29 22:03:53 OpenStack Infra nova: status In Progress Fix Released
2019-05-04 13:05:59 Matt Riedemann nominated for series nova/queens
2019-05-04 13:05:59 Matt Riedemann bug task added nova/queens
2019-05-04 13:05:59 Matt Riedemann nominated for series nova/rocky
2019-05-04 13:05:59 Matt Riedemann bug task added nova/rocky
2019-05-04 13:05:59 Matt Riedemann nominated for series nova/stein
2019-05-04 13:05:59 Matt Riedemann bug task added nova/stein
2019-05-04 13:06:09 Matt Riedemann nova/queens: status New In Progress
2019-05-04 13:06:12 Matt Riedemann nova/rocky: status New In Progress
2019-05-04 13:08:11 Matt Riedemann nova/stein: status New In Progress
2019-05-04 13:08:24 Matt Riedemann nova: importance Undecided Medium
2019-05-04 13:08:28 Matt Riedemann nova/rocky: importance Undecided Medium
2019-05-04 13:08:30 Matt Riedemann nova/queens: importance Undecided Medium
2019-05-04 13:08:32 Matt Riedemann nova/stein: importance Undecided Medium
2019-05-04 13:08:54 Matt Riedemann nova/queens: assignee Lee Yarwood (lyarwood)
2019-05-04 13:08:59 Matt Riedemann nova/rocky: assignee Lee Yarwood (lyarwood)
2019-05-04 13:09:04 Matt Riedemann nova/stein: assignee Lee Yarwood (lyarwood)
2019-05-04 13:09:17 Matt Riedemann tags encryption volumes encryption libvirt volumes
2019-05-04 15:35:34 OpenStack Infra kolla-ansible/stein: status In Progress Fix Committed
2019-06-11 20:32:38 Matt Riedemann nova/stein: status In Progress Fix Released
2019-06-11 22:50:31 OpenStack Infra kolla-ansible/rocky: status New Fix Committed
2019-07-01 15:08:40 Matt Riedemann nova/rocky: status In Progress Fix Released
2019-07-03 02:07:09 OpenStack Infra nova/queens: status In Progress Fix Committed
2019-08-07 08:36:33 Mark Goddard kolla-ansible/stein: status Fix Committed Fix Released
2019-08-07 08:37:18 Mark Goddard kolla-ansible/stein: status Fix Released In Progress
2019-08-07 08:37:23 Mark Goddard kolla-ansible/rocky: status Fix Committed In Progress
2019-08-07 08:37:25 Mark Goddard kolla-ansible: status Fix Committed In Progress
2020-03-11 13:54:16 Mark Goddard kolla-ansible/rocky: importance Undecided High
2020-03-11 13:54:24 Mark Goddard kolla-ansible/rocky: milestone 7.2.1
2020-03-11 13:54:54 Mark Goddard kolla-ansible/rocky: milestone 7.2.1
2020-03-11 13:55:02 Mark Goddard kolla-ansible/stein: status In Progress Triaged
2020-03-11 13:55:07 Mark Goddard kolla-ansible: status In Progress Triaged
2020-03-11 13:55:11 Mark Goddard kolla-ansible/rocky: status In Progress Triaged
2020-03-11 13:55:16 Mark Goddard kolla-ansible: milestone 8.0.0
2020-03-11 13:55:18 Mark Goddard kolla-ansible/stein: milestone 8.0.0
2020-03-11 13:55:22 Mark Goddard kolla-ansible/stein: assignee Mark Goddard (mgoddard)
2020-03-11 13:55:25 Mark Goddard kolla-ansible: assignee Mark Goddard (mgoddard)
2021-01-18 19:16:17 Piotr Parczewski bug added subscriber Piotr Parczewski
2021-06-20 07:56:49 Radosław Piliszek kolla-ansible: status Triaged Invalid
2021-06-20 07:56:56 Radosław Piliszek bug task deleted kolla-ansible/rocky
2021-06-20 07:57:02 Radosław Piliszek bug task deleted kolla-ansible/stein
2021-06-20 07:57:15 Radosław Piliszek kolla-ansible: importance High Undecided