Nova's rescue and unrescue assumes os-brick connect_volume is idempotent

Bug #2020699 reported by Gorka Eguileor
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Triaged
High
Unassigned

Bug Description

The rescue and unrescue operations in Nova assume that calls to `connect_volume` in os-brick are idempotent which it's currently true, but it was not something we guaranteed in os-brick.

With the recent CVE [1][2] we realized that os-brick cannot assume on the `connect_volume` that if there is a device/s present for the provided connection information then it is the right volume, and even if it's the right volume it cannot assume that it has the right information in sysfs (like the volume size), so it needs to clean things up to the best of its ability before actually connecting, and just in case it needs to confirm just before returning a patch to the caller that the device it's going to return is actually correct and consistent (as in the multipath only has devices with the same size and SCSI ID).

This means that os-brick's `connect_volume` will no longer be idempotent by design once this patch [3] merges to prevent data leak in any corner cases.

This will break the rescue and unrescue nova operations, because on the rescue call it stashes the original XML [4] and then unstashes it on unrescue [5], but in between Nova calls `connect_volume` for the rescue instance, effectively disconnecting the original device path.

This means that reusing that original path either points to a non-existent device or to a volume of another instance.

We can see an example of the non-existent device case in the failed CI job [6] where test `tempest.api.compute.servers.test_server_rescue.ServerStableDeviceRescueTest.test_stable_device_rescue_disk_virtio_with_volume_attached` fails with a nova-compute error [7]:

  libvirt.libvirtError: Cannot access storage file '/dev/sdd': No such file or directory

[1]: https://nvd.nist.gov/vuln/detail/CVE-2023-2088

[2]: https://bugs.launchpad.net/nova/+bug/2004555

[3]: https://review.opendev.org/c/openstack/os-brick/+/882841

[4]: https://github.com/openstack/nova/blob/71b105a4cfea054827e09b5b8df6be845909275a/nova/virt/libvirt/driver.py#L4229-L4232

[5]: https://github.com/openstack/nova/blob/71b105a4cfea054827e09b5b8df6be845909275a/nova/virt/libvirt/driver.py#L4323-L4328

[6]: https://a30336fa6a8fca5c6dba-fe779e5654b21fdff79727b204dfb7d6.ssl.cf1.rackcdn.com/882841/3/check/os-brick-src-tempest-lvm-lio-barbican/8ef7adf/testr_results.html

[7]: https://zuul.opendev.org/t/openstack/build/8ef7adf6a82248d8b9f94eb5b5bba73c/log/controller/logs/screen-n-cpu.txt?severity=4#77239

Revision history for this message
sean mooney (sean-k-mooney) wrote :

setting this to high since changing this design block https://review.opendev.org/c/openstack/os-brick/+/882841 from merging which blocks fully resolving https://bugs.launchpad.net/nova/+bug/2004555

tags: added: cinder libvirt rescue volumes
Changed in nova:
status: New → Triaged
importance: Undecided → Medium
importance: Medium → High
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.