Bug #1767363 “Deleting 2 instances with a common multi-attached ...” : Bugs : OpenStack Compute (nova)

Takashi Natsume (natsume-takashi) on 2018-05-02

tags:

added: cinder volumes

Matt Riedemann (mriedem) on 2018-05-03

tags:

added: libvirt multiattach

Revision history for this message

lucky (luckysingh) wrote on 2018-05-28:

#1

Download full text (7.1 KiB)

Yes it is a bug, i have tested on my system. below are the reproducing steps :

Precondition:
create two instance and attach them with a multiattach volume.
there instances id are : c78bdc4b-ebb0-41d2-a435-6f56791a9604, 0efcd163-13ab-4b70-9449-ab89301be1cf

1. delete first attachment :

[root@openstackq ~(keystone_admin)]# nova volume-detach c78bdc4b-ebb0-41d2-a435-6f56791a9604 4361ce05-e325-40c3-8b2c-5bcaeedf4260

2. now, delete second attachment :

[root@openstackq ~(keystone_admin)]# nova volume-detach 0efcd163-13ab-4b70-9449-ab89301be1cf 4361ce05-e325-40c3-8b2c-5bcaeedf4260

Second step executed without any message.

however, if we check output of cinder list, volume's status is shown as a "deattaching". but after few seconds it changes from "deattaching" to "in-use".

3.

Yes it is a bug, i have tested on my system. below are the reproducing steps :

Precondition:
create two instance and attach them with a multiattach volume.
there instances id are :  c78bdc4b-ebb0-41d2-a435-6f56791a9604, 0efcd163-13ab-4b70-9449-ab89301be1cf

1. delete first attachment :

[root@openstackq ~(keystone_admin)]# nova volume-detach c78bdc4b-ebb0-41d2-a435-6f56791a9604 4361ce05-e325-40c3-8b2c-5bcaeedf4260

2. now, delete second attachment :

[root@openstackq ~(keystone_admin)]# nova volume-detach 0efcd163-13ab-4b70-9449-ab89301be1cf 4361ce05-e325-40c3-8b2c-5bcaeedf4260

Second step executed without any message.

however, if we check output of cinder list, volume's status is shown as a "deattaching". but after few seconds it changes from "deattaching" to "in-use".

3.

Changed in nova:
status:	New → Confirmed

Revision history for this message

lucky (luckysingh) wrote on 2018-05-28:

#2

But here I would like to mention 1 more thing that above case will occur only when we try to detach last instance from the volume. It will not occur if we have 4 instances attached to a single volume and we try to delete two instances from it.

Revision history for this message

Lee Yarwood (lyarwood) wrote on 2020-09-15:

#3

_disconnect_volume doesn't delete host volume attachments so the worst thing that can happen with this race is that the underlying volume connection isn't cleaned up on the host when multiple requests race each other. I'll throw a volume_id specific lock around the method to avoid this.

FWIW c#1 and c#2 isn't related, that shows a failure to detach resulting in n-cpu rolling the detach back in c-api.

Revision history for this message

Lee Yarwood (lyarwood) wrote on 2020-09-15:

#4

This is actually super awkward as locking _disconnect_volume and/or _should_disconnect_target isn't enough here as we are reliant on the compute manager / block device actually deleting the attachments. Anyway here's the current list of places where we call _disconnect_volume in the libvirt virt driver:

# egrep '(def\ | self._disconnect_volume)' nova/virt/libvirt/driver.py | grep -B1 self._disconnect_volume
    def _cleanup(self, context, instance, network_info, block_device_info=None,
                self._disconnect_volume(context, connection_info, instance)
--
    def attach_volume(self, context, connection_info, instance, mountpoint,
                self._disconnect_volume(context, connection_info, instance,
--
    def swap_volume(self, context, old_connection_info,
            self._disconnect_volume(context, new_connection_info, instance)
                self._disconnect_volume(context, new_connection_info, instance)
        self._disconnect_volume(context, old_connection_info, instance)
--
    def detach_volume(self, context, connection_info, instance, mountpoint,
        self._disconnect_volume(context, connection_info, instance,
--
    def post_live_migration(self, context, instance, block_device_info,
                self._disconnect_volume(context, connection_info, instance)
--
    def migrate_disk_and_power_off(self, context, instance, dest,
            self._disconnect_volume(context, connection_info, instance)

Revision history for this message

Lee Yarwood (lyarwood) wrote on 2020-09-15:

#5

I think the better option here is to replace calls to _disconnect_volume with calls to block_device.detach():

https://github.com/openstack/nova/blob/e0f088c95d05e9cf32d4af4c7cfc20566b17f8e1/nova/virt/block_device.py#L469-L477

Changed in nova:
importance:	Undecided → Medium

Lee Yarwood (lyarwood) on 2020-09-15

Changed in nova:
assignee:	nobody → Lee Yarwood (lyarwood)

OpenStack Compute (nova)

Deleting 2 instances with a common multi-attached volume can leave the volume attached

Bug Description

Other bug subscribers

Remote bug watches