Comment 3 for bug 1722577

Revision history for this message
Matt Riedemann (mriedem) wrote :

I know what changed, and why it's not 100% failure, because it's a race.

Before Pike, when detaching a volume, the nova compute manager would delete the BlockDeviceMapping for the volume from the nova database *before* telling cinder to detach the volume, which puts it into 'available' status:

a) destroy bdm: https://github.com/openstack/nova/blob/15.0.0/nova/compute/manager.py#L4936

b) mark volume as available: https://github.com/openstack/nova/blob/15.0.0/nova/compute/manager.py#L4941

c) which makes the os-volume_attachments API return a 400 on a subsequent request to detach:

https://github.com/openstack/nova/blob/15.0.0/nova/api/openstack/compute/volumes.py#L442

--

In Pike, the code in the compute manager changed such that the bdm is deleted after the volume is marked as available:

a) mark volume as available: https://github.com/openstack/nova/blob/16.0.0/nova/compute/manager.py#L4906

b) destroy bdm: https://github.com/openstack/nova/blob/16.0.0/nova/compute/manager.py#L4921

Which is going to result in (c) racing and it might find the BDM before it's deleted and try to detach the already detached volume, which results in a 400 rather than a 404.

--

Arguably Tempest, as the client, should be checking the volume status before blindly attempting to detach it.