Comment 6 for bug 2019190

Revision history for this message
melanie witt (melwitt) wrote : Re: [RBD] Retyping of in-use boot volumes renders instances unusable (possible data corruption)

I spent some time on this and I was able to reproduce the bug.

I am not sure exactly how RBD assisted volume migration is supposed to work but there is no call to Nova happening, so Nova doesn't know anything has changed. That point kind of doesn't matter though because AFAICT there is no existing API call that could be used to tell Nova, "point at the new volume location without copying any volume data to it". The only API we have at present is the swap volume API and there's no way to tell it not to copy volume data.

The other issue I see is that the volume attachment connection_info on the Cinder side does not itself get updated with the new volume location. So even if Nova was able to pull new connection_info from Cinder [1], it would still fail to boot because the new volume location isn't there.

Based on the fact that we don't have an API to tell Nova about the new volume location without copying data, I'm not sure what we can do to immediately fix this other than revert the patch that changed the mechanism for RBD volume retype.

For a future fix, I "think" it would not be difficult to add a "do not copy" type of flag to the PUT /servers/{server_id}/os-volume_attachments/{volume_id} API in Nova [2]. Then after the retype Cinder could call Nova to say "this volume moved but don't copy any data there".

Here are the steps I used to reproduce the issue:

https://paste.openstack.org/show/bNpzkjbeXrmTCwNHfDGs

No volumes are encrypted and the [nova] section is configured in cinder.conf.

[1] https://docs.openstack.org/nova/latest/cli/nova-manage.html#volume-attachment-refresh
[2] https://docs.openstack.org/api-ref/compute/?expanded=update-a-volume-attachment-detail#update-a-volume-attachment