BFV VM may be unexpectedly moved to different AZ

Bug #2047182 reported by Damian Dąbrowski
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
New
Undecided
Unassigned

Bug Description

In cases when:
- each availability zone has a separate storage cluster([cinder]/cross_az_attach option helps to achieve that)
and
- there is no default_schedule_zone
VM may be unexpectedly moved to different AZ.

When a VM is created from pre-existing volume, nova places the specific availability zone in request_specs which prevents a VM from being moved to different AZ during resize/migrate[1]. In this case, everything works fine.

Unfortunately, problems start in the following cases:
a) VM is created with --boot-from-volume argument which dynamically creates volume for the VM
b) VM has only ephemeral volume

Lets focus on case a) because option b) may be not working "by design".

_get_volume_from_bdms() method considers only pre-existing volumes[2]. Volume that will be created later on with `--boot-from-volume` does not exist yet so it cannot fetch its availability zone.
As a result, request_specs contains '"availability_zone": null' and VM can be moved to different AZ during resize/migrate. Because storage is not shared between AZs, it breaks a VM.

It's not easy to fix because:
- nova API is not aware of the designated AZ at the time of placing request_specs in DB
- looking at schedule_and_build_instances method[3] we do not create the cinder volumes before downcalling to the compute agent. And we do not allow upcalls from the compute-agent to the api db in general, so it's hard to update request_specs after the volume is created.

Unfortunately, at this point I don't see any easy way to fix this issue.

[1] https://github.com/openstack/nova/blob/d28a55959e50b472e181809b919e11a896f989e3/nova/compute/api.py#L1268C19
[2] https://github.com/openstack/nova/blob/d28a55959e50b472e181809b919e11a896f989e3/nova/compute/api.py#L1247
[3] https://github.com/openstack/nova/blob/master/nova/conductor/manager.py#L1646

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.