cross_az_attach=False doesn't honor BDMs with source=image and dest=volume
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Compute (nova) |
Confirmed
|
Wishlist
|
Unassigned |
Bug Description
The config flag cross_az_attach allows an instance to be pinned to the related volume AZ if the value of that config option is set to False.
We fixed the case of a volume-backed instance by https:/
Since the volume is created based on the current instance.AZ, it does respect the current AZ, but since the instance isn't pinned to that AZ, it can move from an AZ to another while the volume will continue to exist in the original AZ.
As a consequence, the problem is only seen after a move operation but the problem exists since the instance creation.
=== ORIGINAL BUG REPORT BELOW ===
Before I start, let me describe the agents involved in the process migration and/or resize flow of OpenStack (in this case, Nova component). These are the mapping and interpretation I created while troubleshooting the reported problem.
- Nova-API: the agent responsible for receiving the HTTP requests (create/
- Nova-conductor: the agent responsible to "conduct/guide" the workflow. Nova-conductor will read the commands from the RPC queue and then process the request from Nova-API. It does some extra validation, and for every command (create/
- Nova-scheduler: the agent responsible to "schedule" VMs on hosts. It defines where a VM must reside. It receives the "select host request", and processes the algorithms to determine where the VM can be allocated. Before applying the scheduling algorithms, it calls/queries the Placement system to get the possible hosts where VMs might be allocated. I mean, hosts that fit the requested parameters, such as being in a given Cell, availability zone (AZ), having available/free computing resources to support the VM. The call from Nova-scheduler to Placement is an HTTP request.
- Placement: behaves as an inventory system. It tracks where resources are allocated, their characteristics, and providers (hosts/
- Nova: the agent responsible to execute/process the commands and implement actions in the hypervisor.
Then, we have the following workflow from the different processes.
- migrate: Nova API ->(via RPC call -- nova.conductor.
- resize: Nova API ->(via RPC call -- nova.conductor.
As a side note, this mapping also explains why the "resize" was not executing the CPU compatibility check that the "migration" is executing (this is something else that I was checking, but it is worth mentioning here). The resize is basically a cold migration to a new host, where a new flavor (definition of the VM) is applied; thus, it does not need to evaluate CPU feature set compatibility.
The problem we are reporting happens with both "migrate" and "resize" operations. Therefore, I had to add some logs to see what was going on there (that whole process is/was "logless"). The issue happens because Placement always returns all hosts of the environment for a given VM being migrated (resize is a migration process); this only happens if the VM is deployed without defining its availability zone in the request spec.
To be more precise, Nova-conductor in `nova.conductor
That raised a question. How is it possible that the create (deploy VM) process works? It works because of the parameter "cross_az_attach" configured in Nova. As we can see in https:/
After discovering all that, we were under the impression that OpenStack was designed to have (require) different Cells to implement multiple AZs. Therefore, we assumed that the problem was caused due to this code/line (https:/
However, while discussing, and after checking the documentation (https:/
We also discussed if the Placement should be the one doing this check before returning the possible hosts to migrate the VM to. However, this does not seem to be in the Placement context/
Furthermore, the solution proposed (https:/
Following the same process that is used with Nova cells, we propose the solution for this situation at https:/
Any other comments and reviews are welcome!
Changed in nova: | |
status: | New → In Progress |
Changed in nova: | |
status: | In Progress → Invalid |
setting this to invalid
we shoudl discuss this more however it is perfectly valid for a resize to move a VM to a different az as long as the VM was not created with an AZ in the original request.
are not expected to align to cells and you can have multiple cells in the same az and multiple AZ in the same cell concurrently.
if an instance does not request an az in the VM create request there is no expectation that the VM should be tied to the AZ for its lifetime when migrated resize evacuated or otherwise moved.
adding schduler support for cross az attach would be a feature not a bug which is why i marked this as invalid.
the current behaviour is expected.
the https:/ /docs.openstack .org/nova/ latest/ configuration/ config. html#DEFAULT. default_ schedule_ zone
can be used to force the vms to have an AZ when one is not requested
if that is not set and no az is provided vms are expected to be able to float across AZs