Nova compute service exception that performs cold migration virtual machine stuck in resize state.
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Compute (nova) |
Fix Released
|
Low
|
Matt Riedemann | ||
Train |
Fix Released
|
Low
|
Lee Yarwood |
Bug Description
Description:
In the case of a nova-compute service exception, such as down, the instance gets stuck in the resize state during cold migration and cannot perform evacuation.The command request for nova API is also issued, server_status and Task State have been changed, but compute cannot receive the request, resulting in the server State remaining in the resize State. When nova-compute is restarted, the server State becomes ERROR.It is recommended to add validation to prevent instances from entering inoperable states.
This can also happen with commands such as stop/rebuild/
Environment:
1. openstack-Q;nova -version:9.1.1
2. hypervisor: Libvirt + KVM
3. One control node, two compute nodes.
Changed in nova: | |
assignee: | nobody → Matt Riedemann (mriedem) |
status: | New → In Progress |
importance: | Undecided → Low |
For resize, do you mean when the source compute is down? Because the scheduler should filter out any destination computes that are down.
What error traceback/log messages are you seeing when this happens and where - the destination compute's prep_resize method? Conductor? Other?
Also, what server side release specifically? 9.1.1 looks like the version of python-novaclient but we need to know the server side version (queens?).
https:/ /review. opendev. org/#/c/ 699291/ sounds similar but that's a case where the source compute is down while confirming a resize.