Compute process cast against instance it does not own
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Compute (nova) |
Triaged
|
Medium
|
Unassigned |
Bug Description
Description
===========
In case, when some request (ie instance shutdown) is sent to the compute that is not available, request continue hanging in queue. In the meanwhile, instance can be evacuated from this host to the new available compute. However once original compute host becomes available again it process all messages waiting for it in queue without extra verification, which results in super delayed request being processed (ie instance goes shut down on new compute) even if this host doesn't own resource anymore.
Steps to reproduce
==================
1. compute1 goes down
2. Send shutdown request to the VM that is hosted on compute1
3. Evacuate VM (it would require state reset ATM - related bug #1932126) to compute2
4. boot compute1
5. compute1 will make instance that is currently working on compute2 to shutdown
Expected result
===============
Compute1 that does not own resource anymore should not be able to influence it
Environment
===========
SHA: c7d9d6d9dd25e21
Hypervisor: KVM
Networking: ovs
Storage: Ceph
setting this to medium since there is a workload outage when the orginal hosts is restored but there is no data loss.
you can restore the vm by just starting it again but its hard to debug and hard to explain to the customer why it happened.