VMs paused unbeknownst to nova compute are destroyed
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Compute (nova) |
Fix Released
|
Medium
|
Yun Mao | ||
Folsom |
Fix Released
|
Medium
|
Yun Mao |
Bug Description
Libvirt-managed qemu/KVM VMs can be paused outside of nova compute's workflow through a variety of means.
* By issuing virsh suspend
* By issuing virsh qemu-monitor-
* By causing qemu to emit a STOP event, for example when attaching a GDB debugger and single-stepping
* By connecting through an additional qemu monitor and issuing any commands that may cause qemu to emit a STOP event.
Starting in Folsom (specifically https:/
I surmise the original rationale is to destroy VMs that are paused by IO errors or KVM emulation errors, which would also cause qemu to emit STOP events.
The problem is that this will also destroy VMs that are paused through a variety of valid reasons as outlined above.
The problem is exacerbated by a Libvirt bug (https:/
Even with libvirt's bug fixed, there are still points in time at which nova-compute will check a VMs state, find it paused for a valid reason, and decide to erroneously destroy it.
The fix is to either remove this behavior, or to further query libvirt for the paused reason, which will show conclusively whether the VM is effectively crashed, or just paused.
Changed in nova: | |
status: | New → Triaged |
importance: | Undecided → Medium |
tags: | added: folsom-backport-potential |
tags: | removed: folsom-backport-potential |
Changed in nova: | |
milestone: | none → grizzly-3 |
status: | Fix Committed → Fix Released |
Changed in nova: | |
milestone: | grizzly-3 → 2013.1 |
Fix proposed to branch: master /review. openstack. org/19467
Review: https:/