Using reset for hard_reboot is not reliable
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Compute (nova) |
Fix Released
|
Undecided
|
Rafi Khardalian |
Bug Description
Using reset for hard_reboot is not reliable, even where it is supported by libvirt. Hard reboots are one of the only ways to recover a VM in a broken state. The reset command assumes the domain is running in some capacity and will fail if it is not. Here are some steps to reproduce:
1. Create a new libvirt VM (using qemu for my testing).
2. virsh list # validate it is running
virsh # list
Id Name State
-------
3 instance-00000001 running
3. Find and kill -9 the pid of the qemu/kvm process. virsh list --all to confirm:
virsh # list --all
Id Name State
-------
- instance-00000001 shut off
4. Issue a virsh reset, as the code would do:
virsh # reset instance-00000001
error: Failed to reset domain instance-00000001
error: Requested operation is not valid: domain is not running
There is no way to recover this VM without manual intervention. Reverting to the hold behavior, by commenting out the conditional and forcing the code below works much more reliably:
Hard reset is the current sledgehammer for fixing issues and it really needs to stay that way.
Changed in nova: | |
milestone: | none → folsom-rc1 |
status: | Fix Committed → Fix Released |
Changed in nova: | |
milestone: | folsom-rc1 → 2012.2 |
Fix proposed to branch: master /review. openstack. org/11371
Review: https:/