attempting to reboot a shutdown/suspened/crashed/paused instance appears to have failed, but then surprisingly succeeds two minutes later
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Compute (nova) |
Fix Released
|
Medium
|
Guangya Liu (Jay Lau) |
Bug Description
I am running Havana from precise-proposed in the UCA (nova 1:2013.
To reproduce:
- start an instance
- reboot (sudo reboot) the compute node on which it is running
- after the compute node is done booting, the instance will be off:
root@xen10:~# nova list
+------
| ID | Name | Status | Task State | Power State | Networks |
+------
| 4824dce8-
+------
(note that although my hostname has "xen" in it, I'm using KVM. Haven't updated DNS yet...)
- attempt to reboot the instance (nova reboot 4824dce8-
# nova show 4824dce8-
+------
| Property | Value |
+------
| status | SHUTOFF |
| updated | 2013-10-
| OS-EXT-
The reboot fails. The compute node will log:
2013-10-08 11:28:55.579 1400 WARNING nova.compute.
- attempt to start the instance (nova start 4824dce8-
produces console output:
ERROR: Instance 4824dce8-
- wait about 120 seconds, and the compute node will log:
2013-10-08 11:30:56.082 1400 WARNING nova.virt.
Afterwards, the instance will be running.
It's confusing that the reboot logs a failure for a very obvious reason (an instance that is not running can't be *re*booted), yet the instance's state remains as "rebooting". I had expected that the reboot had failed, and openstack was in some consistant state. I was then again suprised when in fact it *was* still rebooting -- it just took two minutes to do so. Less confusing would be to catch the original error, and report the reboot as failed. The log messages are confusing, because the first sets the expectation that a non-running instance can't be rebooted, but it can (two minutes later).
Changed in nova: | |
assignee: | nobody → Jay Lau (jay-lau-513) |
Changed in nova: | |
importance: | Undecided → Medium |
summary: |
- attempting to reboot a shutdown/suspened/crashed instance appears to - have failed, but then surprisingly succeeds two minutes later + attempting to reboot a shutdown/suspened/crashed/paused instance appears + to have failed, but then surprisingly succeeds two minutes later |
tags: | added: havana-backport-potential |
Changed in nova: | |
milestone: | none → icehouse-1 |
Changed in nova: | |
status: | Fix Committed → Fix Released |
Changed in nova: | |
milestone: | icehouse-1 → 2014.1 |
We may need some discussion on this. Actually, for now, nova CAN reboot a STOPPED instance.
The logic is as following(as you append): manager [req-975a1bd2- 5c69-4a59- b506-7318bf5998 74 admin admin] [instance: 6105f3bf- 7f58-4d42- bbbb-ff7186c16c 36] trying to reboot a non-running instance: (state: 4 expected: 1) libvirt. driver [req-975a1bd2- 5c69-4a59- b506-7318bf5998 74 admin admin] [instance: 6105f3bf- 7f58-4d42- bbbb-ff7186c16c 36] Failed to soft reboot instance. Trying hard reboot.
1)
2013-10-10 15:20:16.980 WARNING nova.compute.
2) soft reboot failed and try hard reboot.
2013-10-10 15:22:17.524 WARNING nova.virt.
3) hard reboot will first destroy the instance then re-create and power on the instance.
So it seems to be a valid case. Phil, what do you think? Thanks.