Openstack Nova: Unpause after host reboot fails

Bug #1265494 reported by Tzach Shefi
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
High
Unassigned
Havana
Fix Released
High
Unassigned

Bug Description

Description of problem:
Unpauseing an instance fails if host has rebooted.

Version-Release number of selected component (if applicable):
RHEL: release 6.5 (Santiago)
openstack-nova-api-2013.2.1-1.el6ost.noarch
openstack-nova-compute-2013.2.1-1.el6ost.noarch
openstack-nova-scheduler-2013.2.1-1.el6ost.noarch
openstack-nova-common-2013.2.1-1.el6ost.noarch
openstack-nova-console-2013.2.1-1.el6ost.noarch
openstack-nova-conductor-2013.2.1-1.el6ost.noarch
openstack-nova-novncproxy-2013.2.1-1.el6ost.noarch
openstack-nova-cert-2013.2.1-1.el6ost.noarch

How reproducible:
Every time

Steps to Reproduce:
1. Boot an instance
2. Pause that instance
3. Reboot host
4. Unpause instance

Actual results:
can't unpause instance stuck in status paused, power state - shutdown

Expected results:
Instance should unpause, return to running state

Additional info:

virsh list -all --managed-save
ID is missing from paused instance "-" (pausecirros), state -> shut off.

[root@orange-vdse ~(keystone_admin)]# virsh list --all --managed-save
 Id Name State
----------------------------------------------------
 1 instance-00000003 running
 2 instance-00000002 running
 - instance-00000001 shut off

[root@orange-vdse ~(keystone_admin)]# nova list (notice nova status paused)
+--------------------------------------+---------------+--------+------------+-------------+-----------------+
| ID | Name | Status | Task State | Power State | Networks |
+--------------------------------------+---------------+--------+------------+-------------+-----------------+
| ebe310c2-d715-45e5-83b6-32717af1ac90 | cirros | ACTIVE | None | Running | net=192.168.1.4 |
| 3ef89feb-414f-4524-b806-f14044efdb14 | pausecirros | PAUSED | None | Shutdown | net=192.168.1.5 |
| 8bcae041-2f92-4ae2-a2c2-ee59b067ac76 | suspendcirros | ACTIVE | None | Running | net=192.168.1.2 |
+--------------------------------------+---------------+--------+------------+-------------+-----------------+

Testing without rebooting host, ID/state ("1"/paused) instance (cirros) are ok and it unpauses ok.

[root@orange-vdse ~(keystone_admin)]# virsh list --all --managed-save
 Id Name State
----------------------------------------------------
 1 instance-00000003 paused
 2 instance-00000002 running
 - instance-00000001 shut off
+--------------------------------------+---------------+--------+------------+-------------+-----------------+
| ID | Name | Status | Task State | Power State | Networks |
+--------------------------------------+---------------+--------+------------+-------------+-----------------+
| ebe310c2-d715-45e5-83b6-32717af1ac90 | cirros | PAUSED | None | Paused | net=192.168.1.4 |
| 3ef89feb-414f-4524-b806-f14044efdb14 | pausecirros | PAUSED | None | Shutdown | net=192.168.1.5 |
| 8bcae041-2f92-4ae2-a2c2-ee59b067ac76 | suspendcirros | ACTIVE | None | Running | net=192.168.1.2 |
+--------------------------------------+---------------+--------+------------+-------------+-----------------+

Changed in nova:
status: New → Confirmed
assignee: nobody → Xavier Queralt (xqueralt)
tags: added: libvirt
Revision history for this message
Xavier Queralt (xqueralt-deactivatedaccount) wrote :

The instance cannot be unpaused after reboot because it is in the shutdown power state and libvirt expects it to be in the managedsave state.

There is the resume_guests_state_on_host_boot flag in nova that instructs nova to restore the instance's state after booting. This is only done for instances that where running before shutting down, other states are ignored.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/68690

Changed in nova:
status: Confirmed → In Progress
Changed in nova:
importance: Undecided → Medium
Changed in nova:
milestone: none → icehouse-rc1
tags: added: icehouse-rc-potential
Changed in nova:
importance: Medium → High
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/68690
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=6abda5b738c5d801ede3328cf0ec1dc9dddc022f
Submitter: Jenkins
Branch: master

commit 6abda5b738c5d801ede3328cf0ec1dc9dddc022f
Author: Xavier Queralt <email address hidden>
Date: Thu Jan 23 13:53:37 2014 +0100

    Correct the state for PAUSED instances on reboot

    If an instance is in the PAUSED state before the compute node is
    restarted, it will be left in an inconsistent state (impossible to
    delete and/or restart it) after the node boots again.

    This patch adds a check in the periodic task sync_power_state so when
    this situation is detected the instance is stopped and left in a state
    from where it can be started again or deleted.

    Closes-Bug: #1265494
    Change-Id: I7266ace596598522c28a3839091ef50faf4463c2

Changed in nova:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/havana)

Reviewed: https://review.openstack.org/70940
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=590519a036ec7247a273231750ab6d2030aed360
Submitter: Jenkins
Branch: stable/havana

commit 590519a036ec7247a273231750ab6d2030aed360
Author: Xavier Queralt <email address hidden>
Date: Thu Jan 23 13:53:37 2014 +0100

    Correct the state for PAUSED instances on reboot

    If an instance is in the PAUSED state before the compute node is
    restarted, it will be left in an inconsistent state (impossible to
    delete and/or restart it) after the node boots again.

    This patch adds a check in the periodic task sync_power_state so when
    this situation is detected the instance is stopped and left in a state
    from where it can be started again or deleted.

    Closes-Bug: #1265494
    Change-Id: I7266ace596598522c28a3839091ef50faf4463c2
    (cherry picked from commit 6abda5b738c5d801ede3328cf0ec1dc9dddc022f)

Thierry Carrez (ttx)
Changed in nova:
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in nova:
milestone: icehouse-rc1 → 2014.1
Revision history for this message
Steve Jacobs (stevej) wrote :

I have an instance in this state that I need to recover. Is there any way to get it started again manually?

Revision history for this message
Tzach Shefi (tshefi) wrote :

Hi Steve

I've just tested this on my Juno setup, booted an instance then paused it, reboot host (All in one node).
Instance status after reboot is shutdown, softreboot command didn't work, but hardreboot worked - instance is now running.

Which version of Openstack are you running?
What's the instance's OS?

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.