soft_reboot followed by hard_reboot can lead to double reboot

Bug #1046356 reported by Vish Ishaya
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
High
wangpan
Folsom
Fix Released
High
Vish Ishaya

Bug Description

A change recently went in (https://review.openstack.org/#/c/12368/) to allow hard reboots to be performed on soft_rebooting instances. This leads to an unfortunate situation in libvirt where a SOFT_REBOOT that is ignored by the guest followed by a HARD_REBOOT will lead to a situation where the reboot occurs twice. First the manual hard reboot will go through, and then the soft reboot will timeout, causing another hard_reboot.

Revision history for this message
Vish Ishaya (vishvananda) wrote :

I can't think of a great way around this. The best short term fix I can come up with is to check to make sure the task state is still in SOFT_REBOOTING when the soft reboot fails. Not the best strategy for no_db_compute but probably workable.

Changed in nova:
status: New → Triaged
importance: Undecided → High
assignee: nobody → Yun Mao (yunmao)
Revision history for this message
Vish Ishaya (vishvananda) wrote :

I really would like to avoid reading the db from the driver layer, so assigned to yun to see if he has any other ideas.

Revision history for this message
wangpan (hzwangpan) wrote :

I have an idea, we can use the ID of domain in libvirt to recognize the hard reboot has been implemented or not, because the ID will be +1 when instance was destroyed and start again.
During the soft reboot, we get the domain ID every waiting loop, and check it is changed or not, if changed, we believe a hard reboot has done and ignore this soft reboot.
Vish, can you give me any more suggestions?

Changed in nova:
assignee: Yun Mao (yunmao) → wangpan (hzwangpan)
status: Triaged → In Progress
tags: added: folsom-backport-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/19157
Committed: http://github.com/openstack/nova/commit/6344bff494758e0d7d7d25f43a3b56d82447441e
Submitter: Jenkins
Branch: master

commit 6344bff494758e0d7d7d25f43a3b56d82447441e
Author: Wangpan <email address hidden>
Date: Tue Jan 8 13:54:19 2013 +0800

    Fix double reboot issue during soft reboot

    Using the ID of domain in Libvirt to recognize the hard reboot has been
    implemented or not, if the ID changed, we believe the domain has been rebooted,
    return True and break from soft reboot.

    Fixes: bug #1046356

    Change-Id: Iec2f9e8225cfe2779f84d2095667f3c0e621e935

Changed in nova:
status: In Progress → Fix Committed
Chuck Short (zulcss)
tags: removed: folsom-backport-potential
tags: added: folsom-backport-potential
Thierry Carrez (ttx)
Changed in nova:
milestone: none → grizzly-3
status: Fix Committed → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/folsom)

Fix proposed to branch: stable/folsom
Review: https://review.openstack.org/23065

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/folsom)

Reviewed: https://review.openstack.org/23065
Committed: http://github.com/openstack/nova/commit/46d2060c08980ea20cb8175a9e09ad6415287ee9
Submitter: Jenkins
Branch: stable/folsom

commit 46d2060c08980ea20cb8175a9e09ad6415287ee9
Author: Wangpan <email address hidden>
Date: Tue Jan 8 13:54:19 2013 +0800

    Fix double reboot issue during soft reboot

    Using the ID of domain in Libvirt to recognize the hard reboot has been
    implemented or not, if the ID changed, we believe the domain has been rebooted,
    return True and break from soft reboot.

    Fixes: bug #1046356

    Change-Id: Iec2f9e8225cfe2779f84d2095667f3c0e621e935
    (cherry picked from commit 6344bff494758e0d7d7d25f43a3b56d82447441e)

Mark McLoughlin (markmc)
tags: removed: folsom-backport-potential
Thierry Carrez (ttx)
Changed in nova:
milestone: grizzly-3 → 2013.1
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.