resume guests on libvirt host reboot fails for instances with an ephemeral disk

Bug #1195877 reported by Phil Day
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
High
jan grant

Bug Description

If an instance is created from a flavor with an ephemeral disk, and resume_guests_state_on_host_boot=True the resulting _hard_reboot will fail and put the instance into an error state.

This is because the code in _hard_reboot that checks the image files, _create_images_and_backing() strips the size and filesystem data from the name of the ephemeral disk backing file, resulting in a check for a "epehmeral" rather than (for example) "ephemeral_10_default".

In a normal reboot this results in an unnecessary image download from Glance.

In the case of reboot following a host boot the download fails because the context at this point does not have credentials to be able to access Glance.

Tags: libvirt
Matt Riedemann (mriedem)
tags: added: libvirt
Changed in nova:
assignee: nobody → Rafi Khardalian (rkhardalian)
Revision history for this message
jan grant (jan-grant) wrote :

As far as I can tell, the reason for the two lines in question is lost in the mists of history.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/35133

Changed in nova:
assignee: Rafi Khardalian (rkhardalian) → jan grant (jan-grant)
status: New → In Progress
Revision history for this message
jan grant (jan-grant) wrote :

To reproduce:

1. set resume_guests_state_on_host_boot=true
2. Create an instance from an image.
3. Stop nova-compute.
4. virsh shutdown {instance id}
5. wait until it shuts down...
6. restart nova-compute

This simulates the same path that nova-compute takes on a host reboot.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/35133
Committed: http://github.com/openstack/nova/commit/65386c91910ee03d947c2b8bcc226a53c30e060a
Submitter: Jenkins
Branch: master

commit 65386c91910ee03d947c2b8bcc226a53c30e060a
Author: Jan Grant <email address hidden>
Date: Mon Jul 1 14:18:46 2013 +0100

    libvirt: Fix spurious backing file existence check.

    Bug 1195877

    The problem lies with the truncation of backing file names at the
    first underscore character. On a reboot, nova-compute checks the
    backing images exist. Typically, they will do; however, instead
    of looking for an "ephemeral_XXX" file, it hunts for "ephemeral".
    This won't exist, triggering code that attempts to pull the whole
    image again from glance(!)

    Lacking credentials, the attempt to fetch the image will fail,
    ultimately resulting in the instance going to an ERROR state.

    Change-Id: Iabcc7a1ffd248c22a747a1ddc7863d2b386d409a

Changed in nova:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in nova:
milestone: none → havana-2
status: Fix Committed → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/grizzly)

Fix proposed to branch: stable/grizzly
Review: https://review.openstack.org/51718

Sam Morrison (sorrison)
tags: added: grizzly-backport-potential
Revision history for this message
Blair Bethwaite (blair-bethwaite) wrote :

I reported https://bugs.launchpad.net/nova/+bug/1237683, which appears to be related, but details a much more serious consequence of this bug - that is, secondary ephemeral disk gets corrupted following block/storage migration.

Thierry Carrez (ttx)
Changed in nova:
milestone: havana-2 → 2013.2
Michael Still (mikal)
Changed in nova:
importance: Undecided → High
Alan Pevec (apevec)
tags: removed: grizzly-backport-potential
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.