libvirt reboot sometimes fails to reattach volumes

Bug #1073720 reported by Vish Ishaya on 2012-10-31
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Medium
Vish Ishaya
Folsom
Undecided
Vish Ishaya
nova (Ubuntu)
Undecided
Unassigned
Quantal
Undecided
Unassigned

Bug Description

When doing a hard reboot with volumes attached, If libvirt no longer knows about the instance will come up without volumes. This is important because rebuilding the host machine might lose the data in libvirt and resume_state_on_host uses the same code path as hard_reboot.

Example repro with devstack:

nova boot --flavor 1 --image <image-uuid>
nova volume-create -s 1
nova volume-attach <instance-uuid> <vol-uuid>

sudo virsh destroy instance-00000001
sudo virsh undefine instance-00000001

nova reboot --hard <instance-uuid>

sudo virsh dumpxml instance-00000001

The xml will not show the attached volume.

Changed in nova:
status: New → Triaged
importance: Undecided → Medium
assignee: nobody → Vish Ishaya (vishvananda)

Fix proposed to branch: master
Review: https://review.openstack.org/15153

Changed in nova:
status: Triaged → In Progress
tags: added: folsom-backport-potential

Reviewed: https://review.openstack.org/15153
Committed: http://github.com/openstack/nova/commit/b22c3302ccea6b4b9e685640accbfdbb2856d460
Submitter: Jenkins
Branch: master

commit b22c3302ccea6b4b9e685640accbfdbb2856d460
Author: Vishvananda Ishaya <email address hidden>
Date: Wed Oct 31 13:42:12 2012 -0700

    libvirt: Regenerates xml instead of using on-disk

    The libvirt.xml file is written on initial boot of the instance and
    may not reflect the current state of the vm if volumes have been
    attached. If the libvirt definition of the vm has been lost, it is
    safer to regenerate it rather than attempt to load it from the
    outdated file. This should properly bring back vms with attached
    volumes.

    Fixes bug 1073720

    Change-Id: Iaa754700a149f09fc0c022fa664c06ad17f505f5

Changed in nova:
status: In Progress → Fix Committed
Thierry Carrez (ttx) on 2012-11-21
Changed in nova:
milestone: none → grizzly-1
status: Fix Committed → Fix Released
tags: removed: folsom-backport-potential

Reviewed: https://review.openstack.org/16719
Committed: http://github.com/openstack/nova/commit/f7e5dde6bbd70ee61924c8f556d0070c8ce0a4b2
Submitter: Jenkins
Branch: stable/folsom

commit f7e5dde6bbd70ee61924c8f556d0070c8ce0a4b2
Author: Vishvananda Ishaya <email address hidden>
Date: Wed Oct 31 13:42:12 2012 -0700

    libvirt: Regenerates xml instead of using on-disk

    The libvirt.xml file is written on initial boot of the instance and
    may not reflect the current state of the vm if volumes have been
    attached. If the libvirt definition of the vm has been lost, it is
    safer to regenerate it rather than attempt to load it from the
    outdated file. This should properly bring back vms with attached
    volumes.

    Fixes bug 1073720

    Change-Id: Iaa754700a149f09fc0c022fa664c06ad17f505f5
    (cherry picked from commit b22c3302ccea6b4b9e685640accbfdbb2856d460)

Changed in nova (Ubuntu):
status: New → Fix Released
Changed in nova (Ubuntu Quantal):
status: New → Confirmed

Hello Vish, or anyone else affected,

Accepted nova into quantal-proposed. The package will build now and be available at http://launchpad.net/ubuntu/+source/nova/2012.2.1+stable-20121212-a99a802e-0ubuntu1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in nova (Ubuntu Quantal):
status: Confirmed → Fix Committed
tags: added: verification-needed
Launchpad Janitor (janitor) wrote :
Download full text (8.3 KiB)

This bug was fixed in the package nova - 2012.2.1+stable-20121212-a99a802e-0ubuntu1

---------------
nova (2012.2.1+stable-20121212-a99a802e-0ubuntu1) quantal-proposed; urgency=low

  * Ubuntu updates:
    - debian/control: Ensure novaclient is upgraded with nova,
      require python-keystoneclient >= 1:2.9.0. (LP: #1073289)
    - d/p/avoid_setuptools_git_dependency.patch: Refresh.
  * Dropped patches, applied upstream:
    - debian/patches/CVE-2012-5625.patch: [a99a802]
  * Resynchronize with stable/folsom (b55014ca) (LP: #1085255):
    - [a99a802] create_lvm_image allocates dirty blocks (LP: #1070539)
    - [670b388] RPC exchange name defaults to 'openstack' (LP: #1083944)
    - [3ede373] disassociate_floating_ip with multi_host=True fails
      (LP: #1074437)
    - [22d7c3b] libvirt imagecache should handle shared image storage
      (LP: #1075018)
    - [e787786] Detached and deleted RBD volumes remain associated with insance
      (LP: #1083818)
    - [9265eb0] live_migration missing migrate_data parameter in Hyper-V driver
      (LP: #1066513)
    - [3d99848] use_single_default_gateway does not function correctly
      (LP: #1075859)
    - [65a2d0a] resize does not migrate DHCP host information (LP: #1065440)
    - [102c76b] Nova backup image fails (LP: #1065053)
    - [48a3521] Fix config-file overrides for nova-dhcpbridge
    - [69663ee] Cloudpipe in Folsom: no such option: cnt_vpn_clients
      (LP: #1069573)
    - [6e47cc8] DisassociateAddress can cause Internal Server Error
      (LP: #1080406)
    - [22c3d7b] API calls to dis-associate an auto-assigned floating IP should
      return proper warning (LP: #1061499)
    - [bd11d15] libvirt: if exception raised during volume_detach, volume state
      is inconsistent (LP: #1057756)
    - [dcb59c3] admin can't describe all images in ec2 api (LP: #1070138)
    - [78de622] Incorrect Exception raised during Create server when metadata
      over 255 characters (LP: #1004007)
    - [c313de4] Fixed IP isn't released before updating DHCP host file
      (LP: #1078718)
    - [f4ab42d] Enabling Return Reservation ID with XML create server request
      returns no body (LP: #1061124)
    - [3db2a38] 'BackupCreate' should accept rotation parameter greater than or
      equal to zero (LP: #1071168)
    - [f7e5dde] libvirt reboot sometimes fails to reattach volumes
      (LP: #1073720)
    - [ff776d4] libvirt: detaching volume may fail while terminating other
      instances on the same host concurrently (LP: #1060836)
    - [85a8bc2] Used instance uuid rather than id in remove-fixed-ip
    - [42a85c0] Fix error on invalid delete_on_termination value
    - [6a17579] xenapi migrations fail w/ swap (LP: #1064083)
    - [97649b8] attach-time field for volumes is not updated for detach volume
      (LP: #1056122)
    - [8f6a718] libvirt: rebuild is not using kernel and ramdisk associated with
      the new image (LP: #1060925)
    - [fbe835f] live-migration and volume host assignement (LP: #1066887)
    - [c2a9150] typo prevents volume_tmp_dir flag from working (LP: #1071536)
    - [93efa21] Instances deleted during spawn leak network allocations
      (LP: #1068716)
    - [ebabd02] After restartin...

Read more...

Changed in nova (Ubuntu Quantal):
status: Fix Committed → Fix Released
Thierry Carrez (ttx) on 2013-04-04
Changed in nova:
milestone: grizzly-1 → 2013.1
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers