Volumes are not detached when a build fails

Bug #1332198 reported by Andrew Laski
22
This bug affects 4 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
High
Andrew Laski

Bug Description

When a build fails in the driver spawn method attached volumes are not detached. If the instance goes to ERROR and is later deleted everything gets cleaned up appropriately. If the instance is rescheduled then the next compute will fail with:

2014-06-18 20:09:01.954 11008 TRACE nova.compute.manager [instance: be78cd0e-c67f-439c-bf30-885fb135d9ad] Traceback (most recent call last):
2014-06-18 20:09:01.954 11008 TRACE nova.compute.manager [instance: be78cd0e-c67f-439c-bf30-885fb135d9ad] File "/opt/rackstack/806.0/nova/lib/python2.6/site-packages/nova/compute/manager.py", line 1786, in _prep_block_device
2014-06-18 20:09:01.954 11008 TRACE nova.compute.manager [instance: be78cd0e-c67f-439c-bf30-885fb135d9ad] self.driver, self._await_block_device_map_created) +
2014-06-18 20:09:01.954 11008 TRACE nova.compute.manager [instance: be78cd0e-c67f-439c-bf30-885fb135d9ad] File "/opt/rackstack/806.0/nova/lib/python2.6/site-packages/nova/virt/block_device.py", line 368, in attach_block_devices
2014-06-18 20:09:01.954 11008 TRACE nova.compute.manager [instance: be78cd0e-c67f-439c-bf30-885fb135d9ad] map(_log_and_attach, block_device_mapping)
2014-06-18 20:09:01.954 11008 TRACE nova.compute.manager [instance: be78cd0e-c67f-439c-bf30-885fb135d9ad] File "/opt/rackstack/806.0/nova/lib/python2.6/site-packages/nova/virt/block_device.py", line 366, in _log_and_attach
2014-06-18 20:09:01.954 11008 TRACE nova.compute.manager [instance: be78cd0e-c67f-439c-bf30-885fb135d9ad] bdm.attach(*attach_args, **attach_kwargs)
2014-06-18 20:09:01.954 11008 TRACE nova.compute.manager [instance: be78cd0e-c67f-439c-bf30-885fb135d9ad] File "/opt/rackstack/806.0/nova/lib/python2.6/site-packages/nova/virt/block_device.py", line 45, in wrapped
2014-06-18 20:09:01.954 11008 TRACE nova.compute.manager [instance: be78cd0e-c67f-439c-bf30-885fb135d9ad] ret_val = method(obj, context, *args, **kwargs)
2014-06-18 20:09:01.954 11008 TRACE nova.compute.manager [instance: be78cd0e-c67f-439c-bf30-885fb135d9ad] File "/opt/rackstack/806.0/nova/lib/python2.6/site-packages/nova/virt/block_device.py", line 218, in attach
2014-06-18 20:09:01.954 11008 TRACE nova.compute.manager [instance: be78cd0e-c67f-439c-bf30-885fb135d9ad] volume_api.check_attach(context, volume, instance=instance)
2014-06-18 20:09:01.954 11008 TRACE nova.compute.manager [instance: be78cd0e-c67f-439c-bf30-885fb135d9ad] File "/opt/rackstack/806.0/nova/lib/python2.6/site-packages/nova/volume/cinder.py", line 249, in check_attach
2014-06-18 20:09:01.954 11008 TRACE nova.compute.manager [instance: be78cd0e-c67f-439c-bf30-885fb135d9ad] raise exception.InvalidVolume(reason=msg)
2014-06-18 20:09:01.954 11008 TRACE nova.compute.manager [instance: be78cd0e-c67f-439c-bf30-885fb135d9ad] InvalidVolume: Invalid volume: status must be 'available'
2014-06-18 20:09:01.954 11008 TRACE nova.compute.manager [instance: be78cd0e-c67f-439c-bf30-885fb135d9ad]
2014-06-18 20:09:02.002 11008 ERROR nova.compute.manager [req-e76e85f6-0520-4372-b47d-a80744c912a7 None] [instance: be78cd0e-c67f-439c-bf30-885fb135d9ad] Failure prepping block device

which stops the build and properly stops a reschedule.

Cinder volumes need to be detached on a build failure.

Andrew Laski (alaski)
Changed in nova:
importance: Undecided → High
status: New → In Progress
assignee: nobody → Andrew Laski (alaski)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/101335

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/101335
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=5120c4f7c2670eaa71898fe6941029bbb0081949
Submitter: Jenkins
Branch: master

commit 5120c4f7c2670eaa71898fe6941029bbb0081949
Author: Andrew Laski <email address hidden>
Date: Thu Jun 19 17:15:18 2014 -0400

    Instance and volume cleanup when a build fails

    On failed builds the _shutdown_instance method used to get called which
    would clean up leftover instance artifacts, volume attachments, and
    networking. This no longer happens which is causing volumes to be left
    in an attached state when they're not attached to anything.

    Network deallocation is already handled in this code path so it should
    not happen in _shutdown_instance.

    Change-Id: I899b64ac5941acc282ccae7e1963d6f714c01e8b
    Closes-bug: #1332198

Changed in nova:
status: In Progress → Fix Committed
Changed in nova:
milestone: none → juno-2
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in nova:
milestone: juno-2 → 2014.2
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.