Volumes stuck in "error deleted" state when using device mapper

Bug #979020 reported by Rafi Khardalian
20
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Cinder
Fix Released
High
John Griffith
OpenStack Compute (nova)
Fix Released
High
Rafi Khardalian

Bug Description

Volumes can get stuck in an "error deleted" state when lvm fails to remove them due to being "open". The -f (force) flag is already being passed to lvremove, which has no effect. The failure whose log I've pasted below occurred on a system which uses device mapper, with lvm layered on top. In order to be able to delete the volume, it needs to be removed from device mapper via dmsetup remove so that lvm stops considering the device open and allows removal.

-- Log --

2012-04-11 04:11:32 TRACE nova.rpc.amqp Traceback (most recent call last):
2012-04-11 04:11:32 TRACE nova.rpc.amqp File "/usr/lib/python2.7/dist-packages/nova/rpc/amqp.py", line 252, in _process_data
2012-04-11 04:11:32 TRACE nova.rpc.amqp rval = node_func(context=ctxt, **node_args)
2012-04-11 04:11:32 TRACE nova.rpc.amqp File "/usr/lib/python2.7/dist-packages/nova/volume/manager.py", line 173, in delete_volume
2012-04-11 04:11:32 TRACE nova.rpc.amqp {'status': 'error_deleting'})
2012-04-11 04:11:32 TRACE nova.rpc.amqp File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__
2012-04-11 04:11:32 TRACE nova.rpc.amqp self.gen.next()
2012-04-11 04:11:32 TRACE nova.rpc.amqp File "/usr/lib/python2.7/dist-packages/nova/volume/manager.py", line 162, in delete_volume
2012-04-11 04:11:32 TRACE nova.rpc.amqp self.driver.delete_volume(volume_ref)
2012-04-11 04:11:32 TRACE nova.rpc.amqp File "/usr/lib/python2.7/dist-packages/nova/volume/driver.py", line 172, in delete_volume
2012-04-11 04:11:32 TRACE nova.rpc.amqp self._delete_volume(volume, volume['size'])
2012-04-11 04:11:32 TRACE nova.rpc.amqp File "/usr/lib/python2.7/dist-packages/nova/volume/driver.py", line 128, in _delete_volume
2012-04-11 04:11:32 TRACE nova.rpc.amqp run_as_root=True)
2012-04-11 04:11:32 TRACE nova.rpc.amqp File "/usr/lib/python2.7/dist-packages/nova/volume/driver.py", line 83, in _try_execute
2012-04-11 04:11:32 TRACE nova.rpc.amqp self._execute(*command, **kwargs)
2012-04-11 04:11:32 TRACE nova.rpc.amqp File "/usr/lib/python2.7/dist-packages/nova/utils.py", line 245, in execute
2012-04-11 04:11:32 TRACE nova.rpc.amqp cmd=' '.join(cmd))
2012-04-11 04:11:32 TRACE nova.rpc.amqp ProcessExecutionError: Unexpected error while running command.
2012-04-11 04:11:32 TRACE nova.rpc.amqp Command: sudo nova-rootwrap lvremove -f nova-volumes/volume-00000002
2012-04-11 04:11:32 TRACE nova.rpc.amqp Exit code: 5
2012-04-11 04:11:32 TRACE nova.rpc.amqp Stdout: ''
2012-04-11 04:11:32 TRACE nova.rpc.amqp Stderr: ' Can\'t remove open logical volume "volume-00000002"\n'

-- Log --

Essex final, Ubuntu 11.10 (64-bit).

Revision history for this message
Vish Ishaya (vishvananda) wrote :

sounds like we need an optional call to dmsetup and corresponding nova-rootwrap additions.

Changed in nova:
importance: Undecided → Medium
status: New → Triaged
Revision history for this message
Rafi Khardalian (rkhardalian) wrote :
affects: nova → cinder
Changed in cinder:
status: Triaged → In Progress
assignee: nobody → Rafi Khardalian (rkhardalian)
status: In Progress → Fix Committed
Changed in cinder:
status: Fix Committed → In Progress
Changed in cinder:
milestone: none → folsom-rc1
importance: Medium → Critical
Revision history for this message
Mark McLoughlin (markmc) wrote :

This honestly sounds more like an LVM bug to me

Changed in nova:
milestone: none → folsom-rc1
importance: Undecided → Critical
status: New → Confirmed
Revision history for this message
Rafi Khardalian (rkhardalian) wrote :

Possibly, Mark -- but there's no harm in calling dmsetup remove to try to work around it as best possible.

Revision history for this message
Mark McLoughlin (markmc) wrote :

Deleting a device mapping from under LVMs feet isn't something I'd imagine LVM developers would recommend. It's the kind of thing that may work now but could suddenly cause weird brokeness with future LVM versions.

Revision history for this message
John Griffith (john-griffith) wrote :

I wonder if an lvchange would be a better option here? I seem to recall there were some issues with lvchange as well but might be good to investigate this again.

Changed in nova:
assignee: nobody → John Griffith (john-griffith)
Changed in cinder:
assignee: Rafi Khardalian (rkhardalian) → John Griffith (john-griffith)
Revision history for this message
clayg (clay-gerrard) wrote :

Is the only way to reproduce this issue to mount the volume locally? I would imagine we're removing the iscsi target before removing the volume - so what exactly has the volume open?

Revision history for this message
John Griffith (john-griffith) wrote :

sighh, lvchange is a no go. Rafi let's proceed with the fix you have, just update the rootwrap for the redhat/fedora case and if you wouldn't mind put a TODO or NOTE at the dmsetup remove pointing out that this is a bit of a hack and we may need to address it again in the near future. Thanks, and sorry for the side-track.

Revision history for this message
Thierry Carrez (ttx) wrote :

Downgrading priority as this is not a regression and has been in all previous releases

Changed in cinder:
importance: Critical → High
Changed in nova:
importance: Critical → High
Revision history for this message
Thierry Carrez (ttx) wrote :
Changed in nova:
status: Confirmed → In Progress
Changed in cinder:
assignee: John Griffith (john-griffith) → Rafi Khardalian (rkhardalian)
Changed in nova:
assignee: John Griffith (john-griffith) → Rafi Khardalian (rkhardalian)
Changed in cinder:
assignee: Rafi Khardalian (rkhardalian) → John Griffith (john-griffith)
Revision history for this message
Akira Yoshiyama (yosshy) wrote :

A duplicate of #1038062?

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (master)

Reviewed: https://review.openstack.org/12396
Committed: http://github.com/openstack/cinder/commit/f9cf780678078c77881fee41addc093abb15c136
Submitter: Jenkins
Branch: master

commit f9cf780678078c77881fee41addc093abb15c136
Author: Rafi Khardalian <email address hidden>
Date: Wed Sep 5 05:51:43 2012 +0000

    Fix volume deletion when device mapper is used

    Call dmsetup remove if there is a /dev/mapper/nova--volumes-
    element present.

    Resolves bug 979020

    Change-Id: Iddaaed411a77dda4bd32f9a97687ff17744119eb

Changed in cinder:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/12434
Committed: http://github.com/openstack/nova/commit/d05637f99271e081f9579b69cf77de1969839561
Submitter: Jenkins
Branch: master

commit d05637f99271e081f9579b69cf77de1969839561
Author: Rafi Khardalian <email address hidden>
Date: Wed Sep 5 16:09:45 2012 +0000

    Fix volume deletion when device mapper is used

    Call dmsetup remove if there is a /dev/mapper/nova--volumes-
    element present.

    Resolves bug 979020

    Change-Id: Iddaaed411a77dda4bd32f9a97687ff17744119eb

Changed in nova:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in cinder:
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in nova:
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in cinder:
milestone: folsom-rc1 → 2012.2
Thierry Carrez (ttx)
Changed in nova:
milestone: folsom-rc1 → 2012.2
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.