Error on deleting volume on bay update

Bug #1422831 reported by hongbin
18
This bug affects 4 people
Affects Status Importance Assigned to Milestone
Cinder
Invalid
Undecided
Unassigned
Magnum
New
High
Unassigned

Bug Description

The steps to reproduce:
1. Deploy a 2 nodes bay.
2. Scale it down to 1 node.

The symptom:
* heat stack-list showed the stack staying on UPDATE_IN_PROGRESS.
* The 'c-vol' screen shows the following error:

 Traceback (most recent call last):
   File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 142, in _dispatch_and_reply
     executor_callback))
   File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 186, in _dispatch
     executor_callback)
   File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 130, in _do_dispatch
     result = func(ctxt, **new_args)
   File "/usr/local/lib/python2.7/dist-packages/osprofiler/profiler.py", line 105, in wrapper
     return f(*args, **kwargs)
   File "/opt/stack/cinder/cinder/volume/manager.py", line 134, in lvo_inner1
     return lvo_inner2(inst, context, volume_id, **kwargs)
   File "/usr/local/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py", line 431, in inner
     return f(*args, **kwargs)
   File "/opt/stack/cinder/cinder/volume/manager.py", line 133, in lvo_inner2
     return f(*_args, **_kwargs)
   File "/opt/stack/cinder/cinder/volume/manager.py", line 483, in delete_volume
     {'status': 'error_deleting'})
   File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 82, in __exit__
     six.reraise(self.type_, self.value, self.tb)
   File "/opt/stack/cinder/cinder/volume/manager.py", line 472, in delete_volume
     self.driver.delete_volume(volume_ref)
   File "/usr/local/lib/python2.7/dist-packages/osprofiler/profiler.py", line 105, in wrapper
     return f(*args, **kwargs)
   File "/opt/stack/cinder/cinder/volume/drivers/lvm.py", line 335, in delete_volume
     self._delete_volume(volume)
   File "/opt/stack/cinder/cinder/volume/drivers/lvm.py", line 115, in _delete_volume
     self.vg.delete(name)
   File "/opt/stack/cinder/cinder/brick/local_dev/lvm.py", line 678, in delete
     root_helper=self._root_helper, run_as_root=True)
   File "/opt/stack/cinder/cinder/utils.py", line 143, in execute
     return processutils.execute(*cmd, **kwargs)
   File "/usr/local/lib/python2.7/dist-packages/oslo_concurrency/processutils.py", line 228, in execute
     cmd=sanitized_cmd)
ProcessExecutionError: Unexpected error while running command.
Command: sudo cinder-rootwrap /etc/cinder/rootwrap.conf lvremove --config activation { retry_deactivation = 1} devices { ignore_suspended_devices = 1} -f stack-volumes-lvmdriver-1/volume-faa14149-1cf9-4906-8137-2f533101fce1
Exit code: 5
Stdout: u''
Stderr: u' Logical volume stack-volumes-lvmdriver-1/volume-faa14149-1cf9-4906-8137-2f533101fce1 is used by another device.\n'

hongbin (hongbin034)
Changed in magnum:
assignee: nobody → hongbin (hongbin034)
Revision history for this message
hongbin (hongbin034) wrote :

This is a question for cinder devs:

When this error occurred, how to recover from it (I tried 'cinder delete ...' and 'cinder force-delete ...'. None of them work).

Revision history for this message
Jay Bryant (jsbryant) wrote :

The error message indicates that LVM thinks the volume is still in use. I would try to trace back why it is listed as being in use.

Steven Dake (sdake)
Changed in magnum:
status: New → Triaged
importance: Undecided → High
Mike Perez (thingee)
Changed in cinder:
status: New → Incomplete
Revision history for this message
hongbin (hongbin034) wrote :

@Mike, could you clarify if any additional information you need. Thanks.

Revision history for this message
Mike Perez (thingee) wrote :

@hongbin having the logs of magnum and cinder would be a good start. As Jay noted, LVM is showing the volume as in use. We need to see what interactions happened to verify if a request at any time came to Cinder to do a detach. Otherwise, it's working as expected, and things need to be orchestrated to do a detach before you're expecting a resource to be available.

Revision history for this message
hongbin (hongbin034) wrote :

I cannot reproduce this bug any more. Will revisit it if it shows up later.

Changed in magnum:
assignee: hongbin (hongbin034) → nobody
status: Triaged → Invalid
Revision history for this message
Marco CONSONNI (marco-consonni) wrote :
Download full text (6.1 KiB)

Hello,

not sure that the information I'm reporting here is really related to this bug but the problem I'm facing is very similar, at least in the observable results.

I created an Instance (Ubuntu 14.04 OS) and a Volume with cinder.
I attached the volume to the VM.
Then, in the instance, I created a logical volume using the attached cinder volume.
In order to do that, I submitted the following commands at the instance level:

$ sudo su
# apt-get -y update
# apt-get -y install lvm2
# apt-get -y install xfsprogs
# vgcreate VG /dev/vdb
# lvcreate -L 500M -n LV_DATA VG
# mkfs.xfs -d agcount=8 /dev/VG/LV_DATA
# mkdir -p -m 0700 /db/dbdata
# mount -t xfs -o noatime,nodiratime,attr2 /dev/VG/LV_DATA /db/dbdata

The situation at the instance level is the following:

# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
vda 253:0 0 20G 0 disk
  vda1 253:1 0 20G 0 part /
vdb 253:16 0 1G 0 disk
  VG-LV_DATA (dm-0) 252:0 0 500M 0 lvm /db/dbdata

The logical volume I've just created (VG-LV_DATA) is mounted under /db/dbdata and resides on vdb (the external volume I've created through cinder).

Now the trick begins.

I delete the instance using horizon (or the CLI, it doesn't matter).

As soon as I delete the instance, if i look at the physical node where the cinder volume is hosted, I see the following (pay attention to the latest row of the output - my comments follow...):

root@storm02:/home/openstack# ls -la /dev/mapper
total 0
drwxr-xr-x 2 root root 260 Jul 15 16:19 .
drwxr-xr-x 19 root root 4800 Jul 15 16:19 ..
lrwxrwxrwx 1 root root 7 Jul 14 10:53 cinder--volumes-_snapshot--5a085d24--ca10--4444--a9b1--cb1397a54a8c -> ../dm-3
lrwxrwxrwx 1 root root 7 Jul 14 10:53 cinder--volumes-_snapshot--5a085d24--ca10--4444--a9b1--cb1397a54a8c-cow -> ../dm-2
lrwxrwxrwx 1 root root 7 Jul 14 10:53 cinder--volumes-volume--0fbfd371--ec0a--4b25--93c0--53caa153f973 -> ../dm-6
lrwxrwxrwx 1 root root 7 Jul 14 10:53 cinder--volumes-volume--65e02f67--8e2f--471d--a1fe--ff4d1a14962a -> ../dm-1
lrwxrwxrwx 1 root root 7 Jul 14 10:53 cinder--volumes-volume--65e02f67--8e2f--471d--a1fe--ff4d1a14962a-real -> ../dm-0
lrwxrwxrwx 1 root root 7 Jul 15 16:19 cinder--volumes-volume--c45f2dec--af94--4ba5--9f1b--7b491f215580 -> ../dm-4
lrwxrwxrwx 1 root root 7 Jul 14 10:53 cinder--volumes-volume--de3691a9--6e27--4ce8--9c6e--f86b9458398d -> ../dm-5
crw------- 1 root root 10, 236 Jul 14 10:53 control
lrwxrwxrwx 1 root root 7 Jul 14 10:53 storm02--vg-root -> ../dm-8
lrwxrwxrwx 1 root root 7 Jul 14 10:53 storm02--vg-swap_1 -> ../dm-9
lrwxrwxrwx 1 root root 7 Jul 15 16:19 VG-LV_DATA -> ../dm-7

A new device (VG-LV_DATA) has popped out as soon as I deleted the instance! The cinder volume (that actually 'contained' VG-LV_DATA) is still there but when I try to delete it, I get an error and cinder log says (after this log find what I did to fix the problem):

stderr: ' Logical volume cinder-volumes/volume-c45f2dec-af94-4ba5-9f1b-7b491f215580 is used by another device.\n' to caller
2015-07-15 16:26:32.470 2990 ERROR oslo.messaging._driver...

Read more...

Revision history for this message
Marco CONSONNI (marco-consonni) wrote :

I forgot to mention. I'm using IceHouse.

Revision history for this message
Sean McGinnis (sean-mcginnis) wrote :

A lot has changed here since Icehouse. Closing as invalid as this is probably fixed or else an issue outside of Cinder. Please reopen if it can be reproduced and logs are available.

Changed in cinder:
status: Incomplete → Invalid
Revision history for this message
yatin (yatinkarel) wrote :
Adrian Otto (aotto)
Changed in magnum:
status: Invalid → New
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to magnum (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/408768

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to magnum (master)

Reviewed: https://review.openstack.org/408768
Committed: https://git.openstack.org/cgit/openstack/magnum/commit/?id=874d81c1d91be301a20fdd403b60104b9adc1404
Submitter: Jenkins
Branch: master

commit 874d81c1d91be301a20fdd403b60104b9adc1404
Author: Spyros Trigazis <email address hidden>
Date: Thu Dec 8 19:36:02 2016 +0100

    Remove docker_volume_size from functional-test

    To unblock our gate, in all the functional tests, create
    clusters without cinder volumes.

    Change-Id: Ia3b14603c5fc516b00c862c8b9257e0fd23d4b9e
    Related-Bug: #1422831

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to magnum (stable/newton)

Reviewed: https://review.openstack.org/435805
Committed: https://git.openstack.org/cgit/openstack/magnum/commit/?id=73212dbe39fd3e8b3a57dfac82a441ef1ab862c4
Submitter: Jenkins
Branch: stable/newton

commit 73212dbe39fd3e8b3a57dfac82a441ef1ab862c4
Author: Spyros Trigazis <email address hidden>
Date: Fri Sep 30 15:10:52 2016 +0200

    Make cinder volume optional

    The purpose of this patch is primarily to unblock the
    stable/newton gate and secondarily fix the performance issues
    faced with large clusters, where Magnum creates a volume per
    node.

    In the swarm_atomic and k8s_atomic drivers container images are
    stored in a dedicated cinder volume per cluster node. It is
    proven that this architecture can be a scalability bottleneck.

    Make the use of cinder volumes for container images and opt-in
    option. If docker-volume-size is not specified no cinder
    volumes will be created. Before, if docker-volume-size wasn't
    specified the default value was 25.

    To use cinder volumes for container storage the user will
    interact with magnum as before, (meaning the valid values are
    integers starting from 1).

    Backport: I3394c62a43bbf950b7cf0b86a71b1d9b0481d68f
    Conflicts:
    * magnum/drivers/common/swarm_fedora_template_def.py
      Edit magnum/drivers/swarm_fedora_atomic_v1/template_def.py
      instead.
    * magnum/tests/unit/conductor/handlers/test_k8s_cluster_conductor.py
      Remove invalid unit test for docker_volume_size.
      Fix unit test which references the driver class

    Additionally, remove the use of cinder volumes in functioanal tests.
    2nd Backport: Ia3b14603c5fc516b00c862c8b9257e0fd23d4b9e

    Remove service from manager class for tempest.
    3rd Backport: I67f79efd2049c05d36ea56691b664417ed358fd8

    Closes-Bug: #1638006
    Related-Bug: #1422831

    Change-Id: I219f02dc1861bd4b6b9c59ecc6af448d09004f18

tags: added: in-stable-newton
Changed in magnum:
assignee: nobody → Inampudi Aditya (iaditya91)
Changed in magnum:
assignee: Inampudi Aditya (iaditya91) → nobody
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.