lock may be released before lvremove command returns

Bug #1330776 reported by Clint Byrum
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Cinder
Fix Released
High
Zhang Hao

Bug Description

I believe this failure:

http://logs.openstack.org/09/96009/7/check/check-tempest-dsvm-postgres-full/7fb20c8/console.html
http://logs.openstack.org/09/96009/7/check/check-tempest-dsvm-postgres-full/7fb20c8/logs/screen-c-vol.txt.gz

Is due to this specific sequence which shows a file lock being removed before lvremove has returned:

2014-06-16 17:49:36.443 12132 DEBUG cinder.openstack.common.lockutils [req-62ad2e26-43db-4d1f-b00d-dbbd9cb7e53c 94a588565b4e4888a6381b91a50db8b6 db9a346154a6403dada21dc4e32ff72d - - -] Released file lock "c895c621-772d-4064-af0b-319c0c1b5370-delete_volume" at /opt/stack/data/cinder/cinder-c895c621-772d-4064-af0b-319c0c1b5370-delete_volume for method "lvo_inner2"... inner /opt/stack/new/cinder/cinder/openstack/common/lockutils.py:239
2014-06-16 17:49:36.445 12132 ERROR oslo.messaging.rpc.dispatcher [req-62ad2e26-43db-4d1f-b00d-dbbd9cb7e53c 94a588565b4e4888a6381b91a50db8b6 db9a346154a6403dada21dc4e32ff72d - - -] Exception during message handling: Unexpected error while running command.
Command: sudo cinder-rootwrap /etc/cinder/rootwrap.conf lvremove -f stack-volumes/volume-c895c621-772d-4064-af0b-319c0c1b5370
Exit code: 5
Stdout: ''
Stderr: ' Can\'t remove open logical volume "volume-c895c621-772d-4064-af0b-319c0c1b5370"\n'
2014-06-16 17:49:36.445 12132 TRACE oslo.messaging.rpc.dispatcher Traceback (most recent call last):
2014-06-16 17:49:36.445 12132 TRACE oslo.messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/dispatcher.py", line 134, in _dispatch_and_reply
2014-06-16 17:49:36.445 12132 TRACE oslo.messaging.rpc.dispatcher incoming.message))
2014-06-16 17:49:36.445 12132 TRACE oslo.messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/dispatcher.py", line 177, in _dispatch
2014-06-16 17:49:36.445 12132 TRACE oslo.messaging.rpc.dispatcher return self._do_dispatch(endpoint, method, ctxt, args)
2014-06-16 17:49:36.445 12132 TRACE oslo.messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/dispatcher.py", line 123, in _do_dispatch
2014-06-16 17:49:36.445 12132 TRACE oslo.messaging.rpc.dispatcher result = getattr(endpoint, method)(ctxt, **new_args)
2014-06-16 17:49:36.445 12132 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/new/cinder/cinder/volume/manager.py", line 124, in lvo_inner1
2014-06-16 17:49:36.445 12132 TRACE oslo.messaging.rpc.dispatcher return lvo_inner2(inst, context, volume_id, **kwargs)
2014-06-16 17:49:36.445 12132 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/new/cinder/cinder/openstack/common/lockutils.py", line 233, in inner
2014-06-16 17:49:36.445 12132 TRACE oslo.messaging.rpc.dispatcher retval = f(*args, **kwargs)
2014-06-16 17:49:36.445 12132 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/new/cinder/cinder/volume/manager.py", line 123, in lvo_inner2
2014-06-16 17:49:36.445 12132 TRACE oslo.messaging.rpc.dispatcher return f(*_args, **_kwargs)
2014-06-16 17:49:36.445 12132 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/new/cinder/cinder/volume/manager.py", line 396, in delete_volume
2014-06-16 17:49:36.445 12132 TRACE oslo.messaging.rpc.dispatcher {'status': 'error_deleting'})
2014-06-16 17:49:36.445 12132 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/new/cinder/cinder/openstack/common/excutils.py", line 68, in __exit__
2014-06-16 17:49:36.445 12132 TRACE oslo.messaging.rpc.dispatcher six.reraise(self.type_, self.value, self.tb)
2014-06-16 17:49:36.445 12132 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/new/cinder/cinder/volume/manager.py", line 385, in delete_volume
2014-06-16 17:49:36.445 12132 TRACE oslo.messaging.rpc.dispatcher self.driver.delete_volume(volume_ref)
2014-06-16 17:49:36.445 12132 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/new/cinder/cinder/volume/drivers/lvm.py", line 233, in delete_volume
2014-06-16 17:49:36.445 12132 TRACE oslo.messaging.rpc.dispatcher self._delete_volume(volume)
2014-06-16 17:49:36.445 12132 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/new/cinder/cinder/volume/drivers/lvm.py", line 133, in _delete_volume
2014-06-16 17:49:36.445 12132 TRACE oslo.messaging.rpc.dispatcher self.vg.delete(name)
2014-06-16 17:49:36.445 12132 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/new/cinder/cinder/brick/local_dev/lvm.py", line 610, in delete
2014-06-16 17:49:36.445 12132 TRACE oslo.messaging.rpc.dispatcher root_helper=self._root_helper, run_as_root=True)
2014-06-16 17:49:36.445 12132 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/new/cinder/cinder/utils.py", line 136, in execute
2014-06-16 17:49:36.445 12132 TRACE oslo.messaging.rpc.dispatcher return processutils.execute(*cmd, **kwargs)
2014-06-16 17:49:36.445 12132 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/new/cinder/cinder/openstack/common/processutils.py", line 173, in execute
2014-06-16 17:49:36.445 12132 TRACE oslo.messaging.rpc.dispatcher cmd=' '.join(cmd))
2014-06-16 17:49:36.445 12132 TRACE oslo.messaging.rpc.dispatcher ProcessExecutionError: Unexpected error while running command.
2014-06-16 17:49:36.445 12132 TRACE oslo.messaging.rpc.dispatcher Command: sudo cinder-rootwrap /etc/cinder/rootwrap.conf lvremove -f stack-volumes/volume-c895c621-772d-4064-af0b-319c0c1b5370
2014-06-16 17:49:36.445 12132 TRACE oslo.messaging.rpc.dispatcher Exit code: 5
2014-06-16 17:49:36.445 12132 TRACE oslo.messaging.rpc.dispatcher Stdout: ''
2014-06-16 17:49:36.445 12132 TRACE oslo.messaging.rpc.dispatcher Stderr: ' Can\'t remove open logical volume "volume-c895c621-772d-4064-af0b-319c0c1b5370"\n'
2014-06-16 17:49:36.445 12132 TRACE oslo.messaging.rpc.dispatcher

Tags: gate-failure
Revision history for this message
Mark McLoughlin (markmc) wrote :

logstash query:

message:"Can\\'t remove open logical volume" AND message:"Exit code: 5" AND message:"Exception during message handling" AND tags:"screen-c-vol.txt" AND build_status:"FAILURE"

seen 7 times in the last 48 hours

Revision history for this message
Mark McLoughlin (markmc) wrote :
Changed in cinder:
status: New → Confirmed
importance: Undecided → High
Revision history for this message
git-harry (git-harry) wrote :

I don't think the lock is being removed before the delete operation is performed. It looks like that because of where the logs are generated in the code. The exception raised by the delete isn't logged until after it has risen up outside the lock.

Zhang Hao (zhang-hao16)
Changed in cinder:
assignee: nobody → Zhang Hao (zhang-hao16)
assignee: Zhang Hao (zhang-hao16) → nobody
Zhang Hao (zhang-hao16)
Changed in cinder:
assignee: nobody → Zhang Hao (zhang-hao16)
Revision history for this message
Joe Gordon (jogo) wrote :

no hits in the gate looks like this was resolved somehow

Changed in cinder:
status: Confirmed → Fix Committed
Thierry Carrez (ttx)
Changed in cinder:
milestone: none → juno-3
status: Fix Committed → Fix Released
Revision history for this message
Clint Byrum (clint-fewbar) wrote : Re: [Bug 1330776] Re: lock may be released before lvremove command returns
Download full text (6.8 KiB)

From my perspective, no hits in the gate isn't "Fixed" it is "Lost".

I would tend to suggest that this bug be left open with a Low priority.
Meaning "don't spend time debugging on this, but it is still data".

With unexplained races one often sees them go away for a while and
them come back. This is usually caused by a performance improvement or
regression in one of the racers.

I'd hate to see this come back in a few months and have to start with 0
data.

Excerpts from Joe Gordon's message of 2014-08-29 23:11:17 UTC:
> no hits in the gate looks like this was resolved somehow
>
> ** Changed in: cinder
> Status: Confirmed => Fix Committed
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1330776
>
> Title:
> lock may be released before lvremove command returns
>
> Status in Cinder:
> Fix Committed
>
> Bug description:
> I believe this failure:
>
> http://logs.openstack.org/09/96009/7/check/check-tempest-dsvm-postgres-full/7fb20c8/console.html
> http://logs.openstack.org/09/96009/7/check/check-tempest-dsvm-postgres-full/7fb20c8/logs/screen-c-vol.txt.gz
>
> Is due to this specific sequence which shows a file lock being removed
> before lvremove has returned:
>
> 2014-06-16 17:49:36.443 12132 DEBUG cinder.openstack.common.lockutils [req-62ad2e26-43db-4d1f-b00d-dbbd9cb7e53c 94a588565b4e4888a6381b91a50db8b6 db9a346154a6403dada21dc4e32ff72d - - -] Released file lock "c895c621-772d-4064-af0b-319c0c1b5370-delete_volume" at /opt/stack/data/cinder/cinder-c895c621-772d-4064-af0b-319c0c1b5370-delete_volume for method "lvo_inner2"... inner /opt/stack/new/cinder/cinder/openstack/common/lockutils.py:239
> 2014-06-16 17:49:36.445 12132 ERROR oslo.messaging.rpc.dispatcher [req-62ad2e26-43db-4d1f-b00d-dbbd9cb7e53c 94a588565b4e4888a6381b91a50db8b6 db9a346154a6403dada21dc4e32ff72d - - -] Exception during message handling: Unexpected error while running command.
> Command: sudo cinder-rootwrap /etc/cinder/rootwrap.conf lvremove -f stack-volumes/volume-c895c621-772d-4064-af0b-319c0c1b5370
> Exit code: 5
> Stdout: ''
> Stderr: ' Can\'t remove open logical volume "volume-c895c621-772d-4064-af0b-319c0c1b5370"\n'
> 2014-06-16 17:49:36.445 12132 TRACE oslo.messaging.rpc.dispatcher Traceback (most recent call last):
> 2014-06-16 17:49:36.445 12132 TRACE oslo.messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/dispatcher.py", line 134, in _dispatch_and_reply
> 2014-06-16 17:49:36.445 12132 TRACE oslo.messaging.rpc.dispatcher incoming.message))
> 2014-06-16 17:49:36.445 12132 TRACE oslo.messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/dispatcher.py", line 177, in _dispatch
> 2014-06-16 17:49:36.445 12132 TRACE oslo.messaging.rpc.dispatcher return self._do_dispatch(endpoint, method, ctxt, args)
> 2014-06-16 17:49:36.445 12132 TRACE oslo.messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/dispatcher.py", line 123, in _do_dispatch
> 2014-06-16 17:49:36.445 12132 TRACE oslo.messaging.rpc.dispatcher result = getattr(...

Read more...

Thierry Carrez (ttx)
Changed in cinder:
milestone: juno-3 → 2014.2
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.