CleanableInUse exceptions when doing large parallel operations (like snapshot creates)

Bug #1837403 reported by Mohammed Naser
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Cinder
Invalid
Undecided
Unassigned
OpenStack-Ansible
Fix Released
High
Mohammed Naser
Rocky
New
Undecided
Unassigned
Stein
Fix Released
Undecided
Mohammed Naser
Train
Fix Released
High
Mohammed Naser

Bug Description

This is being observed across a deployment with a large number of parallel Cinder operations.

2019-07-04 17:03:00.305 31260 ERROR oslo_messaging.rpc.server [req-88b9a084-f906-43c8-af5f-b2fceff94d81 a4a021ef6fce14d8c4e4ffe198a04d0426a11832b8821e9ee78f360bef7bfdc7 7d91af2e9c8c4dc392f61437ad932ba2 - d70a4991744248d9a6733356e668dfc6 d70a4991744248d9a6733356e668dfc6] Exception during message handling: CleanableInUse: Snapshot with id b72d4592-1b2b-4ca1-ae82-8390ce2a6516 is already being cleaned up or another host has taken over it.
2019-07-04 17:03:00.305 31260 ERROR oslo_messaging.rpc.server Traceback (most recent call last):
2019-07-04 17:03:00.305 31260 ERROR oslo_messaging.rpc.server File "/openstack/venvs/cinder-17.1.9/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 163, in _process_incoming
2019-07-04 17:03:00.305 31260 ERROR oslo_messaging.rpc.server res = self.dispatcher.dispatch(message)
2019-07-04 17:03:00.305 31260 ERROR oslo_messaging.rpc.server File "/openstack/venvs/cinder-17.1.9/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 220, in dispatch
2019-07-04 17:03:00.305 31260 ERROR oslo_messaging.rpc.server return self._do_dispatch(endpoint, method, ctxt, args)
2019-07-04 17:03:00.305 31260 ERROR oslo_messaging.rpc.server File "/openstack/venvs/cinder-17.1.9/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 190, in _do_dispatch
2019-07-04 17:03:00.305 31260 ERROR oslo_messaging.rpc.server result = func(ctxt, **new_args)
2019-07-04 17:03:00.305 31260 ERROR oslo_messaging.rpc.server File "<decorator-gen-243>", line 2, in create_snapshot
2019-07-04 17:03:00.305 31260 ERROR oslo_messaging.rpc.server File "/openstack/venvs/cinder-17.1.9/lib/python2.7/site-packages/cinder/objects/cleanable.py", line 204, in wrapper
2019-07-04 17:03:00.305 31260 ERROR oslo_messaging.rpc.server cleanable.set_worker()
2019-07-04 17:03:00.305 31260 ERROR oslo_messaging.rpc.server File "/openstack/venvs/cinder-17.1.9/lib/python2.7/site-packages/cinder/objects/cleanable.py", line 150, in set_worker
2019-07-04 17:03:00.305 31260 ERROR oslo_messaging.rpc.server id=self.id)
2019-07-04 17:03:00.305 31260 ERROR oslo_messaging.rpc.server CleanableInUse: Snapshot with id b72d4592-1b2b-4ca1-ae82-8390ce2a6516 is already being cleaned up or another host has taken over it.

Revision history for this message
Mohammed Naser (mnaser) wrote :

Clarification doc change posted

https://review.opendev.org/#/c/672054/

Changed in cinder:
status: New → Invalid
Revision history for this message
Mohammed Naser (mnaser) wrote :

Setting to invalid, this was because there was multiple cinder services with 'host'

Changed in openstack-ansible:
status: New → Confirmed
assignee: nobody → Mohammed Naser (mnaser)
importance: Undecided → High
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to openstack-ansible-os_cinder (master)

Fix proposed to branch: master
Review: https://review.opendev.org/672078

Changed in openstack-ansible:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to openstack-ansible-os_cinder (master)

Reviewed: https://review.opendev.org/672078
Committed: https://git.openstack.org/cgit/openstack/openstack-ansible-os_cinder/commit/?id=c148d77e29af6faebc1c9b012ae08aed447cd179
Submitter: Zuul
Branch: master

commit c148d77e29af6faebc1c9b012ae08aed447cd179
Author: Mohammed Naser <email address hidden>
Date: Mon Jul 22 12:10:01 2019 -0400

    rbd: start using active-active

    This patch drops the hacky workaround of using backend_host which
    is not recommended by the Cinder team and instead uses active-active
    RBD which has been implemented since Rocky.

    Closes-Bug: #1837403
    Change-Id: I0c8aed4d0608c1f117e1baa1f428875956159ffd

Changed in openstack-ansible:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to openstack-ansible-os_cinder (stable/stein)

Fix proposed to branch: stable/stein
Review: https://review.opendev.org/674610

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to openstack-ansible-os_cinder (stable/stein)

Reviewed: https://review.opendev.org/674610
Committed: https://git.openstack.org/cgit/openstack/openstack-ansible-os_cinder/commit/?id=918b9077c816be5fc056637301265e0be2f245ab
Submitter: Zuul
Branch: stable/stein

commit 918b9077c816be5fc056637301265e0be2f245ab
Author: Mohammed Naser <email address hidden>
Date: Mon Jul 22 12:10:01 2019 -0400

    rbd: start using active-active

    This patch drops the hacky workaround of using backend_host which
    is not recommended by the Cinder team and instead uses active-active
    RBD which has been implemented since Rocky.

    Closes-Bug: #1837403
    Change-Id: I0c8aed4d0608c1f117e1baa1f428875956159ffd
    (cherry picked from commit c148d77e29af6faebc1c9b012ae08aed447cd179)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on cinder (master)

Change abandoned by Mohammed Naser (<email address hidden>) on branch: master
Review: https://review.opendev.org/672054

no longer affects: openstack-ansible/trunk
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/openstack-ansible-os_cinder stein-eol

This issue was fixed in the openstack/openstack-ansible-os_cinder stein-eol release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/openstack-ansible-os_cinder train-eol

This issue was fixed in the openstack/openstack-ansible-os_cinder train-eol release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/openstack-ansible-os_cinder ussuri-eol

This issue was fixed in the openstack/openstack-ansible-os_cinder ussuri-eol release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/openstack-ansible-os_cinder yoga-eom

This issue was fixed in the openstack/openstack-ansible-os_cinder yoga-eom release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/openstack-ansible-os_cinder victoria-eom

This issue was fixed in the openstack/openstack-ansible-os_cinder victoria-eom release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/openstack-ansible-os_cinder wallaby-eom

This issue was fixed in the openstack/openstack-ansible-os_cinder wallaby-eom release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/openstack-ansible-os_cinder xena-eom

This issue was fixed in the openstack/openstack-ansible-os_cinder xena-eom release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.