[SRU] rbd calls block eventlet threads

Bug #1401335 reported by Sachi King on 2014-12-11
24
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Cinder
High
Ivan Kolodyazhny
Ubuntu Cloud Archive
Medium
Matt Rae
Kilo
Medium
Unassigned

Bug Description

[Impact]

cinder-volume's rbd driver makes a call out to rbd it does not yield to eventlet, thus blocking all other processing. When this happens any pending requests are stuck unacknowledged in the rabbit queue until the current rbd task completes. This results in an unresponsive cloud presented to the user and actions such as instance creation failing due to nova timing out waiting on cinder.

[Test Case]

Steps to reproduce:
1: Create a volume that will take more than an instant to delete.
2: Delete the volume
3: Immediately attempt to create some volumes

Expected results:
Volumes create in a timely manner and become available
Volume delete processes and delete finishes in parallel

[Regression Potential]

This patch moves all rados calls to a separate python thread which
+doesn't block eventlet loop.

====

When cinder-volume's rbd driver makes a call out to rbd it does not yield to eventlet, thus blocking all other processing.
When this happens any pending requests are stuck unacknowledged in the rabbit queue until the current rbd task completes. This results in an unresponsive cloud presented to the user and actions such as instance creation failing due to nova timing out waiting on cinder.

Requirements to reproduce:
1: Ceph set up with a rbd backend
2: A single ceph-volume worker to prevent the distributed nature from masking the problem
3: A method of creating a large volume, writing to it

Steps to verify volume will trigger issue on delete:
1: Get the UUID of the volume you have created and dirtied
2: Use the rbd command on your ceph cluster to delete the volume and verify it takes a couple minutes to delete.
3: Delete the volume in cinder to cleanup cinder's database.

Steps to reproduce:
1: Create a volume that will take more than an instant to delete.
2: Delete the volume
3: Immediately attempt to create some volumes

Expected results:
Volumes create in a timely manner and become available
Volume delete processes and delete finishes in parallel

Actual results:
Volumes creations are processed after the delete has finished
Volume delete blocks threads and must process first

As RBD commands consume a fair amount of CPU time to process we should not just background the RBD commands as that would represent a DoS risk for the cinder-volume hosts.
One possible way to fix this would be to implement at least 2 queues that control the spawning of threads, reserving x of y threads for time sensitive and fast tasks.

Sachi King (nakato) on 2014-12-11
tags: added: ceph rbd
Huang Zhiteng (zhiteng-huang) wrote :

Thanks for reporting, this looks similar (not necessarily duplicate of) bug 1397264. It seems CephRBD python wrapper have a lot ot room to improve performance.

tags: added: driver
Sachi King (nakato) wrote :

Yes, I did notice the code in manager.py that causes that. Agreed I would not call this a duplicate of that bug, but it is defiantly very closely related. If we had a queue I think it would fix that issue as well.

Even if the CephRBD python wrapper needs performance improvements, I wouldn't say this issue highlights such a need.
In this case it's just the nature of Ceph, when you delete a volume you have to go find all the objects that are part of that rbd image and then tell ceph to delete those, which requires quite a bit of work.

Jay Bryant (jsbryant) on 2014-12-11
Changed in cinder:
importance: Undecided → High
Sachi King (nakato) on 2014-12-11
Changed in cinder:
assignee: nobody → Sachi King (nakato)
Sachi King (nakato) on 2014-12-16
Changed in cinder:
assignee: Sachi King (nakato) → nobody
Alan Jiang (ajiang) wrote :
Download full text (14.4 KiB)

I ran into same RBD scaling issue during cinder ceph scaling test.

This is the profiling I have done by using yappi.

profile function: delete_volume
=================================

Clock type: WALL
Ordered by: ttot, desc

name ncall tsub ttot tavg
/usr/lib/python2.6/site-packages/rbd.py:279 RBD.remove 1 9.150229 9.150295 9.150295
/usr/lib/python2.6/site-packages/rbd.py:329 Image.__init__ 2 0.221957 0.389255 0.194627
/usr/lib64/python2.6/ctypes/util.py:208 find_library 4 0.000103 0.362277 0.090569
/usr/lib64/python2.6/ctypes/util.py:184 _findSoname_ldconfig 4 0.000841 0.362174 0.090543
/usr/lib/python2.6/site-packages/rbd.py:124 load_librbd 3 0.000106 0.276124 0.092041
__builtin__.<method 'read' of 'file' objects> 4 0.238940 0.238940 0.059735
..2.6/site-packages/cinder/volume/drivers/rbd.py:297 RBDDriver._connect_to_rados 1 0.000058 0.222258 0.222258
/usr/lib/python2.6/site-packages/rados.py:196 Rados.__init__ 1 0.000413 0.221959 0.221959
/usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py:580 Session._ewait 1 0.000000 0.164918 0.164918
../lib/python2.6/site-packages/qpid/messaging/endpoints.py:218 Connection._ewait 1 0.000000 0.164918 0.164918
..r/lib/python2.6/site-packages/qpid/messaging/endpoints.py:197 Connection._wait 1 0.000009 0.164918 0.164918
/usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py:815 Sender._ewait 1 0.000000 0.164918 0.164918
/usr/lib/python2.6/site-packages/qpid/concurrency.py:46 Waiter.wait 1 0.000012 0.164909 0.164909
/usr/lib/python2.6/site-packages/qpid/concurrency.py:88 Condition.wait 2/1 0.000065 0.164817 0.082408
/usr/lib/python2.6/site-packages/qpid/compat.py:51 PipeWaiter.wait 2/1 0.024622 0.164742 0.082371
/usr/lib/python2.6/site-packages/rados.py:146 run_in_thread 6/3 0.000029 0.130822 0.021804
/usr/lib64/python2.6/threading.py:463 RadosThread.start 6/4 0.000017 0.123475 0.020579
/usr/lib64/python2.6/threading.py:391 _Event.wait 6/4 0.000000 0.121649 0.020275
../site-packages/cinder/volume/drivers/rbd.py:543 RBDDriver._delete_backup_snaps 1 0.000106 0.116343 0.116343
/usr/lib64/python2.6/re.py:139 search 4 0.000118 0.113237 0.028309
__builtin__.search 4 0.109828 0.109828 0.027457
/usr/lib/python2.6/site-packages/cinder/service.py:241 Service.report_state 1 0.000026 0.109518 0.109518
/usr/lib/python2.6/site-packages/rbd.py:143 RBD.__init__ 1 0.000017 0.108991 0.108991
/usr/lib/python2.6/site-packages/cinder/db/sqlalchemy/api.py:117 wrapper 4/2 0.000027 0.108968 0.027242
/usr/lib/python2.6/site-...

Changed in cinder:
assignee: nobody → Alan Jiang (ajiang)
Mike Perez (thingee) on 2015-03-10
tags: removed: rbd
Jon Bernard (jbernard) on 2015-03-25
Changed in cinder:
status: New → Confirmed

Fix proposed to branch: master
Review: https://review.openstack.org/175555

Changed in cinder:
assignee: Alan Jiang (ajiang) → Ivan Kolodyazhny (e0ne)
status: Confirmed → In Progress

Reviewed: https://review.openstack.org/175555
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=9f83602a82153e0be8c127c3d936b512327336f7
Submitter: Jenkins
Branch: master

commit 9f83602a82153e0be8c127c3d936b512327336f7
Author: Ivan Kolodyazhny <email address hidden>
Date: Mon Apr 20 22:53:14 2015 +0300

    Move RBD calls to a separate threads

    RBD is a python binding for librados which isn't patched by eventlet.

    Making long-running tasks like removing big (~100GB, ~1TB) volumes
    blocks eventlet loop and all cinder-volume service hangs
    until it finished when rados_connect_timeout is disabled. It makes
    cinder-volume services unavailable for a while.

    This patch moves all rados calls to a separate python thread which
    doesn't block eventlet loop.

    Change-Id: Ibaf43858d60e1320c339f2523b5c09c7f7c7f91e
    Closes-Bug: #1401335

Changed in cinder:
status: In Progress → Fix Committed

With this patch volumes can be deleted and created simultaneously.

Steps:

# cinder create 90 --display-name VOLUME90
# cinder delete VOLUME90
# cinder create 1 --display-name VOLUME1
# cinder create 1 --display-name VOLUME1.1
# cinder create 1 --display-name VOLUME1.2
# cinder list
+--------------------------------------+----------+--------------+------+-------------+----------+-------------+
| ID | Status | Display Name | Size | Volume Type | Bootable | Attached to |
+--------------------------------------+----------+--------------+------+-------------+----------+-------------+
| 1dd94a1f-d2bd-4b74-9416-a8758d52f811 | creating | VOLUME1.2 | 1 | None | false | |
| 25e9dab7-4c09-48ab-9c94-375b244bb15e | creating | VOLUME1 | 1 | None | false | |
| 4724f789-117e-4fb8-98c8-163fa1571e45 | deleting | VOLUME90 | 90 | None | false | |
| 542b76a6-1faf-4d9f-9a75-95ef37eba3e9 | creating | VOLUME1.1 | 1 | None | false | |
+--------------------------------------+----------+--------------+------+-------------+----------+-------------+

Deletion and creation passed successfully.

Eric Harney (eharney) wrote :

Re-opening since discussion at https://review.openstack.org/#/c/180212/ indicates that the fix in Cinder might be broken. Should investigate further.

Changed in cinder:
assignee: Ivan Kolodyazhny (e0ne) → Eric Harney (eharney)
status: Fix Committed → In Progress
Changed in cinder:
assignee: Eric Harney (eharney) → Ivan Kolodyazhny (e0ne)
Ivan Kolodyazhny (e0ne) wrote :
Changed in cinder:
status: In Progress → Fix Committed
Changed in cinder:
milestone: none → liberty-2
status: Fix Committed → Fix Released

Change abandoned by Mike Perez (<email address hidden>) on branch: master
Review: https://review.openstack.org/145678
Reason: Over a month with no update.

Thierry Carrez (ttx) on 2015-10-15
Changed in cinder:
milestone: liberty-2 → 7.0.0
Matt Rae (mattrae) on 2017-03-31
description: updated
Matt Rae (mattrae) on 2017-03-31
tags: added: sts-sponsor
Matt Rae (mattrae) wrote :

Adding debdiff for trusty-kilo

Matt Rae (mattrae) on 2017-03-31
no longer affects: cloud-archive
description: updated
tags: removed: sts-sponsor
Changed in cloud-archive:
status: New → Triaged
assignee: nobody → Matt Rae (mattrae)
importance: Undecided → Medium
description: updated
summary: - rbd calls block eventlet threads
+ [SRU] rbd calls block eventlet threads
tags: added: sts sts-sru-needed
James Page (james-page) wrote :

Accepted into kilo-proposed (automatic bug comment failed due to missing target).

Changed in cloud-archive:
status: Triaged → Invalid
tags: added: verification-kilo-needed

Hello Sachi, or anyone else affected,

Accepted cinder into kilo-proposed. The package will build now and be available in the Ubuntu Cloud Archive in a few hours, and then in the -proposed repository.

Please help us by testing this new package. To enable the -proposed repository:

  sudo add-apt-repository cloud-archive:kilo-proposed
  sudo apt-get update

Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-kilo-needed to verification-kilo-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-kilo-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Matt Rae (mattrae) wrote :

I deployed cinder-volume 1:2015.1.4-0ubuntu1.3 from proposed and followed the verification steps. During a ~2-3 minute volume deletion process cinder was still functioning. 'cinder service-list' showed all services as up and I was able to create a new volume during the long deletion.

Matt Rae (mattrae) wrote :

Noting I followed the verification steps from lp:1678275 to create cgroups to limit the I/O for ceph osds which reproduce the slow volume deletion.

tags: added: verification-kilo-done
removed: verification-kilo-needed

The verification of the Stable Release Update for cinder has completed successfully and the package has now been released to -updates. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

James Page (james-page) wrote :

This bug was fixed in the package cinder - 1:2015.1.4-0ubuntu1.3
---------------

 cinder (1:2015.1.4-0ubuntu1.3) trusty-kilo; urgency=medium
 .
   * Fix issue that upload-to-image doesn't copy os_type property
     (LP: #1692446):
     - d/p/0001-Fix-volume-upload-failure-with-glance_api_version-2.patch
 .
 cinder (1:2015.1.4-0ubuntu1.2) trusty-kilo; urgency=medium
 .
   * Move RBD calls to a separate thread to avoid blocking the main process. (LP: #1401335):
     - d/p/lp1401335-Move-RBD-calls-to-a-separate-threads.patch

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers