rbd_exclusive_cinder_pool should be set true as default

Bug #1789793 reported by Hua Zhang
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Cinder Charm
Fix Released
Medium
Unassigned

Bug Description

Cinder starts 1 cinder-volume process per backend which in turn creates an rpc (green) thread pool with size (default is 64). Upon creation of a new green thread (i.e. when an rpc request comes in and there are no free threads) a new native thread pool with size backend_native_threads_pool_size (default is 20) is created. The native thread will call _get_usage_info:

./cinder-2/cinder-volume.log:2018-08-29 07:06:47.287 1073770 DEBUG cinder.volume.drivers.rbd [req-31024786-ab97-48dd-80ce-733b89f1acea 10cbcda6a7854fa79cfc37dc1945cb6d 5d5d0f0ab738467f8ca813dd41432afa - a51502c6e125414fbba0cc95decd86c5 a51502c6e125414fbba0cc95decd86c5] Image volume-1f3aa3d5-5639-4a68-be07-14f3214320c6 is not found. _get_usage_info /usr/lib/python2.7/dist-packages/cinder/volume/drivers/rbd.py:409

Now rbd_exclusive_cinder_pool is setting false as default, when the above log appears, the native thread raised an exception but did not explicitly return (i.e. in a try-finally block) then native thread would hang forever and never yield its calling green thread so it will be not able to create/delete/view volumes at this time.

Besides, setting rbd_exclusive_cinder_pool=True will also reduce the load on the ceph cluster as well as on the volume service so that eg: deleting volume operation can be speed up as well.

So I suggest set rbd_exclusive_cinder_pool=True as default to avoid those problems.

Xav Paice (xavpaice)
tags: added: canonical-bootstack
Revision history for this message
Edward Hope-Morley (hopem) wrote :

This is really just a stop-gap to reduce the likelihood of a race but isn't going to make any guarantees. The RBD driver clearly has fundamental problems in the way that it is using a mix of green and native threads for RBD operations and that is a more complex problem that needs to be resolved in order for these problems to full go away.

Revision history for this message
Felipe Reyes (freyes) wrote :

> So I suggest set rbd_exclusive_cinder_pool=True as default to avoid those problems.

I don't think this is a good idea, because most of the clouds deployed with the charms also deploy ceph-radosgw, so the stats won't be accurate and in "close to full" clusters we may end up in problems.

Felipe Reyes (freyes)
tags: added: sts
James Page (james-page)
Changed in charm-cinder:
status: New → Opinion
importance: Undecided → Medium
Revision history for this message
Trent Lloyd (lathiat) wrote :

This change was already made, but it was marked against LP #1789828 instead. Marking Fixed Released.

Changed in charm-cinder:
status: Opinion → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.