[rbd] When cinder-volume start, it take hours before state become 'up'

Bug #1707936 reported by Mehdi Abaakouk on 2017-08-01
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Cinder
Undecided
Mehdi Abaakouk

Bug Description

Hi,

Since this revert: https://review.openstack.org/#/c/483298/1

Each time cinder-volume restart it takes hours before the 'state' become 'up'

During this warmup, API trows a ton of MessageTimeout.

Cinder is just unusable.

Regards,

Mehdi Abaakouk (sileht) wrote :

The rbd pool have only ~ 200 volumes.

rbd.Image().diff_iterate() will take a look of all objects of all volumes on all Ceph OSDs.

This shouldn't be used to get volume stats at startup and periodically.

Mehdi Abaakouk (sileht) wrote :

Also the experience to debug the issue is terrible. Nothing appears in logs during the volume_stats update.

The state is 'up'. You make a API call like attaching a volume. The state become 'down'. API got MessagingTimeout and nothing else.

I have understand the issue when I have enabled debug and see that cinder-volume was only logging "connecting to ceph (timeout=-1). _connect_to_rados" and nothing else.

The message appears every 1 to 10 seconds (I'm guessing that depends on the size of the volume)

I have manually added some additional logging and found that cinder-volume was stuck inside _get_usage_info() loops.

Eric Harney (eharney) on 2017-08-01
tags: added: drivers rbd
summary: - When cinder-volume start, it take hours before state become 'up'
+ [rbd] When cinder-volume start, it take hours before state become 'up'

Fix proposed to branch: master
Review: https://review.openstack.org/489718

Changed in cinder:
assignee: nobody → Mehdi Abaakouk (sileht)
status: New → In Progress
Mehdi Abaakouk (sileht) wrote :

v.diff_iterate without fast-diff and object-map feature is really slow because it have to dig into all rados objects to get the information.

So, on old deployement, usage of diff_iterate just make cinder-volume unusable.
The process is stuck in this loop just to compute size...

If you continue to use this, you should put somewhere that this feature are "required". And maybe, cinder can enable them, but rebuild the object map can be slow on big volume.

Gorka Eguileor (gorka) on 2017-08-04
tags: added: ceph
Gorka Eguileor (gorka) wrote :

After looking into the matter of the stats I've realized that the RBD stats are wrong, and we shouldn't be returning physical used space in the first place as I explain in the improve provisioning spec [1].

I have proposed a patch in master [2] to fix our incorrect stats report which will also take care of this issue.

[1] https://review.openstack.org/490116
[2] https://review.openstack.org/486734

Change abandoned by Mehdi Abaakouk (sileht) (<email address hidden>) on branch: stable/ocata
Review: https://review.openstack.org/489630
Reason: better solution: https://review.openstack.org/#/c/486734/

Change abandoned by Mehdi Abaakouk (sileht) (<email address hidden>) on branch: master
Review: https://review.openstack.org/489718
Reason: better solution: https://review.openstack.org/#/c/486734/

Mohammed Naser (mnaser) wrote :

I'd like to follow up on this issue, with the new changes, it gets better, but it is still problematic.

_get_usage_info() loops over all RBD volumes which can be very slow for a large cloud. Due to this, the cinder-volume process pretty much stops responding. It also consumes a lot of CPU usage.

I think it would be ideal if the usage calculations was done in a separate thread to not block everything. Alternatively, perhaps add an option to make it opt-in if operators are managing capacity externally.

Jeffrey Zhang (jeffrey4l) wrote :
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers