nova-compute hangs while executing a blocking call to librbd

Bug #1606825 reported by Roman Podoliaka
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Mirantis OpenStack
Status tracked in 10.0.x
10.0.x
Fix Released
High
Roman Podoliaka
9.x
Fix Released
High
Roman Podoliaka

Bug Description

Upstream bug: https://bugs.launchpad.net/nova/+bug/1607461

While executing a call to librbd nova-compute may hang for a while and eventually go down in nova service-list output.

strace'ing shows that a process is stuck on acquiring a mutex:

root@node-153:~# strace -p 16675
Process 16675 attached
futex(0x7fff084ce36c, FUTEX_WAIT_PRIVATE, 1, NULL

gdb allows to see the traceback:

http://paste.openstack.org/show/542534/

^ which basically means calls to librbd (C library) are not monkey-patched and do not allow to switch the execution context to another green thread in a eventlet-based process.

To avoid blocking of the whole nova-compute process on calls to librbd we should wrap them with tpool.execute() (http://eventlet.net/doc/threading.html#eventlet.tpool.execute)

Tags: area-nova
tags: added: mos-nova
tags: added: area-nova
removed: mos-nova
Changed in mos:
assignee: nobody → Roman Podoliaka (rpodolyaka)
milestone: none → 10.0
status: New → Confirmed
importance: Undecided → High
Revision history for this message
Roman Podoliaka (rpodolyaka) wrote :
description: updated
description: updated
description: updated
Revision history for this message
Roman Podoliaka (rpodolyaka) wrote :
Revision history for this message
Timur Nurlygayanov (tnurlygayanov) wrote :

Verified on MOS 9.1 snapshot 9.0-2016-09-11-182323

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.