Running eventlet.monkey_patch in nova_api breaks the AMQP heartbeat thread

Bug #1827744 reported by Damien Ciabrini
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
In Progress
High
Damien Ciabrini

Bug Description

As currently discussed in https://bugs.launchpad.net/nova/+bug/1825584 , it seems that some work has been done in nova recently [0] to fix usage of monkey patching in wsgi and non-wsgi nova applications, in order to avoid some infinite recursion errors.

As a side effect of this work, nova_api now calls eventlet.monkey_patch() when it runs under mod_wsgi. This breaks the AMQP heartbeat monitoring thread, because that thread waits on a data structure [1] that has been monkey patched [2], which makes it yield its execution instead of sleeping for 15s.

Because mod_wsgi stops the execution of its embedded interpreter, the AMQP heartbeat thread can't be resumed until there's a message to be processed in the mod_wsgi queue. This causes long idle period, which in turns makes rabbitmq close the AMQP connection, and makes nova_api logs warnings and reconnect.

Note: other services like heat-api do not use monkey patching and aren't affected, so this seem to confirm that monkey-patching shouldn't happen in nova_api running under mod_wsgi in the first place.

[0] https://review.opendev.org/#/c/626952/
[1] https://github.com/openstack/oslo.messaging/blob/master/oslo_messaging/_drivers/impl_rabbit.py#L904
[2] https://github.com/openstack/oslo.utils/blob/master/oslo_utils/eventletutils.py#L182

summary: - Running eventlet.money_patch in nova_api breaks the AMQP heartbeat
+ Running eventlet.monkey_patch in nova_api breaks the AMQP heartbeat
thread
description: updated
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-heat-templates (master)

Fix proposed to branch: master
Review: https://review.opendev.org/657168

Changed in tripleo:
assignee: nobody → Damien Ciabrini (dciabrin)
status: New → In Progress
Revision history for this message
Damien Ciabrini (dciabrin) wrote :

FWIW, it looks to me that the introduction of monkey_patching in lp#1787331 to run asynchronous tasks under mod_wsgi is not a good idea because IMHO the concurrency model of mod_wsgi was not designed to do that.

The side effects of introducing async in mod_wsgi are discussed in http://lists.openstack.org/pipermail/openstack-discuss/2019-April/005310.html

The proposed fix breaks the async tasks under mod_wsgi but I feel that mod_wsgi wasn't meant to allow such workflows in the first place.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on tripleo-heat-templates (master)

Change abandoned by Damien Ciabrini (<email address hidden>) on branch: master
Review: https://review.opendev.org/657168
Reason: Abandoning the review since removing monkey-patch is not working and is not the solution as agreed.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.