metadata server running on mod_wsgi blocks on rpc.call to nova-conductor

Bug #1246623 reported by Alvaro Lopez
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Won't Fix
Medium
Unassigned
oslo-incubator
Won't Fix
Undecided
Unassigned

Bug Description

Hi there.

We are running all the nova apis (osapi, ec2, metadata) under mod_wsgi on Apache. With the Havana upgrade, the metadata server (that started using nova-conductor) stopped working (it works perfectly outside apache though). After further debugging, I've found that it is blocking in the get from the dataqueue of the RPC client: https://github.com/openstack/nova/blob/stable/havana/nova/openstack/common/rpc/amqp.py#L526

A workaround was a call to eventlet.monkey_patch() in our WSGI script. Shouldn't we call the monkey patch on the amqp.py module directly?

Tags: api
Matt Riedemann (mriedem)
tags: added: conductor
Revision history for this message
Doug Hellmann (doug-hellmann) wrote :

I think the app needs to set up the monkey patch call before anything else tries to use any files or network connections. Doing it inside amqp.py is too low level.

Changed in oslo:
status: New → Won't Fix
Revision history for this message
Russell Bryant (russellb) wrote :

It seems like this would affect the regular API service, too. We are only running eventlet.monkey_patch() if you're running the API via one of the nova commands (nova-api, nova-api-metadata, nova-api-os-compute, nova-api-ec2).

Changed in nova:
status: New → Confirmed
tags: added: api
Changed in nova:
importance: Undecided → Medium
Revision history for this message
Nathanael Burton (mathrock) wrote :

Russell, I can confirm this affects both the nova osapi and metadata when trying to run them in a webserver. I can recreate the problem on both Apache httpd with mod_wsgi and Nginx with uwsgi. This is running stable/havana. Interestingly it doesn't affect the entire nova API. I can perform most API commands, however 'nova console-log' seems to consistently hang with this issue. The get_console_output rpc.call will go out, hit the nova-compute node, return back to the API caller with the console log result but hang on the API side until the rpc_response_timeout hits and then return an error to the client.

I've had some success with throwing an eventlet.monkey_patch(os=False) in the wsgi file that the web server uses. However whereas without the monkey_patch, get_console_output always hangs and times out in the same place, with the monkey_patch it only sometimes works. Usually the first few requests will succeed and then all follow-on ones will hang in different places within the RPC code (seems like a side affect of the monkey_patch). I haven't tried testing this against icehouse yet, but I would suspect the same behavior.

Revision history for this message
Nathanael Burton (mathrock) wrote :

Just tested Apache + mod_wsgi for the current master(icehouse) in devstack and both the nova osapi and metadata apis seem to be working as expected.

Revision history for this message
Dan Smith (danms) wrote :

Removed the conductor tag since this is eventlet-related, it just fails on any RPC (which happens to be conductor in the metadata case)

tags: removed: conductor
Changed in nova:
status: Confirmed → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.