Comment 2 for bug 797770

Revision history for this message
Chris Behrens (cbehrens) wrote :

I've done a lot of thinking about this one. The only cause can be that the greenthread that has the socket calls to rabbit... is not being scheduled in time to complete the AMQP protocol negotiation. The only things that could cause *that* would be a bug in eventlet or something blocking. It really must be the latter. The only things that could cause *that* would be: too much CPU spinning in python or something else blocking. We're seeing this in stress testing, and there appears to be plenty of CPU available. So, something is probably blocking?

Looking at what might be blocking, with help from a discussion with eday, we've determined that sqlalchemy is using a mysql engine that uses a C library to make socket calls. eventlet cannot wrap this. This means that all mysql queries block until finished. If mysql queries take some time for some reason, this can cause delayed greenthread scheduling.

Potential solutions are:

1) Find and use a mysql engine for sqlachemy that uses python sockets. (Does one exist?)
2) Re-architect the DB layer to use a pool of threads for DB calls... assuming the C code will unlock the GIL during socket operations.