Comment 11 for bug 1230407

Revision history for this message
Salvatore Orlando (salvatore-orlando) wrote :

It appears that it is not the db queries which deadlock, but the rpc calls which trigger these database queries.
The lock wait timeout is probably just a manifestation of the eventlet deadlock.
This gets so bad at time that it appears all the connections in the pool are then taken by deadlocked threads, leaving the server without available connections and making the server totally unresponsive.

This last manifestation should be the cause of the observed failure.

I think this problem is not new in Neutron; these deadlock have been sporadically observed in the past.
There was a similar bug (https://bugs.launchpad.net/tripleo/+bug/1184484), but with some improvements in Havana-1 the issue apparently went away.

No change happened in quantum in the last 3 days which might justify this. However, recently - not sure when - vpn support was added to devstack-gate. As the VPN support adds more RPC calls which might increase a chance of deadlock, I would first try if removing vpn support from devstack-gate does remove the issue.

If that is successfull, I will then work on a solution which prevents this issue altogether.