Comment 5 for bug 1529937

Revision history for this message
Roman Podoliaka (rpodolyaka) wrote :

So this was reproduced again. As previously, there was an exception in Nova logs:

http://paste.openstack.org/show/484699/

If you take a look at the code, Nova fails to get a floating IP from Neutron *right after* it was successfully created (Neutron returns 201 Created after *committing* data to a DB).

Here we can see two subsequent requests to neutron-server:

http://paste.openstack.org/show/484688/

I reverted the snapshot of the environment. This cluster had 3 controllers and 2 compute nodes.

The interesting fact is that despite the fact we use Galera in active-backup mode we had sessions to two different mysql nodes, as we can see in HAProxy stats

http://paste.openstack.org/show/484698/

or in mysql shell:

http://paste.openstack.org/show/484701/

This happens when active mysql backend goes down from haproxy point of view, so the latter switches to a backup. When active mysql node is back again, haproxy switches back to it, but it *does not* close sessions to the backup server. Given the fact we use connection pools in OpenStack services, it happens that we actually start to use more than 1 Galera node, even if we tried to avoid that in the first place.