Comment 1 for bug 1253993

Revision history for this message
Salvatore Orlando (salvatore-orlando) wrote :

The cause for the massive amount of time needed to complete a loop is the number of calls sent from the neutron server which need to be handled.

In some cases about 1,000 incoming requests, which resulted in about 1,500 calls to neutron-server from the agent, were observed in a single tempest run (isolated and parallel).

In particular calls for security group updates and port updates trigger refresh_firewall which is a rather expensive call.
In some cases even 20 threads were concurrently running refresh firewall; all these threads synchronize on a semaphore for iptables.

This number is currently being brought down in https://review.openstack.org/#/c/57420, by:
- ensuring messages are sent from the server to the client only when really necessary
- reworking message handling in the agent by reacting to notification in the main rpc loop rather then immediately once the message is received, thus avoiding concurrent execution of methods which will end up doing exactly the same changes to iptables
- grouping calls from the agent to the server where possible (e.g.: send a single request for device details instead of a request for each device)

Leveraging threads or external processes for tasks which do not have to be synchronous with port processing is also currently being evaluated.