neutron

Bug #1782421
Comment #3

Comment 3 for bug 1782421

Revision history for this message

ymadhavi@in.ibm.com (ymadhavi) wrote on 2018-07-22:

I tried to apply both patches

https://review.openstack.org/#/c/583527/
https://review.openstack.org/#/c/577739/

But I do not see much improvement. Still scale test is failing after reaching 120 VM count and now getting AMQP time outs as threads started waiting more time than previous.

My environment is ,stable/queens), 5 compute nodes, 1 neutron server with 10 threads, scale to 500 virtual machines concurrently with 10 thread.

I see that revision_plugin is called before session.flush() and it is trying to bump revisions of all objects in a session if session.dirty() is true. But network object is not yet all modified during port creation but it is there in the session, so network object also getting new revision each time when session.flush() is done, which is actually causing threads to end up with old revision network object and they end up with 500 error after said number of retries.

I tried something around like added one more check at

https://github.com/openstack/neutron/blob/master/neutron/services/revisions/revision_plugin.py#L48

if session.is_modiified(obj) and isinstance(obj, standard_attr.HasStandardAttributes):
then bump revision.

Though it adds one additional expensive operation but still it gives more accuracy on whether the object is modified or not.

With this workaround I am able to scale successfully upto 500 VM with out any errors in neutron.