I don't think this actually changed anything and we should revert it.
The part that exponentially increases is the amount of time it will wait for a response from the server, not the amount of time it backs off.
Consider this (verified behavior on devstack):
Server goes offline
Agent report state (start timer for 60 seconds)
Agent report state timeout exception
Agent sleeps random(0, rpc.TRANSPORT.conf.rpc_response_timeout)[1]
Agent report state (start timer for 120 seconds)
Agent report state timeout exception
Agent sleeps random(0, rpc.TRANSPORT.conf.rpc_response_timeout)
Agent report state (start timer for 240 seconds)
Server resumes after 150 seconds
Server processes messages in 'reports' queue.
Agent gets report state response.
There is no point in changing the exponential timeout increase because the server will process the report state as long as its in that timeout window.
The maximum time in the worst possible case an agent will not have a report_state in the queue is the rpc.TRANSPORT.conf.rpc_response_timeout.
I don't think this actually changed anything and we should revert it.
The part that exponentially increases is the amount of time it will wait for a response from the server, not the amount of time it backs off.
Consider this (verified behavior on devstack):
Server goes offline conf.rpc_ response_ timeout) [1] conf.rpc_ response_ timeout)
Agent report state (start timer for 60 seconds)
Agent report state timeout exception
Agent sleeps random(0, rpc.TRANSPORT.
Agent report state (start timer for 120 seconds)
Agent report state timeout exception
Agent sleeps random(0, rpc.TRANSPORT.
Agent report state (start timer for 240 seconds)
Server resumes after 150 seconds
Server processes messages in 'reports' queue.
Agent gets report state response.
There is no point in changing the exponential timeout increase because the server will process the report state as long as its in that timeout window.
The maximum time in the worst possible case an agent will not have a report_state in the queue is the rpc.TRANSPORT. conf.rpc_ response_ timeout.
1. https:/ /github. com/openstack/ neutron/ blob/9f4f6c8db2 7f4838a11b4a271 e96c372f01118dd /neutron/ common/ rpc.py# L141