2) L3 agents in compute1 (in this case) receives the router update notification and delete the router locally
3) L3 agent in controller receives an update router notification with related routers. The agent doesn't delete locally the router and queues for processing the related routers. As a consequence, the router ports are not set to status DOWN and the test case times out
Describing first what "SHOULD HAPPEN" in a successful run of test_from_ dvr_to_ dvr_ha (see http:// paste.openstack .org/show/ 769794/):
1) Test sets router "admin_state_up": false
2) L3 agents in controller and compute1 receive the router update notification and delete the router locally
In a failed execution, this is the observed sequence of events (see http:// paste.openstack .org/show/ 769795/):
1) Test sets router "admin_state_up": false
2) L3 agents in compute1 (in this case) receives the router update notification and delete the router locally
3) L3 agent in controller receives an update router notification with related routers. The agent doesn't delete locally the router and queues for processing the related routers. As a consequence, the router ports are not set to status DOWN and the test case times out
I have observed this pattern several times.
My main suspect at this point in time is https:/ /github. com/openstack/ neutron/ blob/78aae12a88 e8b3cc0609c8305 27533b8a8a92d60 /neutron/ db/l3_dvrschedu ler_db. py#L141- L153. This code was last modified in rapid succession by patches https:/ /review. opendev. org/#/c/ 664525 and https:/ /review. opendev. org/#/c/ 661522. It was originally created by https:/ /review. opendev. org/#/c/ 597567