Comment 15 for bug 1632540

Revision history for this message
Quan Tian (tianquan23) wrote :

Hi Brian Haley and Swaminathan Vasudevan, I reproduced the bug in master branch, following the steps:
1. kill a dvr_snat l3 agent
2. create a DVR+HA router
3. start the dvr_snat l3 agent
4. the error logs will continue to be output

The reason is that when the l3 agent does fullsync, for every router, it calls ensure_snat_cleanup depending on whether the agent is dvr_snat or not, since [1]. However, DVR+HA routers always have snat namespaces on dvr_snat agents holding themselves for keepalived. Therefore, the cleanup call is unexpected and cause that the _process_updated_router method always catch an Exception and then put the router back to the RouterProcessingQueue again and again.

[1] https://review.openstack.org/#/c/326729/

I have submitted a patch for this: https://review.openstack.org/434863