Neutron server is losing contact with the L3 agent running in the controller node. One example is:
Jan 26 18:27:23.664199 ubuntu-bionic-rax-iad-0002168118 neutron-server[6878]: WARNING neutron.db.agents_db [None req-96b7c5e3-0c74-48ca-92a2-6b43a9ef6544 None None] Agent healthcheck: found 1 dead agents out of 8:
Jan 26 18:27:23.664199 ubuntu-bionic-rax-iad-0002168118 neutron-server[6878]: Type Last heartbeat host
Jan 26 18:27:23.664199 ubuntu-bionic-rax-iad-0002168118 neutron-server[6878]: L3 agent 2019-01-26 18:25:44 ubuntu-bionic-rax-iad-0002168118
Checking in the L3 agent log around the time the first instance of the above message is seen, we can find this traceback: http://paste.openstack.org/show/744001/. Please note that this traceback takes place at Jan 26 18:25:56.559883, whereas the Neutron server starts reporting loosing contact with the L3 agent (see message above) at Jan 26 18:27:23.664199, having received the last heartbeat at 2019-01-26 18:25:44. In fact, this is the last time the L3 agent reports receiving a router update:
Neutron server is losing contact with the L3 agent running in the controller node. One example is:
Jan 26 18:27:23.664199 ubuntu- bionic- rax-iad- 0002168118 neutron- server[ 6878]: WARNING neutron. db.agents_ db [None req-96b7c5e3- 0c74-48ca- 92a2-6b43a9ef65 44 None None] Agent healthcheck: found 1 dead agents out of 8: bionic- rax-iad- 0002168118 neutron- server[ 6878]: Type Last heartbeat host bionic- rax-iad- 0002168118 neutron- server[ 6878]: L3 agent 2019-01-26 18:25:44 ubuntu- bionic- rax-iad- 0002168118
Jan 26 18:27:23.664199 ubuntu-
Jan 26 18:27:23.664199 ubuntu-
Checking in the L3 agent log around the time the first instance of the above message is seen, we can find this traceback: http:// paste.openstack .org/show/ 744001/. Please note that this traceback takes place at Jan 26 18:25:56.559883, whereas the Neutron server starts reporting loosing contact with the L3 agent (see message above) at Jan 26 18:27:23.664199, having received the last heartbeat at 2019-01-26 18:25:44. In fact, this is the last time the L3 agent reports receiving a router update:
Jan 26 18:25:56.399748 ubuntu- bionic- rax-iad- 0002168118 neutron- l3-agent[ 8618]: DEBUG neutron. agent.l3. agent [None req-296cf80d- 5b44-4c99- 914d-499ec94939 4b tempest- NetworkMigratio nFromHA- 1759813396 tempest- NetworkMigratio nFromHA- 1759813396] Got routers updated notification :['e6e7911c- a3e0-4331- abe4-580aaf5ba2 fc'] {{(pid=8618) routers_updated /opt/stack/ neutron/ neutron/ agent/l3/ agent.py: 444}}
The router with uuid e6e7911c- a3e0-4331- abe4-580aaf5ba2 fc is being migrated from HA to DVR by test case NetworkMigratio nFromHA: test_from_ ha_to_dvr.
I have confirmed a similar pattern takes place in several occurrences of this bug. In all cases, a router is being migrated from HA to DVR or legacy.
Nest step is to dig deeper in the traceback http:// paste.openstack .org/show/ 744001/