[SNAT][HA]snat traffic broken after restarting network nodes
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
neutron |
Confirmed
|
Medium
|
venkata anil |
Bug Description
After restarting both network nodes (l3 agent_mode=
Then, once the one who actually takes SNAT traffic is done, the other one won't take over the responsibility.
[root@zk22-01 ~]# neutron router-list
+------
| id | name | external_
+------
| c497892b-
| | | -400e-8c76-
+------
[root@zk22-01 ~]# neutron l3-agent-
+------
| id | host | admin_state_up | alive | ha_state |
+------
| be5526ce-
| dcdfc230-
+------
[root@zk22-01 ~]# ip netns exec snat-c497892b-
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on ha-004331fc-9f, link-type EN10MB (Ethernet), capture size 65535 bytes
18:59:03.574554 IP 169.254.192.2 > 224.0.0.18: VRRPv2, Advertisement, vrid 1, prio 50, authtype simple, intvl 2s, length 20
18:59:05.575500 IP 169.254.192.2 > 224.0.0.18: VRRPv2, Advertisement, vrid 1, prio 50, authtype simple, intvl 2s, length 20
18:59:07.576432 IP 169.254.192.2 > 224.0.0.18: VRRPv2, Advertisement, vrid 1, prio 50, authtype simple, intvl 2s, length 20
18:59:09.577361 IP 169.254.192.2 > 224.0.0.18: VRRPv2, Advertisement, vrid 1, prio 50, authtype simple, intvl 2s, length 20
18:59:11.578293 IP 169.254.192.2 > 224.0.0.18: VRRPv2, Advertisement, vrid 1, prio 50, authtype simple, intvl 2s, length 20
18:59:13.579243 IP 169.254.192.2 > 224.0.0.18: VRRPv2, Advertisement, vrid 1, prio 50, authtype simple, intvl 2s, length 20
[root@zk22-02 ~]# ip netns exec snat-c497892b-
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on ha-dda33de1-3e, link-type EN10MB (Ethernet), capture size 65535 bytes
18:59:15.918725 IP 169.254.192.1 > 224.0.0.18: VRRPv2, Advertisement, vrid 1, prio 50, authtype simple, intvl 2s, length 20
18:59:17.919038 IP 169.254.192.1 > 224.0.0.18: VRRPv2, Advertisement, vrid 1, prio 50, authtype simple, intvl 2s, length 20
18:59:19.920036 IP 169.254.192.1 > 224.0.0.18: VRRPv2, Advertisement, vrid 1, prio 50, authtype simple, intvl 2s, length 20
18:59:21.921004 IP 169.254.192.1 > 224.0.0.18: VRRPv2, Advertisement, vrid 1, prio 50, authtype simple, intvl 2s, length 20
18:59:23.922007 IP 169.254.192.1 > 224.0.0.18: VRRPv2, Advertisement, vrid 1, prio 50, authtype simple, intvl 2s, length 20
18:59:25.923017 IP 169.254.192.1 > 224.0.0.18: VRRPv2, Advertisement, vrid 1, prio 50, authtype simple, intvl 2s, length 20
After comparing the flows in br-tun before and after rebooting the network node, I found that some arp_responder flows related to the HA network are missed.