Activity log for bug #1822256

Date Who What changed Old value New value Message
2019-03-29 07:25:52 Yang Li bug added bug
2019-03-29 07:36:32 Yang Li summary Ip fragments lost when restart ovs-agent with openvswitch firewall Ip segments lost when restart ovs-agent with openvswitch firewall
2019-03-29 07:37:04 Yang Li description environment: linux version: Linux controller.novalocal 3.10.0-693.el7.x86_64 #1 SMP Tue Aug 22 21:09:27 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux OpenStack version: Rocky network type: vxlan or vlan firewall driver: openvswitch 1. Create 2 VMs(vm1, vm2) in different compute nodes(node-1, node-2) with all tcp passed sg in one network. 2. Login to vm2, create a large file, for example: vm2# dd if=/dev/zero of=/mnt/test.img bs=1G count=5 3.Login to vm1, scp vm2's large file into vm1, when scp process starts, go to step 4. vm1# scp vm2-ip:/mnt/test.img /mnt 4.Login to node-2, and restart neutron-openvswitch-agent, this will refresh all the openflow in br-int node-2# systemctl restart neutron-openvswitch-agent 5.Login to vm1, and after several seconds, you will find the scp process status is stalled. After some investigation, I found the openflow refresh causes ip fragments lost.When this happened, I captured packets with "tcpdump -i tap-xxx -w tmp.pcap", and with wireshark I saw these errors: 192.168.100.19 192.168.100.5 SSH 16478 Server: [TCP ACKed unseen segment] [TCP Previous segment not captured] , Encrypted packet (len=16412) 192.168.100.19 192.168.100.5 SSH 8302 Server: [TCP ACKed unseen segment] , Encrypted packet (len=8236) 192.168.100.5 192.168.100.19 TCP 66 [TCP ACKed unseen segment] [TCP Previous segment not captured] 54354 → 22 [ACK] Seq=2509 Ack=600733 Win=16522 Len=0 TSval=2847412 TSecr=2851031 192.168.100.19 192.168.100.5 SSH 1464 Server: [TCP Spurious Retransmission] , Encrypted packet (len=1398) 192.168.100.5 192.168.100.19 TCP 78 [TCP Dup ACK 25182#1] 54354 → 22 [ACK] Seq=326305 Ack=67089901 Win=18494 Len=0 TSval=2849742 TSecr=2853310 SLE=67073429 SRE=67074827 192.168.100.5 192.168.100.19 TCP 110 [TCP Retransmission] 54354 → 22 [PSH, ACK] Seq=326173 Ack=67089901 Win=18494 Len=44 TSval=2849742 TSecr=2853310 192.168.100.19 192.168.100.5 TCP 1464 [TCP Retransmission] 22 → 54354 [ACK] Seq=70971905 Ack=346105 Win=2016 Len=1398 TSval=2853361 TSecr=2849691 192.168.100.19 192.168.100.5 TCP 1464 [TCP Retransmission] 22 → 54354 [ACK] Seq=70971905 Ack=346105 Win=2016 Len=1398 TSval=2853463 TSecr=2849691 192.168.100.19 192.168.100.5 TCP 1464 [TCP Retransmission] 22 → 54354 [ACK] Seq=70971905 Ack=346105 Win=2016 Len=1398 TSval=2854076 TSecr=2849691 And I checked the statue of this tcp connect in both compute nodes, it's still ESTABLISHED. # conntrack -L | grep 192.168.100.5 tcp 6 299 ESTABLISHED src=192.168.100.5 dst=192.168.100.19 sport=54356 dport=22 src=192.168.100.19 dst=192.168.100.5 sport=22 dport=54354 [ASSURED] mark=0 zone=4 use=1 # conntrack -L | grep 192.168.100.5 tcp 6 287 ESTABLISHED src=192.168.100.5 dst=192.168.100.19 sport=54356 dport=22 src=192.168.100.19 dst=192.168.100.5 sport=22 dport=54354 [ASSURED] mark=0 zone=1 use=1 I have no idea why refresh openflow will cause ip fragments lost, hopes someone has a way to solve this problem. environment: linux version: Linux controller.novalocal 3.10.0-693.el7.x86_64 #1 SMP Tue Aug 22 21:09:27 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux OpenStack version: Rocky network type: vxlan or vlan firewall driver: openvswitch 1. Create 2 VMs(vm1, vm2) in different compute nodes(node-1, node-2) with all tcp passed sg in one network. 2. Login to vm2, create a large file, for example: vm2# dd if=/dev/zero of=/mnt/test.img bs=1G count=5 3.Login to vm1, scp vm2's large file into vm1, when scp process starts, go to step 4. vm1# scp vm2-ip:/mnt/test.img /mnt 4.Login to node-2, and restart neutron-openvswitch-agent, this will refresh all the openflow in br-int node-2# systemctl restart neutron-openvswitch-agent 5.Login to vm1, and after several seconds, you will find the scp process status is stalled. After some investigation, I found the openflow refresh causes ip segments lost.When this happened, I captured packets with "tcpdump -i tap-xxx -w tmp.pcap", and with wireshark I saw these errors: 192.168.100.19 192.168.100.5 SSH 16478 Server: [TCP ACKed unseen segment] [TCP Previous segment not captured] , Encrypted packet (len=16412) 192.168.100.19 192.168.100.5 SSH 8302 Server: [TCP ACKed unseen segment] , Encrypted packet (len=8236) 192.168.100.5 192.168.100.19 TCP 66 [TCP ACKed unseen segment] [TCP Previous segment not captured] 54354 → 22 [ACK] Seq=2509 Ack=600733 Win=16522 Len=0 TSval=2847412 TSecr=2851031 192.168.100.19 192.168.100.5 SSH 1464 Server: [TCP Spurious Retransmission] , Encrypted packet (len=1398) 192.168.100.5 192.168.100.19 TCP 78 [TCP Dup ACK 25182#1] 54354 → 22 [ACK] Seq=326305 Ack=67089901 Win=18494 Len=0 TSval=2849742 TSecr=2853310 SLE=67073429 SRE=67074827 192.168.100.5 192.168.100.19 TCP 110 [TCP Retransmission] 54354 → 22 [PSH, ACK] Seq=326173 Ack=67089901 Win=18494 Len=44 TSval=2849742 TSecr=2853310 192.168.100.19 192.168.100.5 TCP 1464 [TCP Retransmission] 22 → 54354 [ACK] Seq=70971905 Ack=346105 Win=2016 Len=1398 TSval=2853361 TSecr=2849691 192.168.100.19 192.168.100.5 TCP 1464 [TCP Retransmission] 22 → 54354 [ACK] Seq=70971905 Ack=346105 Win=2016 Len=1398 TSval=2853463 TSecr=2849691 192.168.100.19 192.168.100.5 TCP 1464 [TCP Retransmission] 22 → 54354 [ACK] Seq=70971905 Ack=346105 Win=2016 Len=1398 TSval=2854076 TSecr=2849691 And I checked the statue of this tcp connect in both compute nodes, it's still ESTABLISHED. # conntrack -L | grep 192.168.100.5 tcp 6 299 ESTABLISHED src=192.168.100.5 dst=192.168.100.19 sport=54356 dport=22 src=192.168.100.19 dst=192.168.100.5 sport=22 dport=54354 [ASSURED] mark=0 zone=4 use=1 # conntrack -L | grep 192.168.100.5 tcp 6 287 ESTABLISHED src=192.168.100.5 dst=192.168.100.19 sport=54356 dport=22 src=192.168.100.19 dst=192.168.100.5 sport=22 dport=54354 [ASSURED] mark=0 zone=1 use=1 I have no idea why refresh openflow will cause ip segments lost, hopes someone has a way to solve this problem.
2019-03-29 14:50:16 Bence Romsics tags ovs ovs-fw
2019-03-29 15:19:23 Bence Romsics neutron: importance Undecided High
2019-04-11 07:01:59 LIU Yulong neutron: status New Invalid