arp responder will be created for a vlan port when port ip changed

Bug #1824504 reported by Yang Li on 2019-04-12
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Medium
Yang Li

Bug Description

When the env contains both vlan and vxlan networks, if we enable l2pop and arp_responder function, when we update a vlan port's ip(VM1), the arp responder for this port will be created in br-tun.
When other compute‘s vm(VM2) ping VM1, the first arp destination is ff:ff:ff:ff:ff:ff, so every compute node will get this arp request, and because of arp responder, they will all create a arp reply to the VM2, all these arp reply packets will confuse physical switch, this will cause some network connection problem.

Yang Li (yang-li) wrote :

When we update a port ip, we should check if the network is tunnel type, if not, we should not handle arp related function.

Fix proposed to branch: master
Review: https://review.openstack.org/652043

Changed in neutron:
assignee: nobody → Yang Li (yang-li)
status: New → In Progress
Changed in neutron:
importance: Undecided → Medium
Yang Li (yang-li) wrote :

reproduce step:
env:
We enabled l2population in mechanism driver, and enable l2_population, arp_responder on agent side.

/etc/neutron/plugins/ml2/ml2_conf.ini
[ml2]
type_drivers = vxlan,vlan,flat,local
tenant_network_types = vlan
mechanism_drivers =openvswitch,l2population

/etc/neutron/plugins/ml2/openvswitch_agent.ini
[agent]
l2_population = True
arp_responder = True

1. Create a vlan network.
2. Create 3 VMs(vm1, vm2, vm3) in different compute nodes(host1, host2, host3) with the vlan network.
3. Modify vm1's ip address, for example the new ip address is 192.168.111.95.
4. Login to host2 and host3, you will find the vm1's ip address arp related openflows in br-tun.
# ovs-ofctl dump-flows br-tun | grep ARP | grep 192.168.111.95
 cookie=0x9b7c0a4c49fcb20b, duration=12.806s, table=21, n_packets=0, n_bytes=0, idle_age=12, priority=1,arp,dl_vlan=3,arp_tpa=192.168.111.95 actions=move:NXM_OF_ETH_SRC[]->NXM_OF_ETH_DST[],mod_dl_src:fa:16:3e:ad:40:47,load:0x2->NXM_OF_ARP_OP[],move:NXM_NX_ARP_SHA[]->NXM_NX_ARP_THA[],move:NXM_OF_ARP_SPA[]->NXM_OF_ARP_TPA[],load:0xfa163ead4047->NXM_NX_ARP_SHA[],load:0xc0a86f5f->NXM_OF_ARP_SPA[],IN_PORT

Then when we ping 192.168.111.95 from a VM or namespace, the first arp mac destination is ff:ff:ff:ff:ff:ff, all the compute nodes will receive this packet, and because of arp responder for 192.168.111.95, 2 or more computed nodes will reply this packet, this will make the mac looks like exist on several compute nodes, this is impossible on common situation, and it will confuse the physical switch between compute nodes, it won't know the real port the mac belong to. And then all the packet to this mac cannot be forward to the right VM.

Reviewed: https://review.opendev.org/652043
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=5301ecf41b18e83abdc3c828cd64ce0111f9fcd1
Submitter: Zuul
Branch: master

commit 5301ecf41b18e83abdc3c828cd64ce0111f9fcd1
Author: Yang Li <email address hidden>
Date: Fri Apr 12 19:14:29 2019 +0800

    Don't add arp responder for non tunnel network port

    When the vlan and vxlan both exist in env, and l2population
    and arp_responder are enabled, if we update a port's ip address
    from vlan network, there will be arp responder related flows
    added into br-tun, this will cause too many arp reply for
    one arp request, and vm connections will be unnormal.

    Closes-Bug: #1824504
    Change-Id: I1b6154b9433a9442d3e0118dedfa01c4a9b4740b

Changed in neutron:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers