Dual Stack IPV4/6 ARP bleed
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
neutron |
Fix Released
|
High
|
Brian Haley |
Bug Description
Version = newton (2:9.2.
DVR (HA mode) with ipv4 and ipv6
l2_population = True
arp_responder = True
I noticed when using dual stack ipv4 and ipv6 Linux guests are working fine but windows guests seemed to have a problem with their ipv4 connectivity. Upon investigation I found an ARP issue in that both the ipv4 and ipv6 interface of the virtual router are responding to arp requests the below shows a capture from the tap interface of the guest when the arp table in the guest is flushed.
12:04:58.273446 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.0.0.1 tell 10.0.0.23, length 28
12:04:58.273776 ARP, Ethernet (len 6), IPv4 (len 4), Reply 10.0.0.1 is-at fa:16:3e:43:9d:58, length 28
12:04:58.273790 ARP, Ethernet (len 6), IPv4 (len 4), Reply 10.0.0.1 is-at fa:16:3e:19:96:e6, length 28
If I look at the active router interfaxces I can see mac ending 9d:58 is the ipv4 interface and mac ending 96:e6 is the ipv6 interfaces as shown below:
# ip a
1: lo: <LOOPBACK,
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: qr-12e41bc2-
link/ether fa:16:3e:43:9d:58 brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet 10.0.0.1/8 brd 10.255.255.255 scope global qr-12e41bc2-68
valid_lft forever preferred_lft forever
inet6 fe80::f816:
valid_lft forever preferred_lft forever
6: qr-bd91567c-
link/ether fa:16:3e:19:96:e6 brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet6 <prefix_
valid_lft forever preferred_lft forever
inet6 fe80::f816:
valid_lft forever preferred_lft forever
7: qr-03c27e46-
link/ether fa:16:3e:8e:f7:bd brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet6 <prefix_
valid_lft forever preferred_lft forever
inet6 fe80::f816:
valid_lft forever preferred_lft forever
We should not see two arp responses here, we should only see one from fa:16:3e:43:9d:58. Turns out that Linux guests populates the arp table based upon the first response and windows guests based upon the latest response, this explains to me why windows guests are failing and Linux are working as the first arp response is valid and the second one is invalid - but I think we have a bigger issue here as I should not be getting an arp response for the ipv6 interface.
description: | updated |
Changed in neutron: | |
assignee: | sean redmond (sean-redmond1) → Brian Haley (brian-haley) |
Digging further I found inside the qrouter name space on the compute host arp_filter is set to 0 if I set this to 1 inside the name space on the ipv6 interface I no longer get duplicate arp responses from the ipv4 and ipv6 interfaces. I found the correct term for this issue seems to be arp flux (http:// linux-ip. net/html/ ether-arp. html)
The below is the command needed to overcome this issue in my case, this was ran on the nova compute host.
ip netns exec qrouter- d88a7831- 2cf1-45cc- 8276-742176ca09 21 echo 1 > /proc/sys/ net/ipv4/ conf/qr- bd91567c- 81/arp_ filter
I believe the neutron l3agent should be setting this on all ipv6 router interfaces.