Dual Stack IPV4/6 ARP bleed

Bug #1692007 reported by sean redmond
20
This bug affects 3 people
Affects Status Importance Assigned to Milestone
neutron
Fix Released
High
Brian Haley

Bug Description

Version = newton (2:9.2.0-0ubuntu1~cloud0)
DVR (HA mode) with ipv4 and ipv6
l2_population = True
arp_responder = True

I noticed when using dual stack ipv4 and ipv6 Linux guests are working fine but windows guests seemed to have a problem with their ipv4 connectivity. Upon investigation I found an ARP issue in that both the ipv4 and ipv6 interface of the virtual router are responding to arp requests the below shows a capture from the tap interface of the guest when the arp table in the guest is flushed.

12:04:58.273446 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.0.0.1 tell 10.0.0.23, length 28
12:04:58.273776 ARP, Ethernet (len 6), IPv4 (len 4), Reply 10.0.0.1 is-at fa:16:3e:43:9d:58, length 28
12:04:58.273790 ARP, Ethernet (len 6), IPv4 (len 4), Reply 10.0.0.1 is-at fa:16:3e:19:96:e6, length 28

If I look at the active router interfaxces I can see mac ending 9d:58 is the ipv4 interface and mac ending 96:e6 is the ipv6 interfaces as shown below:

# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: qr-12e41bc2-68@if1021: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 8950 qdisc noqueue state UP group default qlen 1000
    link/ether fa:16:3e:43:9d:58 brd ff:ff:ff:ff:ff:ff link-netnsid 0

    inet 10.0.0.1/8 brd 10.255.255.255 scope global qr-12e41bc2-68

       valid_lft forever preferred_lft forever
    inet6 fe80::f816:3eff:fe43:9d58/64 scope link
       valid_lft forever preferred_lft forever
6: qr-bd91567c-81@if1143: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 8950 qdisc noqueue state UP group default qlen 1000
    link/ether fa:16:3e:19:96:e6 brd ff:ff:ff:ff:ff:ff link-netnsid 0

    inet6 <prefix_hidden>:9:1::1/64 scope global

       valid_lft forever preferred_lft forever
    inet6 fe80::f816:3eff:fe19:96e6/64 scope link
       valid_lft forever preferred_lft forever
7: qr-03c27e46-4b@if1159: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 8950 qdisc noqueue state UP group default qlen 1000
    link/ether fa:16:3e:8e:f7:bd brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet6 <prefix_hidden>:9:2::1/64 scope global
       valid_lft forever preferred_lft forever
    inet6 fe80::f816:3eff:fe8e:f7bd/64 scope link
       valid_lft forever preferred_lft forever

We should not see two arp responses here, we should only see one from fa:16:3e:43:9d:58. Turns out that Linux guests populates the arp table based upon the first response and windows guests based upon the latest response, this explains to me why windows guests are failing and Linux are working as the first arp response is valid and the second one is invalid - but I think we have a bigger issue here as I should not be getting an arp response for the ipv6 interface.

description: updated
Revision history for this message
sean redmond (sean-redmond1) wrote :

Digging further I found inside the qrouter name space on the compute host arp_filter is set to 0 if I set this to 1 inside the name space on the ipv6 interface I no longer get duplicate arp responses from the ipv4 and ipv6 interfaces. I found the correct term for this issue seems to be arp flux (http://linux-ip.net/html/ether-arp.html)

The below is the command needed to overcome this issue in my case, this was ran on the nova compute host.

ip netns exec qrouter-d88a7831-2cf1-45cc-8276-742176ca0921 echo 1 > /proc/sys/net/ipv4/conf/qr-bd91567c-81/arp_filter

I believe the neutron l3agent should be setting this on all ipv6 router interfaces.

Revision history for this message
Brian Haley (brian-haley) wrote :

I was going to say we might need to set arp_ignore=1, that will drop if the incoming IP is not on the interface. Usually arp_announce=2 is set along with this as well.

These should probably be set in init_l3()

Changed in neutron:
assignee: nobody → Brian Haley (brian-haley)
importance: Undecided → High
Revision history for this message
Brian Haley (brian-haley) wrote :

BTW, I don't see this on my devstack install that's close to master when running the tcpdump in the qrouter namespace, and no sysctl settings differ from the default except for send_redirects. Is that where you ran your tcpdump?

Revision history for this message
sean redmond (sean-redmond1) wrote :

You may see it easier from within the guest running the tcpdump or tcpdump from the tap interface of the instance.

But I can also see this within the qrouter namespace of the dvr router on the nova compute host running the instance.

Within the instance I run:

arp -d 10.0.0.1
ping 10.0.0.1

And the below shows the tcpdump from the dvr qrouter namespace:

root@nova-compute:~# ip netns exec qrouter-d88a7831-2cf1-45cc-8276-742176ca0921 bash
root@nova-compute:~# tcpdump -i any -nn -vv arp
tcpdump: listening on any, link-type LINUX_SLL (Linux cooked), capture size 262144 bytes

09:21:08.694004 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.0.0.1 tell 10.0.0.23, length 28
09:21:08.694025 ARP, Ethernet (len 6), IPv4 (len 4), Reply 10.0.0.1 is-at fa:16:3e:43:9d:58, length 28
09:21:08.694031 ARP, Ethernet (len 6), IPv4 (len 4), Reply 10.0.0.1 is-at fa:16:3e:19:96:e6, length 28

3 packets captured
3 packets received by filter
0 packets dropped by kernel
root@os-nova-compute-04:~#

The below shows the tcpdump after setting arp_filter=1

root@nova-compute:~# ip netns exec qrouter-d88a7831-2cf1-45cc-8276-742176ca0921 echo 1 > /proc/sys/net/ipv4/conf/qr-bd91567c-81/arp_filter
root@nova-compute:~# ip netns exec qrouter-d88a7831-2cf1-45cc-8276-742176ca0921 bash
root@nova-compute:~# tcpdump -i any -nn -vv arp
tcpdump: listening on any, link-type LINUX_SLL (Linux cooked), capture size 262144 bytes

09:19:11.630277 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.0.0.1 tell 10.0.0.23, length 28
09:19:11.630301 ARP, Ethernet (len 6), IPv4 (len 4), Reply 10.0.0.1 is-at fa:16:3e:43:9d:58, length 28

2 packets captured
2 packets received by filter
0 packets dropped by kernel

Revision history for this message
sean redmond (sean-redmond1) wrote :

I also wanted to test the sysctl settings you proposed and they seem fine:

root@nova-compute:~# ip netns exec qrouter-d88a7831-2cf1-45cc-8276-742176ca0921 echo 1 > /proc/sys/net/ipv4/conf/qr-bd91567c-81/arp_ignore
root@nova-compute:~# ip netns exec qrouter-d88a7831-2cf1-45cc-8276-742176ca0921 echo 2 > /proc/sys/net/ipv4/conf/qr-bd91567c-81/arp_announce

root@nova-compute:~# tcpdump -i any -nn -vv arp
tcpdump: listening on any, link-type LINUX_SLL (Linux cooked), capture size 262144 bytes
09:26:47.558628 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.0.0.1 tell 10.0.0.23, length 28
09:26:47.558652 ARP, Ethernet (len 6), IPv4 (len 4), Reply 10.0.0.1 is-at fa:16:3e:43:9d:58, length 28

2 packets captured
2 packets received by filter
0 packets dropped by kernel
root@nova-compute:~#

I also enclosed a text file to share the output of the below for reference:

root@nova-compute:~# ip netns exec qrouter-d88a7831-2cf1-45cc-8276-742176ca0921 bash
root@nova-compute:~# sysctl -a > /root/qrouter-d88a7831-2cf1-45cc-8276-742176ca0921_sysctl.txt

The kernel in scope is below:

root@nova-compute:~# uname -a
Linux os-nova-compute-04 4.4.0-72-generic #93-Ubuntu SMP Fri Mar 31 14:07:41 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
root@nova-compute:~#

Revision history for this message
Chris S (bloatyfloat) wrote :

I would suggest the following in neutron/agent/linux/interface.py add_ipv6_addr def:

    def add_ipv6_addr(self, device_name, v6addr, namespace, scope='global'):
        device = ip_lib.IPDevice(device_name,
                                 namespace=namespace)
        net = netaddr.IPNetwork(v6addr)
        device.addr.add(str(net), scope)
        ip_lib.IPWrapper(namespace=namespace).netns.execute(
            ['sysctl', '-w', 'net.ipv4.conf.%s.arp_ignore=1' % device_name])
        ip_lib.IPWrapper(namespace=namespace).netns.execute(
            ['sysctl', '-w', 'net.ipv4.conf.%s.arp_announce=2' % device_name])

This will result in the IPv4 sysctl values only being set once an IPv6 address is being activated on a device that is providing IPv6 connectivity.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/468846

Changed in neutron:
assignee: Brian Haley (brian-haley) → sean redmond (sean-redmond1)
status: New → In Progress
Changed in neutron:
assignee: sean redmond (sean-redmond1) → Brian Haley (brian-haley)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/ocata)

Fix proposed to branch: stable/ocata
Review: https://review.openstack.org/486585

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.openstack.org/468846
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=0f076cfc2d59605bdb5e66241a98ed5fa1fdddeb
Submitter: Jenkins
Branch: master

commit 0f076cfc2d59605bdb5e66241a98ed5fa1fdddeb
Author: Sean Redmond <email address hidden>
Date: Mon May 29 10:50:27 2017 +0100

    Do not respond to ARP on IPv6-only interfaces

    When using dual stack, the IPv6 router interface responds
    to ARP requests that only the IPv4 interface should.
    This results in ARP flux and can cause a guest to address
    packets to the wrong layer-2 address when sending traffic
    to the IPv4 gateway.

    Change arp_ignore and arp_announce sysctl options on interfaces
    in the router namespace to be more strict in how we respond.

    Closes-bug: 1692007
    Change-Id: Ic3c2370995abb027a3412b473ce6bc63790c1105

Changed in neutron:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 11.0.0.0b3

This issue was fixed in the openstack/neutron 11.0.0.0b3 development milestone.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/ocata)

Reviewed: https://review.openstack.org/486585
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=4f57f19280ae50af74e70d1b4d2b4659a5757745
Submitter: Jenkins
Branch: stable/ocata

commit 4f57f19280ae50af74e70d1b4d2b4659a5757745
Author: Sean Redmond <email address hidden>
Date: Mon May 29 10:50:27 2017 +0100

    Do not respond to ARP on IPv6-only interfaces

    When using dual stack, the IPv6 router interface responds
    to ARP requests that only the IPv4 interface should.
    This results in ARP flux and can cause a guest to address
    packets to the wrong layer-2 address when sending traffic
    to the IPv4 gateway.

    Change arp_ignore and arp_announce sysctl options on interfaces
    in the router namespace to be more strict in how we respond.

    Closes-bug: 1692007
    Change-Id: Ic3c2370995abb027a3412b473ce6bc63790c1105

tags: added: in-stable-ocata
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 10.0.3

This issue was fixed in the openstack/neutron 10.0.3 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.