Install guide: Add arp_ignore (sysctl.conf) to the other IP options

Bug #1526818 reported by Michael Steffens
22
This bug affects 4 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Invalid
Undecided
Unassigned
neutron
Invalid
Undecided
Unassigned
openstack-manuals
Fix Released
Low
Unassigned

Bug Description

We are facing a very strange behaviour of ARP in tenant networks, causing Windows guests to incorrectly decline DHCP addresses. These VMs apparently do an ARP request for the address they have been offered, discarding them in case a different MAC is reporting to own that IP already.

We are using openvswitch-agent with ml2 plugin.

Investigating this issue using Linux guests. Please look at the following example. A VM with the fixed-ip 192.168.1.15 reports the following ARP cache:

   root@michael-test2:~# arp
   Address HWtype HWaddress Flags Mask Iface
   host-192-168-1-2.openst ether fa:16:3e:de:ab:ea C eth0
   192.168.1.13 ether a6:b2:dc:d8:39:c1 C eth0
   192.168.1.119 (incomplete) eth0
   host-192-168-1-20.opens ether fa:16:3e:76:43:ce C eth0
   host-192-168-1-19.opens ether fa:16:3e:0d:a6:0b C eth0
   host-192-168-1-1.openst ether fa:16:3e:2a:81:ff C eth0
   192.168.1.14 ether 0e:bf:04:b7:ed:52 C eth0

Both 192.168.1.13 and 192.168.1.14 do not exist in this subnet, and their MAC addresses a6:b2:dc:d8:39:c1 and 0e:bf:04:b7:ed:52 actually belong to other instance qbr* and qvb* devices, living on their respective hypervisor hosts!

Looking at 0e:bf:04:b7:ed:52, for example, yields

   # ip link list | grep -C1 -e 0e:bf:04:b7:ed:52
   59: qbr9ac24ac1-e1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default
       link/ether 0e:bf:04:b7:ed:52 brd ff:ff:ff:ff:ff:ff
   60: qvo9ac24ac1-e1: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master ovs-system state UP mode DEFAULT group default qlen 1000
   --
   61: qvb9ac24ac1-e1: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master qbr9ac24ac1-e1 state UP mode DEFAULT group default qlen 1000
       link/ether 0e:bf:04:b7:ed:52 brd ff:ff:ff:ff:ff:ff
   62: tap9ac24ac1-e1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master qbr9ac24ac1-e1 state UNKNOWN mode DEFAULT group default qlen 500

on the compute node. Using tcpdump on qbr9ac24ac1-e1 on the host and triggering a fresh ARM lookup from the guest results in

   # tcpdump -i qbr9ac24ac1-e1 -vv -l | grep ARP
   tcpdump: WARNING: qbr9ac24ac1-e1: no IPv4 address assigned
   tcpdump: listening on qbr9ac24ac1-e1, link-type EN10MB (Ethernet), capture size 65535 bytes
   14:00:32.089726 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 192.168.1.14 tell 192.168.1.15, length 28
   14:00:32.089740 ARP, Ethernet (len 6), IPv4 (len 4), Reply 192.168.1.14 is-at 0e:bf:04:b7:ed:52 (oui Unknown), length 28
   14:00:32.090141 ARP, Ethernet (len 6), IPv4 (len 4), Reply 192.168.1.14 is-at 7a:a5:71:63:47:94 (oui Unknown), length 28
   14:00:32.090160 ARP, Ethernet (len 6), IPv4 (len 4), Reply 192.168.1.14 is-at 02:f9:33:d5:04:0d (oui Unknown), length 28
   14:00:32.090168 ARP, Ethernet (len 6), IPv4 (len 4), Reply 192.168.1.14 is-at 9a:a0:46:e4:03:06 (oui Unknown), length 28

Four different devices are claiming to own the non-existing IP address! Looking them up in neutron shows they are all related to existing ports on the subnet, but different ones:

   # neutron port-list | grep -e 47fbb8b5-55 -e 46647cca-32 -e e9e2d7c3-7e -e 9ac24ac1-e1
   | 46647cca-3293-42ea-8ec2-0834e19422fa | | fa:16:3e:7d:9c:45 | {"subnet_id": "25dbbdc0-f438-4f89-8663-1772f9c7ef36", "ip_address": "192.168.1.8"} |
   | 47fbb8b5-5549-46e4-850e-bd382375e0f8 | | fa:16:3e:fa:df:32 | {"subnet_id": "25dbbdc0-f438-4f89-8663-1772f9c7ef36", "ip_address": "192.168.1.7"} |
   | 9ac24ac1-e157-484e-b6a2-a1dded4731ac | | fa:16:3e:2a:80:6b | {"subnet_id": "25dbbdc0-f438-4f89-8663-1772f9c7ef36", "ip_address": "192.168.1.15"} |
   | e9e2d7c3-7e58-4bc2-a25f-d48e658b2d56 | | fa:16:3e:0d:a6:0b | {"subnet_id": "25dbbdc0-f438-4f89-8663-1772f9c7ef36", "ip_address": "192.168.1.19"} |

Environment:

Host: Ubuntu server 14.04
Kernel: linux-image-generic-lts-vivid, 3.19.0-39-generic #44~14.04.1-Ubuntu SMP Wed Dec 2 10:00:35 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
OpenStack Kilo:
# dpkg -l | grep -e nova -e neutron
ii neutron-common 1:2015.1.2-0ubuntu2~cloud0 all Neutron is a virtual network service for Openstack - common
ii neutron-plugin-ml2 1:2015.1.2-0ubuntu2~cloud0 all Neutron is a virtual network service for Openstack - ML2 plugin
ii neutron-plugin-openvswitch-agent 1:2015.1.2-0ubuntu2~cloud0 all Neutron is a virtual network service for Openstack - Open vSwitch plugin agent
ii nova-common 1:2015.1.2-0ubuntu2~cloud0 all OpenStack Compute - common files
ii nova-compute 1:2015.1.2-0ubuntu2~cloud0 all OpenStack Compute - compute node base
ii nova-compute-kvm 1:2015.1.2-0ubuntu2~cloud0 all OpenStack Compute - compute node (KVM)
ii nova-compute-libvirt 1:2015.1.2-0ubuntu2~cloud0 all OpenStack Compute - compute node libvirt support
ii python-neutron 1:2015.1.2-0ubuntu2~cloud0 all Neutron is a virtual network service for Openstack - Python library
ii python-neutron-fwaas 2015.1.2-0ubuntu2~cloud0 all Firewall-as-a-Service driver for OpenStack Neutron
ii python-neutronclient 1:2.3.11-0ubuntu1.2~cloud0 all client - Neutron is a virtual network service for Openstack
ii python-nova 1:2015.1.2-0ubuntu2~cloud0 all OpenStack Compute Python libraries
ii python-novaclient 1:2.22.0-0ubuntu2~cloud0 all client library for OpenStack Compute API

Revision history for this message
Sean Dague (sdague) wrote :

This feels like it needs neutron experts to weigh in because under this kind of environment the network setup is basically done by neutron.

Changed in nova:
status: New → Incomplete
Revision history for this message
Assaf Muller (amuller) wrote :

Do the qvo/qvb/qbr devices have an IP address assigned? (They shouldn't). I saw an issue with NetworkManager on a particular setup that caused this. It messed everything up in really weird ways.

Revision history for this message
Michael Steffens (michael-steffens-b) wrote :

No, only compute host physical interfaces (mgmt and instance tunnel) have IP addresses assigned, according to "ifconfig -a". The qvo/qvb/qbr devices don't.

Revision history for this message
Yushiro FURUKAWA (y-furukawa-2) wrote :

Hi,

Are both 192.168.1.13 and 192.168.1.14 assigned to device other than qbr/qboxxxx on the compute-node host?
I wonder if after sending DHCPOFFER, GARP will be sent and compute-node host
replies instead of the neutron port.

Revision history for this message
Michael Steffens (michael-steffens-b) wrote :

They are indeed assigned to instance tunnel interfaces on compute nodes. They are configured in /etc/neutron/plugins/ml2/ml2_conf.ini as

   [ovs]
   local_ip = 192.168.1.14

on the node hosting the VM above, and 192.168.1.13 on the other compute node.

How can I verify whether these interfaces are receiving and responding to GARP, as you describe it?

Revision history for this message
Michael Steffens (michael-steffens-b) wrote :
Download full text (3.6 KiB)

I tried to examine the ARP traffic on the tunnel directly using "tcpdump -i em2 -vv -l | grep ARP".

ARP packets are travelling both raw, and wrapped by GRE, occurring every couple of seconds. It's hard to correlate them with individual requests.

Observations:

 * raw ARP requests only originate from IPs actually used in the instance tunnel network. Responses only return MAC addresses of physical interfaces attached to that network.

 * GRE wrapped ARP requests originate from existing virtual interface IPs, and are answered with MAC addresses of Neutron ports only, as long as I trigger resolutions of the tunnel IP address of the VMs own compute hosts.

 * The bogus MAC address 0e:bf:04:b7:ed:52 is never shipped on tunnel interface em2, neither raw nor GRE wrapped. Even after wiping this address from the VMs ARP cache and forcing a fresh resolution with ping, resulting in the known bogus address again.

 * When triggering a fresh resolution of the IP address used by tunnel interface em2 on the other compute node, the whole cascade of excess ARP responses does appear in GRE wrapped ARP responses:

    14:34:17.883714 IP (tos 0x0, ttl 64, id 9402, offset 0, flags [DF], proto GRE (47), length 70)
        192.168.1.14 > 192.168.1.13: GREv0, Flags [key present], key=0x1e, length 50
            ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 192.168.1.13 tell 192.168.1.15, length 28
    14:34:17.883722 IP (tos 0x0, ttl 64, id 62059, offset 0, flags [DF], proto GRE (47), length 70)
        192.168.1.14 > 192.168.1.2: GREv0, Flags [key present], key=0x1e, length 50
            ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 192.168.1.13 tell 192.168.1.15, length 28
    14:34:17.884301 IP (tos 0x0, ttl 64, id 50179, offset 0, flags [DF], proto GRE (47), length 70)
        192.168.1.13 > 192.168.1.14: GREv0, Flags [key present], key=0x1e, length 50
            ARP, Ethernet (len 6), IPv4 (len 4), Reply 192.168.1.13 is-at a6:b2:dc:d8:39:c1 (oui Unknown), length 28
    14:34:17.884331 IP (tos 0x0, ttl 64, id 50180, offset 0, flags [DF], proto GRE (47), length 70)
        192.168.1.13 > 192.168.1.14: GREv0, Flags [key present], key=0x1e, length 50
            ARP, Ethernet (len 6), IPv4 (len 4), Reply 192.168.1.13 is-at 7e:d6:9b:cd:9d:2a (oui Unknown), length 28
    14:34:17.884339 IP (tos 0x0, ttl 64, id 50181, offset 0, flags [DF], proto GRE (47), length 70)
        192.168.1.13 > 192.168.1.14: GREv0, Flags [key present], key=0x1e, length 50
            ARP, Ethernet (len 6), IPv4 (len 4), Reply 192.168.1.13 is-at 56:b2:1c:f7:d2:88 (oui Unknown), length 28
    14:34:17.884347 IP (tos 0x0, ttl 64, id 50182, offset 0, flags [DF], proto GRE (47), length 70)
        192.168.1.13 > 192.168.1.14: GREv0, Flags [key present], key=0x1e, length 50
            ARP, Ethernet (len 6), IPv4 (len 4), Reply 192.168.1.13 is-at 16:6c:37:19:3e:3d (oui Unknown), length 28
    14:34:17.884355 IP (tos 0x0, ttl 64, id 50183, offset 0, flags [DF], proto GRE (47), length 70)
        192.168.1.13 > 192.168.1.14: GREv0, Flags [key present], key=0x1e, length 50
            ARP, Ethernet (len 6), IPv4 (len 4), Reply 192.168.1.13 is-at 9e:b4:03:8e:74:cb (oui Unknown), le...

Read more...

Revision history for this message
Yushiro FURUKAWA (y-furukawa-2) wrote :

Hi Michael, sorry for late reply.

You can confirm GARP response by using following commands:

  arping -D <ip_address>

BTW, I've reproduced similar behavior with following conditions:

  - network_type: VLAN
  - core_plugin: ML2 with ovs driver

Then, I avoided this behavior by setting following commands:

  sysctl -w net.ipv4.conf.all.arp_ignore=1

or

  sysctl -w net.ipv4.conf.default.arp_ignore=1
  sysctl -w net.ipv4.conf.<bridge_name1>.arp_ignore=1
  sysctl -w net.ipv4.conf.<bridge_name2>.arp_ignore=1
  ...

Please try it :)

Revision history for this message
Augustina Ragwitz (auggy) wrote :

Hi Michael, I noticed there hasn't been a response to the last item here in over 2 months, were you able to resolve the issue with Yushiro's suggested workaround or via another way?

Revision history for this message
Michael Steffens (michael-steffens-b) wrote :

Hi Augistina, hi Yushiro. I'm also sorry for the late reply! I had to wait for an opportunity to safely do the test, as our environment exposing the behavior got sort of "semi productive" ...

I executed

  sysctl -w net.ipv4.conf.all.arp_ignore=1

on all compute nodes and indeed that fixed the issue! Thanks a lot!

As far as Neutron is concerned, I believe that ML2 plugin and OVS plugin couldn't do much to fix the problem on their own? If so, I would suggest to add a note to

  http://docs.openstack.org/kilo/install-guide/install/apt/content/neutron-compute-node.html

to configure

  net.ipv4.conf.all.arp_ignore=1

in /etc/sysctl.conf, along with the other IP options.

Revision history for this message
Augustina Ragwitz (auggy) wrote :

Thanks Michael! I'm going to close this out for Nova, and add this as an issue to the doc team to update that documentation.

Changed in nova:
status: Incomplete → Invalid
summary: - Incorrect and excess ARP responses in tenant subnets
+ Install guide should suggest net.ipv4.conf.all.arp_ignore=1
summary: - Install guide should suggest net.ipv4.conf.all.arp_ignore=1
+ Install guide should suggest enabling arp_ignore in sysctl.conf with the
+ other IP options
summary: - Install guide should suggest enabling arp_ignore in sysctl.conf with the
- other IP options
+ Install guide: Add arp_ignore (sysctl.conf) to the other IP options
Changed in openstack-manuals:
assignee: nobody → Ryan Selden (ryanx-seldon)
Revision history for this message
Ryan Selden (ryanx-seldon) wrote :

Michael - does the doc issue only affect the kilo docs? I can't find anywhere else that uses sysctl.conf in the current master, should this even be applied on newer releases?

Revision history for this message
Michael Steffens (michael-steffens-b) wrote :

Not sure. Kilo was the last release that had the openvswitch plugin documented as default for Neutron networking in the install guide. Since Liberty it refers to linuxbridge instead.

One can still run Liberty or later with openvswitch, too (when you do an upgrade in place and can't change the wiring, for example). In that case these newer releases are affected as well. But I don't know and can't test whether a configuration using the linuxbridge plugin would be affected, I'm afraid.

Changed in openstack-manuals:
assignee: Ryan Selden (ryanx-seldon) → nobody
Revision history for this message
Alexandra Settle (alexandra-settle) wrote :

I'll take a further look into the docs to see where it is applicable. Thanks for doing all the hard yards here people! :)

Changed in openstack-manuals:
status: New → Triaged
importance: Undecided → Medium
assignee: nobody → Alexandra Settle (alexandra-settle)
Changed in openstack-manuals:
assignee: Alexandra Settle (alexandra-settle) → nobody
assignee: nobody → cat lookabaugh (cat-lookabaugh)
Changed in openstack-manuals:
assignee: cat lookabaugh (cat-lookabaugh) → nobody
Changed in openstack-manuals:
assignee: nobody → Alexandra Settle (alexandra-settle)
importance: Medium → Low
status: Triaged → Confirmed
tags: added: install-guide
removed: arp nova ovs subnet tenant
Changed in openstack-manuals:
assignee: Alexandra Settle (alexandra-settle) → nobody
Revision history for this message
Lana (loquacity) wrote :

It seems as though the content might have moved on? Is there still a doc impact here?

Lana (loquacity)
Changed in openstack-manuals:
status: Confirmed → Fix Released
Changed in neutron:
status: New → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.