Comment 0 for bug 1794232

Revision history for this message
Nivedita Singhvi (niveditasinghvi) wrote :

[Impact]

When attempting to create a geneve tunnel on Ubuntu 16.04 Xenial, in
an OS environment with open vswitch, where ipv6 has been disabled,
the create fails with the error :

“ovs-vsctl: Error detected while setting up 'geneve0': could not
add network device geneve0 to ofproto (Address family not supported
by protocol)."

[Test Case]
(Best to do this on a kvm guest VM so as not to interfere with
 your system's networking)

1. On any Ubuntu Xenial kernel, disable ipv6. This example
   is shown with the4.15.0-23-generic kernel (which differs
   slightly from 4.4.x in symptoms):

- Edit /etc/default/grub to add the line:
        GRUB_CMDLINE_LINUX="ipv6.disable=1"
- # update-grub
- Reboot

2. Install OVS
# apt install openvswitch-switch

3. Create a Geneve tunnel
# ovs-vsctl add-br br1
# ovs-vsctl add-port br1 geneve1 -- set interface geneve1
type=geneve options:remote_ip=192.168.x.z

(where remote_ip is the IP of the other host)

You will see the following error message:

"ovs-vsctl: Error detected while setting up 'geneve1'.
See ovs-vswitchd log for details."

From /var/log/openvswitch/ovs-vswitchd.log you will see:

"2018-07-02T16:48:13.295Z|00026|dpif|WARN|system@ovs-system:
failed to add geneve1 as port: Address family not supported
by protocol"

You will notice from the "ifconfig" output that the device
genev_sys_6081 is not created.

If you do not disable IPv6 (remove ipv6.disable=1 from
/etc/default/grub + update-grub + reboot), the same
'ovs-vsctl add-port' command completes successfully.
You can see that it is working properly by adding an
IP to the br1 and pinging each host.

On kernel 4.4 (4.4.0-128-generic), the error message doesn't
happen using the 'ovs-vsctl add-port' command, no warning is
shown in ovs-vswitchd.log, but the device genev_sys_6081 is
also not created and ping test won't work.

[Other Info]

* Analysis

Geneve tunnels should work with either IPv4 or IPv6 environments
as a design and support principle.

Currently, however, what's in the implementation requires support
for ipv6 for metadata-based tunnels which geneve is:

rather than:

a) ipv4 + metadata // whether ipv6 compiled or dynamically disabled
b) ipv4 + metadata + ipv6

What enforces this in the current 4.4.0-x code when opening a Geneve
tunnel is the following in geneve_open() :

        bool ipv6 = geneve->remote.sa.sa_family == AF_INET6;
        bool metadata = geneve->collect_md;
        ...

#if IS_ENABLED(CONFIG_IPV6)
        geneve->sock6 = NULL;
        if (ipv6 || metadata)
                ret = geneve_sock_add(geneve, true);
#endif
        if (!ret && (!ipv6 || metadata))
                ret = geneve_sock_add(geneve, false);

CONFIG_IPV6 is enabled, IPv6 is disabled at boot, but
even though ipv6 is false, metadata is always true
for a geneve open as it is set unconditionally in
ovs:

In /lib/dpif_netlink_rtnl.c :

case OVS_VPORT_TYPE_GENEVE:
nl_msg_put_flag(&request, IFLA_GENEVE_COLLECT_METADATA);

The second argument of geneve_sock_add is a boolean
value indicating whether it's an ipv6 address family
socket or not, and we thus incorrectly pass a true
value rather than false.

The current "|| metadata" check is unnecessary and incorrectly
sends the tunnel creation code down the ipv6 path, which
fails subsequently when the code expects an ipv6 family socket.

* This issue exists in all versions of the kernel upto present
   mainline and net-next trees.

* Testing with a trivial patch to remove that and make
  similar changes to those made for vxlan (which had the
  same issue) has been successful. Patches for various
  versions to be attached here soon.

* We are in the process of sending a patch for this upstream
  once it has completed adequate testing.

* Example Versions (bug exists in all versions of Ubuntu
  and mainline):

$ uname -r
4.4.0-135-generic

$ lsb_release -rd
Description: Ubuntu 16.04.5 LTS
Release: 16.04

$ dpkg -l | grep openvswitch-switch
ii openvswitch-switch 2.5.4-0ubuntu0.16.04.1