Geneve tunnels don't work when ipv6 is disabled

Bug #1794232 reported by Nivedita Singhvi on 2018-09-25
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
High
Nivedita Singhvi
Xenial
High
Nivedita Singhvi
Bionic
High
Nivedita Singhvi
Cosmic
High
Nivedita Singhvi
Disco
High
Nivedita Singhvi

Bug Description

SRU Justification

Impact: Cannot create geneve tunnels if ipv6 is disabled dynamically.

Fix:
Fixed by upstream commit in v5.0:
Commit: cf1c9ccba7308e48a68fa77f476287d9d614e4c7
"geneve: correctly handle ipv6.disable module parameter"

Hence available in Disco and later; required in X,B,C.

Testcase:
1. Boot with "ipv6.disable=1"
2. Then try and create a geneve tunnel using:
   # ovs-vsctl add-br br1
   # ovs-vsctl add-port br1 geneve1 -- set interface geneve1
    type=geneve options:remote_ip=192.168.x.z // ip of the other host

Regression Potential: Low, only geneve tunnels when ipv6 dynamically
disabled, current status is it doesn't work at all.

Other Info:
* Mainline commit msg includes reference to a fix for
  non-metadata tunnels (infrastructure is not yet in
  our tree prior to Disco), hence not being included
  at this time under this case.

  At this time, all geneve tunnels created as above
  are metadata-enabled.

---
[Impact]

When attempting to create a geneve tunnel on Ubuntu 16.04 Xenial, in
an OS environment with open vswitch, where ipv6 has been disabled,
the create fails with the error :

“ovs-vsctl: Error detected while setting up 'geneve0': could not
add network device geneve0 to ofproto (Address family not supported
by protocol)."

[Fix]
There is an upstream commit for this in v5.0 mainline (and in Disco and later Ubuntu kernels).

"geneve: correctly handle ipv6.disable module parameter"
Commit: cf1c9ccba7308e48a68fa77f476287d9d614e4c7

This fix is needed on all our series prior to Disco
and the v5.0 kernel: X, C, B. It is identical to the
fix we implemented and tested internally with, but had
not pushed upstream yet.

[Test Case]
(Best to do this on a kvm guest VM so as not to interfere with
 your system's networking)

1. On any Ubuntu Xenial kernel, disable ipv6. This example
   is shown with the 4.15.0-23-generic kernel (which differs
   slightly from 4.4.x in symptoms):

- Edit /etc/default/grub to add the line:
        GRUB_CMDLINE_LINUX="ipv6.disable=1"
- # update-grub
- Reboot

2. Install OVS
# apt install openvswitch-switch

3. Create a Geneve tunnel
# ovs-vsctl add-br br1
# ovs-vsctl add-port br1 geneve1 -- set interface geneve1
type=geneve options:remote_ip=192.168.x.z

(where remote_ip is the IP of the other host)

You will see the following error message:

"ovs-vsctl: Error detected while setting up 'geneve1'.
See ovs-vswitchd log for details."

From /var/log/openvswitch/ovs-vswitchd.log you will see:

"2018-07-02T16:48:13.295Z|00026|dpif|WARN|system@ovs-system:
failed to add geneve1 as port: Address family not supported
by protocol"

You will notice from the "ifconfig" output that the device
genev_sys_6081 is not created.

If you do not disable IPv6 (remove ipv6.disable=1 from
/etc/default/grub + update-grub + reboot), the same
'ovs-vsctl add-port' command completes successfully.
You can see that it is working properly by adding an
IP to the br1 and pinging each host.

On kernel 4.4 (4.4.0-128-generic), the error message doesn't
happen using the 'ovs-vsctl add-port' command, no warning is
shown in ovs-vswitchd.log, but the device genev_sys_6081 is
also not created and ping test won't work.

With the fixed test kernel, the interfaces and tunnel
is created successfully.

[Regression Potential]
* Low -- affects the geneve driver only, and when ipv6 is
  disabled, and since it doesn't work in that case at all,
  this fix gets the tunnel up and running for the common case.

[Other Info]

* Analysis

Geneve tunnels should work with either IPv4 or IPv6 environments
as a design and support principle.

Currently, however, what's in the implementation requires support
for ipv6 for metadata-based tunnels which geneve is:

rather than:

a) ipv4 + metadata // whether ipv6 compiled or dynamically disabled
b) ipv4 + metadata + ipv6

What enforces this in the current 4.4.0-x code when opening a Geneve
tunnel is the following in geneve_open() :

        bool ipv6 = geneve->remote.sa.sa_family == AF_INET6;
        bool metadata = geneve->collect_md;
        ...

#if IS_ENABLED(CONFIG_IPV6)
        geneve->sock6 = NULL;
        if (ipv6 || metadata)
                ret = geneve_sock_add(geneve, true);
#endif
        if (!ret && (!ipv6 || metadata))
                ret = geneve_sock_add(geneve, false);

CONFIG_IPV6 is enabled, IPv6 is disabled at boot, but
even though ipv6 is false, metadata is always true
for a geneve open as it is set unconditionally in
ovs:

In /lib/dpif_netlink_rtnl.c :

case OVS_VPORT_TYPE_GENEVE:
nl_msg_put_flag(&request, IFLA_GENEVE_COLLECT_METADATA);

The second argument of geneve_sock_add is a boolean
value indicating whether it's an ipv6 address family
socket or not, and we thus incorrectly pass a true
value rather than false.

The current "|| metadata" check is unnecessary and incorrectly
sends the tunnel creation code down the ipv6 path, which
fails subsequently when the code expects an ipv6 family socket.

* This issue exists in all versions of the kernel upto present
   mainline and net-next trees.

* Testing with a trivial patch to remove that and make
  similar changes to those made for vxlan (which had the
  same issue) has been successful. Patches for various
  versions to be attached here soon.

* Example Versions (bug exists in all versions of Ubuntu
  and mainline):

$ uname -r
4.4.0-135-generic

$ lsb_release -rd
Description: Ubuntu 16.04.5 LTS
Release: 16.04

$ dpkg -l | grep openvswitch-switch
ii openvswitch-switch 2.5.4-0ubuntu0.16.04.1

tags: added: geneve kernel-bug

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1794232

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
tags: added: bionic

Logs not necessary at this time, will attach patches and other
information as needed.

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Changed in linux (Ubuntu):
importance: Undecided → Medium
status: Confirmed → Triaged
tags: added: kernel-da-key

We had tested a patch discussed above and tested internally,
with success - although we have limited testing (opening up
a geneve tunnel between 2 kvm guests).

Jiri has now pushed an identical patch upstream which is
available in the v5.0 kernel and later.

"geneve: correctly handle ipv6.disable module parameter"
Commit: cf1c9ccba7308e48a68fa77f476287d9d614e4c7

Although I do not have testing validation from original
poster, since it has been committed upstream, I'm going
to go ahead and get the SRU request started.

Changed in linux (Ubuntu):
status: Triaged → In Progress
importance: Medium → High
Changed in linux (Ubuntu Cosmic):
status: New → In Progress
Changed in linux (Ubuntu Disco):
assignee: nobody → Nivedita Singhvi (niveditasinghvi)
Changed in linux (Ubuntu Cosmic):
assignee: nobody → Nivedita Singhvi (niveditasinghvi)
Changed in linux (Ubuntu Xenial):
assignee: nobody → Nivedita Singhvi (niveditasinghvi)
status: New → In Progress
Changed in linux (Ubuntu Cosmic):
importance: Undecided → High
Changed in linux (Ubuntu Xenial):
importance: Undecided → High
description: updated
Changed in linux (Ubuntu Bionic):
status: New → In Progress
importance: Undecided → High
assignee: nobody → Nivedita Singhvi (niveditasinghvi)
description: updated
Changed in linux (Ubuntu Disco):
status: In Progress → Fix Released
description: updated
description: updated
description: updated
Matthew Ruffell (mruffell) wrote :

I tested a fully up to date cosmic VM using the reproducer steps in the description, and found that I could not create a geneve tunnel when ipv6 is disabled.

I compiled a new cosmic kernel off the master-next branch with this commit included:
"geneve: correctly handle ipv6.disable module parameter"
Commit: cf1c9ccba7308e48a68fa77f476287d9d614e4c7

The commit was a clean cherry-pick, and when the patched kernel was installed, I was able to successfully create a geneve tunnel when ipv6 is disabled.

I also tested the latest disco daily build, and found that disco is not effected, as I can successfully create a geneve tunnel when ipv6 is disabled.

description: updated
description: updated
description: updated
tags: added: cosmic xenial

Submitted SRU request for Bionic, Cosmic.

Huge thanks for the testing, Matthew!

Resubmitted SRU for B,C for this kernel cycle.

Changed in linux (Ubuntu Bionic):
status: In Progress → Fix Committed
Changed in linux (Ubuntu Cosmic):
status: In Progress → Fix Committed

A 4.4 test kernel with the fix backported is available at:

https://people.canonical.com/~nivedita/geneve-xenial-test/

if anyone wishes to validate the 4.4 X solution.

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-bionic' to 'verification-done-bionic'. If the problem still exists, change the tag 'verification-needed-bionic' to 'verification-failed-bionic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-bionic

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-cosmic' to 'verification-done-cosmic'. If the problem still exists, change the tag 'verification-needed-cosmic' to 'verification-failed-cosmic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-cosmic

Bionic, Cosmic kernels successfully tested.
I've updated the tags.

tags: added: verification-done-bionic verification-done-cosmic
removed: verification-needed-bionic verification-needed-cosmic
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers