regression: IPv6 PMTU discovery fails with source-specific routing

Bug #1788623 reported by Mikael Magnusson on 2018-08-23
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
High
Unassigned
Bionic
High
Unassigned

Bug Description

IPv6 PMTU discovery fails when using source-specific routing on Ubuntu 18.04.

I have attached a test case called pmtu-sads.sh which is based on tools/testing/selftests/net/pmtu.sh in the linux source.

I have verified that the test fails on:
Ubuntu 18.04 with 4.15.0-30.32
Ubuntu 18.04 with 4.17.0-041700.201806041953
Ubuntu 18.04 with 4.18.3-041803.201808180530

The test succeeds on Ubuntu 16.04 with 4.4.0.131.137 which makes it a regression.

System information:
Ubuntu 4.15.0-30.32-generic 4.15.18

Description: Ubuntu 18.04.1 LTS
Release: 18.04

See below for a patch which is working for me. I'm currently using linux kernel 4.15.0-32.35 built from git://kernel.ubuntu.com/ubuntu/ubuntu-bionic.git with this patch.
---
ProblemType: Bug
AlsaVersion: Advanced Linux Sound Architecture Driver Version k4.15.0-30-generic.
AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay'
ApportVersion: 2.20.9-0ubuntu7.2
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 'arecord'
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/by-path', '/dev/snd/hwC0D0', '/dev/snd/pcmC0D0c', '/dev/snd/pcmC0D0p', '/dev/snd/controlC0', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
Card0.Amixer.info: Error: [Errno 2] No such file or directory: 'amixer': 'amixer'
Card0.Amixer.values: Error: [Errno 2] No such file or directory: 'amixer': 'amixer'
DistroRelease: Ubuntu 18.04
IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig'
Lsusb:
 Bus 001 Device 002: ID 0627:0001 Adomax Technology Co., Ltd
 Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
 Bus 004 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
 Bus 003 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
 Bus 002 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
MachineType: QEMU Standard PC (i440FX + PIIX, 1996)
Package: linux (not installed)
ProcFB: 0 qxldrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-30-generic root=UUID=041f6e1e-4904-4760-8518-3a88dc288556 ro splash quiet vt.handoff=1
ProcVersionSignature: Ubuntu 4.15.0-30.32-generic 4.15.18
RelatedPackageVersions:
 linux-restricted-modules-4.15.0-30-generic N/A
 linux-backports-modules-4.15.0-30-generic N/A
 linux-firmware 1.173.1
RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill'
Tags: bionic
Uname: Linux 4.15.0-30-generic x86_64
UnreportableReason: This report is about a package that is not installed.
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: adm cdrom dip lpadmin plugdev sambashare sudo
_MarkForUpload: False
dmi.bios.date: 04/01/2014
dmi.bios.vendor: SeaBIOS
dmi.bios.version: 1.10.2-1ubuntu1
dmi.chassis.type: 1
dmi.chassis.vendor: QEMU
dmi.chassis.version: pc-i440fx-bionic
dmi.modalias: dmi:bvnSeaBIOS:bvr1.10.2-1ubuntu1:bd04/01/2014:svnQEMU:pnStandardPC(i440FX+PIIX,1996):pvrpc-i440fx-bionic:cvnQEMU:ct1:cvrpc-i440fx-bionic:
dmi.product.name: Standard PC (i440FX + PIIX, 1996)
dmi.product.version: pc-i440fx-bionic
dmi.sys.vendor: QEMU

Mikael Magnusson (mikma) wrote :

Patch which fixes the bug for me.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1788623

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
tags: added: bionic
Mikael Magnusson (mikma) on 2018-08-23
description: updated
tags: added: patch

apport information

tags: added: apport-collected
description: updated

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

Mikael Magnusson (mikma) wrote :

I added logs from a Virtual Machine I did some testing on. But I doubt they are very interesting.

Mikael Magnusson (mikma) on 2018-08-23
Changed in linux (Ubuntu):
status: Incomplete → New

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
Joseph Salisbury (jsalisbury) wrote :

I'd like to perform a bisect to figure out what commit caused this regression. We need to identify the earliest kernel where the issue started happening as well as the last kernel that did not have this issue.

Can you test the following kernels and report back? We are looking for the first kernel version that exhibits this bug:

4.6 Final: http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.6/
4.8 Final: http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.8/
v4.13 Final: http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.13/
v4.14-rc1: http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.14-rc1/
v4.14 Final: http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.14/
v4.15 Final: http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.15

You don't have to test every kernel, just up until the kernel that first has this bug.

Thanks in advance!

Changed in linux (Ubuntu):
importance: Undecided → High
tags: added: performing-bisect
Changed in linux (Ubuntu Bionic):
status: New → Incomplete
Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Changed in linux (Ubuntu Bionic):
importance: Undecided → High
Changed in linux (Ubuntu):
assignee: nobody → Joseph Salisbury (jsalisbury)
Changed in linux (Ubuntu Bionic):
assignee: nobody → Joseph Salisbury (jsalisbury)
Mikael Magnusson (mikma) wrote :

I have tested all kernels in the list (except 4.6 which link leads to 404) and the first kernel with the bug is v4.15 Final.

Kernel v4.15 also seems to be the first kernel with the exception table in net/ipv6/route.c.

Changed in linux (Ubuntu):
status: Incomplete → In Progress
Changed in linux (Ubuntu Bionic):
status: Incomplete → In Progress
Joseph Salisbury (jsalisbury) wrote :

The following commit introduced the exception table in v4.15-rc1:
35732d01fe31 ipv6: introduce a hash table to store dst cache

This commit does not revert easily. Before I contact upstream regarding this regression, can you test the v4.19-rc1 kernel to see if any other commits may have helped this bug? It can be downloaded from:

 http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.19-rc1/

Mikael Magnusson (mikma) wrote :

I ran the test case on the 4.19.0-041900rc1-generic kernel and it also has the bug.

Joseph Salisbury (jsalisbury) wrote :

To confirm this regression is due to commit 35732d01fe31, I built a mainline test kernel with that commits as the tip.

This test kernel can be downloaded from:

http://kernel.ubuntu.com/~jsalisbury/lp1788623

Can you test this kernel and see if it resolves this bug?

Mikael Magnusson (mikma) wrote :

Yes, kernel 4.14.0-041400rc3-generic resolves the bug. The test case in #1 succeeds with that kernel.

Joseph Salisbury (jsalisbury) wrote :

Thanks for testing. That implies that commit 35732d01fe31 is not the specific commit that introduced the regression. I built a second test kernel up to the commit f5bbe7ee79c2.

This test kernel can be downloaded from:

http://kernel.ubuntu.com/~jsalisbury/lp1788623

Can you test this kernel and see if it resolves this bug?

Mikael Magnusson (mikma) wrote :

Yes, the following kernels also resolves this bug:

Linux ubuntu 4.14.0-041400rc3-generic #201808301751 SMP Thu Aug 30 17:53:30 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Linux ubuntu 4.14.0-041400rc3-generic #201808301846 SMP Thu Aug 30 18:47:56 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

Joseph Salisbury (jsalisbury) wrote :

Thanks for testing. There are quite a few ipv6 commits added in v4.15-rc1, so we may have to bisect through them. Can you first confirm that v4.15-rc1 exhibits the bug? It can be downloaded from:

http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.15-rc1/

Mikael Magnusson (mikma) wrote :

Yes, 4.15.0-041500rc1-generic exhibits the bug.

Joseph Salisbury (jsalisbury) wrote :

I started a kernel bisect between commit v4.14 final and v4.15-rc1. The kernel bisect will require testing of about 13 test kernels.

I built the first test kernel, up to the following commit:
1be2172e96e33bfa22a5c7a651f768ef30ce3984

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1788623

Can you test that kernel and report back if it has the bug or not? I will build the next test kernel based on your test results.

Thanks in advance

Mikael Magnusson (mikma) wrote :

Sorry for the long delay. I have tested the following version now, and the test (pmtu-sadr.sh) reports failure.

Linux ubuntu 4.14.0-041400-generic #201809131957 SMP Thu Sep 13 20:03:05 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

Joseph Salisbury (jsalisbury) wrote :

You mention in comment #2 that your patch resolves this bug. Is that still the case, and have you sent it upstream for inclusion in mainline?

Changed in linux (Ubuntu Bionic):
status: In Progress → Confirmed
Changed in linux (Ubuntu):
status: In Progress → Confirmed
Changed in linux (Ubuntu Bionic):
assignee: Joseph Salisbury (jsalisbury) → nobody
Changed in linux (Ubuntu):
assignee: Joseph Salisbury (jsalisbury) → nobody
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers