RTNL assertion failure on ipvlan

Bug #1776927 reported by Neil Wilson
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Medium
Joseph Salisbury
Bionic
Fix Released
Medium
Joseph Salisbury

Bug Description

== SRU Justification ==
Running up two containers using ipvlan with IPv6 autoconf active triggers
an assertion failure in the kernel (cloud image running on Brightbox).
This is a regression caused by commit e9997c2938b2.

This regression is fixed by commit 8230819494b3 but also requires commit 94333fac44d1
as a prereq.

== Fixes ==
94333fac44d1 ("ipvlan: drop ipv6 dependency")
8230819494b3 ("ipvlan: use per device spinlock to protect addrs list updates")

== Regression Potential ==
Low. Fixes a current regression. The fix was also sent to stable, so
it has had additional upstream review.

== Test Case ==
A test kernel was built with these patches and tested by the original bug reporter.
The bug reporter states the test kernel resolved the bug.

Running up two containers using ipvlan with IPv6 autoconf active triggers an assertion failure in the kernel (cloud image running on Brightbox)

Jun 14 15:16:37 srv-x3w2q kernel: RTNL: assertion failed at /build/linux-uT8zSN/linux-4.15.0/drivers/net/ipvlan/ipvlan_core.c (110)
Jun 14 15:16:37 srv-x3w2q kernel: CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.15.0-23-generic #25-Ubuntu
Jun 14 15:16:37 srv-x3w2q kernel: Hardware name: Red Hat KVM, BIOS 1.10.2-3.el7_4.1 04/01/2014
Jun 14 15:16:37 srv-x3w2q kernel: Call Trace:
Jun 14 15:16:37 srv-x3w2q kernel: <IRQ>
Jun 14 15:16:37 srv-x3w2q kernel: dump_stack+0x63/0x8b
Jun 14 15:16:37 srv-x3w2q kernel: ipvlan_addr_busy+0x96/0xa0 [ipvlan]
Jun 14 15:16:37 srv-x3w2q kernel: ipvlan_addr6_event+0x77/0xd0 [ipvlan]
Jun 14 15:16:37 srv-x3w2q kernel: notifier_call_chain+0x4c/0x70
Jun 14 15:16:37 srv-x3w2q kernel: atomic_notifier_call_chain+0x1a/0x20
Jun 14 15:16:37 srv-x3w2q kernel: inet6addr_notifier_call_chain+0x1b/0x20
Jun 14 15:16:37 srv-x3w2q kernel: ipv6_add_addr+0x43d/0x5c0
Jun 14 15:16:37 srv-x3w2q kernel: ? addrconf_prefix_route+0xd7/0x120
Jun 14 15:16:37 srv-x3w2q kernel: addrconf_prefix_rcv_add_addr+0xb9/0x250
Jun 14 15:16:37 srv-x3w2q kernel: ? addrconf_prefix_rcv_add_addr+0xb9/0x250
Jun 14 15:16:37 srv-x3w2q kernel: addrconf_prefix_rcv+0x26c/0x740
Jun 14 15:16:37 srv-x3w2q kernel: ndisc_router_discovery+0x683/0xbe0
Jun 14 15:16:37 srv-x3w2q kernel: ? ndisc_router_discovery+0x683/0xbe0
Jun 14 15:16:37 srv-x3w2q kernel: ndisc_rcv+0xe9/0x100
Jun 14 15:16:37 srv-x3w2q kernel: icmpv6_rcv+0x408/0x540
Jun 14 15:16:37 srv-x3w2q kernel: ip6_input_finish+0xcc/0x460
Jun 14 15:16:37 srv-x3w2q kernel: ip6_input+0x3f/0xb0
Jun 14 15:16:37 srv-x3w2q kernel: ip6_rcv_finish+0x92/0x100
Jun 14 15:16:37 srv-x3w2q kernel: ipv6_rcv+0x346/0x550
Jun 14 15:16:37 srv-x3w2q kernel: ? ipvlan_handle_frame+0xbd/0x1c0 [ipvlan]
Jun 14 15:16:37 srv-x3w2q kernel: __netif_receive_skb_core+0x432/0xb40
Jun 14 15:16:37 srv-x3w2q kernel: ? ipv6_gro_receive+0x22b/0x390
Jun 14 15:16:37 srv-x3w2q kernel: __netif_receive_skb+0x18/0x60
Jun 14 15:16:37 srv-x3w2q kernel: ? __netif_receive_skb+0x18/0x60
Jun 14 15:16:37 srv-x3w2q kernel: netif_receive_skb_internal+0x37/0xd0
Jun 14 15:16:37 srv-x3w2q kernel: napi_gro_receive+0xc5/0xf0
Jun 14 15:16:37 srv-x3w2q kernel: receive_buf+0x275/0x1180 [virtio_net]
Jun 14 15:16:37 srv-x3w2q kernel: ? vring_unmap_one+0x1b/0x80
Jun 14 15:16:37 srv-x3w2q kernel: virtnet_poll+0xc4/0x289 [virtio_net]
Jun 14 15:16:37 srv-x3w2q kernel: net_rx_action+0x140/0x3a0
Jun 14 15:16:37 srv-x3w2q kernel: __do_softirq+0xdf/0x2b2
Jun 14 15:16:37 srv-x3w2q kernel: irq_exit+0xb6/0xc0
Jun 14 15:16:37 srv-x3w2q kernel: do_IRQ+0x82/0xd0
Jun 14 15:16:37 srv-x3w2q kernel: common_interrupt+0x84/0x84
Jun 14 15:16:37 srv-x3w2q kernel: </IRQ>
Jun 14 15:16:37 srv-x3w2q kernel: RIP: 0010:native_safe_halt+0x6/0x10
Jun 14 15:16:37 srv-x3w2q kernel: RSP: 0018:ffffac5ec0377e80 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffd9
Jun 14 15:16:37 srv-x3w2q kernel: RAX: ffffffffa4196060 RBX: 0000000000000001 RCX: 0000000000000000
Jun 14 15:16:37 srv-x3w2q kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
Jun 14 15:16:37 srv-x3w2q kernel: RBP: ffffac5ec0377e80 R08: 0000000000000000 R09: ffffffffa4c08528
Jun 14 15:16:37 srv-x3w2q kernel: R10: ffff91d27ffb1ca8 R11: 0000000000000000 R12: 0000000000000001
Jun 14 15:16:37 srv-x3w2q kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000

Fix is apparently at https://www.spinics.net/lists/netdev/msg485566.html

ProblemType: Bug
DistroRelease: Ubuntu 18.04
Package: linux-image-4.15.0-23-generic 4.15.0-23.25
ProcVersionSignature: User Name 4.15.0-23.25-generic 4.15.18
Uname: Linux 4.15.0-23-generic x86_64
AlsaDevices:
 total 0
 crw-rw---- 1 root audio 116, 1 Jun 14 10:00 seq
 crw-rw---- 1 root audio 116, 33 Jun 14 10:00 timer
AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay'
ApportVersion: 2.20.9-0ubuntu7.2
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 'arecord'
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
CRDA: N/A
Date: Thu Jun 14 15:20:04 2018
IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig'
Lsusb:
 Bus 001 Device 002: ID 0627:0001 Adomax Technology Co., Ltd
 Bus 001 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
MachineType: Red Hat KVM
PciMultimedia:

ProcFB:

ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-23-generic root=LABEL=cloudimg-rootfs ro console=tty1 console=ttyS0
RelatedPackageVersions:
 linux-restricted-modules-4.15.0-23-generic N/A
 linux-backports-modules-4.15.0-23-generic N/A
 linux-firmware N/A
RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill'
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 04/01/2014
dmi.bios.vendor: SeaBIOS
dmi.bios.version: 1.10.2-3.el7_4.1
dmi.chassis.type: 1
dmi.chassis.vendor: Red Hat
dmi.chassis.version: RHEL 7.4.0 PC (i440FX + PIIX, 1996)
dmi.modalias: dmi:bvnSeaBIOS:bvr1.10.2-3.el7_4.1:bd04/01/2014:svnRedHat:pnKVM:pvrRHEL7.4.0PC(i440FX+PIIX,1996):cvnRedHat:ct1:cvrRHEL7.4.0PC(i440FX+PIIX,1996):
dmi.product.family: Red Hat Enterprise Linux
dmi.product.name: KVM
dmi.product.version: RHEL 7.4.0 PC (i440FX + PIIX, 1996)
dmi.sys.vendor: Red Hat

Revision history for this message
Neil Wilson (neil-aldur) wrote :
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Status changed to Confirmed

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Neil Wilson (neil-aldur) wrote :
Changed in linux (Ubuntu):
importance: Undecided → Medium
Changed in linux (Ubuntu):
assignee: nobody → Joseph Salisbury (jsalisbury)
status: Confirmed → In Progress
Changed in linux (Ubuntu Bionic):
importance: Undecided → Medium
status: New → In Progress
assignee: nobody → Joseph Salisbury (jsalisbury)
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I built a test kernel with commit 8230819494. It also required commit 94333fac44 as a prereq. The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1776927

Can you test this kernel and see if it resolves this bug?

Note about installing test kernels:
• If the test kernel is prior to 4.15(Bionic) you need to install the linux-image and linux-image-extra .deb packages.
• If the test kernel is 4.15(Bionic) or newer, you need to install the linux-modules, linux-modules-extra and linux-image-unsigned .deb packages.

Thanks in advance!

Revision history for this message
Neil Wilson (neil-aldur) wrote :

Seems to do the trick

ubuntu@srv-uq8mr:~$ uname -a
Linux srv-uq8mr 4.15.0-23-generic #26~lp1776927 SMP Fri Jun 15 17:06:38 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

[ 307.632377] audit: type=1400 audit(1529163746.165:17): apparmor="STATUS" operation="profile_load" profile="unconfined" name="cri-containerd.apparmor.d" pid=2039 comm="apparmor_parser"
[ 335.794651] eth0: renamed from veth1d773317
[ 336.205408] IPVS: Registered protocols (TCP, UDP, SCTP, AH, ESP)
[ 336.205570] IPVS: Connection hash table configured (size=4096, memory=64Kbytes)
[ 336.593750] IPVS: ipvs loaded.
[ 336.600809] IPVS: [rr] scheduler registered.
[ 336.609316] IPVS: [wrr] scheduler registered.
[ 336.616999] IPVS: [sh] scheduler registered.
[ 336.766386] Netfilter messages via NETLINK v0.30.
[ 336.780403] ip_set: protocol 6

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :
description: updated
Changed in linux (Ubuntu Bionic):
status: In Progress → Fix Committed
Revision history for this message
Brad Figg (brad-figg) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-bionic' to 'verification-done-bionic'. If the problem still exists, change the tag 'verification-needed-bionic' to 'verification-failed-bionic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-bionic
Revision history for this message
Neil Wilson (neil-aldur) wrote :

Does the trick nicely. Thank you.

tags: added: verification-done-bionic
removed: verification-needed-bionic
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (35.6 KiB)

This bug was fixed in the package linux - 4.15.0-33.36

---------------
linux (4.15.0-33.36) bionic; urgency=medium

  * linux: 4.15.0-33.36 -proposed tracker (LP: #1787149)

  * RTNL assertion failure on ipvlan (LP: #1776927)
    - ipvlan: drop ipv6 dependency
    - ipvlan: use per device spinlock to protect addrs list updates
    - SAUCE: fix warning from "ipvlan: drop ipv6 dependency"

  * ubuntu_bpf_jit test failed on Bionic s390x systems (LP: #1753941)
    - test_bpf: flag tests that cannot be jited on s390

  * HDMI/DP audio can't work on the laptop of Dell Latitude 5495 (LP: #1782689)
    - drm/nouveau: fix nouveau_dsm_get_client_id()'s return type
    - drm/radeon: fix radeon_atpx_get_client_id()'s return type
    - drm/amdgpu: fix amdgpu_atpx_get_client_id()'s return type
    - platform/x86: apple-gmux: fix gmux_get_client_id()'s return type
    - ALSA: hda: use PCI_BASE_CLASS_DISPLAY to replace PCI_CLASS_DISPLAY_VGA
    - vga_switcheroo: set audio client id according to bound GPU id

  * locking sockets broken due to missing AppArmor socket mediation patches
    (LP: #1780227)
    - UBUNTU SAUCE: apparmor: fix apparmor mediating locking non-fs, unix sockets

  * Update2 for ocxl driver (LP: #1781436)
    - ocxl: Fix page fault handler in case of fault on dying process

  * netns: unable to follow an interface that moves to another netns
    (LP: #1774225)
    - net: core: Expose number of link up/down transitions
    - dev: always advertise the new nsid when the netns iface changes
    - dev: advertise the new ifindex when the netns iface changes

  * [Bionic] Disk IO hangs when using BFQ as io scheduler (LP: #1780066)
    - block, bfq: fix occurrences of request finish method's old name
    - block, bfq: remove batches of confusing ifdefs
    - block, bfq: add requeue-request hook

  * HP ProBook 455 G5 needs mute-led-gpio fixup (LP: #1781763)
    - ALSA: hda: add mute led support for HP ProBook 455 G5

  * [Bionic] bug fixes to improve stability of the ThunderX2 i2c driver
    (LP: #1781476)
    - i2c: xlp9xx: Fix issue seen when updating receive length
    - i2c: xlp9xx: Make sure the transfer size is not more than
      I2C_SMBUS_BLOCK_SIZE

  * x86/kvm: fix LAPIC timer drift when guest uses periodic mode (LP: #1778486)
    - x86/kvm: fix LAPIC timer drift when guest uses periodic mode

  * Please include ax88179_178a and r8152 modules in d-i udeb (LP: #1771823)
    - [Config:] d-i: Add ax88179_178a and r8152 to nic-modules

  * Nvidia fails after switching its mode (LP: #1778658)
    - PCI: Restore config space on runtime resume despite being unbound

  * Kernel error "task zfs:pid blocked for more than 120 seconds" (LP: #1781364)
    - SAUCE: (noup) zfs to 0.7.5-1ubuntu16.3

  * CVE-2018-12232
    - PATCH 1/1] socket: close race condition between sock_close() and
      sockfs_setattr()

  * CVE-2018-10323
    - xfs: set format back to extents if xfs_bmap_extents_to_btree

  * change front mic location for more lenovo m7/8/9xx machines (LP: #1781316)
    - ALSA: hda/realtek - Fix the problem of two front mics on more machines
    - ALSA: hda/realtek - two more lenovo models need fixup of MIC_LOCATION

  * Cephfs + fscache: unab...

Changed in linux (Ubuntu Bionic):
status: Fix Committed → Fix Released
Changed in linux (Ubuntu):
status: In Progress → Fix Released
Brad Figg (brad-figg)
tags: added: cscc
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.