Kernel Oops with 17.10

Bug #1731031 reported by Henning Meyer
This bug report is a duplicate of:  Bug #1734327: Kernel panic on a nfsroot system. Edit Remove
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
In Progress
Medium
Joseph Salisbury
Artful
In Progress
Medium
Joseph Salisbury

Bug Description

have captured dmesg just before oops

ProblemType: Bug
DistroRelease: Ubuntu 17.10
Package: xorg 1:7.7+19ubuntu3
ProcVersionSignature: Ubuntu 4.13.0-16.19-generic 4.13.4
Uname: Linux 4.13.0-16-generic x86_64
NonfreeKernelModules: wl
ApportVersion: 2.20.7-0ubuntu3.1
Architecture: amd64
CompizPlugins: No value set for `/apps/compiz-1/general/screen0/options/active_plugins'
CompositorRunning: None
CurrentDesktop: KDE
Date: Wed Nov 8 20:41:14 2017
DistUpgraded: 2017-11-07 01:49:14,657 DEBUG icon theme changed, re-reading
DistroCodename: artful
DistroVariant: kubuntu
DkmsStatus:
 bcmwl, 6.30.223.271+bdcom, 4.10.0-38-generic, x86_64: installed
 bcmwl, 6.30.223.271+bdcom, 4.13.0-16-generic, x86_64: installed
GraphicsCard:
 Intel Corporation Haswell-ULT Integrated Graphics Controller [8086:0a2e] (rev 09) (prog-if 00 [VGA controller])
   Subsystem: Apple Inc. Haswell-ULT Integrated Graphics Controller [106b:011a]
LightdmGreeterLogOld: ** (lightdm-gtk-greeter:1762): WARNING **: Failed to load user image: Failed to open file '/home/hmeyer/.face': No such file or directory
MachineType: Apple Inc. MacBookPro11,1
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.13.0-16-generic root=UUID=fbc61d79-8b51-4801-8e2b-175ff6daa14f ro quiet splash apparmor=0 vt.handoff=7
SourcePackage: xorg
Symptom: display
UpgradeStatus: Upgraded to artful on 2017-11-07 (1 days ago)
dmi.bios.date: 02/22/2016
dmi.bios.vendor: Apple Inc.
dmi.bios.version: MBP111.88Z.0138.B17.1602221718
dmi.board.asset.tag: Base Board Asset Tag#
dmi.board.name: Mac-189A3D4F975D5FFC
dmi.board.vendor: Apple Inc.
dmi.board.version: MacBookPro11,1
dmi.chassis.type: 10
dmi.chassis.vendor: Apple Inc.
dmi.chassis.version: Mac-189A3D4F975D5FFC
dmi.modalias: dmi:bvnAppleInc.:bvrMBP111.88Z.0138.B17.1602221718:bd02/22/2016:svnAppleInc.:pnMacBookPro11,1:pvr1.0:rvnAppleInc.:rnMac-189A3D4F975D5FFC:rvrMacBookPro11,1:cvnAppleInc.:ct10:cvrMac-189A3D4F975D5FFC:
dmi.product.family: Mac
dmi.product.name: MacBookPro11,1
dmi.product.version: 1.0
dmi.sys.vendor: Apple Inc.
version.compiz: compiz N/A
version.libdrm2: libdrm2 2.4.83-1
version.libgl1-mesa-dri: libgl1-mesa-dri 17.2.2-0ubuntu1
version.libgl1-mesa-glx: libgl1-mesa-glx 17.2.2-0ubuntu1
version.xserver-xorg-core: xserver-xorg-core 2:1.19.5-0ubuntu2
version.xserver-xorg-input-evdev: xserver-xorg-input-evdev 1:2.10.5-1ubuntu1
version.xserver-xorg-video-ati: xserver-xorg-video-ati 1:7.10.0-1
version.xserver-xorg-video-intel: xserver-xorg-video-intel 2:2.99.917+git20170309-0ubuntu1
version.xserver-xorg-video-nouveau: xserver-xorg-video-nouveau 1:1.0.15-2
xserver.bootTime: Wed Nov 8 20:39:06 2017
xserver.configfile: default
xserver.errors:

xserver.logfile: /var/log/Xorg.0.log
xserver.version: 2:1.19.5-0ubuntu2
xserver.video_driver: modeset

Revision history for this message
Henning Meyer (henning.meyer) wrote :
Revision history for this message
Henning Meyer (henning.meyer) wrote :

xorg was auto selected when I used apport and said "freeze"

affects: xorg (Ubuntu) → linux (Ubuntu)
Revision history for this message
Henning Meyer (henning.meyer) wrote :

I boot, I connect to wifi, I browse the internet, freeze happens within minutes

had no similar issue with Ubuntu 17.04 before update

Revision history for this message
Henning Meyer (henning.meyer) wrote :
Download full text (7.5 KiB)

[ 305.778878] BUG: unable to handle kernel paging request at fffff9cafc000020
[ 305.778915] IP: kfree+0x53/0x160
[ 305.778924] PGD 0
[ 305.778924] P4D 0

[ 305.778939] Oops: 0000 [#1] SMP
[ 305.778948] Modules linked in: rfcomm cmac bnep binfmt_misc intel_spi_platform intel_spi spi_nor mtd joydev applesmc input_polldev intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_codec_hdmi pcbc snd_hda_codec_cirrus snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep aesni_intel snd_pcm aes_x86_64 crypto_simd glue_helper cryptd intel_cstate intel_rapl_perf wl(POE) snd_seq_midi btusb snd_seq_midi_event btrtl btbcm snd_rawmidi btintel lpc_ich bluetooth thunderbolt cfg80211 snd_seq bdc_pci ecdh_generic input_leds bcm5974 snd_seq_device snd_timer mei_me snd mei sbs shpchp soundcore acpi_als sbshc kfifo_buf industrialio apple_bl mac_hid parport_pc ppdev lp parport ip_tables x_tables autofs4
[ 305.779157] btrfs xor raid6_pq hid_generic hid_apple usbhid hid uas usb_storage i915 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt ahci fb_sys_fops libahci drm video
[ 305.779200] CPU: 3 PID: 777 Comm: cupsd Tainted: P OE 4.13.0-16-generic #19-Ubuntu
[ 305.779220] Hardware name: Apple Inc. MacBookPro11,1/Mac-189A3D4F975D5FFC, BIOS MBP111.88Z.0138.B17.1602221718 02/22/2016
[ 305.779243] task: ffff89855c6c1740 task.stack: ffffb207c1f68000
[ 305.779259] RIP: 0010:kfree+0x53/0x160
[ 305.779268] RSP: 0018:ffffb207c1f6bd30 EFLAGS: 00010286
[ 305.779281] RAX: 0000000000000000 RBX: 00000000000008d0 RCX: 0000000000000006
[ 305.779297] RDX: 0000488250a03468 RSI: 0000000000010080 RDI: 0000767e80000000
[ 305.779313] RBP: ffffb207c1f6bd48 R08: 000000000001f4c0 R09: ffffffff8f3b8819
[ 305.779328] R10: fffff9cafc000000 R11: 0000000001000000 R12: ffff898551025780
[ 305.779344] R13: ffffffff8efa123e R14: 0000000000000000 R15: ffff89851c80da80
[ 305.779361] FS: 00007fd593498040(0000) GS:ffff89856f380000(0000) knlGS:0000000000000000
[ 305.779379] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 305.779392] CR2: fffff9cafc000020 CR3: 000000045a604000 CR4: 00000000001406e0
[ 305.779408] Call Trace:
[ 305.779417] security_sk_free+0x3e/0x50
[ 305.779428] __sk_destruct+0x108/0x190
[ 305.779438] sk_destruct+0x20/0x30
[ 305.779448] __sk_free+0x82/0xa0
[ 305.779456] sk_free+0x19/0x20
[ 305.779466] tcp_close+0x232/0x3f0
[ 305.779476] inet_release+0x3c/0x60
[ 305.779486] inet6_release+0x30/0x40
[ 305.779497] sock_release+0x1f/0x80
[ 305.779506] sock_close+0x12/0x20
[ 305.779516] __fput+0xe7/0x220
[ 305.779524] ____fput+0xe/0x10
[ 305.779534] task_work_run+0x76/0x90
[ 305.779545] exit_to_usermode_loop+0xc4/0xd0
[ 305.779556] syscall_return_slowpath+0x59/0x60
[ 305.779568] entry_SYSCALL_64_fastpath+0xa7/0xa9
[ 305.779579] RIP: 0033:0x7fd591ce6db4
[ 305.779588] RSP: 002b:00007ffe68aaf0e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
[ 305.779605] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00007fd591ce6db4
[ 305.779621] RDX: 0000000000000000 RSI: 000000000000000d RDI: 000000000000000d
[ 305.779637] ...

Read more...

Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Status changed to Confirmed

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Henning Meyer (henning.meyer) wrote :
tags: added: kernel-fixed-upstream
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Can you see which release candidate in 4.14 fixes this bug? If we know that, we can "Reverse" bisect to identify the specific commit that is the fix.

All of the release candidates are here:
http://kernel.ubuntu.com/~kernel-ppa/mainline/

You already tested 4.14-rc8, so maybe try 4.14-rc4 and 4.14-rc1.

Changed in linux (Ubuntu):
importance: Undecided → Medium
status: Confirmed → Incomplete
status: Incomplete → Triaged
Changed in linux (Ubuntu Artful):
status: New → Triaged
importance: Undecided → Medium
tags: added: performing-bisect
Revision history for this message
Henning Meyer (henning.meyer) wrote :

Unfortunately, I am no longer sure the issue is fixed at all, as opposed to it just gets no longer triggered: I changed back to the stock kernel affected by the bug, but changed boot parameters to capture a kernel dump with a crash kernel - and now I get days of uptime without the kernel oops.

Revision history for this message
Henning Meyer (henning.meyer) wrote :

specifically, adding or removing the option "crashkernel=384M-:128M" seems to affect my ability to trigger the bug

Revision history for this message
Henning Meyer (henning.meyer) wrote :

currently testing 4.14-rc4 (no crash kernel), no crashes so far (I can still trigger the problem with 4.13)

Revision history for this message
Henning Meyer (henning.meyer) wrote :

no crashes with 4.14-rc1 either

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Can you try 4.13 final? If it has the bug, we can "Reverse" bisect between 4.13 final and 4.14-rc1.

4.13 final is available from:
http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.13/

Revision history for this message
Henning Meyer (henning.meyer) wrote :
Download full text (4.4 KiB)

I have been able to reproduce the problem with 4.13 mainline so far

I managed to capture a kernel dump with stock 4.13.0-16-generic

This is a MacBook, I managed to get a similar error with a thunderbolt-ethernet adapter instead of wifi. I have captured a second backtrace with netconsole

[ 305.709842] general protection fault: 0000 [#1] SMP
[ 305.709906] Modules linked in: netconsole cpuid rfcomm cmac bnep binfmt_misc joydev intel_spi_platform intel_spi spi_nor mtd intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp snd_hda_codec_hdmi kvm_intel snd_hda_codec_cirrus snd_hda_codec_generic kvm applesmc input_polldev snd_hda_intel irqbypass crct10dif_pclmul snd_hda_codec crc32_pclmul ghash_clmulni_intel snd_hda_core pcbc snd_hwdep snd_pcm aesni_intel aes_x86_64 snd_seq_midi crypto_simd snd_seq_midi_event glue_helper cryptd snd_rawmidi intel_cstate intel_rapl_perf btusb snd_seq btrtl btbcm btintel snd_seq_device bluetooth snd_timer ecdh_generic bcm5974 input_leds snd wl(POE) lpc_ich thunderbolt bdc_pci cfg80211 soundcore mei_me shpchp mei sbs sbshc acpi_als kfifo_buf industrialio apple_bl mac_hid parport_pc ppdev lp parport ip_tables
[ 305.710462] x_tables autofs4 btrfs xor hid_generic hid_apple usbhid hid raid6_pq uas usb_storage i915 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect tg3 sysimgblt ahci fb_sys_fops libahci ptp drm pps_core video
[ 305.710621] CPU: 2 PID: 0 Comm: swapper/2 Tainted: P OE 4.13.0-16-generic #19-Ubuntu
[ 305.710685] Hardware name: Apple Inc. MacBookPro11,1/Mac-189A3D4F975D5FFC, BIOS MBP111.88Z.0138.B17.1602221718 02/22/2016
[ 305.710763] task: ffff9d3b5cee45c0 task.stack: ffffb5e941914000
[ 305.710813] RIP: 0010:kfree+0x53/0x160
[ 305.710844] RSP: 0018:ffff9d3b6f303b80 EFLAGS: 00010207
[ 305.710885] RAX: 0000000000000000 RBX: f8b3894800000300 RCX: 0000000000000008
[ 305.710938] RDX: 000038add0a03030 RSI: 0000000000010080 RDI: 000062c880000000
[ 305.710990] RBP: ffff9d3b6f303b98 R08: 0000000000000001 R09: ffffffffb3fb8819
[ 305.711042] R10: 03e2c7e6c4000000 R11: 0000000001000000 R12: ffff9d3b55690000
[ 305.711095] R13: ffffffffb3ba123e R14: ffff9d3b55690000 R15: ffffffffb47ec280
[ 305.711148] FS: 0000000000000000(0000) GS:ffff9d3b6f300000(0000) knlGS:0000000000000000
[ 305.711207] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 305.711251] CR2: 00007f1e2c0d3528 CR3: 00000003b2a09000 CR4: 00000000001406e0
[ 305.711304] Call Trace:
[ 305.711326] <IRQ>
[ 305.711351] security_sk_free+0x3e/0x50
[ 305.711386] __sk_destruct+0x108/0x190
[ 305.711419] sk_destruct+0x20/0x30
[ 305.711449] __sk_free+0x82/0xa0
[ 305.711477] sk_free+0x19/0x20
[ 305.711507] sock_put+0x14/0x20
[ 305.711535] tcp_v6_rcv+0x910/0x990
[ 305.711569] ip6_input_finish+0xc7/0x450
[ 305.711602] ip6_input+0x3f/0xb0
[ 305.711631] ip6_rcv_finish+0x89/0xf0
[ 305.711662] ipv6_rcv+0x343/0x550
[ 305.711692] ? load_balance+0x13b/0x9a0
[ 305.711727] __netif_receive_skb_core+0x39a/0xaa0
[ 305.711766] __netif_receive_skb+0x18/0x60
[ 305.711799] ? __netif_receive_skb+0x18/0x60
[ 305.711835] process_backlog+0x89/0x140
[ 305.711868] net_rx_action+0x13b/0x380
[ 305.711905] __do_softirq+0xde/0x2a...

Read more...

Revision history for this message
Henning Meyer (henning.meyer) wrote :

sorry, I have been _unable_ to reproduce with 4.13 mainline

Revision history for this message
Henning Meyer (henning.meyer) wrote :

I do have a crashkernel dump of 4.13.0-16-generic

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Can you see if the bug happens with 4.13-rc1:
http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.13-rc1/

Revision history for this message
Henning Meyer (henning.meyer) wrote :

I've been unable to reproduce the bug with 4.13-rc1

Revision history for this message
Henning Meyer (henning.meyer) wrote :

Is the ubuntu kernel built with 4.13 mainline plus a series of patches, and I could rebuild the kernel while bisecting that set of patches? Or is the situation more complicated than that?

Changed in linux (Ubuntu Artful):
assignee: nobody → Joseph Salisbury (jsalisbury)
status: Triaged → In Progress
Changed in linux (Ubuntu):
status: Triaged → In Progress
assignee: nobody → Joseph Salisbury (jsalisbury)
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

That is correct. Ubuntu starts with the mainline kernel and adds patches on top. From your testing, it appears the bug may have been introduce by an Ubuntu SAUCE patch, since you cannot reproduce it with any of the upstream kernels.

We should figure out which Ubuntu kernel version introduced the bug. We can then perform a kernel bisect to identify the exact patch that caused this.

The call trace looks similar to what is happening in bug 1734327. Can you see if the bug you are seeing happens with the test kernel I built for that bug:
http://kernel.ubuntu.com/~jsalisbury/lp1731031/

Revision history for this message
Henning Meyer (henning.meyer) wrote :

I am travelling right now, and I won't be able to try the test kernel until after Christmas.

Revision history for this message
Henning Meyer (henning.meyer) wrote :

I was unable to trigger the bug with the 4.13.0-19-TwoReverts kernel (one hour of observation).

The stack traces in bug 1734327 look very similar to the ones I see, like that bug I have one machine that triggers the bug (kernel panic within 5 minutes of internet usage), and one that doesn't. I can reproduce the problem with the latest 4.13.0-21-generic Ubuntu kernel (first encountered with a 4.13.0-16-generic kernel).

I will try to reproduce the problem once more with the default 4.13.0-19 kernel, and then keep running the 4.13.0-19-TwoReverts kernel.

Revision history for this message
Henning Meyer (henning.meyer) wrote :
Download full text (3.9 KiB)

I just triggered a kernel oops with 4.13.0-17, but the system remains responsive afterwards:

[ 18.645733] random: crng init done
[ 76.860979] BUG: unable to handle kernel paging request at ffffea15932d2020
[ 76.861025] IP: kfree+0x53/0x160
[ 76.861039] PGD 0
[ 76.861040] P4D 0

[ 76.861064] Oops: 0000 [#1] SMP
[ 76.861079] Modules linked in: rfcomm cmac bnep intel_spi_platform intel_spi binfmt_misc spi_nor mtd joydev intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp applesmc input_polldev snd_hda_codec_hdmi kvm_intel kvm irqbypass crct10dif_pclmul snd_hda_codec_cirrus snd_hda_codec_generic crc32_pclmul ghash_clmulni_intel pcbc aesni_intel snd_hda_intel snd_hda_codec aes_x86_64 snd_hda_core snd_hwdep crypto_simd glue_helper cryptd snd_pcm wl(POE) intel_cstate intel_rapl_perf snd_seq_midi btusb snd_seq_midi_event btrtl btbcm snd_rawmidi btintel lpc_ich bluetooth ecdh_generic input_leds bcm5974 snd_seq thunderbolt mei_me cfg80211 snd_seq_device snd_timer snd bdc_pci mei soundcore shpchp sbs sbshc acpi_als kfifo_buf industrialio apple_bl mac_hid parport_pc ppdev lp parport ip_tables x_tables autofs4
[ 76.861393] btrfs xor raid6_pq hid_generic hid_apple usbhid hid i915 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ahci libahci drm uas usb_storage video
[ 76.861439] CPU: 1 PID: 774 Comm: cupsd Tainted: P OE 4.13.0-17-generic #20-Ubuntu
[ 76.861458] Hardware name: Apple Inc. MacBookPro11,1/Mac-189A3D4F975D5FFC, BIOS MBP111.88Z.0138.B17.1602221718 02/22/2016
[ 76.861482] task: ffff913f11d81740 task.stack: ffffaaf8421b0000
[ 76.861497] RIP: 0010:kfree+0x53/0x160
[ 76.861506] RSP: 0018:ffffaaf8421b3d30 EFLAGS: 00010286
[ 76.861518] RAX: 0000000000000000 RBX: 000320bf8b480000 RCX: 0000000000000006
[ 76.861535] RDX: 000039b910a03550 RSI: 0000000000010080 RDI: 00006ec4c0000000
[ 76.861551] RBP: ffffaaf8421b3d48 R08: 000000000001f4c0 R09: ffffffffad9bb839
[ 76.861567] R10: ffffea15932d2000 R11: 0000000001000000 R12: ffff913f1aad0000
[ 76.861583] R13: ffffffffad5a155e R14: 0000000000000000 R15: ffff913f1c3719c0
[ 76.861599] FS: 00007f8b186b0040(0000) GS:ffff913f2f280000(0000) knlGS:0000000000000000
[ 76.861618] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 76.861631] CR2: ffffea15932d2020 CR3: 000000045b33e000 CR4: 00000000001406e0
[ 76.861647] Call Trace:
[ 76.861658] security_sk_free+0x3e/0x50
[ 76.861669] __sk_destruct+0x108/0x190
[ 76.861679] sk_destruct+0x20/0x30
[ 76.861689] __sk_free+0x82/0xa0
[ 76.861697] sk_free+0x19/0x20
[ 76.861707] tcp_close+0x232/0x3f0
[ 76.861717] inet_release+0x3c/0x60
[ 76.861728] inet6_release+0x30/0x40
[ 76.861737] sock_release+0x1f/0x80
[ 76.861746] sock_close+0x12/0x20
[ 76.861756] __fput+0xe7/0x220
[ 76.861764] ____fput+0xe/0x10
[ 76.861774] task_work_run+0x76/0x90
[ 76.861785] exit_to_usermode_loop+0xc4/0xd0
[ 76.861796] syscall_return_slowpath+0x59/0x60
[ 76.861809] entry_SYSCALL_64_fastpath+0xa7/0xa9
[ 76.861820] RIP: 0033:0x7f8b16efedb4
[ 76.861829] RSP: 002b:00007ffc07e597a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
[ 76.861846] RAX: 0000000000000000 RBX: 000...

Read more...

Revision history for this message
Henning Meyer (henning.meyer) wrote :

I can confirm a crash with the default 4.13.0-19-generic kernel

Revision history for this message
Henning Meyer (henning.meyer) wrote :

No kernel oops with 4.13.0-19-TwoReverts after 24h

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.