System locks up unrecoverably after CPU soft lockup
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Compiz |
New
|
Undecided
|
Unassigned | ||
Nvidia |
New
|
Undecided
|
Unassigned | ||
linux |
New
|
Undecided
|
Unassigned | ||
systemd |
New
|
Undecided
|
Unassigned | ||
Ubuntu |
Confirmed
|
Undecided
|
Unassigned |
Bug Description
I recently updated from 16.04 LTS to 17.04.
Ever since then, I have had repeated situations were the system will lock up (often while asleep) in such a way that it no longer responds to any kind of input and must be hard rebooted.
When checking the syslog, I see repeated instances of messages like this:
May 10 07:36:40 Hickory kernel: [224616.143587] NMI watchdog: BUG: soft lockup - CPU#6 stuck for 23s! [compiz:4946]
May 10 07:36:40 Hickory kernel: [224616.143590] Modules linked in: ccm pci_stub vboxpci(OE) vboxnetadp(OE) vboxnetflt(OE) xt_CHECKSUM iptable_mangle vboxdrv(OE) bridge stp llc rfcomm ipt_MASQUERADE nf_nat_
May 10 07:36:40 Hickory kernel: [224616.143640] irqbypass videobuf2_core bluetooth crct10dif_pclmul snd_usb_audio videodev crc32_pclmul joydev input_leds ghash_clmulni_intel media snd_usbmidi_lib pcbc snd_hda_
May 10 07:36:40 Hickory kernel: [224616.143681] sysfillrect sysimgblt fb_sys_fops drm ahci firewire_core libahci r8169 crc_itu_t mii fjes video
May 10 07:36:40 Hickory kernel: [224616.143688] CPU: 6 PID: 4946 Comm: compiz Tainted: P D W OEL 4.10.0-20-generic #22-Ubuntu
May 10 07:36:40 Hickory kernel: [224616.143689] Hardware name: ASUS All Series/Z87-A, BIOS 1602 10/29/2013
May 10 07:36:40 Hickory kernel: [224616.143691] task: ffff9e0517ef9680 task.stack: ffffc26c43cfc000
May 10 07:36:40 Hickory kernel: [224616.143694] RIP: 0010:native_
May 10 07:36:40 Hickory kernel: [224616.143696] RSP: 0018:ffffc26c43
May 10 07:36:40 Hickory kernel: [224616.143698] RAX: 0000000000000000 RBX: ffffe3ec8986faf0 RCX: ffff9e059ed99e40
May 10 07:36:40 Hickory kernel: [224616.143699] RDX: 00000000001c0101 RSI: 0000000000000101 RDI: ffffe3ec8986faf0
May 10 07:36:40 Hickory kernel: [224616.143700] RBP: ffffc26c43cff628 R08: 00000000001c0000 R09: 0000000000000000
May 10 07:36:40 Hickory kernel: [224616.143701] R10: 0000000000000000 R11: 0000000000000000 R12: ffffe3ec85b31580
May 10 07:36:40 Hickory kernel: [224616.143702] R13: ffffc26c43cff6a8 R14: ffff9e03e1beb1f0 R15: 0000000000000000
May 10 07:36:40 Hickory kernel: [224616.143704] FS: 00007f1e47daf78
May 10 07:36:40 Hickory kernel: [224616.143705] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
May 10 07:36:40 Hickory kernel: [224616.143706] CR2: ffffffffff600000 CR3: 0000000397e3f000 CR4: 00000000001406e0
May 10 07:36:40 Hickory kernel: [224616.143708] Call Trace:
May 10 07:36:40 Hickory kernel: [224616.143711] _raw_spin_
May 10 07:36:40 Hickory kernel: [224616.143714] __page_
May 10 07:36:40 Hickory kernel: [224616.143716] try_to_
May 10 07:36:40 Hickory kernel: [224616.143719] ? page_counter_
May 10 07:36:40 Hickory kernel: [224616.143721] rmap_walk_
May 10 07:36:40 Hickory kernel: [224616.143723] rmap_walk+0x48/0x60
May 10 07:36:40 Hickory kernel: [224616.143724] try_to_
May 10 07:36:40 Hickory kernel: [224616.143726] ? page_remove_
May 10 07:36:40 Hickory kernel: [224616.143727] ? __page_
May 10 07:36:40 Hickory kernel: [224616.143729] ? page_get_
May 10 07:36:40 Hickory kernel: [224616.143731] ? invalid_
May 10 07:36:40 Hickory kernel: [224616.143732] migrate_
May 10 07:36:40 Hickory kernel: [224616.143734] ? __ClearPageMova
May 10 07:36:40 Hickory kernel: [224616.143736] ? isolate_
May 10 07:36:40 Hickory kernel: [224616.143737] compact_
May 10 07:36:40 Hickory kernel: [224616.143740] ? ktime_get+0x41/0xb0
May 10 07:36:40 Hickory kernel: [224616.143741] compact_
May 10 07:36:40 Hickory kernel: [224616.143743] try_to_
May 10 07:36:40 Hickory kernel: [224616.143745] __alloc_
May 10 07:36:40 Hickory kernel: [224616.143747] __alloc_
May 10 07:36:40 Hickory kernel: [224616.143749] __alloc_
May 10 07:36:40 Hickory kernel: [224616.143752] alloc_pages_
May 10 07:36:40 Hickory kernel: [224616.143753] kmalloc_
May 10 07:36:40 Hickory kernel: [224616.143755] kmalloc_
May 10 07:36:40 Hickory kernel: [224616.143757] __kmalloc+
May 10 07:36:40 Hickory kernel: [224616.143766] nvkms_alloc+
May 10 07:36:40 Hickory kernel: [224616.143777] _nv001951kms+
May 10 07:36:40 Hickory kernel: [224616.143786] ? _nv001898kms+
May 10 07:36:40 Hickory kernel: [224616.143788] ? kmalloc_
May 10 07:36:40 Hickory kernel: [224616.143789] ? kmalloc_
May 10 07:36:40 Hickory kernel: [224616.143791] ? __kmalloc+
May 10 07:36:40 Hickory kernel: [224616.143793] ? __check_
May 10 07:36:40 Hickory kernel: [224616.143796] ? _copy_from_
May 10 07:36:40 Hickory kernel: [224616.143814] ? _nv000319kms+
May 10 07:36:40 Hickory kernel: [224616.143821] ? _nv000171kms+
May 10 07:36:40 Hickory kernel: [224616.143829] ? nvKmsIoctl+
May 10 07:36:40 Hickory kernel: [224616.143837] ? nvkms_ioctl_
May 10 07:36:40 Hickory kernel: [224616.143845] ? nvkms_ioctl+
May 10 07:36:40 Hickory kernel: [224616.143932] ? nvidia_
May 10 07:36:40 Hickory kernel: [224616.143998] ? nvidia_
May 10 07:36:40 Hickory kernel: [224616.144001] ? do_vfs_
May 10 07:36:40 Hickory kernel: [224616.144002] ? __schedule+
May 10 07:36:40 Hickory kernel: [224616.144004] ? SyS_ioctl+0x79/0x90
May 10 07:36:40 Hickory kernel: [224616.144006] ? entry_SYSCALL_
May 10 07:36:40 Hickory kernel: [224616.144007] Code: 48 03 04 d5 e0 53 d4 a1 48 89 08 8b 41 08 85 c0 75 09 f3 90 8b 41 08 85 c0 74 f7 4c 8b 09 4d 85 c9 74 08 41 0f 18 09 eb 02 f3 90 <8b> 17 66 85 d2 75 f7 be 01 00 00 00 eb 10 89 d0 f0 0f b1 37 39
May 10 07:36:48 Hickory kernel: [224624.139731] NMI watchdog: BUG: soft lockup - CPU#4 stuck for 22s! [Chrome_
May 10 07:36:48 Hickory kernel: [224624.139744] Modules linked in: ccm pci_stub vboxpci(OE) vboxnetadp(OE) vboxnetflt(OE) xt_CHECKSUM iptable_mangle vboxdrv(OE) bridge stp llc rfcomm ipt_MASQUERADE nf_nat_
May 10 07:36:48 Hickory kernel: [224624.139779] irqbypass videobuf2_core bluetooth crct10dif_pclmul snd_usb_audio videodev crc32_pclmul joydev input_leds ghash_clmulni_intel media snd_usbmidi_lib pcbc snd_hda_
May 10 07:36:48 Hickory kernel: [224624.139812] sysfillrect sysimgblt fb_sys_fops drm ahci firewire_core libahci r8169 crc_itu_t mii fjes video
May 10 07:36:48 Hickory kernel: [224624.139817] CPU: 4 PID: 26728 Comm: Chrome_ChildThr Tainted: P D W OEL 4.10.0-20-generic #22-Ubuntu
May 10 07:36:48 Hickory kernel: [224624.139818] Hardware name: ASUS All Series/Z87-A, BIOS 1602 10/29/2013
May 10 07:36:48 Hickory kernel: [224624.139819] task: ffff9e0509ef2d00 task.stack: ffffc26c4c554000
May 10 07:36:48 Hickory kernel: [224624.139822] RIP: 0010:native_
May 10 07:36:48 Hickory kernel: [224624.139822] RSP: 0000:ffffc26c4c
May 10 07:36:48 Hickory kernel: [224624.139823] RAX: 00000000001c0101 RBX: ffffe3ec8986faf0 RCX: 0000000000000001
May 10 07:36:48 Hickory kernel: [224624.139824] RDX: 0000000000000101 RSI: 0000000000000001 RDI: ffffe3ec8986faf0
May 10 07:36:48 Hickory kernel: [224624.139825] RBP: ffffc26c4c557d48 R08: 0000000000000101 R09: ffff9e053982ec80
May 10 07:36:48 Hickory kernel: [224624.139825] R10: 00007fad07f00618 R11: 00007fad07f00618 R12: ffff9e03e1beb008
May 10 07:36:48 Hickory kernel: [224624.139826] R13: 3e00000000355001 R14: ffffc26c4c557e30 R15: ffff9e0299d4c320
May 10 07:36:48 Hickory kernel: [224624.139827] FS: 00007facf69df70
May 10 07:36:48 Hickory kernel: [224624.139827] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
May 10 07:36:48 Hickory kernel: [224624.139828] CR2: 00007facab001280 CR3: 00000002961cd000 CR4: 00000000001406e0
May 10 07:36:48 Hickory kernel: [224624.139829] Call Trace:
May 10 07:36:48 Hickory kernel: [224624.139833] _raw_spin_
May 10 07:36:48 Hickory kernel: [224624.139836] __migration_
May 10 07:36:48 Hickory kernel: [224624.139836] migration_
May 10 07:36:48 Hickory kernel: [224624.139839] do_swap_
May 10 07:36:48 Hickory kernel: [224624.139841] ? ep_ptable_
May 10 07:36:48 Hickory kernel: [224624.139842] handle_
May 10 07:36:48 Hickory kernel: [224624.139843] ? __seccomp_
May 10 07:36:48 Hickory kernel: [224624.139845] __do_page_
May 10 07:36:48 Hickory kernel: [224624.139847] do_page_
May 10 07:36:48 Hickory kernel: [224624.139848] page_fault+
May 10 07:36:48 Hickory kernel: [224624.139849] RIP: 0033:0x417bc0
May 10 07:36:48 Hickory kernel: [224624.139850] RSP: 002b:00007facf6
May 10 07:36:48 Hickory kernel: [224624.139851] RAX: 00007facaae01580 RBX: 00007facab001280 RCX: 00007facaac00bc1
May 10 07:36:48 Hickory kernel: [224624.139851] RDX: 00007facab001281 RSI: 00007faca5a00590 RDI: 00007fad07f00610
May 10 07:36:48 Hickory kernel: [224624.139852] RBP: 00007facaa700188 R08: 00000000ffffffff R09: 00007facab5016e8
May 10 07:36:48 Hickory kernel: [224624.139852] R10: 00007fad07f00618 R11: 00007fad07f00618 R12: 00007fad07f00608
May 10 07:36:48 Hickory kernel: [224624.139853] R13: 00000000000000b0 R14: 00007fad07f00048 R15: 00007fad07f00600
May 10 07:36:48 Hickory kernel: [224624.139854] Code: c0 74 e6 4d 85 c9 c6 07 01 74 30 41 c7 41 08 01 00 00 00 e9 52 ff ff ff 83 fa 01 0f 84 b0 fe ff ff 8b 07 84 c0 74 08 f3 90 8b 07 <84> c0 75 f8 b8 01 00 00 00 66 89 07 5d c3 f3 90 4c 8b 09 4d 85
I run BOINC when I'm not at my computer, which will often use a lot of CPU/GPU/RAM, but it's limited in what it can use, and my computer is very powerful.
I've also seen instances where individual applications will lock up suddenly, and I am unable to kill them because they are waiting on an uninterruptable communication with the CPU. The only recourse I have found in those cases has also been to reboot.
no longer affects: | duplicity |
Status changed to 'Confirmed' because the bug affects multiple users.