Bug #1668356 “Hard lockup after 4 hours uptime” : Xenial (16.04) : Bugs : linux package : Ubuntu

Revision history for this message

Olivier Louvignes (olouvignes) wrote on 2017-02-27:

#1

CurrentDmesg.txt Edit (67.7 KiB, text/plain; charset="utf-8")
Dependencies.txt Edit (2.3 KiB, text/plain; charset="utf-8")
HookError_source_linux.txt Edit (722 bytes, text/plain; charset="utf-8")
JournalErrors.txt Edit (55.8 KiB, text/plain; charset="utf-8")
Lspci.txt Edit (17.2 KiB, text/plain; charset="utf-8")
ProcCpuinfo.txt Edit (4.4 KiB, text/plain; charset="utf-8")
ProcEnviron.txt Edit (288 bytes, text/plain; charset="utf-8")
ProcInterrupts.txt Edit (2.7 KiB, text/plain; charset="utf-8")
ProcModules.txt Edit (9.7 KiB, text/plain; charset="utf-8")
UdevDb.txt Edit (169.7 KiB, text/plain; charset="utf-8")

Revision history for this message

Olivier Louvignes (olouvignes) wrote on 2017-02-27:

#2

Download full text (8.6 KiB)

DHCPREQUEST of 10.34.242.77 on eth0 to 10.32.65.65 port 67 (xid=0x30fdf923)
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<ffffffff810fa7e3>] timecounter_read+0x13/0x60
PGD 0
Oops: 0000 [#1] SMP
Modules linked in: rfcomm ipt_MASQUERADE nf_nat_masquerade_ipv4 xt_multiport xfrm_user xfrm_algo iptable_nat nf_nat_ipv4 br_netfilter bridge stp llc aufs pl2303 usbserial bnep arc4 snd_hda_codec_hdmi snd_soc_skl snd_soc_skl_ipc snd_hda_ext_core snd_soc_sst_ipc snd_hda_codec_realtek snd_soc_sst_dsp snd_hda_codec_generic nls_iso8859_1 snd_soc_core snd_compress ac97_bus snd_pcm_dmaengine dw_dmac_core snd_hda_intel iwlmvm snd_hda_codec snd_hda_core intel_rapl 8250_dw snd_hwdep mac80211 x86_pkg_temp_thermal intel_powerclamp coretemp snd_pcm kvm_intel kvm snd_seq_midi snd_seq_midi_event irqbypass crct10dif_pclmul crc32_pclmul iwlwifi ghash_clmulni_intel snd_rawmidi aesni_intel snd_seq aes_x86_64 lrw gf128mul glue_helper cfg80211 snd_seq_device ablk_helper cryptd snd_timer snd soundcore idma64
virt_dma shpchp ir_lirc_codec ir_xmp_decoder lirc_dev ir_mce_kbd_decoder ir_sharp_decoder intel_lpss_pci ir_sanyo_decoder btusb ir_sony_decoder hci_uart btrtl ir_jvc_decoder ir_rc6_decoder btbcm btqca ir_rc5_decoder btintel ir_nec_decoder bluetooth mei_me rc_rc6_mce ite_cir rc_core intel_lpss_acpi intel_lpss mei acpi_pad mac_hid acpi_als kfifo_buf industrialio ip6t_REJECT nf_reject_ipv6 nf_log_ipv6 xt_hl ip6t_rt nf_conntrack_ipv6 nf_defrag_ipv6 ipt_REJECT nf_reject_ipv4 nf_log_ipv4 nf_log_common xt_LOG xt_limit xt_tcpudp xt_addrtype nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack ip6table_filter ip6_tables nf_conntrack_netbios_ns nf_conntrack_broadcast nf_nat_ftp nf_nat nf_conntrack_ftp nf_conntrack iptable_filter ip_tables x_tables parport_pc ppdev sunrpc lp parport autofs4 i915_bpo intel_ips
i2c_algo_bit drm_kms_helper syscopyarea e1000e sysfillrect sysimgblt fb_sys_fops ptp sdhci_pci ahci drm pps_core sdhci libahci video pinctrl_sunrisepoint i2c_hid pinctrl_intel hid fjes
CPU: 3 PID: 15471 Comm: kworker/3:0 Not tainted 4.4.0-64-generic #85-Ubuntu
Hardware name: /NUC6i5SYB, BIOS SYSKLi35.86A.0051.2016.0804.1114 08/04/2016
Workqueue: events e1000e_systim_overflow_work [e1000e]
task: ffff880031f32d00 ti: ffff8800350e8000 task.ti: ffff8800350e8000
RIP: 0010:[<ffffffff810fa7e3>] [<ffffffff810fa7e3>] timecounter_read+0x13/0x60
RSP: 0018:ffff8800350ebdb0 EFLAGS: 00010046
RAX: 0000000000000000 RBX: ffff8800353ab7a0 RCX: 0000000000000001
RDX: 0000000000000001 RSI: ffff8800350ebdf8 RDI: 0000000000000000
RBP: ffff8800350ebdb8 R08: ffff88016ed965c0 R09: 0000000000000000
R10: 000000010035ffff R11: 0000000000000001 R12: ffff8800353ab780
R13: ffff8800350ebdf8 R14: 0000000000000246 R15: ffff8800353ab6d0
FS: 0000000000000000(0000) GS:ffff88016ed80000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 0000000002e0a000 CR4: 00000000003406e0
Stack:
ffff8800353ab7d0 ffff8800350ebde8 ffffffffc014d36e ffff8800353ab6d0
ffff88016ed965c0 ffff88016ed9af00 00000000000000c0 ffff8800350ebe18
ffffffffc014d521 ffffffff81837e26 ffff88016ed9af00 00000000a91221c0
Call Trace:
[<ffffffffc01...

DHCPREQUEST of 10.34.242.77 on eth0 to 10.32.65.65 port 67 (xid=0x30fdf923)
BUG: unable to handle kernel NULL pointer dereference at           (null)
IP: [<ffffffff810fa7e3>] timecounter_read+0x13/0x60
PGD 0 
Oops: 0000 [#1] SMP 
Modules linked in: rfcomm ipt_MASQUERADE nf_nat_masquerade_ipv4 xt_multiport xfrm_user xfrm_algo iptable_nat nf_nat_ipv4 br_netfilter bridge stp llc aufs pl2303 usbserial bnep arc4 snd_hda_codec_hdmi snd_soc_skl snd_soc_skl_ipc snd_hda_ext_core snd_soc_sst_ipc snd_hda_codec_realtek snd_soc_sst_dsp snd_hda_codec_generic nls_iso8859_1 snd_soc_core snd_compress ac97_bus snd_pcm_dmaengine dw_dmac_core snd_hda_intel iwlmvm snd_hda_codec snd_hda_core intel_rapl 8250_dw snd_hwdep mac80211 x86_pkg_temp_thermal intel_powerclamp coretemp snd_pcm kvm_intel kvm snd_seq_midi snd_seq_midi_event irqbypass crct10dif_pclmul crc32_pclmul iwlwifi ghash_clmulni_intel snd_rawmidi aesni_intel snd_seq aes_x86_64 lrw gf128mul glue_helper cfg80211 snd_seq_device ablk_helper cryptd snd_timer snd soundcore idma64
 virt_dma shpchp ir_lirc_codec ir_xmp_decoder lirc_dev ir_mce_kbd_decoder ir_sharp_decoder intel_lpss_pci ir_sanyo_decoder btusb ir_sony_decoder hci_uart btrtl ir_jvc_decoder ir_rc6_decoder btbcm btqca ir_rc5_decoder btintel ir_nec_decoder bluetooth mei_me rc_rc6_mce ite_cir rc_core intel_lpss_acpi intel_lpss mei acpi_pad mac_hid acpi_als kfifo_buf industrialio ip6t_REJECT nf_reject_ipv6 nf_log_ipv6 xt_hl ip6t_rt nf_conntrack_ipv6 nf_defrag_ipv6 ipt_REJECT nf_reject_ipv4 nf_log_ipv4 nf_log_common xt_LOG xt_limit xt_tcpudp xt_addrtype nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack ip6table_filter ip6_tables nf_conntrack_netbios_ns nf_conntrack_broadcast nf_nat_ftp nf_nat nf_conntrack_ftp nf_conntrack iptable_filter ip_tables x_tables parport_pc ppdev sunrpc lp parport autofs4 i915_bpo intel_ips
 i2c_algo_bit drm_kms_helper syscopyarea e1000e sysfillrect sysimgblt fb_sys_fops ptp sdhci_pci ahci drm pps_core sdhci libahci video pinctrl_sunrisepoint i2c_hid pinctrl_intel hid fjes
CPU: 3 PID: 15471 Comm: kworker/3:0 Not tainted 4.4.0-64-generic #85-Ubuntu
Hardware name:                  /NUC6i5SYB, BIOS SYSKLi35.86A.0051.2016.0804.1114 08/04/2016
Workqueue: events e1000e_systim_overflow_work [e1000e]
task: ffff880031f32d00 ti: ffff8800350e8000 task.ti: ffff8800350e8000
RIP: 0010:[<ffffffff810fa7e3>]  [<ffffffff810fa7e3>] timecounter_read+0x13/0x60
RSP: 0018:ffff8800350ebdb0  EFLAGS: 00010046
RAX: 0000000000000000 RBX: ffff8800353ab7a0 RCX: 0000000000000001
RDX: 0000000000000001 RSI: ffff8800350ebdf8 RDI: 0000000000000000
RBP: ffff8800350ebdb8 R08: ffff88016ed965c0 R09: 0000000000000000
R10: 000000010035ffff R11: 0000000000000001 R12: ffff8800353ab780
R13: ffff8800350ebdf8 R14: 0000000000000246 R15: ffff8800353ab6d0
FS:  0000000000000000(0000) GS:ffff88016ed80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 0000000002e0a000 CR4: 00000000003406e0
Stack:
 ffff8800353ab7d0 ffff8800350ebde8 ffffffffc014d36e ffff8800353ab6d0
 ffff88016ed965c0 ffff88016ed9af00 00000000000000c0 ffff8800350ebe18
 ffffffffc014d521 ffffffff81837e26 ffff88016ed9af00 00000000a91221c0
Call Trace:
 [<ffffffffc014d36e>] e1000e_phc_gettime+0x2e/0x60 [e1000e]
 [<ffffffffc014d521>] e1000e_systim_overflow_work+0x31/0xa0 [e1000e]
 [<ffffffff81837e26>] ? __schedule+0x3b6/0xa30
 [<ffffffff8109a515>] process_one_work+0x165/0x480
 [<ffffffff8109a87b>] worker_thread+0x4b/0x4c0
 [<ffffffff8109a830>] ? process_one_work+0x480/0x480
 [<ffffffff8109a830>] ? process_one_work+0x480/0x480
 [<ffffffff810a0ba8>] kthread+0xd8/0xf0
 [<ffffffff810a0ad0>] ? kthread_create_on_node+0x1e0/0x1e0
 [<ffffffff8183c98f>] ret_from_fork+0x3f/0x70
 [<ffffffff810a0ad0>] ? kthread_create_on_node+0x1e0/0x1e0
Code: 00 48 d3 e0 48 83 e8 01 48 89 43 18 5b 41 5c 41 5d 5d c3 0f 1f 44 00 00 0f 1f 44 00 00 55 48 89 e5 53 48 8b 07 48 89 fb 48 89 c7 <ff> 10 48 8b 33 48 89 c2 48 2b 53 08 8b 4e 10 48 23 56 08 48 0f 
RIP  [<ffffffff810fa7e3>] timecounter_read+0x13/0x60
 RSP <ffff8800350ebdb0>
CR2: 0000000000000000
---[ end trace 7d024538180dff79 ]---
BUG: unable to handle kernel paging request at ffffffffffffffd8
IP: [<ffffffff810a1250>] kthread_data+0x10/0x20
PGD 2e0d067 PUD 2e0f067 PMD 0 
Oops: 0000 [#2] SMP 
Modules linked in: rfcomm ipt_MASQUERADE nf_nat_masquerade_ipv4 xt_multiport xfrm_user xfrm_algo iptable_nat nf_nat_ipv4 br_netfilter bridge stp llc aufs pl2303 usbserial bnep arc4 snd_hda_codec_hdmi snd_soc_skl snd_soc_skl_ipc snd_hda_ext_core snd_soc_sst_ipc snd_hda_codec_realtek snd_soc_sst_dsp snd_hda_codec_generic nls_iso8859_1 snd_soc_core snd_compress ac97_bus snd_pcm_dmaengine dw_dmac_core snd_hda_intel iwlmvm snd_hda_codec snd_hda_core intel_rapl 8250_dw snd_hwdep mac80211 x86_pkg_temp_thermal intel_powerclamp coretemp snd_pcm kvm_intel kvm snd_seq_midi snd_seq_midi_event irqbypass crct10dif_pclmul crc32_pclmul iwlwifi ghash_clmulni_intel snd_rawmidi aesni_intel snd_seq aes_x86_64 lrw gf128mul glue_helper cfg80211 snd_seq_device ablk_helper cryptd snd_timer snd soundcore idma64
 virt_dma shpchp ir_lirc_codec ir_xmp_decoder lirc_dev ir_mce_kbd_decoder ir_sharp_decoder intel_lpss_pci ir_sanyo_decoder btusb ir_sony_decoder hci_uart btrtl ir_jvc_decoder ir_rc6_decoder btbcm btqca ir_rc5_decoder btintel ir_nec_decoder bluetooth mei_me rc_rc6_mce ite_cir rc_core intel_lpss_acpi intel_lpss mei acpi_pad mac_hid acpi_als kfifo_buf industrialio ip6t_REJECT nf_reject_ipv6 nf_log_ipv6 xt_hl ip6t_rt nf_conntrack_ipv6 nf_defrag_ipv6 ipt_REJECT nf_reject_ipv4 nf_log_ipv4 nf_log_common xt_LOG xt_limit xt_tcpudp xt_addrtype nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack ip6table_filter ip6_tables nf_conntrack_netbios_ns nf_conntrack_broadcast nf_nat_ftp nf_nat nf_conntrack_ftp nf_conntrack iptable_filter ip_tables x_tables parport_pc ppdev sunrpc lp parport autofs4 i915_bpo intel_ips
 i2c_algo_bit drm_kms_helper syscopyarea e1000e sysfillrect sysimgblt fb_sys_fops ptp sdhci_pci ahci drm pps_core sdhci libahci video pinctrl_sunrisepoint i2c_hid pinctrl_intel hid fjes
CPU: 3 PID: 15471 Comm: kworker/3:0 Tainted: G      D         4.4.0-64-generic #85-Ubuntu
Hardware name:                  /NUC6i5SYB, BIOS SYSKLi35.86A.0051.2016.0804.1114 08/04/2016
task: ffff880031f32d00 ti: ffff8800350e8000 task.ti: ffff8800350e8000
RIP: 0010:[<ffffffff810a1250>]  [<ffffffff810a1250>] kthread_data+0x10/0x20
RSP: 0018:ffff8800350ebaa8  EFLAGS: 00010002
RAX: 0000000000000000 RBX: 0000000000000003 RCX: ffffffff82109e80
RDX: 0000000000000003 RSI: 0000000000000003 RDI: ffff880031f32d00
RBP: ffff8800350ebaa8 R08: 00000000ffffffff R09: 0000000000000000
R10: ffff880031f32d60 R11: 0000000000005c00 R12: 0000000000000000
R13: 0000000000016dc0 R14: ffff880031f32d00 R15: ffff88016ed96dc0
FS:  0000000000000000(0000) GS:ffff88016ed80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000028 CR3: 00000001653c0000 CR4: 00000000003406e0
Stack:
 ffff8800350ebac0 ffffffff8109b8b1 ffff88016ed96dc0 ffff8800350ebb10
 ffffffff818380c0 ffff8800350ebb28 ffffffff00000003 ffff880031f32d00
 ffff8800350ec000 ffff880031f333d0 ffff8800350eb6c0 0000000000000046
Call Trace:
 [<ffffffff8109b8b1>] wq_worker_sleeping+0x11/0x90
 [<ffffffff818380c0>] __schedule+0x650/0xa30
 [<ffffffff818384d5>] schedule+0x35/0x80
 [<ffffffff81084425>] do_exit+0x775/0xb00
 [<ffffffff81031c41>] oops_end+0xa1/0xd0
 [<ffffffff8106ad05>] no_context+0x135/0x380
 [<ffffffff8106afd0>] __bad_area_nosemaphore+0x80/0x1f0
 [<ffffffff8106b153>] bad_area_nosemaphore+0x13/0x20
 [<ffffffff8106b417>] __do_page_fault+0xb7/0x400
 [<ffffffffc01ecb7b>] ? __i915_wait_request+0xcb/0x660 [i915_bpo]
 [<ffffffff8106b782>] do_page_fault+0x22/0x30
 [<ffffffff8183e778>] page_fault+0x28/0x30
 [<ffffffff810fa7e3>] ? timecounter_read+0x13/0x60
 [<ffffffffc014d36e>] e1000e_phc_gettime+0x2e/0x60 [e1000e]
 [<ffffffffc014d521>] e1000e_systim_overflow_work+0x31/0xa0 [e1000e]
 [<ffffffff81837e26>] ? __schedule+0x3b6/0xa30
 [<ffffffff8109a515>] process_one_work+0x165/0x480
 [<ffffffff8109a87b>] worker_thread+0x4b/0x4c0
 [<ffffffff8109a830>] ? process_one_work+0x480/0x480
 [<ffffffff8109a830>] ? process_one_work+0x480/0x480
 [<ffffffff810a0ba8>] kthread+0xd8/0xf0
 [<ffffffff810a0ad0>] ? kthread_create_on_node+0x1e0/0x1e0
 [<ffffffff8183c98f>] ret_from_fork+0x3f/0x70
 [<ffffffff810a0ad0>] ? kthread_create_on_node+0x1e0/0x1e0
Code: ff ff ff be 46 02 00 00 48 c7 c7 a0 ac cb 81 e8 c7 01 fe ff e9 a6 fe ff ff 66 90 0f 1f 44 00 00 48 8b 87 18 05 00 00 55 48 89 e5 <48> 8b 40 d8 5d c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 
RIP  [<ffffffff810a1250>] kthread_data+0x10/0x20
 RSP <ffff8800350ebaa8>
CR2: ffffffffffffffd8
---[ end trace 7d024538180dff7a ]---
Fixing recursive fault but reboot is needed!

Revision history for this message

Brad Figg (brad-figg) wrote on 2017-02-27: Status changed to Confirmed

#3

This change was made by a bot.

Changed in linux (Ubuntu):
status:	New → Confirmed

Joseph Salisbury (jsalisbury) on 2017-02-27

Changed in linux (Ubuntu):
importance:	Undecided → Medium
importance:	Medium → High
Changed in linux (Ubuntu Xenial):
importance:	Undecided → High
status:	New → Triaged
Changed in linux (Ubuntu):
status:	Confirmed → Triaged
tags:	added: kernel-da-key

Revision history for this message

Joseph Salisbury (jsalisbury) wrote on 2017-02-27:

#4

The lkml tread you referenced in the bug description was for commit 37b12910dd11d9ab969f2c310dc9160b7f3e3405. That commit landed upstream in v4.3.rc1, so it is already in the 4.4 based Xenial kernel.

Did this issue start happening after a recent upgrade, or after applying updates?

Revision history for this message

Joseph Salisbury (jsalisbury) wrote on 2017-02-27:

#5

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v4.10 kernel[0].

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.10

Revision history for this message

Olivier Louvignes (olouvignes) wrote on 2017-02-28:

#6

Hard to tell regarding updates, I'd say it started early january (but we did not really pay attention at first). Our players have unattended security upgrades so I'd say some kernel upgrade landing in january might have introduced a regression. As far as I know we never encountered this issue in 2016.

Do you think using 4.10 would be considered safe in production? I'm a bit afraid to (further) break production machine. Thanks!

Revision history for this message

Jay (jayanth-k) wrote on 2017-12-13:

#7

Hi,

We can confirm that this issue still persists as of 4.4.0-103-generic #126-Ubuntu SMP Mon Dec 4 16:23:28 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux this update.

It only happens on a soft reboot, i.e. 'sudo reboot'.

Not sure if there is a work around for it.

Ubuntu
linux package

Hard lockup after 4 hours uptime

Bug Description

Other bug subscribers

Bug attachments

Remote bug watches

Affects		Status	Importance	Assigned to	Milestone
	linux (Ubuntu)	Triaged	High	Unassigned
	Xenial	Triaged	High	Unassigned

Ubuntulinux package

Hard lockup after 4 hours uptime

Bug Description

Other bug subscribers

Bug attachments

Remote bug watches

Ubuntu
linux package