NVMe triggering kernel panic followed by "bad: scheduling from the idle thread!"

Bug #1626679 reported by Paul Graydon
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Triaged
High
Unassigned

Bug Description

On an NVMe system I'm using, Ubuntu 16.04.1 regularly seems to trigger off a kernel panic against somepart of the NVMe driver it looks like, after which the logs get filled with entries over and over again of:

"bad: scheduling from the idle thread!"

Here's the initial stack trace that seems to trigger off the bug:

Sep 22 15:51:46 ubuntu kernel: [ 97.478175] ------------[ cut here ]------------
Sep 22 15:51:46 ubuntu kernel: [ 97.478185] WARNING: CPU: 13 PID: 0 at /build/linux-dcxD3m/linux-4.4.0/kernel/irq/manage.c:1438 __free_irq+0x1d2/0x280()
Sep 22 15:51:46 ubuntu kernel: [ 97.478188] Trying to free IRQ 38 from IRQ context!
Sep 22 15:51:46 ubuntu kernel: [ 97.478191] Modules linked in: nls_iso8859_1 ipmi_ssif intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass ioatdma me
i_me sb_edac shpchp edac_core lpc_ich mei 8250_fintek ipmi_msghandler mac_hid ib_iser rdma_cm iw_cm ib_cm ib_sa ib_mad ib_core ib_addr autofs4 btrfs iscsi_tcp libiscsi_tcp libiscsi
scsi_transport_iscsi raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_pclmul ixgbe crc32_pclmu
l dca vxlan aesni_intel ip6_udp_tunnel udp_tunnel aes_x86_64 lrw gf128mul ptp glue_helper ahci ablk_helper pps_core cryptd nvme libahci mdio wmi fjes
Sep 22 15:51:46 ubuntu kernel: [ 97.478257] CPU: 13 PID: 0 Comm: swapper/13 Not tainted 4.4.0-31-generic #50-Ubuntu
Sep 22 15:51:46 ubuntu kernel: [ 97.478260] Hardware name: Oracle Corporation ORACLE SERVER X5-2/ASM,MOTHERBOARD,1U, BIOS 30080100 04/13/2016
Sep 22 15:51:46 ubuntu kernel: [ 97.478263] 0000000000000286 4fea3140a01056a3 ffff883f7f743b10 ffffffff813f1143
Sep 22 15:51:46 ubuntu kernel: [ 97.478267] ffff883f7f743b58 ffffffff81cb61f8 ffff883f7f743b48 ffffffff81081102
Sep 22 15:51:46 ubuntu kernel: [ 97.478271] 0000000000000026 ffff883f5b2ea700 0000000000000026 00000000ffffffff
Sep 22 15:51:46 ubuntu kernel: [ 97.478275] Call Trace:
Sep 22 15:51:46 ubuntu kernel: [ 97.478277] <IRQ> [<ffffffff813f1143>] dump_stack+0x63/0x90
Sep 22 15:51:46 ubuntu kernel: [ 97.478290] [<ffffffff81081102>] warn_slowpath_common+0x82/0xc0
Sep 22 15:51:46 ubuntu kernel: [ 97.478294] [<ffffffff8108119c>] warn_slowpath_fmt+0x5c/0x80
Sep 22 15:51:46 ubuntu kernel: [ 97.478299] [<ffffffff81098b03>] ? try_to_grab_pending+0xb3/0x160
Sep 22 15:51:46 ubuntu kernel: [ 97.478302] [<ffffffff810dbaf2>] __free_irq+0x1d2/0x280
Sep 22 15:51:46 ubuntu kernel: [ 97.478306] [<ffffffff810dbc2c>] free_irq+0x3c/0x90
Sep 22 15:51:46 ubuntu kernel: [ 97.478314] [<ffffffffc0072199>] nvme_suspend_queue+0x89/0xb0 [nvme]
Sep 22 15:51:46 ubuntu kernel: [ 97.478320] [<ffffffffc00721e7>] nvme_disable_admin_queue+0x27/0x90 [nvme]
Sep 22 15:51:46 ubuntu kernel: [ 97.478325] [<ffffffffc00724ee>] nvme_dev_disable+0x29e/0x2c0 [nvme]
Sep 22 15:51:46 ubuntu kernel: [ 97.478330] [<ffffffffc0071420>] ? __nvme_process_cq+0x210/0x210 [nvme]
Sep 22 15:51:46 ubuntu kernel: [ 97.478334] [<ffffffff8154fc7c>] ? dev_warn+0x6c/0x90
Sep 22 15:51:46 ubuntu kernel: [ 97.478340] [<ffffffffc0072700>] nvme_timeout+0x110/0x1d0 [nvme]
Sep 22 15:51:46 ubuntu kernel: [ 97.478344] [<ffffffff813f0f4f>] ? cpumask_next_and+0x2f/0x40
Sep 22 15:51:46 ubuntu kernel: [ 97.478348] [<ffffffff810bd5dc>] ? load_balance+0x18c/0x980
Sep 22 15:51:46 ubuntu kernel: [ 97.478354] [<ffffffff813cc2ff>] blk_mq_rq_timed_out+0x2f/0x70
Sep 22 15:51:46 ubuntu kernel: [ 97.478358] [<ffffffff813cc38e>] blk_mq_check_expired+0x4e/0x80
Sep 22 15:51:46 ubuntu kernel: [ 97.478363] [<ffffffff813cece8>] bt_for_each+0xd8/0xe0
Sep 22 15:51:46 ubuntu kernel: [ 97.478367] [<ffffffff813cc340>] ? blk_mq_rq_timed_out+0x70/0x70
Sep 22 15:51:46 ubuntu kernel: [ 97.478370] [<ffffffff813cc340>] ? blk_mq_rq_timed_out+0x70/0x70
Sep 22 15:51:46 ubuntu kernel: [ 97.478375] [<ffffffff813cf4f7>] blk_mq_queue_tag_busy_iter+0x47/0xc0
Sep 22 15:51:46 ubuntu kernel: [ 97.478379] [<ffffffff813cb0a0>] ? blk_mq_attempt_merge+0xb0/0xb0
Sep 22 15:51:46 ubuntu kernel: [ 97.478383] [<ffffffff813cb0e1>] blk_mq_rq_timer+0x41/0xf0
Sep 22 15:51:46 ubuntu kernel: [ 97.478389] [<ffffffff810ec5e5>] call_timer_fn+0x35/0x120
Sep 22 15:51:46 ubuntu kernel: [ 97.478393] [<ffffffff813cb0a0>] ? blk_mq_attempt_merge+0xb0/0xb0
Sep 22 15:51:46 ubuntu kernel: [ 97.478397] [<ffffffff810ecf9a>] run_timer_softirq+0x23a/0x2f0
Sep 22 15:51:46 ubuntu kernel: [ 97.478403] [<ffffffff81085b51>] __do_softirq+0x101/0x290
Sep 22 15:51:46 ubuntu kernel: [ 97.478407] [<ffffffff81085e53>] irq_exit+0xa3/0xb0
Sep 22 15:51:46 ubuntu kernel: [ 97.478413] [<ffffffff818305e2>] smp_apic_timer_interrupt+0x42/0x50
Sep 22 15:51:46 ubuntu kernel: [ 97.478417] [<ffffffff8182e8a2>] apic_timer_interrupt+0x82/0x90
Sep 22 15:51:46 ubuntu kernel: [ 97.478419] <EOI> [<ffffffff816c3c11>] ? cpuidle_enter_state+0x111/0x2b0
Sep 22 15:51:46 ubuntu kernel: [ 97.478428] [<ffffffff816c3de7>] cpuidle_enter+0x17/0x20
Sep 22 15:51:46 ubuntu kernel: [ 97.478432] [<ffffffff810c3fe2>] call_cpuidle+0x32/0x60
Sep 22 15:51:46 ubuntu kernel: [ 97.478436] [<ffffffff816c3dc3>] ? cpuidle_select+0x13/0x20
Sep 22 15:51:46 ubuntu kernel: [ 97.478440] [<ffffffff810c42a0>] cpu_startup_entry+0x290/0x350
Sep 22 15:51:46 ubuntu kernel: [ 97.478444] [<ffffffff81051714>] start_secondary+0x154/0x190
Sep 22 15:51:46 ubuntu kernel: [ 97.478448] ---[ end trace 4f4c67e52b4d19ac ]---

then

Sep 22 15:51:46 ubuntu kernel: [ 97.478463] BUG: scheduling while atomic: swapper/13/0/0x00000100
Sep 22 15:51:46 ubuntu kernel: [ 97.551653] Modules linked in: nls_iso8859_1 ipmi_ssif intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass ioatdma mei_me sb_edac shpchp edac_core lpc_ich mei 8250_fintek ipmi_msghandler mac_hid ib_iser rdma_cm iw_cm ib_cm ib_sa ib_mad ib_core ib_addr autofs4 btrfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_pclmul ixgbe crc32_pclmul dca vxlan aesni_intel ip6_udp_tunnel udp_tunnel aes_x86_64 lrw gf128mul ptp glue_helper ahci ablk_helper pps_core cryptd nvme libahci mdio wmi fjes
Sep 22 15:51:46 ubuntu kernel: [ 97.551671] CPU: 13 PID: 0 Comm: swapper/13 Tainted: G W 4.4.0-31-generic #50-Ubuntu
Sep 22 15:51:46 ubuntu kernel: [ 97.551672] Hardware name: Oracle Corporation ORACLE SERVER X5-2/ASM,MOTHERBOARD,1U, BIOS 30080100 04/13/2016
Sep 22 15:51:46 ubuntu kernel: [ 97.551673] 0000000000000286 4fea3140a01056a3 ffff883f7f743a98 ffffffff813f1143
Sep 22 15:51:46 ubuntu kernel: [ 97.551674] ffff883f7f756d00 0000000000000000 ffff883f7f743aa8 ffffffff810a5e4b
Sep 22 15:51:46 ubuntu kernel: [ 97.551677] ffff883f7f743af8 ffffffff818296e6 ffff883f5b340400 ffff883f0000000d
Sep 22 15:51:46 ubuntu kernel: [ 97.551679] Call Trace:
Sep 22 15:51:46 ubuntu kernel: [ 97.551679] <IRQ> [<ffffffff813f1143>] dump_stack+0x63/0x90
Sep 22 15:51:46 ubuntu kernel: [ 97.551688] [<ffffffff810a5e4b>] __schedule_bug+0x4b/0x60
Sep 22 15:51:46 ubuntu kernel: [ 97.551691] [<ffffffff818296e6>] __schedule+0x726/0xa30
Sep 22 15:51:46 ubuntu kernel: [ 97.551693] [<ffffffff81829a25>] schedule+0x35/0x80
Sep 22 15:51:46 ubuntu kernel: [ 97.551694] [<ffffffff8182cab9>] schedule_timeout+0x129/0x270
Sep 22 15:51:46 ubuntu kernel: [ 97.551695] [<ffffffff810ec5a0>] ? trace_event_raw_event_tick_stop+0x120/0x120
Sep 22 15:51:46 ubuntu kernel: [ 97.551697] [<ffffffff810ec9bd>] msleep+0x2d/0x40
Sep 22 15:51:46 ubuntu kernel: [ 97.551699] [<ffffffffc006e510>] nvme_wait_ready+0x90/0x100 [nvme]
Sep 22 15:51:46 ubuntu kernel: [ 97.551701] [<ffffffffc006fdf0>] nvme_disable_ctrl+0x40/0x50 [nvme]
Sep 22 15:51:46 ubuntu kernel: [ 97.551702] [<ffffffffc007224d>] nvme_disable_admin_queue+0x8d/0x90 [nvme]
Sep 22 15:51:46 ubuntu kernel: [ 97.551704] [<ffffffffc00724ee>] nvme_dev_disable+0x29e/0x2c0 [nvme]
Sep 22 15:51:46 ubuntu kernel: [ 97.551706] [<ffffffffc0071420>] ? __nvme_process_cq+0x210/0x210 [nvme]
Sep 22 15:51:46 ubuntu kernel: [ 97.551707] [<ffffffff8154fc7c>] ? dev_warn+0x6c/0x90
Sep 22 15:51:46 ubuntu kernel: [ 97.551708] [<ffffffffc0072700>] nvme_timeout+0x110/0x1d0 [nvme]
Sep 22 15:51:46 ubuntu kernel: [ 97.551710] [<ffffffff813f0f4f>] ? cpumask_next_and+0x2f/0x40
Sep 22 15:51:46 ubuntu kernel: [ 97.551711] [<ffffffff810bd5dc>] ? load_balance+0x18c/0x980
Sep 22 15:51:46 ubuntu kernel: [ 97.551712] [<ffffffff813cc2ff>] blk_mq_rq_timed_out+0x2f/0x70
Sep 22 15:51:46 ubuntu kernel: [ 97.551714] [<ffffffff813cc38e>] blk_mq_check_expired+0x4e/0x80
Sep 22 15:51:46 ubuntu kernel: [ 97.551715] [<ffffffff813cece8>] bt_for_each+0xd8/0xe0
Sep 22 15:51:46 ubuntu kernel: [ 97.551717] [<ffffffff813cc340>] ? blk_mq_rq_timed_out+0x70/0x70
Sep 22 15:51:46 ubuntu kernel: [ 97.551718] [<ffffffff813cc340>] ? blk_mq_rq_timed_out+0x70/0x70
Sep 22 15:51:46 ubuntu kernel: [ 97.551719] [<ffffffff813cf4f7>] blk_mq_queue_tag_busy_iter+0x47/0xc0
Sep 22 15:51:46 ubuntu kernel: [ 97.551720] [<ffffffff813cb0a0>] ? blk_mq_attempt_merge+0xb0/0xb0
Sep 22 15:51:46 ubuntu kernel: [ 97.551722] [<ffffffff813cb0e1>] blk_mq_rq_timer+0x41/0xf0
Sep 22 15:51:46 ubuntu kernel: [ 97.551723] [<ffffffff810ec5e5>] call_timer_fn+0x35/0x120
Sep 22 15:51:46 ubuntu kernel: [ 97.551724] [<ffffffff813cb0a0>] ? blk_mq_attempt_merge+0xb0/0xb0
Sep 22 15:51:46 ubuntu kernel: [ 97.551726] [<ffffffff810ecf9a>] run_timer_softirq+0x23a/0x2f0
Sep 22 15:51:46 ubuntu kernel: [ 97.551727] [<ffffffff81085b51>] __do_softirq+0x101/0x290
Sep 22 15:51:46 ubuntu kernel: [ 97.551729] [<ffffffff81085e53>] irq_exit+0xa3/0xb0
Sep 22 15:51:46 ubuntu kernel: [ 97.551730] [<ffffffff818305e2>] smp_apic_timer_interrupt+0x42/0x50
Sep 22 15:51:46 ubuntu kernel: [ 97.551731] [<ffffffff8182e8a2>] apic_timer_interrupt+0x82/0x90
Sep 22 15:51:46 ubuntu kernel: [ 97.551732] <EOI> [<ffffffff816c3c11>] ? cpuidle_enter_state+0x111/0x2b0
Sep 22 15:51:46 ubuntu kernel: [ 97.551735] [<ffffffff816c3de7>] cpuidle_enter+0x17/0x20
Sep 22 15:51:46 ubuntu kernel: [ 97.551736] [<ffffffff810c3fe2>] call_cpuidle+0x32/0x60
Sep 22 15:51:46 ubuntu kernel: [ 97.551737] [<ffffffff816c3dc3>] ? cpuidle_select+0x13/0x20
Sep 22 15:51:46 ubuntu kernel: [ 97.551738] [<ffffffff810c42a0>] cpu_startup_entry+0x290/0x350
Sep 22 15:51:46 ubuntu kernel: [ 97.551740] [<ffffffff81051714>] start_secondary+0x154/0x190
Sep 22 15:51:46 ubuntu kernel: [ 97.551741] bad: scheduling from the idle thread!
Sep 22 15:51:46 ubuntu kernel: [ 97.608224] CPU: 13 PID: 0 Comm: swapper/13 Tainted: G W 4.4.0-31-generic #50-Ubuntu
Sep 22 15:51:46 ubuntu kernel: [ 97.608225] Hardware name: Oracle Corporation ORACLE SERVER X5-2/ASM,MOTHERBOARD,1U, BIOS 30080100 04/13/2016
Sep 22 15:51:46 ubuntu kernel: [ 97.608226] 0000000000000286 4fea3140a01056a3 ffff883f7f743a68 ffffffff813f1143
Sep 22 15:51:46 ubuntu kernel: [ 97.608227] ffff883f7f756d00 0000000000000000 ffff883f7f743a80 ffffffff810b1fbc
Sep 22 15:51:46 ubuntu kernel: [ 97.608228] ffff883f7f756d00 ffff883f7f743aa8 ffffffff810aaf91 ffff883f7f756d00
Sep 22 15:51:46 ubuntu kernel: [ 97.608229] Call Trace:
Sep 22 15:51:46 ubuntu kernel: [ 97.608230] <IRQ> [<ffffffff813f1143>] dump_stack+0x63/0x90
Sep 22 15:51:46 ubuntu kernel: [ 97.608233] [<ffffffff810b1fbc>] dequeue_task_idle+0x2c/0x40
Sep 22 15:51:46 ubuntu kernel: [ 97.608235] [<ffffffff810aaf91>] deactivate_task+0x81/0xa0
Sep 22 15:51:46 ubuntu kernel: [ 97.608236] [<ffffffff818290af>] __schedule+0xef/0xa30
Sep 22 15:51:46 ubuntu kernel: [ 97.608238] [<ffffffff81829a25>] schedule+0x35/0x80
Sep 22 15:51:46 ubuntu kernel: [ 97.608239] [<ffffffff8182cab9>] schedule_timeout+0x129/0x270
Sep 22 15:51:46 ubuntu kernel: [ 97.608240] [<ffffffff810ec5a0>] ? trace_event_raw_event_tick_stop+0x120/0x120
Sep 22 15:51:46 ubuntu kernel: [ 97.608242] [<ffffffff810ec9bd>] msleep+0x2d/0x40
Sep 22 15:51:46 ubuntu kernel: [ 97.608244] [<ffffffffc006e510>] nvme_wait_ready+0x90/0x100 [nvme]
Sep 22 15:51:46 ubuntu kernel: [ 97.608245] [<ffffffffc006fdf0>] nvme_disable_ctrl+0x40/0x50 [nvme]
Sep 22 15:51:46 ubuntu kernel: [ 97.608247] [<ffffffffc007224d>] nvme_disable_admin_queue+0x8d/0x90 [nvme]
Sep 22 15:51:46 ubuntu kernel: [ 97.608250] [<ffffffffc00724ee>] nvme_dev_disable+0x29e/0x2c0 [nvme]
Sep 22 15:51:46 ubuntu kernel: [ 97.608251] [<ffffffffc0071420>] ? __nvme_process_cq+0x210/0x210 [nvme]
Sep 22 15:51:46 ubuntu kernel: [ 97.608253] [<ffffffff8154fc7c>] ? dev_warn+0x6c/0x90
Sep 22 15:51:46 ubuntu kernel: [ 97.608254] [<ffffffffc0072700>] nvme_timeout+0x110/0x1d0 [nvme]
Sep 22 15:51:46 ubuntu kernel: [ 97.608256] [<ffffffff813f0f4f>] ? cpumask_next_and+0x2f/0x40
Sep 22 15:51:46 ubuntu kernel: [ 97.608257] [<ffffffff810bd5dc>] ? load_balance+0x18c/0x980
Sep 22 15:51:46 ubuntu kernel: [ 97.608258] [<ffffffff813cc2ff>] blk_mq_rq_timed_out+0x2f/0x70
Sep 22 15:51:46 ubuntu kernel: [ 97.608259] [<ffffffff813cc38e>] blk_mq_check_expired+0x4e/0x80
Sep 22 15:51:46 ubuntu kernel: [ 97.608261] [<ffffffff813cece8>] bt_for_each+0xd8/0xe0
Sep 22 15:51:46 ubuntu kernel: [ 97.608263] [<ffffffff813cc340>] ? blk_mq_rq_timed_out+0x70/0x70
Sep 22 15:51:46 ubuntu kernel: [ 97.608264] [<ffffffff813cc340>] ? blk_mq_rq_timed_out+0x70/0x70
Sep 22 15:51:46 ubuntu kernel: [ 97.608266] [<ffffffff813cf4f7>] blk_mq_queue_tag_busy_iter+0x47/0xc0
Sep 22 15:51:46 ubuntu kernel: [ 97.608267] [<ffffffff813cb0a0>] ? blk_mq_attempt_merge+0xb0/0xb0
Sep 22 15:51:46 ubuntu kernel: [ 97.608268] [<ffffffff813cb0e1>] blk_mq_rq_timer+0x41/0xf0
Sep 22 15:51:46 ubuntu kernel: [ 97.608270] [<ffffffff810ec5e5>] call_timer_fn+0x35/0x120
Sep 22 15:51:46 ubuntu kernel: [ 97.608272] [<ffffffff813cb0a0>] ? blk_mq_attempt_merge+0xb0/0xb0
Sep 22 15:51:46 ubuntu kernel: [ 97.608273] [<ffffffff810ecf9a>] run_timer_softirq+0x23a/0x2f0
Sep 22 15:51:46 ubuntu kernel: [ 97.608275] [<ffffffff81085b51>] __do_softirq+0x101/0x290
Sep 22 15:51:46 ubuntu kernel: [ 97.608276] [<ffffffff81085e53>] irq_exit+0xa3/0xb0
Sep 22 15:51:46 ubuntu kernel: [ 97.608278] [<ffffffff818305e2>] smp_apic_timer_interrupt+0x42/0x50
Sep 22 15:51:46 ubuntu kernel: [ 97.608279] [<ffffffff8182e8a2>] apic_timer_interrupt+0x82/0x90
Sep 22 15:51:46 ubuntu kernel: [ 97.608280] <EOI> [<ffffffff816c3c11>] ? cpuidle_enter_state+0x111/0x2b0
Sep 22 15:51:46 ubuntu kernel: [ 97.608282] [<ffffffff816c3de7>] cpuidle_enter+0x17/0x20
Sep 22 15:51:46 ubuntu kernel: [ 97.608284] [<ffffffff810c3fe2>] call_cpuidle+0x32/0x60
Sep 22 15:51:46 ubuntu kernel: [ 97.608286] [<ffffffff816c3dc3>] ? cpuidle_select+0x13/0x20
Sep 22 15:51:46 ubuntu kernel: [ 97.608287] [<ffffffff810c42a0>] cpu_startup_entry+0x290/0x350
Sep 22 15:51:46 ubuntu kernel: [ 97.608288] [<ffffffff81051714>] start_secondary+0x154/0x190
Sep 22 15:51:46 ubuntu kernel: [ 97.608591] nvme 0000:2b:00.0: Cancelling I/O 0 QID 0

It largely continues on from there. I'll attach a copy of a kern.log with the full details in it.

ProblemType: Bug
DistroRelease: Ubuntu 16.04
Package: linux-image-4.4.0-38-generic 4.4.0-38.57
ProcVersionSignature: Ubuntu 4.4.0-31.50-generic 4.4.13
Uname: Linux 4.4.0-31-generic x86_64
AlsaDevices:
 total 0
 crw-rw---- 1 root audio 116, 1 Sep 22 17:40 seq
 crw-rw---- 1 root audio 116, 33 Sep 22 17:40 timer
AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
ApportVersion: 2.20.1-0ubuntu2.1
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
Date: Thu Sep 22 17:53:45 2016
HibernationDevice: RESUME=UUID=4d659f94-d8b2-49f5-befe-a02b9a5f8677
IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
Lsusb:
 Bus 002 Device 002: ID 8087:8002 Intel Corp.
 Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
 Bus 001 Device 002: ID 8087:800a Intel Corp.
 Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
MachineType: Oracle Corporation ORACLE SERVER X5-2
PciMultimedia:

ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcFB: 0 EFI VGA
ProcKernelCmdLine: BOOT_IMAGE=(http)/kernel initrd=initrd root=/dev/sda3 ro netroot=iscsi:@169.254.0.2::3260::iqn.2015-02.oracle.boot:uefi crashkernel=auto ip=dhcp iscsi_initiator=iqn.2015-10.oracle:2.1g1538-gb000393 LANG=en_US.UTF-8 console=ttyS0,9600 console=tty0
RelatedPackageVersions:
 linux-restricted-modules-4.4.0-31-generic N/A
 linux-backports-modules-4.4.0-31-generic N/A
 linux-firmware 1.157.3
RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 04/13/2016
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: 30080100
dmi.board.asset.tag: 7317947
dmi.board.name: ASM,MOTHERBOARD,1U
dmi.board.vendor: Oracle Corporation
dmi.board.version: Rev 04
dmi.chassis.asset.tag: 7331406
dmi.chassis.type: 17
dmi.chassis.vendor: Oracle Corporation
dmi.chassis.version: ORACLE SERVER X5-2
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr30080100:bd04/13/2016:svnOracleCorporation:pnORACLESERVERX5-2:pvr:rvnOracleCorporation:rnASM,MOTHERBOARD,1U:rvrRev04:cvnOracleCorporation:ct17:cvrORACLESERVERX5-2:
dmi.product.name: ORACLE SERVER X5-2
dmi.sys.vendor: Oracle Corporation

Revision history for this message
Paul Graydon (twirrim) wrote :
Revision history for this message
Paul Graydon (twirrim) wrote :

gzip'd copy of the kern.log showing the error.

Revision history for this message
Brad Figg (brad-figg) wrote : Status changed to Confirmed

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
Changed in linux (Ubuntu):
importance: Undecided → High
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Can you test the Yakkety proposed kernel and post back if it resolves this bug?

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed.

Changed in linux (Ubuntu):
status: Confirmed → Triaged
tags: added: kernel-da-key
Revision history for this message
Paul Graydon (twirrim) wrote :

There isn't a kernel in proposed at the moment, but I've tested using the latest in yakkety and it seems to be working fine.

I don't have a simple replication case for the bug, unfortunately. It just seems to happen for (hand-wavey guess) 50% of boots.

So far I've got this 4.8.0-19-generic kernel to boot several times over without problem. I'll keep rebooting and rebooting the server in the background today, just in case, while I focus on other stuff.

Revision history for this message
Patricia Gaughen (gaughen) wrote :

Paul - have you confirmed that you are no longer seeing the issue? If yes, please update this bug with the info.

Revision history for this message
Keith Busch (keith-busch) wrote :

A couple issues here. The nvme driver was ported from 4.5, but the block layer was based on 4.4, so there was a mismatch in how to handle timeouts. That was this launchpad:

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1597908

But also, there is a bug handling legacy IRQ that only affected the 4.5 version of this driver, and that was fixed in commit: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit?id=a5229050b69cfffb690b546c357ca5a60434c0c8

Revision history for this message
Tim Gardner (timg-tpi) wrote :

That patch was released in Ubuntu-4.4.0-35.54

Revision history for this message
Sam Stoelinga (sammiestoel) wrote :

I think I'm seeing the same issue, but not sure. Let me know if I should file a new bug:

```
[31816.378948] bad: scheduling from the idle thread!
[31816.378950] CPU: 19 PID: 0 Comm: swapper/19 Tainted: G W OEL 4.4.0-21-generic #37-Ubuntu
[31816.378950] Hardware name: Supermicro SYS-F618R2-RC0PT+/X10DRFR-NT, BIOS 2.0 01/27/2016
[31816.378952] 0000000000000286 43008b208dbadff6 ffff882fa6283e18 ffffffff813e93c3
[31816.378953] ffff885fbec56d00 0000000000000000 ffff882fa6283e30 ffffffff810b1d2c
[31816.378954] ffff885fbec56d00 ffff882fa6283e58 ffffffff810aacc1 00000001ffffff10
[31816.378954] Call Trace:
[31816.378956] [<ffffffff813e93c3>] dump_stack+0x63/0x90
[31816.378958] [<ffffffff810b1d2c>] dequeue_task_idle+0x2c/0x40
[31816.378959] [<ffffffff810aacc1>] deactivate_task+0x81/0xa0
[31816.378961] [<ffffffff8181ff7f>] __schedule+0x5cf/0xa10
[31816.378961] [<ffffffff818203f5>] schedule+0x35/0x80
[31816.378962] [<ffffffff8182069e>] schedule_preempt_disabled+0xe/0x10
[31816.378964] [<ffffffff810c3f11>] cpu_startup_entry+0x191/0x350
[31816.378965] [<ffffffff810516f4>] start_secondary+0x154/0x190
```

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.