Comment 19 for bug 1966870

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

As the last stage in the crash is in
  ? blk_mq_exit_hctx+0x160/0x160
I was looking if there was anything else with block devices going on.
I found another crash right at boot/init time (this one is also in the attached currentDmesg.txt).

[ 537.566942] ------------[ cut here ]------------
[ 537.566946] WARNING: CPU: 7 PID: 2421 at block/blk-mq.c:3087 blk_mq_release+0x45/0xe0
[ 537.566958] Modules linked in: nbd(+) xt_comment zfs(PO) zunicode(PO) zzstd(O) zlua(O) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) vhost_vsock vmw_vsock_virtio_transport_common vhost vhost_iotlb vsock xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp ip6table_mangle ip6table_nat iptable_mangle iptable_nat nf_tables ip6table_filter ip6_tables iptable_filter bpfilter bridge stp llc nfnetlink_cttimeout nfnetlink openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 uio_pci_generic uio nls_iso8859_1 rpcrdma sunrpc rdma_ucm ib_iser libiscsi scsi_transport_iscsi ib_umad rdma_cm ib_ipoib iw_cm ib_cm dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ipmi_ssif intel_rapl_msr intel_rapl_common sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm rapl intel_cstate efi_pstore ioatdma hpilo acpi_ipmi ipmi_si acpi_tad mac_hid acpi_power_meter sch_fq_codel ipmi_devintf ipmi_msghandler msr ip_tables x_tables autofs4 btrfs
[ 537.567073] blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear mlx5_ib ib_uverbs ib_core ses enclosure mgag200 i2c_algo_bit crct10dif_pclmul crc32_pclmul ghash_clmulni_intel drm_kms_helper syscopyarea sysfillrect aesni_intel sysimgblt mlx5_core fb_sys_fops pci_hyperv_intf crypto_simd ixgbe cec psample cryptd xfrm_algo nvme i2c_i801 rc_core mlxfw hpsa xhci_pci dca drm lpc_ich i2c_smbus tg3 xhci_pci_renesas nvme_core tls mdio scsi_transport_sas wmi
[ 537.567146] CPU: 7 PID: 2421 Comm: modprobe Tainted: P O 5.13.0-27-generic #29~20.04.1-Ubuntu
[ 537.567151] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS P89 01/22/2018
[ 537.567154] RIP: 0010:blk_mq_release+0x45/0xe0
[ 537.567159] Code: 48 31 d2 eb 07 83 c2 01 39 f2 74 27 48 63 c2 48 8b 04 c7 48 85 c0 74 ed 48 8b 88 30 02 00 00 48 05 30 02 00 00 48 39 c1 75 db <0f> 0b 83 c2 01 39 f2 75 d9 49 8b 84 24 90 05 00 00 49 8d 9c 24 90
[ 537.567163] RSP: 0018:ffffc287c17f3a58 EFLAGS: 00010246
[ 537.567167] RAX: ffff9f2611f50230 RBX: ffff9f2609bfefa0 RCX: ffff9f2611f50230
[ 537.567170] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff9f261e60ef38
[ 537.567173] RBP: ffffc287c17f3a70 R08: 0000000000000004 R09: 000000000000002c
[ 537.567175] R10: ffff9f2601c07800 R11: 00000000000001b6 R12: ffff9f2609bfef20
[ 537.567178] R13: ffff9f2609bfef20 R14: ffff9f2609bfefa0 R15: 0000000000000000
[ 537.567181] FS: 00007f1728b36680(0000) GS:ffff9f2d5fbc0000(0000) knlGS:0000000000000000
[ 537.567185] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 537.567188] CR2: 00007f814132242d CR3: 0000000119bf4003 CR4: 00000000001706e0
[ 537.567191] Call Trace:
[ 537.567197] blk_release_queue+0xbc/0x140
[ 537.567203] kobject_release+0x4b/0x160
[ 537.567211] kobject_put+0x49/0x60
[ 537.567215] blk_put_queue+0x12/0x20
[ 537.567224] disk_release+0x68/0x90
[ 537.567231] device_release+0x3b/0xa0
[ 537.567239] kobject_release+0x4b/0x160
[ 537.567243] kobject_put+0x49/0x60
[ 537.567247] put_device+0x13/0x20
[ 537.567252] put_disk+0x1b/0x20
[ 537.567258] nbd_dev_add+0x259/0x2b0 [nbd]
[ 537.567275] nbd_init+0x11a/0x1000 [nbd]
[ 537.567284] ? 0xffffffffc0ffd000
[ 537.567287] do_one_initcall+0x48/0x1d0
[ 537.567298] ? __cond_resched+0x19/0x30
[ 537.567308] ? kmem_cache_alloc_trace+0x37c/0x440
[ 537.567319] do_init_module+0x62/0x260
[ 537.567325] load_module+0x125d/0x1440
[ 537.567332] __do_sys_finit_module+0xc2/0x120
[ 537.567337] ? __do_sys_finit_module+0xc2/0x120
[ 537.567343] __x64_sys_finit_module+0x1a/0x20
[ 537.567347] do_syscall_64+0x61/0xb0
[ 537.567354] ? exit_to_user_mode_prepare+0x3d/0x1c0
[ 537.567362] ? syscall_exit_to_user_mode+0x27/0x50
[ 537.567367] ? __x64_sys_newfstat+0x16/0x20
[ 537.567373] ? do_syscall_64+0x6e/0xb0
[ 537.567379] ? asm_exc_page_fault+0x8/0x30
[ 537.567385] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 537.567390] RIP: 0033:0x7f1728c7876d
[ 537.567394] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d f3 36 0d 00 f7 d8 64 89 01 48
[ 537.567398] RSP: 002b:00007ffcfaa186a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[ 537.567402] RAX: ffffffffffffffda RBX: 0000559308b38000 RCX: 00007f1728c7876d
[ 537.567405] RDX: 0000000000000000 RSI: 0000559307d34358 RDI: 0000000000000003
[ 537.567407] RBP: 0000000000060000 R08: 0000000000000000 R09: 0000000000000000
[ 537.567409] R10: 0000000000000003 R11: 0000000000000246 R12: 0000559307d34358
[ 537.567411] R13: 0000000000000000 R14: 0000559308b38110 R15: 0000559308b38000
[ 537.567416] ---[ end trace a8e7ceefa2a2e201 ]---

So maybe the latter crash is a follow on consequence of the former?