Kernel panic - kernel NULL pointer dereference; RIP is at blk_mq_put_rq_ref+0xa/0x60

Bug #1948471 reported by Maarten van den Berg
20
This bug affects 3 people
Affects Status Importance Assigned to Milestone
linux-meta-gcp-5.11 (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

We have recently seen seemingly random kernel panics on one of our instances in Google Compute Engine, running the `linux-gcp-5.11` kernel on Ubuntu 20.04.

The crashes happened weeks apart and seem to occur randomly, we have not found a way to consistently trigger the crashes.

We've captured the crash logs of the two crashes we've seen so far:

## Crash 1

[731972.720559] BUG: kernel NULL pointer dereference, address: 0000000000000000
[731972.727963] #PF: supervisor instruction fetch in kernel mode
[731972.733844] #PF: error_code(0x0010) - not-present page
[731972.739243] PGD 147111067 P4D 147111067 PUD 10fd90067 PMD 0
[731972.745117] Oops: 0010 [#1] SMP NOPTI
[731972.748989] CPU: 6 PID: 146349 Comm: node_exporter Not tainted 5.11.0-1018-gcp #20~20.04.2-Ubuntu
[731972.758078] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
[731972.767508] RIP: 0010:0x0
[731972.770345] Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6.
[731972.777862] RSP: 0018:ffffb971c105fb00 EFLAGS: 00010246
[731972.783298] RAX: 0000000000000000 RBX: ffffb971c105fb90 RCX: 0000000000000002
[731972.790643] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff9e4a8305dc00
[731972.797984] RBP: ffffb971c105fb08 R08: 0000000000000000 R09: 000000000000003b
[731972.805328] R10: ffff9e4aa8ac2000 R11: ffff9e4aa8ac13b2 R12: ffff9e4a8305dc00
[731972.812672] R13: ffff9e4a8305d000 R14: 0000000000000000 R15: 0000000000000001
[731972.820016] FS: 00007fb07dffb700(0000) GS:ffff9e693fb80000(0000) knlGS:0000000000000000
[731972.829050] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[731972.835006] CR2: ffffffffffffffd6 CR3: 00000001115a8000 CR4: 0000000000350ee0
[731972.843919] Call Trace:
[731972.846571] blk_mq_put_rq_ref+0x47/0x60
[731972.850710] bt_iter+0x54/0x90
[731972.854002] blk_mq_queue_tag_busy_iter+0x18b/0x2d0
[731972.859104] ? blk_mq_hctx_mark_pending+0x70/0x70
[731972.864026] ? blk_mq_hctx_mark_pending+0x70/0x70
[731972.868952] blk_mq_in_flight+0x38/0x60
[731972.873355] diskstats_show+0x75/0x2b0
[731972.877322] seq_read_iter+0x2a3/0x450
[731972.881288] proc_reg_read_iter+0x5e/0x80
[731972.885499] new_sync_read+0x110/0x1a0
[731972.890119] vfs_read+0x154/0x1b0
[731972.893647] ksys_read+0x67/0xe0
[731972.897092] __x64_sys_read+0x1a/0x20
[731972.900969] do_syscall_64+0x38/0x90
[731972.904870] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[731972.910124] RIP: 0033:0x4bd01b
[731972.913381] Code: fb ff eb bd e8 e6 22 fb ff e9 61 ff ff ff cc e8 5b f1 fa ff 48 8b 7c 24 10 48 8b 74 24 18 48 8b 54 24 20 48 8b 44 24 08 0f 05 <48> 3d 01 f0 ff ff 76 20 48 c7 44 24 28 ff ff ff ff 48 c7 44 24 30
[731972.933139] RSP: 002b:000000c0005797f0 EFLAGS: 00000206 ORIG_RAX: 0000000000000000
[731972.940915] RAX: ffffffffffffffda RBX: 000000c000045800 RCX: 00000000004bd01b
[731972.948249] RDX: 0000000000001000 RSI: 000000c00087a000 RDI: 0000000000000008
[731972.955586] RBP: 000000c000579840 R08: 0000000000000001 R09: 0000000000000002
[731972.963040] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
[731972.970378] R13: 0000000000000002 R14: 0000000000000002 R15: 0000000000000002
[731972.978074] Modules linked in: nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd virtio_net glue_helper net_failover input_leds failover psmouse serio_raw efi_pstore sch_fq_codel msr drm virtio_rng ip_tables x_tables autofs4
[731973.006538] CR2: 0000000000000000
[731973.010064] ---[ end trace a489911d719de581 ]---
[731973.135811] RIP: 0010:0x0
[731973.138658] Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6.
[731973.145731] RSP: 0018:ffffb971c105fb00 EFLAGS: 00010246
[731973.151331] RAX: 0000000000000000 RBX: ffffb971c105fb90 RCX: 0000000000000002
[731973.158961] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff9e4a8305dc00
[731973.166296] RBP: ffffb971c105fb08 R08: 0000000000000000 R09: 000000000000003b
[731973.173718] R10: ffff9e4aa8ac2000 R11: ffff9e4aa8ac13b2 R12: ffff9e4a8305dc00
[731973.181228] R13: ffff9e4a8305d000 R14: 0000000000000000 R15: 0000000000000001
[731973.189008] FS: 00007fb07dffb700(0000) GS:ffff9e693fb80000(0000) knlGS:0000000000000000
[731973.197297] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[731973.203242] CR2: ffffffffffffffd6 CR3: 00000001115a8000 CR4: 0000000000350ee0
[731973.210841] Kernel panic - not syncing: Fatal exception
[731973.222020] Kernel Offset: 0x1a200000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[731973.361494] Rebooting in 10 seconds..

## Crash 2

[1785249.917228] BUG: kernel NULL pointer dereference, address: 00000000000000c0
[1785249.924620] #PF: supervisor read access in kernel mode
[1785249.930041] #PF: error_code(0x0000) - not-present page
[1785249.935458] PGD 109bb2067 P4D 109bb2067 PUD 106a5d067 PMD 0
[1785249.941399] Oops: 0000 [#1] SMP NOPTI
[1785249.945339] CPU: 14 PID: 137099 Comm: node_exporter Not tainted 5.11.0-1020-gcp #22~20.04.1-Ubuntu
[1785249.954576] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
[1785249.964162] RIP: 0010:blk_mq_put_rq_ref+0xa/0x60
[1785249.970037] Code: 15 0f b6 d3 4c 89 e7 be 01 00 00 00 e8 cf fe ff ff 5b 41 5c 5d c3 0f 0b 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 8b 47 10 <48> 8b 80 c0 00 00 00 48 89 e5 48 3b 78 40 74 1f 4c 8d 87 e8 00 00
[1785249.989188] RSP: 0018:ffffb65081bcbb08 EFLAGS: 00010207
[1785249.994695] RAX: 0000000000000000 RBX: ffffb65081bcbb90 RCX: 0000000000000002
[1785250.002125] RDX: 0000000000000001 RSI: 0000000000000202 RDI: ffff8b74c2e42000
[1785250.009567] RBP: ffffb65081bcbb40 R08: 0000000000000000 R09: 0000000000000015
[1785250.017084] R10: 0000000000000364 R11: 0000000000000008 R12: ffff8b74c2e42000
[1785250.024946] R13: ffff8b74c2e41000 R14: 0000000000000000 R15: 0000000000000001
[1785250.032362] FS: 00007f7ce5ffb700(0000) GS:ffff8b937fd80000(0000) knlGS:0000000000000000
[1785250.040996] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[1785250.047453] CR2: 00000000000000c0 CR3: 00000001081d8000 CR4: 0000000000350ee0
[1785250.054973] Call Trace:
[1785250.057703] ? bt_iter+0x54/0x90
[1785250.061215] blk_mq_queue_tag_busy_iter+0x18b/0x2d0
[1785250.066461] ? blk_mq_hctx_mark_pending+0x70/0x70
[1785250.071967] ? blk_mq_hctx_mark_pending+0x70/0x70
[1785250.076958] blk_mq_in_flight+0x38/0x60
[1785250.081096] diskstats_show+0x75/0x2b0
[1785250.085145] seq_read_iter+0x2a3/0x450
[1785250.089271] proc_reg_read_iter+0x5e/0x80
[1785250.093648] new_sync_read+0x110/0x1a0
[1785250.097766] vfs_read+0x154/0x1b0
[1785250.101363] ksys_read+0x67/0xe0
[1785250.104887] __x64_sys_read+0x1a/0x20
[1785250.108829] do_syscall_64+0x38/0x90
[1785250.112772] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[1785250.118107] RIP: 0033:0x4bd01b
[1785250.121916] Code: fb ff eb bd e8 e6 22 fb ff e9 61 ff ff ff cc e8 5b f1 fa ff 48 8b 7c 24 10 48 8b 74 24 18 48 8b 54 24 20 48 8b 44 24 08 0f 05 <48> 3d 01 f0 ff ff 76 20 48 c7 44 24 28 ff ff ff ff 48 c7 44 24 30
[1785250.141048] RSP: 002b:000000c00007a7f0 EFLAGS: 00000206 ORIG_RAX: 0000000000000000
[1785250.148896] RAX: ffffffffffffffda RBX: 000000c000045800 RCX: 00000000004bd01b
[1785250.156309] RDX: 0000000000001000 RSI: 000000c00094d000 RDI: 0000000000000009
[1785250.163726] RBP: 000000c00007a840 R08: 0000000000000001 R09: 0000000000000002
[1785250.171401] R10: 0000000000000000 R11: 0000000000000206 R12: ffffffffffffffff
[1785250.178834] R13: 0000000000000002 R14: 0000000000000001 R15: 0000000000000002
[1785250.186425] Modules linked in: tcp_diag inet_diag btrfs blake2b_generic xor raid6_pq ufs msdos xfs nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua crct10dif_pclmul crc32_pclmul ghash_clmulni_intel input_leds aesni_intel virtio_net crypto_simd cryptd net_failover psmouse glue_helper failover serio_raw efi_pstore sch_fq_codel msr drm virtio_rng ip_tables x_tables autofs4
[1785250.221208] CR2: 00000000000000c0
[1785250.225329] ---[ end trace 2202978dadca9c1d ]---
[1785250.385966] RIP: 0010:blk_mq_put_rq_ref+0xa/0x60
[1785250.390984] Code: 15 0f b6 d3 4c 89 e7 be 01 00 00 00 e8 cf fe ff ff 5b 41 5c 5d c3 0f 0b 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 8b 47 10 <48> 8b 80 c0 00 00 00 48 89 e5 48 3b 78 40 74 1f 4c 8d 87 e8 00 00
[1785250.410204] RSP: 0018:ffffb65081bcbb08 EFLAGS: 00010207
[1785250.415707] RAX: 0000000000000000 RBX: ffffb65081bcbb90 RCX: 0000000000000002
[1785250.423124] RDX: 0000000000000001 RSI: 0000000000000202 RDI: ffff8b74c2e42000
[1785250.430745] RBP: ffffb65081bcbb40 R08: 0000000000000000 R09: 0000000000000015
[1785250.438422] R10: 0000000000000364 R11: 0000000000000008 R12: ffff8b74c2e42000
[1785250.445835] R13: ffff8b74c2e41000 R14: 0000000000000000 R15: 0000000000000001
[1785250.453424] FS: 00007f7ce5ffb700(0000) GS:ffff8b937fd80000(0000) knlGS:0000000000000000
[1785250.461888] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[1785250.467914] CR2: 00000000000000c0 CR3: 00000001081d8000 CR4: 0000000000350ee0
[1785250.475347] Kernel panic - not syncing: Fatal exception
[1785250.481954] Kernel Offset: 0x12200000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[1785250.589751] Rebooting in 10 seconds..

ProblemType: Bug
DistroRelease: Ubuntu 20.04
Package: linux-gcp 5.11.0.1021.23~20.04.20
ProcVersionSignature: Ubuntu 5.11.0-1021.23~20.04.1-gcp 5.11.22
Uname: Linux 5.11.0-1021-gcp x86_64
ApportVersion: 2.20.11-0ubuntu27.20
Architecture: amd64
CasperMD5CheckResult: skip
Date: Fri Oct 22 12:13:34 2021
SourcePackage: linux-meta-gcp-5.11
UpgradeStatus: No upgrade log present (probably fresh install)

Revision history for this message
Maarten van den Berg (maartenberg) wrote :
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in linux-meta-gcp-5.11 (Ubuntu):
status: New → Confirmed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.