Kernel traces leading to crash - refcount_t: underflow; use-after-free and refcount_t: saturated; leaking memory -- lib/refcount.c

Bug #2051123 reported by Christian Rohmann
16
This bug affects 3 people
Affects Status Importance Assigned to Milestone
linux-hwe-6.5 (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

A few hours after upgrading a machine serving as VM hypervisor running OpenStack Nova + libvirt from linux kernel 6.2.0-37-generic to 6.5.0-14-generic we observed kernel traces and quick disintegration of the system and its various processes.

While the TCP connection itself was accepted, we were unable to log in via SSH anymore or use the console.
A hard reset was required to get the machine back up. We went back to the former HWE kernel version, 6.2.0-37-generic, and have not observed any issues since.

Attached is all of the kernel log from bootup to the crash - this is where the issues started ...

```
[...]
Jan 23 11:36:13 fra-az1-comp-21 kernel: vxlan-304: fa:16:3e:7f:e2:6f migrated from 10.101.11.98 to 10.101.11.101
Jan 23 11:41:05 fra-az1-comp-21 kernel: hrtimer: interrupt took 32482 ns
Jan 23 12:02:41 fra-az1-comp-21 kernel: clocksource: timekeeping watchdog on CPU50: hpet wd-wd read-back delay of 245561ns
Jan 23 12:02:41 fra-az1-comp-21 kernel: clocksource: wd-tsc-wd read-back delay of 245561ns, clock-skew test skipped!
Jan 23 12:18:18 fra-az1-comp-21 kernel: perf: interrupt took too long (2509 > 2500), lowering kernel.perf_event_max_sample_rate to 79500
Jan 23 12:44:35 fra-az1-comp-21 kernel: clocksource: timekeeping watchdog on CPU127: hpet wd-wd read-back delay of 244863ns
Jan 23 12:44:35 fra-az1-comp-21 kernel: clocksource: wd-tsc-wd read-back delay of 245352ns, clock-skew test skipped!
Jan 23 13:08:56 fra-az1-comp-21 kernel: clocksource: timekeeping watchdog on CPU16: hpet wd-wd read-back delay of 243257ns
Jan 23 13:08:56 fra-az1-comp-21 kernel: clocksource: wd-tsc-wd read-back delay of 247517ns, clock-skew test skipped!
Jan 23 14:13:58 fra-az1-comp-21 kernel: clocksource: timekeeping watchdog on CPU52: hpet wd-wd read-back delay of 248076ns
Jan 23 14:13:58 fra-az1-comp-21 kernel: clocksource: wd-tsc-wd read-back delay of 245142ns, clock-skew test skipped!
Jan 23 14:31:18 fra-az1-comp-21 kernel: clocksource: timekeeping watchdog on CPU124: hpet wd-wd read-back delay of 245073ns
Jan 23 14:31:18 fra-az1-comp-21 kernel: clocksource: wd-tsc-wd read-back delay of 231034ns, clock-skew test skipped!
Jan 23 15:13:52 fra-az1-comp-21 kernel: clocksource: timekeeping watchdog on CPU24: hpet wd-wd read-back delay of 244863ns
Jan 23 15:13:52 fra-az1-comp-21 kernel: clocksource: wd-tsc-wd read-back delay of 245701ns, clock-skew test skipped!
Jan 23 15:35:18 fra-az1-comp-21 kernel: clocksource: timekeeping watchdog on CPU76: hpet wd-wd read-back delay of 245282ns
Jan 23 15:35:18 fra-az1-comp-21 kernel: clocksource: wd-tsc-wd read-back delay of 245841ns, clock-skew test skipped!
Jan 23 16:10:49 fra-az1-comp-21 kernel: clocksource: timekeeping watchdog on CPU27: hpet wd-wd read-back delay of 244653ns
Jan 23 16:10:49 fra-az1-comp-21 kernel: clocksource: wd-tsc-wd read-back delay of 245980ns, clock-skew test skipped!
Jan 23 16:12:49 fra-az1-comp-21 kernel: workqueue: drain_vmap_area_work hogged CPU for >10000us 4 times, consider switching to WQ_UNBOUND
Jan 23 16:20:56 fra-az1-comp-21 kernel: clocksource: timekeeping watchdog on CPU0: hpet wd-wd read-back delay of 242907ns
Jan 23 16:20:56 fra-az1-comp-21 kernel: clocksource: wd-tsc-wd read-back delay of 247796ns, clock-skew test skipped!
Jan 23 16:25:04 fra-az1-comp-21 kernel: ------------[ cut here ]------------
Jan 23 16:25:04 fra-az1-comp-21 kernel: refcount_t: underflow; use-after-free.
Jan 23 16:25:04 fra-az1-comp-21 kernel: WARNING: CPU: 84 PID: 7072 at lib/refcount.c:28 refcount_warn_saturate+0xa3/0x150
Jan 23 16:25:04 fra-az1-comp-21 kernel: Modules linked in: xt_multiport ebt_arp nft_meta_bridge xt_CT xt_mac xt_set xt_state ip_set_hash_net ip_set vhost_net vhost vhost_iotlb tap xt_policy xt_REDIRECT xt_nat xt_connmark xt_mark vxlan ip6_udp_tunnel udp_tunnel xt_comment xt_physdev veth xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink xfrm_user xfrm_algo nvme_fabrics 8021q garp mrp br_netfilter bridge stp llc bonding binfmt_misc tls nls_ascii ipmi_ssif intel_rapl_msr intel_rapl_common amd64_edac edac_mce_amd kvm_amd kvm irqbypass rapl wmi_bmof irdma ib_uverbs ib_core joydev input_leds ccp k10temp ptdma switchtec acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler mac_hid sch_fq_codel efi_pstore ip_tables x_tables autofs4 dm_crypt raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid0 multipath linear hid_generic usbhid hid cdc_ether usbnet mii ast i2c_algo_bit drm_shmem_helper
Jan 23 16:25:04 fra-az1-comp-21 kernel: crct10dif_pclmul raid1 crc32_pclmul drm_kms_helper ice polyval_clmulni polyval_generic ghash_clmulni_intel aesni_intel crypto_simd cryptd ahci nvme gnss drm i40e libahci xhci_pci i2c_piix4 xhci_pci_renesas nvme_core nvme_common wmi
Jan 23 16:25:04 fra-az1-comp-21 kernel: CPU: 84 PID: 7072 Comm: nova-compute Not tainted 6.5.0-14-generic #14~22.04.1-Ubuntu
Jan 23 16:25:04 fra-az1-comp-21 kernel: Hardware name: ASUSTeK COMPUTER INC. RS720A-E11-RS24U/KMPP-D32 Series, BIOS 1501 08/23/2023
Jan 23 16:25:04 fra-az1-comp-21 kernel: RIP: 0010:refcount_warn_saturate+0xa3/0x150
Jan 23 16:25:04 fra-az1-comp-21 kernel: Code: 93 00 0f b6 1d 69 60 dd 01 80 fb 01 0f 87 33 d6 8d 00 83 e3 01 75 dd 48 c7 c7 68 97 7c ac c6 05 4d 60 dd 01 01 e8 cd e7 8f ff <0f> 0b eb c6 0f b6 1d 40 60 dd 01 80 fb 01 0f 87 f3 d5 8d 00 83 e3
Jan 23 16:25:04 fra-az1-comp-21 kernel: RSP: 0018:ffffab7ed6c03b90 EFLAGS: 00010246
Jan 23 16:25:04 fra-az1-comp-21 kernel: RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
Jan 23 16:25:04 fra-az1-comp-21 kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
Jan 23 16:25:04 fra-az1-comp-21 kernel: RBP: ffffab7ed6c03b98 R08: 0000000000000000 R09: 0000000000000000
Jan 23 16:25:04 fra-az1-comp-21 kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
Jan 23 16:25:04 fra-az1-comp-21 kernel: R13: ffff9b3fae203ec0 R14: 0000000000000002 R15: ffff9a4ac6aefa00
Jan 23 16:25:04 fra-az1-comp-21 kernel: FS: 00007f6916504000(0000) GS:ffff9b3e5ed00000(0000) knlGS:0000000000000000
Jan 23 16:25:04 fra-az1-comp-21 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 23 16:25:04 fra-az1-comp-21 kernel: CR2: 00007f6915aea250 CR3: 000001008f468006 CR4: 0000000000770ee0
Jan 23 16:25:04 fra-az1-comp-21 kernel: PKRU: 55555554
Jan 23 16:25:04 fra-az1-comp-21 kernel: Call Trace:
Jan 23 16:25:04 fra-az1-comp-21 kernel: <TASK>
Jan 23 16:25:04 fra-az1-comp-21 kernel: ? show_regs+0x6d/0x80
Jan 23 16:25:04 fra-az1-comp-21 kernel: ? __warn+0x89/0x160
Jan 23 16:25:04 fra-az1-comp-21 kernel: ? refcount_warn_saturate+0xa3/0x150
Jan 23 16:25:04 fra-az1-comp-21 kernel: ? report_bug+0x17e/0x1b0
Jan 23 16:25:04 fra-az1-comp-21 kernel: ? srso_alias_return_thunk+0x5/0x7f
Jan 23 16:25:04 fra-az1-comp-21 kernel: ? handle_bug+0x46/0x90
Jan 23 16:25:04 fra-az1-comp-21 kernel: ? exc_invalid_op+0x18/0x80
Jan 23 16:25:04 fra-az1-comp-21 kernel: ? asm_exc_invalid_op+0x1b/0x20
Jan 23 16:25:04 fra-az1-comp-21 kernel: ? refcount_warn_saturate+0xa3/0x150
Jan 23 16:25:04 fra-az1-comp-21 kernel: __sock_wfree+0x6d/0x80
Jan 23 16:25:04 fra-az1-comp-21 kernel: skb_release_head_state+0x27/0xb0
Jan 23 16:25:04 fra-az1-comp-21 kernel: kfree_skb_list_reason+0x5e/0x260
Jan 23 16:25:04 fra-az1-comp-21 kernel: skb_release_data+0x14e/0x200
Jan 23 16:25:04 fra-az1-comp-21 kernel: __kfree_skb+0x2b/0x50
Jan 23 16:25:04 fra-az1-comp-21 kernel: __tcp_close+0x93/0x400
Jan 23 16:25:04 fra-az1-comp-21 kernel: tcp_close+0x24/0x90
Jan 23 16:25:04 fra-az1-comp-21 kernel: tls_sk_proto_close+0xfb/0x2d0 [tls]
Jan 23 16:25:04 fra-az1-comp-21 kernel: ? srso_alias_return_thunk+0x5/0x7f
Jan 23 16:25:04 fra-az1-comp-21 kernel: ? wp_page_reuse+0x67/0x90
Jan 23 16:25:04 fra-az1-comp-21 kernel: inet_release+0x44/0x90
Jan 23 16:25:04 fra-az1-comp-21 kernel: __sock_release+0x40/0xc0
Jan 23 16:25:04 fra-az1-comp-21 kernel: sock_close+0x15/0x30
Jan 23 16:25:04 fra-az1-comp-21 kernel: __fput+0xfc/0x2c0
Jan 23 16:25:04 fra-az1-comp-21 kernel: ____fput+0xe/0x20
Jan 23 16:25:04 fra-az1-comp-21 kernel: task_work_run+0x61/0xa0
Jan 23 16:25:04 fra-az1-comp-21 kernel: exit_to_user_mode_loop+0x100/0x130
Jan 23 16:25:04 fra-az1-comp-21 kernel: exit_to_user_mode_prepare+0xa5/0xb0
Jan 23 16:25:04 fra-az1-comp-21 kernel: syscall_exit_to_user_mode+0x29/0x60
Jan 23 16:25:04 fra-az1-comp-21 kernel: do_syscall_64+0x67/0x90
Jan 23 16:25:04 fra-az1-comp-21 kernel: ? srso_alias_return_thunk+0x5/0x7f
Jan 23 16:25:04 fra-az1-comp-21 kernel: ? do_user_addr_fault+0x17a/0x6b0
Jan 23 16:25:04 fra-az1-comp-21 kernel: ? srso_alias_return_thunk+0x5/0x7f
Jan 23 16:25:04 fra-az1-comp-21 kernel: ? exit_to_user_mode_prepare+0x30/0xb0
Jan 23 16:25:04 fra-az1-comp-21 kernel: ? srso_alias_return_thunk+0x5/0x7f
Jan 23 16:25:04 fra-az1-comp-21 kernel: ? irqentry_exit_to_user_mode+0x17/0x20
Jan 23 16:25:04 fra-az1-comp-21 kernel: ? srso_alias_return_thunk+0x5/0x7f
Jan 23 16:25:04 fra-az1-comp-21 kernel: ? irqentry_exit+0x43/0x50
Jan 23 16:25:04 fra-az1-comp-21 kernel: ? srso_alias_return_thunk+0x5/0x7f
Jan 23 16:25:04 fra-az1-comp-21 kernel: ? exc_page_fault+0x94/0x1b0
Jan 23 16:25:04 fra-az1-comp-21 kernel: entry_SYSCALL_64_after_hwframe+0x6e/0xd8
Jan 23 16:25:04 fra-az1-comp-21 kernel: RIP: 0033:0x7f6916314f8b
Jan 23 16:25:04 fra-az1-comp-21 kernel: Code: 03 00 00 00 0f 05 48 3d 00 f0 ff ff 77 41 c3 48 83 ec 18 89 7c 24 0c e8 73 ba f7 ff 8b 7c 24 0c 41 89 c0 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 35 44 89 c7 89 44 24 0c e8 c1 ba f7 ff 8b 44
Jan 23 16:25:04 fra-az1-comp-21 kernel: RSP: 002b:00007ffe93a5edf0 EFLAGS: 00000293 ORIG_RAX: 0000000000000003
Jan 23 16:25:04 fra-az1-comp-21 kernel: RAX: 0000000000000000 RBX: 00007f691125f8e0 RCX: 00007f6916314f8b
Jan 23 16:25:04 fra-az1-comp-21 kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000007
Jan 23 16:25:04 fra-az1-comp-21 kernel: RBP: 0000000000000007 R08: 0000000000000000 R09: 0000000000000000
Jan 23 16:25:04 fra-az1-comp-21 kernel: R10: 000055ab7a67e4d0 R11: 0000000000000293 R12: 000055ab7a69a520
Jan 23 16:25:04 fra-az1-comp-21 kernel: R13: 000055ab7c859610 R14: 0000000000000000 R15: 000055ab79249300
Jan 23 16:25:04 fra-az1-comp-21 kernel: </TASK>
Jan 23 16:25:04 fra-az1-comp-21 kernel: ---[ end trace 0000000000000000 ]---
Jan 23 16:25:04 fra-az1-comp-21 kernel: BUG: kernel NULL pointer dereference, address: 0000000000000008
Jan 23 16:25:04 fra-az1-comp-21 kernel: #PF: supervisor read access in kernel mode
Jan 23 16:25:04 fra-az1-comp-21 kernel: #PF: error_code(0x0000) - not-present page
Jan 23 16:25:04 fra-az1-comp-21 kernel: PGD 0 P4D 0
Jan 23 16:25:04 fra-az1-comp-21 kernel: Oops: 0000 [#1] PREEMPT SMP NOPTI
Jan 23 16:25:04 fra-az1-comp-21 kernel: CPU: 84 PID: 7072 Comm: nova-compute Tainted: G W 6.5.0-14-generic #14~22.04.1-Ubuntu
Jan 23 16:25:04 fra-az1-comp-21 kernel: Hardware name: ASUSTeK COMPUTER INC. RS720A-E11-RS24U/KMPP-D32 Series, BIOS 1501 08/23/2023
Jan 23 16:25:04 fra-az1-comp-21 kernel: RIP: 0010:skb_release_data+0x10e/0x200
Jan 23 16:25:04 fra-az1-comp-21 kernel: Code: 7e 48 83 fe 11 0f 87 ef 00 00 00 4c 89 e2 48 c1 e2 04 49 8b 5c 15 30 84 c0 79 0f 44 89 f6 48 89 df e8 36 d2 06 00 84 c0 75 c1 <48> 8b 43 08 a8 01 0f 85 b2 00 00 00 0f 1f 44 00 00 66 90 f0 ff 4b
Jan 23 16:25:04 fra-az1-comp-21 kernel: RSP: 0018:ffffab7ed6c03b90 EFLAGS: 00010246
Jan 23 16:25:04 fra-az1-comp-21 kernel: RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
Jan 23 16:25:04 fra-az1-comp-21 kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9b57e02aa000
Jan 23 16:25:04 fra-az1-comp-21 kernel: RBP: ffffab7ed6c03bc0 R08: 0000000000000000 R09: 0000000000000000
Jan 23 16:25:04 fra-az1-comp-21 kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
Jan 23 16:25:04 fra-az1-comp-21 kernel: R13: ffff9b74e4214240 R14: 0000000000000000 R15: ffff9b57e02aa000
Jan 23 16:25:04 fra-az1-comp-21 kernel: FS: 00007f6916504000(0000) GS:ffff9b3e5ed00000(0000) knlGS:0000000000000000
Jan 23 16:25:04 fra-az1-comp-21 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 23 16:25:04 fra-az1-comp-21 kernel: CR2: 0000000000000008 CR3: 000001008f468006 CR4: 0000000000770ee0
Jan 23 16:25:04 fra-az1-comp-21 kernel: PKRU: 55555554
Jan 23 16:25:04 fra-az1-comp-21 kernel: Call Trace:
Jan 23 16:25:04 fra-az1-comp-21 kernel: <TASK>
Jan 23 16:25:04 fra-az1-comp-21 kernel: ? show_regs+0x6d/0x80
Jan 23 16:25:04 fra-az1-comp-21 kernel: ? __die+0x24/0x80
Jan 23 16:25:04 fra-az1-comp-21 kernel: ? page_fault_oops+0x99/0x1b0
Jan 23 16:25:04 fra-az1-comp-21 kernel: ? do_user_addr_fault+0x31d/0x6b0
Jan 23 16:25:04 fra-az1-comp-21 kernel: ? exc_page_fault+0x83/0x1b0
Jan 23 16:25:04 fra-az1-comp-21 kernel: ? asm_exc_page_fault+0x27/0x30
Jan 23 16:25:04 fra-az1-comp-21 kernel: ? skb_release_data+0x10e/0x200
Jan 23 16:25:04 fra-az1-comp-21 kernel: kfree_skb_list_reason+0x75/0x260
Jan 23 16:25:04 fra-az1-comp-21 kernel: skb_release_data+0x14e/0x200
Jan 23 16:25:04 fra-az1-comp-21 kernel: __kfree_skb+0x2b/0x50
Jan 23 16:25:04 fra-az1-comp-21 kernel: __tcp_close+0x93/0x400
Jan 23 16:25:04 fra-az1-comp-21 kernel: tcp_close+0x24/0x90
Jan 23 16:25:04 fra-az1-comp-21 kernel: tls_sk_proto_close+0xfb/0x2d0 [tls]
Jan 23 16:25:04 fra-az1-comp-21 kernel: ? srso_alias_return_thunk+0x5/0x7f
Jan 23 16:25:04 fra-az1-comp-21 kernel: ? wp_page_reuse+0x67/0x90
Jan 23 16:25:04 fra-az1-comp-21 kernel: inet_release+0x44/0x90
Jan 23 16:25:04 fra-az1-comp-21 kernel: __sock_release+0x40/0xc0
Jan 23 16:25:04 fra-az1-comp-21 kernel: sock_close+0x15/0x30
Jan 23 16:25:04 fra-az1-comp-21 kernel: __fput+0xfc/0x2c0
Jan 23 16:25:04 fra-az1-comp-21 kernel: ____fput+0xe/0x20
Jan 23 16:25:04 fra-az1-comp-21 kernel: task_work_run+0x61/0xa0
Jan 23 16:25:04 fra-az1-comp-21 kernel: exit_to_user_mode_loop+0x100/0x130
Jan 23 16:25:04 fra-az1-comp-21 kernel: exit_to_user_mode_prepare+0xa5/0xb0
Jan 23 16:25:04 fra-az1-comp-21 kernel: syscall_exit_to_user_mode+0x29/0x60
Jan 23 16:25:04 fra-az1-comp-21 kernel: do_syscall_64+0x67/0x90
Jan 23 16:25:04 fra-az1-comp-21 kernel: ? srso_alias_return_thunk+0x5/0x7f
Jan 23 16:25:04 fra-az1-comp-21 kernel: ? do_user_addr_fault+0x17a/0x6b0
Jan 23 16:25:04 fra-az1-comp-21 kernel: ? srso_alias_return_thunk+0x5/0x7f
Jan 23 16:25:04 fra-az1-comp-21 kernel: ? exit_to_user_mode_prepare+0x30/0xb0
Jan 23 16:25:04 fra-az1-comp-21 kernel: ? srso_alias_return_thunk+0x5/0x7f
Jan 23 16:25:04 fra-az1-comp-21 kernel: ? irqentry_exit_to_user_mode+0x17/0x20
Jan 23 16:25:04 fra-az1-comp-21 kernel: ? srso_alias_return_thunk+0x5/0x7f
Jan 23 16:25:04 fra-az1-comp-21 kernel: ? irqentry_exit+0x43/0x50
Jan 23 16:25:04 fra-az1-comp-21 kernel: ? srso_alias_return_thunk+0x5/0x7f
Jan 23 16:25:04 fra-az1-comp-21 kernel: ? exc_page_fault+0x94/0x1b0
Jan 23 16:25:04 fra-az1-comp-21 kernel: entry_SYSCALL_64_after_hwframe+0x6e/0xd8
Jan 23 16:25:04 fra-az1-comp-21 kernel: RIP: 0033:0x7f6916314f8b
Jan 23 16:25:04 fra-az1-comp-21 kernel: Code: 03 00 00 00 0f 05 48 3d 00 f0 ff ff 77 41 c3 48 83 ec 18 89 7c 24 0c e8 73 ba f7 ff 8b 7c 24 0c 41 89 c0 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 35 44 89 c7 89 44 24 0c e8 c1 ba f7 ff 8b 44
Jan 23 16:25:04 fra-az1-comp-21 kernel: RSP: 002b:00007ffe93a5edf0 EFLAGS: 00000293 ORIG_RAX: 0000000000000003
Jan 23 16:25:04 fra-az1-comp-21 kernel: RAX: 0000000000000000 RBX: 00007f691125f8e0 RCX: 00007f6916314f8b
Jan 23 16:25:04 fra-az1-comp-21 kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000007
Jan 23 16:25:04 fra-az1-comp-21 kernel: RBP: 0000000000000007 R08: 0000000000000000 R09: 0000000000000000
Jan 23 16:25:04 fra-az1-comp-21 kernel: R10: 000055ab7a67e4d0 R11: 0000000000000293 R12: 000055ab7a69a520
Jan 23 16:25:04 fra-az1-comp-21 kernel: R13: 000055ab7c859610 R14: 0000000000000000 R15: 000055ab79249300
Jan 23 16:25:04 fra-az1-comp-21 kernel: </TASK>
Jan 23 16:25:04 fra-az1-comp-21 kernel: Modules linked in: xt_multiport ebt_arp nft_meta_bridge xt_CT xt_mac xt_set xt_state ip_set_hash_net ip_set vhost_net vhost vhost_iotlb tap xt_policy xt_REDIRECT xt_nat xt_connmark xt_mark vxlan ip6_udp_tunnel udp_tunnel xt_comment xt_physdev veth xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink xfrm_user xfrm_algo nvme_fabrics 8021q garp mrp br_netfilter bridge stp llc bonding binfmt_misc tls nls_ascii ipmi_ssif intel_rapl_msr intel_rapl_common amd64_edac edac_mce_amd kvm_amd kvm irqbypass rapl wmi_bmof irdma ib_uverbs ib_core joydev input_leds ccp k10temp ptdma switchtec acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler mac_hid sch_fq_codel efi_pstore ip_tables x_tables autofs4 dm_crypt raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid0 multipath linear hid_generic usbhid hid cdc_ether usbnet mii ast i2c_algo_bit drm_shmem_helper
Jan 23 16:25:04 fra-az1-comp-21 kernel: crct10dif_pclmul raid1 crc32_pclmul drm_kms_helper ice polyval_clmulni polyval_generic ghash_clmulni_intel aesni_intel crypto_simd cryptd ahci nvme gnss drm i40e libahci xhci_pci i2c_piix4 xhci_pci_renesas nvme_core nvme_common wmi
Jan 23 16:25:04 fra-az1-comp-21 kernel: CR2: 0000000000000008
Jan 23 16:25:04 fra-az1-comp-21 kernel: ---[ end trace 0000000000000000 ]---
Jan 23 16:25:04 fra-az1-comp-21 kernel: RIP: 0010:skb_release_data+0x10e/0x200
Jan 23 16:25:04 fra-az1-comp-21 kernel: Code: 7e 48 83 fe 11 0f 87 ef 00 00 00 4c 89 e2 48 c1 e2 04 49 8b 5c 15 30 84 c0 79 0f 44 89 f6 48 89 df e8 36 d2 06 00 84 c0 75 c1 <48> 8b 43 08 a8 01 0f 85 b2 00 00 00 0f 1f 44 00 00 66 90 f0 ff 4b
Jan 23 16:25:04 fra-az1-comp-21 kernel: RSP: 0018:ffffab7ed6c03b90 EFLAGS: 00010246
Jan 23 16:25:04 fra-az1-comp-21 kernel: RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
Jan 23 16:25:04 fra-az1-comp-21 kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9b57e02aa000
Jan 23 16:25:04 fra-az1-comp-21 kernel: RBP: ffffab7ed6c03bc0 R08: 0000000000000000 R09: 0000000000000000
Jan 23 16:25:04 fra-az1-comp-21 kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
Jan 23 16:25:04 fra-az1-comp-21 kernel: R13: ffff9b74e4214240 R14: 0000000000000000 R15: ffff9b57e02aa000
Jan 23 16:25:04 fra-az1-comp-21 kernel: FS: 00007f6916504000(0000) GS:ffff9b3e5ed00000(0000) knlGS:0000000000000000
Jan 23 16:25:04 fra-az1-comp-21 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 23 16:25:04 fra-az1-comp-21 kernel: CR2: 0000000000000008 CR3: 000001008f468006 CR4: 0000000000770ee0
Jan 23 16:25:04 fra-az1-comp-21 kernel: PKRU: 55555554
Jan 23 16:25:04 fra-az1-comp-21 kernel: note: nova-compute[7072] exited with irqs disabled
Jan 23 16:25:10 fra-az1-comp-21 kernel: ------------[ cut here ]------------
Jan 23 16:25:10 fra-az1-comp-21 kernel: refcount_t: saturated; leaking memory.
Jan 23 16:25:10 fra-az1-comp-21 kernel: WARNING: CPU: 64 PID: 8919 at lib/refcount.c:22 refcount_warn_saturate+0x148/0x150
Jan 23 16:25:10 fra-az1-comp-21 kernel: Modules linked in: xt_multiport ebt_arp nft_meta_bridge xt_CT xt_mac xt_set xt_state ip_set_hash_net ip_set vhost_net vhost vhost_iotlb tap xt_policy xt_REDIRECT xt_nat xt_connmark xt_mark vxlan ip6_udp_tunnel udp_tunnel xt_comment xt_physdev veth xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink xfrm_user xfrm_algo nvme_fabrics 8021q garp mrp br_netfilter bridge stp llc bonding binfmt_misc tls nls_ascii ipmi_ssif intel_rapl_msr intel_rapl_common amd64_edac edac_mce_amd kvm_amd kvm irqbypass rapl wmi_bmof irdma ib_uverbs ib_core joydev input_leds ccp k10temp ptdma switchtec acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler mac_hid sch_fq_codel efi_pstore ip_tables x_tables autofs4 dm_crypt raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid0 multipath linear hid_generic usbhid hid cdc_ether usbnet mii ast i2c_algo_bit drm_shmem_helper
Jan 23 16:25:10 fra-az1-comp-21 kernel: crct10dif_pclmul raid1 crc32_pclmul drm_kms_helper ice polyval_clmulni polyval_generic ghash_clmulni_intel aesni_intel crypto_simd cryptd ahci nvme gnss drm i40e libahci xhci_pci i2c_piix4 xhci_pci_renesas nvme_core nvme_common wmi
Jan 23 16:25:10 fra-az1-comp-21 kernel: CPU: 64 PID: 8919 Comm: msgr-worker-0 Tainted: G D W 6.5.0-14-generic #14~22.04.1-Ubuntu
Jan 23 16:25:10 fra-az1-comp-21 kernel: Hardware name: ASUSTeK COMPUTER INC. RS720A-E11-RS24U/KMPP-D32 Series, BIOS 1501 08/23/2023
Jan 23 16:25:10 fra-az1-comp-21 kernel: RIP: 0010:refcount_warn_saturate+0x148/0x150
Jan 23 16:25:10 fra-az1-comp-21 kernel: Code: 38 97 7c ac c6 05 c3 5f dd 01 01 e8 42 e7 8f ff 0f 0b e9 38 ff ff ff 48 c7 c7 10 97 7c ac c6 05 aa 5f dd 01 01 e8 28 e7 8f ff <0f> 0b e9 1e ff ff ff 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90
Jan 23 16:25:10 fra-az1-comp-21 kernel: RSP: 0018:ffffab7ed41ff900 EFLAGS: 00010246
Jan 23 16:25:10 fra-az1-comp-21 kernel: RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
Jan 23 16:25:10 fra-az1-comp-21 kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
Jan 23 16:25:10 fra-az1-comp-21 kernel: RBP: ffffab7ed41ff908 R08: 0000000000000000 R09: 0000000000000000
Jan 23 16:25:10 fra-az1-comp-21 kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffffab7ed41ff948
Jan 23 16:25:10 fra-az1-comp-21 kernel: R13: ffff9a83b49bbae8 R14: ffff9a83b49bbb10 R15: ffff9b3ff1a56c00
Jan 23 16:25:10 fra-az1-comp-21 kernel: FS: 00007fca43dfd640(0000) GS:ffff9b3e5e800000(0000) knlGS:0000000000000000
Jan 23 16:25:10 fra-az1-comp-21 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 23 16:25:10 fra-az1-comp-21 kernel: CR2: 0000560dce5bcfb0 CR3: 000001016af3e004 CR4: 0000000000770ee0
Jan 23 16:25:10 fra-az1-comp-21 kernel: PKRU: 55555554
Jan 23 16:25:10 fra-az1-comp-21 kernel: Call Trace:
Jan 23 16:25:10 fra-az1-comp-21 kernel: <TASK>
Jan 23 16:25:10 fra-az1-comp-21 kernel: ? show_regs+0x6d/0x80
Jan 23 16:25:10 fra-az1-comp-21 kernel: ? __warn+0x89/0x160
Jan 23 16:25:10 fra-az1-comp-21 kernel: ? refcount_warn_saturate+0x148/0x150
Jan 23 16:25:10 fra-az1-comp-21 kernel: ? report_bug+0x17e/0x1b0
Jan 23 16:25:10 fra-az1-comp-21 kernel: ? srso_alias_return_thunk+0x5/0x7f
Jan 23 16:25:10 fra-az1-comp-21 kernel: ? handle_bug+0x46/0x90
Jan 23 16:25:10 fra-az1-comp-21 kernel: ? exc_invalid_op+0x18/0x80
Jan 23 16:25:10 fra-az1-comp-21 kernel: ? asm_exc_invalid_op+0x1b/0x20
Jan 23 16:25:10 fra-az1-comp-21 kernel: ? refcount_warn_saturate+0x148/0x150
Jan 23 16:25:10 fra-az1-comp-21 kernel: __tcp_transmit_skb+0x89b/0xa10
Jan 23 16:25:10 fra-az1-comp-21 kernel: tcp_write_xmit+0x4aa/0xac0
Jan 23 16:25:10 fra-az1-comp-21 kernel: ? srso_alias_return_thunk+0x5/0x7f
Jan 23 16:25:10 fra-az1-comp-21 kernel: ? __check_object_size.part.0+0x72/0x150
Jan 23 16:25:10 fra-az1-comp-21 kernel: __tcp_push_pending_frames+0x37/0x110
Jan 23 16:25:10 fra-az1-comp-21 kernel: tcp_push+0x123/0x190
Jan 23 16:25:10 fra-az1-comp-21 kernel: tcp_sendmsg_locked+0x9ad/0xd60
Jan 23 16:25:10 fra-az1-comp-21 kernel: ? __x86_indirect_jump_thunk_r13+0x20/0x20
Jan 23 16:25:10 fra-az1-comp-21 kernel: tcp_sendmsg+0x2c/0x50
Jan 23 16:25:10 fra-az1-comp-21 kernel: inet_sendmsg+0x42/0x80
Jan 23 16:25:10 fra-az1-comp-21 kernel: sock_sendmsg+0xb4/0xd0
Jan 23 16:25:10 fra-az1-comp-21 kernel: ____sys_sendmsg+0x2aa/0x370
Jan 23 16:25:10 fra-az1-comp-21 kernel: ? srso_alias_return_thunk+0x5/0x7f
Jan 23 16:25:10 fra-az1-comp-21 kernel: ___sys_sendmsg+0x9a/0xf0
Jan 23 16:25:10 fra-az1-comp-21 kernel: ? srso_alias_return_thunk+0x5/0x7f
Jan 23 16:25:10 fra-az1-comp-21 kernel: __sys_sendmsg+0x89/0xf0
Jan 23 16:25:10 fra-az1-comp-21 kernel: __x64_sys_sendmsg+0x1d/0x30
Jan 23 16:25:10 fra-az1-comp-21 kernel: do_syscall_64+0x5b/0x90
Jan 23 16:25:10 fra-az1-comp-21 kernel: ? srso_alias_return_thunk+0x5/0x7f
Jan 23 16:25:10 fra-az1-comp-21 kernel: ? ksys_read+0xe6/0x100
Jan 23 16:25:10 fra-az1-comp-21 kernel: ? srso_alias_return_thunk+0x5/0x7f
Jan 23 16:25:10 fra-az1-comp-21 kernel: ? exit_to_user_mode_prepare+0x30/0xb0
Jan 23 16:25:10 fra-az1-comp-21 kernel: ? srso_alias_return_thunk+0x5/0x7f
Jan 23 16:25:10 fra-az1-comp-21 kernel: ? syscall_exit_to_user_mode+0x37/0x60
Jan 23 16:25:10 fra-az1-comp-21 kernel: ? srso_alias_return_thunk+0x5/0x7f
Jan 23 16:25:10 fra-az1-comp-21 kernel: ? do_syscall_64+0x67/0x90
Jan 23 16:25:10 fra-az1-comp-21 kernel: ? srso_alias_return_thunk+0x5/0x7f
Jan 23 16:25:10 fra-az1-comp-21 kernel: ? exit_to_user_mode_prepare+0x9b/0xb0
Jan 23 16:25:10 fra-az1-comp-21 kernel: ? srso_alias_return_thunk+0x5/0x7f
Jan 23 16:25:10 fra-az1-comp-21 kernel: ? syscall_exit_to_user_mode+0x37/0x60
Jan 23 16:25:10 fra-az1-comp-21 kernel: ? srso_alias_return_thunk+0x5/0x7f
Jan 23 16:25:10 fra-az1-comp-21 kernel: ? do_syscall_64+0x67/0x90
Jan 23 16:25:10 fra-az1-comp-21 kernel: ? do_syscall_64+0x67/0x90
Jan 23 16:25:10 fra-az1-comp-21 kernel: ? do_syscall_64+0x67/0x90
Jan 23 16:25:10 fra-az1-comp-21 kernel: entry_SYSCALL_64_after_hwframe+0x6e/0xd8
Jan 23 16:25:10 fra-az1-comp-21 kernel: RIP: 0033:0x7fca44f2799d
Jan 23 16:25:10 fra-az1-comp-21 kernel: Code: 28 89 54 24 1c 48 89 74 24 10 89 7c 24 08 e8 6a 90 f6 ff 8b 54 24 1c 48 8b 74 24 10 41 89 c0 8b 7c 24 08 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 33 44 89 c7 48 89 44 24 08 e8 ae 90 f6 ff 48
Jan 23 16:25:10 fra-az1-comp-21 kernel: RSP: 002b:00007fca43df5740 EFLAGS: 00000293 ORIG_RAX: 000000000000002e
Jan 23 16:25:10 fra-az1-comp-21 kernel: RAX: ffffffffffffffda RBX: 0000000000000094 RCX: 00007fca44f2799d
Jan 23 16:25:10 fra-az1-comp-21 kernel: RDX: 0000000000004000 RSI: 00007fca43df57b0 RDI: 0000000000000065
Jan 23 16:25:10 fra-az1-comp-21 kernel: RBP: 0000000000004000 R08: 0000000000000000 R09: 0000000000000020
Jan 23 16:25:10 fra-az1-comp-21 kernel: R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000094
Jan 23 16:25:10 fra-az1-comp-21 kernel: R13: 00007fca43df57b0 R14: 0000000000000065 R15: 000055f41b68b1e0
Jan 23 16:25:10 fra-az1-comp-21 kernel: </TASK>
Jan 23 16:25:10 fra-az1-comp-21 kernel: ---[ end trace 0000000000000000 ]---

[...]

```

the machine runs OpenStack nova-compute + libvirt for the VM workload. Also there is OpenStack Neutron with linuxbridge ML2 (vxlan overlay).

Revision history for this message
Christian Rohmann (christian-rohmann) wrote :
Revision history for this message
Christian Rohmann (christian-rohmann) wrote :

We just observed this issue on another machine of the same make and model.
Kernel log of the boot up to the crash is attached.

This machine had NO virtual machines running though. We saw side effects such as hanging processes but were able to log in and reboot the machine.

summary: - Kernel traces and crash on on KVM hypervisor - refcount_t: underflow;
- use-after-free and refcount_t: saturated; leaking memory --
- lib/refcount.c
+ Kernel traces leading to crash - refcount_t: underflow; use-after-free
+ and refcount_t: saturated; leaking memory -- lib/refcount.c
Revision history for this message
Christian Rohmann (christian-rohmann) wrote :
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in linux-hwe-6.5 (Ubuntu):
status: New → Confirmed
Revision history for this message
GuoqingJiang (guoqingjiang) wrote :

Could you try the latest upstream kernel which convert dm-crypt's tasklet to BH workqueue? I suppose the commit fb6ad4aec1d0 ("dm-crypt: Convert from tasklet to BH workqueue") might resolve the issue.

Revision history for this message
GuoqingJiang (guoqingjiang) wrote :

Just noticed master-next has disabled tasklets for dm-crypt.

https://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/mantic/commit/?h=master-next&id=13104eddc76990dc3e4183cff050c9b6dc5e859e

I suppose hwe-6.5 will sync from mantic later, so please try with the newer kernel.

Revision history for this message
GuoqingJiang (guoqingjiang) wrote :

Err, the comments (#5 and #6) are for lp#2051232, sorry for confusion!

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.