kernel: BUG: Bad page state in process kworker

Bug #2051232 reported by Christian Rohmann
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux-hwe-6.5 (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

Similar to the bug https://bugs.launchpad.net/ubuntu/+source/linux-hwe-6.5/+bug/2051123 where traces were shown, we observed a "BUG" being reported on yet another machine of the same make / model (Asus RS720A-E11-RS24U using dual socket AMD EPYC Milan CPUs):

```
[...]
Jan 24 08:57:00 fra-az1-comp-24 kernel: BUG: Bad page state in process kworker/u257:18 pfn:5812dc
Jan 24 08:57:00 fra-az1-comp-24 kernel: page:00000000b0c63dd1 refcount:-1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x5812dc
Jan 24 08:57:00 fra-az1-comp-24 kernel: flags: 0x17ffffc0000000(node=0|zone=2|lastcpupid=0x1fffff)
Jan 24 08:57:00 fra-az1-comp-24 kernel: page_type: 0xffffffff()
Jan 24 08:57:00 fra-az1-comp-24 kernel: raw: 0017ffffc0000000 dead000000000100 dead000000000122 0000000000000000
Jan 24 08:57:00 fra-az1-comp-24 kernel: raw: 0000000000000000 0000000000000000 ffffffffffffffff 0000000000000000
Jan 24 08:57:00 fra-az1-comp-24 kernel: page dumped because: nonzero _refcount
Jan 24 08:57:00 fra-az1-comp-24 kernel: Modules linked in: vxlan ip6_udp_tunnel udp_tunnel ebt_arp nft_meta_bridge xt_CT xt_mac xt_state xt_comment xt_physdev vhost_net vhost vhost_iotlb tap xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink xfrm_user xfrm_algo nvme_fabrics 8021q garp mrp br_netfilter bridge stp llc bonding binfmt_misc tls nls_ascii ipmi_ssif intel_rapl_msr intel_rapl_
common amd64_edac edac_mce_amd kvm_amd kvm irqbypass rapl wmi_bmof irdma ib_uverbs ib_core joydev input_leds ccp k10temp ptdma switchtec acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler mac_hid sch_fq_codel efi_pstore ip_tables x_tables autofs4 dm_crypt raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid0 multipath linear hid_generic usbhid hid cdc_ether usbnet mii ast i2c_algo_bit drm_shmem_helper raid1 drm_kms_helper ice crct10dif_pclmul crc32_pclmul po
lyval_clmulni polyval_generic
Jan 24 08:57:00 fra-az1-comp-24 kernel: ghash_clmulni_intel aesni_intel crypto_simd cryptd ahci nvme gnss drm i40e libahci xhci_pci i2c_piix4 xhci_pci_renesas nvme_core nvme_common wmi
Jan 24 08:57:00 fra-az1-comp-24 kernel: CPU: 14 PID: 1094271 Comm: kworker/u257:18 Not tainted 6.5.0-14-generic #14~22.04.1-Ubuntu
Jan 24 08:57:00 fra-az1-comp-24 kernel: Hardware name: To be filled by O.E.M. To be filled by O.E.M./KMPP-D32 Series, BIOS 1501 08/23/2023
Jan 24 08:57:00 fra-az1-comp-24 kernel: Workqueue: kcryptd/252:12 kcryptd_crypt [dm_crypt]
Jan 24 08:57:00 fra-az1-comp-24 kernel: Call Trace:
Jan 24 08:57:00 fra-az1-comp-24 kernel: <TASK>
Jan 24 08:57:00 fra-az1-comp-24 kernel: dump_stack_lvl+0x48/0x70
Jan 24 08:57:00 fra-az1-comp-24 kernel: dump_stack+0x10/0x20
Jan 24 08:57:00 fra-az1-comp-24 kernel: bad_page+0x76/0x120
Jan 24 08:57:00 fra-az1-comp-24 kernel: __rmqueue_pcplist+0x149/0x1d0
Jan 24 08:57:00 fra-az1-comp-24 kernel: ? srso_alias_return_thunk+0x5/0x7f
Jan 24 08:57:00 fra-az1-comp-24 kernel: rmqueue+0x37c/0xf10
Jan 24 08:57:00 fra-az1-comp-24 kernel: get_page_from_freelist+0x10b/0x4c0
Jan 24 08:57:00 fra-az1-comp-24 kernel: ? srso_alias_return_thunk+0x5/0x7f
Jan 24 08:57:00 fra-az1-comp-24 kernel: __alloc_pages+0x1e7/0x350
Jan 24 08:57:00 fra-az1-comp-24 kernel: alloc_pages+0x90/0x1a0
Jan 24 08:57:00 fra-az1-comp-24 kernel: crypt_page_alloc+0x2f/0x70 [dm_crypt]
Jan 24 08:57:00 fra-az1-comp-24 kernel: mempool_alloc+0x83/0x1c0
Jan 24 08:57:00 fra-az1-comp-24 kernel: ? srso_alias_return_thunk+0x5/0x7f
Jan 24 08:57:00 fra-az1-comp-24 kernel: crypt_alloc_buffer+0x11a/0x1f0 [dm_crypt]
Jan 24 08:57:00 fra-az1-comp-24 kernel: kcryptd_crypt_write_convert+0xa3/0x1d0 [dm_crypt]
Jan 24 08:57:00 fra-az1-comp-24 kernel: kcryptd_crypt+0x114/0x170 [dm_crypt]
Jan 24 08:57:00 fra-az1-comp-24 kernel: process_one_work+0x240/0x450
Jan 24 08:57:00 fra-az1-comp-24 kernel: worker_thread+0x50/0x3f0
Jan 24 08:57:00 fra-az1-comp-24 kernel: ? __pfx_worker_thread+0x10/0x10
Jan 24 08:57:00 fra-az1-comp-24 kernel: kthread+0xf2/0x120
Jan 24 08:57:00 fra-az1-comp-24 kernel: ? __pfx_kthread+0x10/0x10
Jan 24 08:57:00 fra-az1-comp-24 kernel: ret_from_fork+0x47/0x70
Jan 24 08:57:00 fra-az1-comp-24 kernel: ? __pfx_kthread+0x10/0x10
Jan 24 08:57:00 fra-az1-comp-24 kernel: ret_from_fork_asm+0x1b/0x30
Jan 24 08:57:00 fra-az1-comp-24 kernel: </TASK>
Jan 24 08:57:00 fra-az1-comp-24 kernel: Disabling lock debugging due to kernel taint
Jan 24 08:57:01 fra-az1-comp-24 kernel: BUG: Bad page state in process kworker/u257:5 pfn:279d3a4
Jan 24 08:57:01 fra-az1-comp-24 kernel: page:00000000e3b89192 refcount:-1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x279d3a4
Jan 24 08:57:01 fra-az1-comp-24 kernel: flags: 0x17ffffc0000000(node=0|zone=2|lastcpupid=0x1fffff)
Jan 24 08:57:01 fra-az1-comp-24 kernel: page_type: 0xffffffff()
Jan 24 08:57:01 fra-az1-comp-24 kernel: raw: 0017ffffc0000000 dead000000000100 dead000000000122 0000000000000000
Jan 24 08:57:01 fra-az1-comp-24 kernel: raw: 0000000000000000 0000000000000000 ffffffffffffffff 0000000000000000
Jan 24 08:57:01 fra-az1-comp-24 kernel: page dumped because: nonzero _refcount
Jan 24 08:57:01 fra-az1-comp-24 kernel: Modules linked in: vxlan ip6_udp_tunnel udp_tunnel ebt_arp nft_meta_bridge xt_CT xt_mac xt_state xt_comment xt_physdev vhost_net vhost vhost_iotlb tap xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink xfrm_user xfrm_algo nvme_fabrics 8021q garp mrp br_netfilter bridge stp llc bonding binfmt_misc tls nls_ascii ipmi_ssif intel_rapl_msr intel_rapl_common amd64_edac edac_mce_amd kvm_amd kvm irqbypass rapl wmi_bmof irdma ib_uverbs ib_core joydev input_leds ccp k10temp ptdma switchtec acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler mac_hid sch_fq_codel efi_pstore ip_tables x_tables autofs4 dm_crypt raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid0 multipath linear hid_generic usbhid hid cdc_ether usbnet mii ast i2c_algo_bit drm_shmem_helper raid1 drm_kms_helper ice crct10dif_pclmul crc32_pclmul polyval_clmulni polyval_generic
Jan 24 08:57:01 fra-az1-comp-24 kernel: ghash_clmulni_intel aesni_intel crypto_simd cryptd ahci nvme gnss drm i40e libahci xhci_pci i2c_piix4 xhci_pci_renesas nvme_core nvme_common wmi
Jan 24 08:57:01 fra-az1-comp-24 kernel: CPU: 14 PID: 1104204 Comm: kworker/u257:5 Tainted: G B 6.5.0-14-generic #14~22.04.1-Ubuntu
Jan 24 08:57:01 fra-az1-comp-24 kernel: Hardware name: To be filled by O.E.M. To be filled by O.E.M./KMPP-D32 Series, BIOS 1501 08/23/2023
Jan 24 08:57:01 fra-az1-comp-24 kernel: Workqueue: kcryptd/252:12 kcryptd_crypt [dm_crypt]
Jan 24 08:57:01 fra-az1-comp-24 kernel: Call Trace:
Jan 24 08:57:01 fra-az1-comp-24 kernel: <TASK>
Jan 24 08:57:01 fra-az1-comp-24 kernel: dump_stack_lvl+0x48/0x70
Jan 24 08:57:01 fra-az1-comp-24 kernel: dump_stack+0x10/0x20
Jan 24 08:57:01 fra-az1-comp-24 kernel: bad_page+0x76/0x120
Jan 24 08:57:01 fra-az1-comp-24 kernel: __rmqueue_pcplist+0x149/0x1d0
Jan 24 08:57:01 fra-az1-comp-24 kernel: ? srso_alias_return_thunk+0x5/0x7f
Jan 24 08:57:01 fra-az1-comp-24 kernel: rmqueue+0x37c/0xf10
Jan 24 08:57:01 fra-az1-comp-24 kernel: get_page_from_freelist+0x10b/0x4c0
Jan 24 08:57:01 fra-az1-comp-24 kernel: ? srso_alias_return_thunk+0x5/0x7f
Jan 24 08:57:01 fra-az1-comp-24 kernel: __alloc_pages+0x1e7/0x350
Jan 24 08:57:01 fra-az1-comp-24 kernel: alloc_pages+0x90/0x1a0
Jan 24 08:57:01 fra-az1-comp-24 kernel: crypt_page_alloc+0x2f/0x70 [dm_crypt]
Jan 24 08:57:01 fra-az1-comp-24 kernel: mempool_alloc+0x83/0x1c0
Jan 24 08:57:01 fra-az1-comp-24 kernel: ? srso_alias_return_thunk+0x5/0x7f
Jan 24 08:57:01 fra-az1-comp-24 kernel: crypt_alloc_buffer+0x11a/0x1f0 [dm_crypt]
Jan 24 08:57:01 fra-az1-comp-24 kernel: kcryptd_crypt_write_convert+0xa3/0x1d0 [dm_crypt]
Jan 24 08:57:01 fra-az1-comp-24 kernel: kcryptd_crypt+0x114/0x170 [dm_crypt]
Jan 24 08:57:01 fra-az1-comp-24 kernel: process_one_work+0x240/0x450
Jan 24 08:57:01 fra-az1-comp-24 kernel: worker_thread+0x50/0x3f0
Jan 24 08:57:01 fra-az1-comp-24 kernel: ? __pfx_worker_thread+0x10/0x10
Jan 24 08:57:01 fra-az1-comp-24 kernel: kthread+0xf2/0x120
Jan 24 08:57:01 fra-az1-comp-24 kernel: ? __pfx_kthread+0x10/0x10
Jan 24 08:57:01 fra-az1-comp-24 kernel: ret_from_fork+0x47/0x70
Jan 24 08:57:01 fra-az1-comp-24 kernel: ? __pfx_kthread+0x10/0x10
Jan 24 08:57:01 fra-az1-comp-24 kernel: ret_from_fork_asm+0x1b/0x30
Jan 24 08:57:01 fra-az1-comp-24 kernel: </TASK>

[...]

```

Revision history for this message
Christian Rohmann (christian-rohmann) wrote :
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in linux-hwe-6.5 (Ubuntu):
status: New → Confirmed
Revision history for this message
GuoqingJiang (guoqingjiang) wrote :

Could you try the latest upstream kernel which convert dm-crypt's tasklet to BH workqueue? I suppose the commit fb6ad4aec1d0 ("dm-crypt: Convert from tasklet to BH workqueue") might resolve the issue.

And mantic master-next has disabled tasklets for dm-crypt.

https://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/mantic/commit/?h=master-next&id=13104eddc76990dc3e4183cff050c9b6dc5e859e

I suppose hwe-6.5 will sync from mantic later, so please try with the newer kernel.

BTW, could you share how to reproduce the issue? I can try from my side in case above commits doesn't help.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.