Random kernel panics related to page fault in Ubuntu 23.04/23.10 server with QEMU

Bug #2046329 reported by Sibidharan
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
New
Undecided
Unassigned

Bug Description

Here are the general info about the server

01:06:49 labs@selfmadeninja ~ → lsb_release -a; uname -a; dpkg -l | grep ' linux-i'
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 23.10
Release: 23.10
Codename: mantic
Linux selfmadeninja 6.5.0-14-generic #14-Ubuntu SMP PREEMPT_DYNAMIC Tue Nov 14 14:59:49 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
rc linux-image-6.2.0-20-generic 6.2.0-20.20 amd64 Signed kernel image generic
rc linux-image-6.2.0-23-generic 6.2.0-23.23 amd64 Signed kernel image generic
rc linux-image-6.2.0-24-generic 6.2.0-24.24 amd64 Signed kernel image generic
rc linux-image-6.2.0-25-generic 6.2.0-25.25 amd64 Signed kernel image generic
rc linux-image-6.2.0-26-generic 6.2.0-26.26 amd64 Signed kernel image generic
rc linux-image-6.2.0-27-generic 6.2.0-27.28 amd64 Signed kernel image generic
rc linux-image-6.2.0-31-generic 6.2.0-31.31 amd64 Signed kernel image generic
rc linux-image-6.2.0-32-generic 6.2.0-32.32 amd64 Signed kernel image generic
rc linux-image-6.2.0-33-generic 6.2.0-33.33+1 amd64 Signed kernel image generic
rc linux-image-6.2.0-34-generic 6.2.0-34.34 amd64 Signed kernel image generic
rc linux-image-6.2.0-35-generic 6.2.0-35.35 amd64 Signed kernel image generic
rc linux-image-6.2.0-36-generic 6.2.0-36.37 amd64 Signed kernel image generic
rc linux-image-6.2.0-37-generic 6.2.0-37.38 amd64 Signed kernel image generic
ii linux-image-6.2.0-39-generic 6.2.0-39.40 amd64 Signed kernel image generic
ii linux-image-6.5.0-14-generic 6.5.0-14.14 amd64 Signed kernel image generic
ii linux-image-generic 6.5.0.14.16 amd64 Generic Linux kernel image

I recently changed to Intel i9 14900K and it was working well for sometime. My system has 128G RAM and I am using QEMU/KVM and running many servers. I also have a Windows VM for gaming with GPU passthrough. I have dual GPUs installed, but using only one GPU at the moment, that too only when the Windows VM is on. I am using integrated graphics for the Ubuntu Server.

I am getting random panics, I was clueless and then I installed kdump to capture the panics. All of them point to something like this

[34006.012227] BUG: unable to handle page fault for address: ffff936a8abdd3c8
[34006.012234] #PF: supervisor read access in kernel mode
[34006.012235] #PF: error_code(0x0000) - not-present page

but they all originate from different areas. I have zipped all the crash from host I got so far including the kernel dump and its available here (crash.tar.gz- 4.5GB): https://drive.google.com/drive/folders/1faoiKI5Vr2KkwOxHpKQ35TnNcNgjn_Iz?usp=drive_link

So I turned off panic_on_oops in the host, and my VM started to panic (I have kdump configuted in one of my vm also) and the crash logs are here: https://drive.google.com/file/d/1yGeNBw9VushW50G1nuePURs0P4PeEsTs/view?usp=sharing

This is my current /proc/cmdline:

BOOT_IMAGE=/vmlinuz-6.5.0-14-generic root=/dev/mapper/ubuntu--vg-ubuntu--lv ro slub_debug=F intel_iommu=on kvm.ignore_msrs=1 slub_debug=F split_lock_detect=off softlockup_panic=0 nmi_watchdog=0 ubsan=0 panic_on_oops=0 crashkernel=1G-:1G

-----------------------------------------------------------------
Panic Examples:

Panic 1 on host:

[43084.904967] BUG: kernel NULL pointer dereference, address: 0000000000000009
[43084.904972] #PF: supervisor instruction fetch in kernel mode
[43084.904974] #PF: error_code(0x0010) - not-present page
[43084.904975] PGD 0 P4D 0
[43084.904977] Oops: 0010 [#1] PREEMPT SMP NOPTI
[43084.904980] CPU: 9 PID: 0 Comm: swapper/9 Kdump: loaded Tainted: P O 6.2.0-37-generic #38-Ubuntu
[43084.904982] Hardware name: ASUS System Product Name/PRIME Z790-P WIFI, BIOS 1402 09/08/2023
[43084.904983] RIP: 0010:0x9
[43084.904987] Code: Unable to access opcode bytes at 0xffffffffffffffdf.
[43084.904988] RSP: 0018:ffffb9f0001cbdb8 EFLAGS: 00010046
[43084.904990] RAX: 0000000000000000 RBX: ffff9fd9419ccd40 RCX: 0000000000000000
[43084.904991] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[43084.904992] RBP: ffffffff8957bd15 R08: 0000000000000000 R09: 0000000000000000
[43084.904993] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[43084.904993] R13: 0000000000000004 R14: 0000000000000009 R15: ffffb9f0001cbdd8
[43084.904995] FS: 0000000000000000(0000) GS:ffff9ff7fec40000(0000) knlGS:0000000000000000
[43084.904996] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[43084.904997] CR2: ffffffffffffffdf CR3: 0000000150364000 CR4: 0000000000752ee0
[43084.904998] PKRU: 55555554
[43084.904999] Call Trace:
[43084.905000] <TASK>
[43084.905002] ? show_regs+0x6d/0x80
[43084.905006] ? __die+0x24/0x80
[43084.905008] ? page_fault_oops+0x99/0x1b0
[43084.905010] ? do_user_addr_fault+0x2f3/0x620
[43084.905012] ? exc_page_fault+0x80/0x1b0
[43084.905015] ? asm_exc_page_fault+0x27/0x30
[43084.905018] ? psi_task_change+0x55/0xd0
[43084.905022] ? enqueue_task+0xd6/0x1a0
[43084.905025] ? ttwu_do_activate+0x64/0x100
[43084.905027] ? sched_ttwu_pending+0xf1/0x1a0
[43084.905029] ? __flush_smp_call_function_queue+0xf7/0x1f0
[43084.905032] ? flush_smp_call_function_queue+0x3a/0xb0
[43084.905034] ? do_idle+0xb2/0x100
[43084.905036] ? cpu_startup_entry+0x1d/0x20
[43084.905038] ? start_secondary+0x138/0x170
[43084.905040] ? secondary_startup_64_no_verify+0xe5/0xeb
[43084.905044] </TASK>
[43084.905044] Modules linked in: vfio_pci vfio_pci_core vfio_iommu_type1 vfio iommufd snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi snd_hda_intel tls snd_seq_dummy snd_hrtimer rfcomm vhost_net tap xt_CHECKSUM xt_conntrack zfs(PO) xt_MASQUERADE zunicode(PO) nf_conntrack_netlink zzstd(O) xfrm_user zlua(O) xfrm_algo zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) xt_addrtype br_netfilter nft_masq vmw_vsock_vmci_transport vmw_vmci vhost_vsock vmw_vsock_virtio_transport_common vhost vhost_iotlb vsock nft_chain_nat ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_multiport xt_cgroup xt_mark xt_owner xt_tcpudp nft_compat cmac algif_hash algif_skcipher af_alg nf_tables nfnetlink overlay bnep openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 bridge stp llc binfmt_misc nls_iso8859_1 snd_sof_pci_intel_tgl snd_sof_intel_hda_common soundwire_intel soundwire_generic_allocation soundwire_cadence snd_sof_intel_hda snd_sof_pci snd_sof_xtensa_dsp snd_sof
[43084.905086] snd_sof_utils snd_soc_hdac_hda snd_hda_ext_core snd_soc_acpi_intel_match snd_soc_acpi snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec intel_rapl_msr snd_hda_core intel_rapl_common snd_hwdep intel_tcc_cooling soundwire_bus x86_pkg_temp_thermal snd_soc_core snd_compress ac97_bus intel_powerclamp snd_pcm_dmaengine coretemp snd_pcm btusb iwlmvm btrtl snd_seq_midi snd_seq_midi_event btbcm btintel kvm_intel btmtk snd_rawmidi kvm mac80211 bluetooth irqbypass libarc4 snd_seq snd_seq_device rapl pmt_telemetry ecdh_generic cmdlinepart mei_hdcp mei_pxp pmt_class joydev input_leds intel_cstate ecc asus_nb_wmi eeepc_wmi snd_timer spi_nor iwlwifi wmi_bmof snd mtd soundcore intel_vsec cfg80211 acpi_pad acpi_tad mei_me mac_hid mei dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua msr parport_pc ppdev lp parport efi_pstore dmi_sysfs ip_tables x_tables autofs4 btrfs blake2b_generic raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0
[43084.905133] multipath linear hid_generic usbhid hid i915 drm_buddy i2c_algo_bit ttm drm_display_helper cec rc_core drm_kms_helper syscopyarea mfd_aaeon sysfillrect asus_wmi sysimgblt ledtrig_audio crct10dif_pclmul crc32_pclmul polyval_clmulni polyval_generic ghash_clmulni_intel sha512_ssse3 aesni_intel sparse_keymap crypto_simd r8169 nvme cryptd platform_profile i2c_i801 spi_intel_pci ahci i2c_smbus drm nvme_core spi_intel intel_lpss_pci realtek xhci_pci libahci intel_lpss nvme_common vmd xhci_pci_renesas idma64 video wmi pinctrl_alderlake
[43084.905160] CR2: 0000000000000009

Panic 2 on host:

[102172.140109] BUG: unable to handle page fault for address: fffffffffffbcca3
[102172.140114] #PF: supervisor read access in kernel mode
[102172.140116] #PF: error_code(0x0000) - not-present page
[102172.140117] PGD 1370e15067 P4D 1370e15067 PUD 1370e17067 PMD 0
[102172.140120] Oops: 0000 [#1] PREEMPT SMP NOPTI
[102172.140123] CPU: 9 PID: 117178 Comm: worker Kdump: loaded Tainted: P O 6.2.0-39-generic #40-Ubuntu
[102172.140125] Hardware name: ASUS System Product Name/PRIME Z790-P WIFI, BIOS 1402 09/08/2023
[102172.140126] RIP: 0010:mempool_alloc+0x5e/0x1c0
[102172.140131] Code: 00 48 c7 45 b0 00 00 00 00 48 c7 45 b8 00 00 00 00 48 c7 45 c0 00 00 00 00 48 c7 45 c8 00 00 00 00 41 81 e5 00 04 00 00 75 5e <41> 89 dc 81 e3 bf fb ff ff 41 81 cc 00 20 09 00 81 cb 00 20 09 00
[102172.140133] RSP: 0018:ffffa8244f7ab960 EFLAGS: 00010246
[102172.140134] RAX: 0000000000000000 RBX: 0000000000000cc0 RCX: 0000000000000cc0
[102172.140136] RDX: 0000000000000000 RSI: 0000000000000cc0 RDI: ffffffffb9d1ce58
[102172.140137] RBP: ffffa8244f7ab9c0 R08: ffffffffb9d1ce40 R09: ffff89ab82fb7080
[102172.140138] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000cc0
[102172.140139] R13: 0000000000000400 R14: ffffffffb9d1ce58 R15: ffffa8244f7abb08
[102172.140140] FS: 00007fa6ba7fc6c0(0000) GS:ffff89ca3ec40000(0000) knlGS:0000000000000000
[102172.140141] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[102172.140142] CR2: fffffffffffbcca3 CR3: 000000012fbc4000 CR4: 0000000000752ee0
[102172.140143] PKRU: 55555554
[102172.140144] Call Trace:
[102172.140146] <TASK>
[102172.140148] ? show_regs+0x6d/0x80
[102172.140151] ? __die+0x24/0x80
[102172.140153] ? page_fault_oops+0x99/0x1b0
[102172.140156] ? kernelmode_fixup_or_oops+0xb2/0x140
[102172.140158] ? __bad_area_nosemaphore+0x1a5/0x2c0
[102172.140159] ? update_load_avg+0x82/0x810
[102172.140162] ? bad_area_nosemaphore+0x16/0x30
[102172.140163] ? do_kern_addr_fault+0x7b/0xa0
[102172.140165] ? exc_page_fault+0x10a/0x1b0
[102172.140172] ? asm_exc_page_fault+0x27/0x30
[102172.140178] ? mempool_alloc+0x5e/0x1c0
[102172.140182] bio_alloc_bioset+0x20d/0x530
[102172.140188] iomap_dio_alloc_bio.isra.0+0x3b/0x50
[102172.140192] iomap_dio_bio_iter+0x2b4/0x500
[102172.140195] __iomap_dio_rw+0x430/0x740
[102172.140200] iomap_dio_rw+0x11/0x70
[102172.140202] ext4_dio_write_iter+0x2a3/0x4b0
[102172.140205] ext4_file_write_iter+0x38/0x80
[102172.140207] do_iter_readv_writev+0xec/0x160
[102172.140210] do_iter_write+0x9d/0x170
[102172.140212] vfs_writev+0xf5/0x1b0
[102172.140215] __x64_sys_pwritev+0xc9/0x110
[102172.140217] do_syscall_64+0x58/0x90
[102172.140220] ? exit_to_user_mode_prepare+0x30/0xb0
[102172.140223] ? syscall_exit_to_user_mode+0x37/0x60
[102172.140225] ? do_syscall_64+0x67/0x90
[102172.140227] ? do_syscall_64+0x67/0x90
[102172.140228] ? do_syscall_64+0x67/0x90
[102172.140230] ? do_syscall_64+0x67/0x90
[102172.140232] ? do_syscall_64+0x67/0x90
[102172.140233] entry_SYSCALL_64_after_hwframe+0x73/0xdd
[102172.140235] RIP: 0033:0x7fb73c1121d0
[102172.140237] Code: 3c 24 48 89 4c 24 18 e8 ee 92 f7 ff 4c 8b 54 24 18 8b 3c 24 45 31 c0 41 89 c1 8b 54 24 14 48 8b 74 24 08 b8 28 01 00 00 0f 05 <48> 3d 00 f0 ff ff 77 30 44 89 cf 48 89 04 24 e8 3c 93 f7 ff 48 8b
[102172.140238] RSP: 002b:00007fa6ba7fb600 EFLAGS: 00000246 ORIG_RAX: 0000000000000128
[102172.140240] RAX: ffffffffffffffda RBX: 00007fa5a14d4b10 RCX: 00007fb73c1121d0
[102172.140241] RDX: 0000000000000018 RSI: 0000558de9da7360 RDI: 000000000000000f
[102172.140242] RBP: 0000558de9ad9440 R08: 0000000000000000 R09: 0000000000000000
[102172.140243] R10: 0000014be83c8000 R11: 0000000000000246 R12: 0000558de9ad9450
[102172.140244] R13: 0000558de92ed95b R14: 00007fa6baffc300 R15: 00007fa6b9ffc000
[102172.140246] </TASK>
[102172.140247] Modules linked in: tls snd_seq_dummy snd_hrtimer rfcomm zfs(PO) zunicode(PO) zzstd(O) zlua(O) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) vhost_vsock vmw_vsock_virtio_transport_common vhost_net vhost vhost_iotlb tap ip6t_REJECT nf_reject_ipv6 xt_multiport xt_cgroup xt_mark xt_owner nf_conntrack_netlink xfrm_user xfrm_algo xt_addrtype br_netfilter xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp nft_compat nft_masq nft_chain_nat vmw_vsock_vmci_transport vsock nf_tables vmw_vmci nfnetlink overlay cmac algif_hash algif_skcipher af_alg bnep cpuid openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 bridge stp llc binfmt_misc nls_iso8859_1 snd_sof_pci_intel_tgl snd_sof_intel_hda_common soundwire_intel soundwire_generic_allocation soundwire_cadence snd_sof_intel_hda snd_sof_pci snd_sof_xtensa_dsp snd_sof snd_sof_utils snd_soc_hdac_hda snd_hda_ext_core snd_soc_acpi_intel_match snd_soc_acpi snd_intel_dspcfg snd_intel_sdw_acpi
[102172.140287] snd_hda_codec snd_hda_core snd_hwdep soundwire_bus intel_rapl_msr snd_soc_core intel_rapl_common snd_compress ac97_bus snd_pcm_dmaengine intel_tcc_cooling snd_pcm x86_pkg_temp_thermal intel_powerclamp coretemp snd_seq_midi iwlmvm snd_seq_midi_event snd_rawmidi kvm_intel btusb mac80211 snd_seq btrtl btbcm btintel btmtk snd_seq_device libarc4 snd_timer kvm cmdlinepart bluetooth iwlwifi spi_nor irqbypass snd rapl pmt_telemetry ecdh_generic mei_hdcp mei_pxp pmt_class joydev input_leds asus_nb_wmi eeepc_wmi wmi_bmof intel_cstate mtd soundcore ecc cfg80211 intel_vsec acpi_pad acpi_tad mac_hid mei_me mei dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua msr parport_pc ppdev lp parport efi_pstore dmi_sysfs ip_tables x_tables autofs4 btrfs blake2b_generic raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid i915 drm_buddy i2c_algo_bit ttm drm_display_helper cec rc_core drm_kms_helper
[102172.140333] syscopyarea mfd_aaeon sysfillrect asus_wmi sysimgblt crct10dif_pclmul crc32_pclmul polyval_clmulni polyval_generic ghash_clmulni_intel sha512_ssse3 aesni_intel ledtrig_audio crypto_simd sparse_keymap nvme i2c_i801 spi_intel_pci r8169 cryptd platform_profile drm intel_lpss_pci ahci i2c_smbus spi_intel nvme_core realtek intel_lpss libahci xhci_pci nvme_common idma64 vmd xhci_pci_renesas video wmi pinctrl_alderlake
[102172.140352] CR2: fffffffffffbcca3

Panic 3 on host:

[34006.012227] BUG: unable to handle page fault for address: ffff936a8abdd3c8
[34006.012234] #PF: supervisor read access in kernel mode
[34006.012235] #PF: error_code(0x0000) - not-present page
[34006.012236] PGD 0 P4D 0
[34006.012239] Oops: 0000 [#1] PREEMPT SMP NOPTI
[34006.012241] CPU: 7 PID: 11561 Comm: worker Kdump: loaded Tainted: P O 6.5.0-14-generic #14-Ubuntu
[34006.012243] Hardware name: ASUS System Product Name/PRIME Z790-P WIFI, BIOS 1601 11/23/2023
[34006.012244] RIP: 0010:__kmem_cache_alloc_node+0xd0/0x360
[34006.012249] Code: c0 0f 85 5e 02 00 00 4d 85 e4 0f 84 40 02 00 00 0f 1f 44 00 00 48 c7 44 24 10 00 00 00 00 49 8b 04 24 65 48 03 05 80 70 7b 58 <48> 8b 50 08 4c 8b 10 48 8b 40 10 4d 85 d2 74 1f 48 85 c0 74 1a 41
[34006.012250] RSP: 0018:ffffaab07ef7f8b0 EFLAGS: 00010287
[34006.012252] RAX: ffff936a8abdd3c0 RBX: 0000000000092820 RCX: 0000000000001000
[34006.012253] RDX: 00000000ffffffff RSI: 0000000000092820 RDI: ffff9ce5c0042d00
[34006.012254] RBP: ffffaab07ef7f900 R08: ffffffffa77931f5 R09: 0000000000000000
[34006.012255] R10: 0000000000000000 R11: 0000000000000000 R12: ffff9ce5c0042d00
[34006.012256] R13: 0000000000092820 R14: 00000000ffffffff R15: 0000000000001000
[34006.012257] FS: 00007fe5bbfff6c0(0000) GS:ffff9d04bf1c0000(0000) knlGS:0000000000000000
[34006.012258] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[34006.012259] CR2: ffff936a8abdd3c8 CR3: 00000002065be000 CR4: 0000000000752ee0
[34006.012260] PKRU: 55555554
[34006.012261] Call Trace:
[34006.012262] <TASK>
[34006.012264] ? show_regs+0x6d/0x80
[34006.012267] ? __die+0x24/0x80
[34006.012268] ? page_fault_oops+0x99/0x1b0
[34006.012271] ? kernelmode_fixup_or_oops+0xb2/0x140
[34006.012273] ? __bad_area_nosemaphore+0x1a5/0x2c0
[34006.012274] ? bad_area_nosemaphore+0x16/0x30
[34006.012275] ? do_kern_addr_fault+0x7b/0xa0
[34006.012276] ? exc_page_fault+0x1a4/0x1b0
[34006.012278] ? asm_exc_page_fault+0x27/0x30
[34006.012280] ? mempool_kmalloc+0x15/0x20
[34006.012283] ? __kmem_cache_alloc_node+0xd0/0x360
[34006.012284] ? mempool_kmalloc+0x15/0x20
[34006.012285] ? mempool_kmalloc+0x15/0x20
[34006.012286] __kmalloc+0x51/0x170
[34006.012289] mempool_kmalloc+0x15/0x20
[34006.012290] mempool_alloc+0x80/0x1c0
[34006.012292] nvme_map_data+0x5e/0x480 [nvme]
[34006.012296] ? __submit_bio_noacct+0x90/0x230
[34006.012298] nvme_prep_rq.part.0+0x35/0x130 [nvme]
[34006.012301] nvme_queue_rqs+0xc4/0x2a0 [nvme]
[34006.012304] blk_mq_flush_plug_list.part.0+0x18b/0x1b0
[34006.012307] blk_mq_flush_plug_list+0x19/0x30
[34006.012308] __blk_flush_plug+0xdf/0x130
[34006.012309] blk_finish_plug+0x31/0x50
[34006.012310] __iomap_dio_rw+0x423/0x730
[34006.012313] iomap_dio_rw+0x11/0x60
[34006.012314] ext4_dio_write_iter+0x17b/0x3e0
[34006.012316] ext4_file_write_iter+0x3b/0x80
[34006.012318] do_iter_readv_writev+0xef/0x160
[34006.012320] do_iter_write+0xa4/0x1a0
[34006.012321] vfs_writev+0xf8/0x1c0
[34006.012323] __x64_sys_pwritev+0xc9/0x110
[34006.012325] do_syscall_64+0x59/0x90
[34006.012326] ? sysvec_reschedule_ipi+0x7a/0x120
[34006.012327] entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[34006.012328] RIP: 0033:0x7ff69eb24c30
[34006.012359] Code: 3c 24 48 89 4c 24 18 e8 5e ec f6 ff 4c 8b 54 24 18 8b 3c 24 45 31 c0 41 89 c1 8b 54 24 14 48 8b 74 24 08 b8 28 01 00 00 0f 05 <48> 3d 00 f0 ff ff 77 30 44 89 cf 48 89 04 24 e8 ac ec f6 ff 48 8b
[34006.012360] RSP: 002b:00007fe5bbffe5c0 EFLAGS: 00000246 ORIG_RAX: 0000000000000128
[34006.012361] RAX: ffffffffffffffda RBX: 000055b93ff9a100 RCX: 00007ff69eb24c30
[34006.012361] RDX: 0000000000000002 RSI: 000055b93feb0400 RDI: 000000000000000f
[34006.012362] RBP: 00007fe4aee6fb00 R08: 0000000000000000 R09: 0000000000000000
[34006.012363] R10: 00000155ccfa0000 R11: 0000000000000246 R12: 000055b93fbdca20
[34006.012363] R13: 000055b93e6a41f7 R14: 00007fe5d0ff82c0 R15: 00007fe5bb7ff000
[34006.012364] </TASK>
[34006.012365] Modules linked in: vfio_pci vfio_pci_core vfio_iommu_type1 vfio iommufd snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi snd_hda_intel tls snd_seq_dummy snd_hrtimer rfcomm vhost_net tap xt_CHECKSUM xt_conntrack xt_MASQUERADE nf_conntrack_netlink xfrm_user xfrm_algo xt_addrtype br_netfilter nft_masq zfs(PO) spl(O) vhost_vsock vmw_vsock_virtio_transport_common vhost vhost_iotlb vsock nft_chain_nat ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_multiport xt_cgroup xt_mark xt_owner xt_tcpudp nft_compat nf_tables nfnetlink cmac algif_hash algif_skcipher af_alg overlay bridge stp llc bnep openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 binfmt_misc nls_iso8859_1 snd_sof_pci_intel_tgl snd_sof_intel_hda_common soundwire_intel snd_sof_intel_hda_mlink soundwire_cadence intel_rapl_msr snd_sof_intel_hda intel_rapl_common snd_sof_pci intel_uncore_frequency snd_sof_xtensa_dsp intel_uncore_frequency_common snd_sof snd_sof_utils snd_soc_hdac_hda snd_hda_ext_core
[34006.012395] snd_soc_acpi_intel_match intel_tcc_cooling x86_pkg_temp_thermal snd_soc_acpi snd_intel_dspcfg intel_powerclamp snd_intel_sdw_acpi snd_hda_codec iwlmvm snd_hda_core snd_hwdep coretemp soundwire_generic_allocation soundwire_bus mac80211 snd_soc_core kvm_intel snd_compress ac97_bus snd_pcm_dmaengine snd_pcm libarc4 kvm snd_seq_midi snd_seq_midi_event snd_rawmidi cmdlinepart snd_seq irqbypass spi_nor btusb snd_seq_device mfd_aaeon asus_nb_wmi eeepc_wmi btrtl snd_timer mtd asus_wmi btbcm iwlwifi rapl snd btintel ledtrig_audio pmt_telemetry i2c_i801 btmtk sparse_keymap spi_intel_pci intel_cstate mei_hdcp mei_pxp pmt_class platform_profile joydev input_leds wmi_bmof bluetooth spi_intel soundcore i2c_smbus ecdh_generic cfg80211 ecc intel_vsec acpi_tad acpi_pad mei_me mac_hid mei dm_multipath msr parport_pc ppdev lp parport efi_pstore dmi_sysfs ip_tables x_tables autofs4 btrfs blake2b_generic raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath
[34006.012430] linear hid_generic usbhid hid i915 drm_buddy i2c_algo_bit ttm drm_display_helper cec rc_core crct10dif_pclmul drm_kms_helper crc32_pclmul polyval_clmulni polyval_generic nvme ghash_clmulni_intel aesni_intel r8169 nvme_core ahci crypto_simd cryptd realtek nvme_common libahci drm intel_lpss_pci video intel_lpss xhci_pci idma64 xhci_pci_renesas vmd wmi pinctrl_alderlake
[34006.012445] CR2: ffff936a8abdd3c8

Sibidharan (sibi1995)
summary: - Random kernel panics related to page fault in Ubuntu 23.4.23.10 server
+ Random kernel panics related to page fault in Ubuntu 23.4/23.10 server
with QEMU
description: updated
Revision history for this message
Ubuntu Foundations Team Bug Bot (crichton) wrote : Re: Random kernel panics related to page fault in Ubuntu 23.4/23.10 server with QEMU

Thank you for taking the time to report this bug and helping to make Ubuntu better. It seems that your bug report is not filed about a specific source package though, rather it is just filed against Ubuntu in general. It is important that bug reports be filed about source packages so that people interested in the package can find the bugs about it. You can find some hints about determining what package your bug might be about at https://wiki.ubuntu.com/Bugs/FindRightPackage. You might also ask for help in the #ubuntu-bugs irc channel on Libera.chat.

To change the source package that this bug is filed about visit https://bugs.launchpad.net/ubuntu/+bug/2046329/+editstatus and add the package name in the text box next to the word Package.

[This is an automated message. I apologize if it reached you inappropriately; please just reply to this message indicating so.]

tags: added: bot-comment
Paul White (paulw2u)
affects: ubuntu → linux (Ubuntu)
summary: - Random kernel panics related to page fault in Ubuntu 23.4/23.10 server
+ Random kernel panics related to page fault in Ubuntu 23.04/23.10 server
with QEMU
tags: added: lunar mantic
Revision history for this message
Sibidharan (sibi1995) wrote :
Download full text (4.7 KiB)

I disabled iommu and still the crash happens and this happens very randomly. Not as much frequent while gaming inside the Windows VM, but still crash happens randomly. Someone please help me fix it, my server is very unstable because of this.

[97751.636298] BUG: kernel NULL pointer dereference, address: 0000000000000000
[97751.636304] #PF: supervisor instruction fetch in kernel mode
[97751.636305] #PF: error_code(0x0010) - not-present page
[97751.636306] PGD 15645d067 P4D 15645d067 PUD 193973067 PMD 0
[97751.636309] Oops: 0010 [#1] PREEMPT SMP NOPTI
[97751.636311] CPU: 5 PID: 0 Comm: swapper/5 Kdump: loaded Tainted: P O 6.5.0-14-generic #14-Ubuntu
[97751.636313] Hardware name: ASUS System Product Name/PRIME Z790-P WIFI, BIOS 1601 11/23/2023
[97751.636315] RIP: 0010:0x0
[97751.636349] Code: Unable to access opcode bytes at 0xffffffffffffffd6.
[97751.636350] RSP: 0018:ffffab35c01abdf8 EFLAGS: 00010046
[97751.636351] RAX: 000058e7938ec63a RBX: ffffab35c01abdf8 RCX: 0000000000000000
[97751.636352] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[97751.636353] RBP: ffff939541798000 R08: 0000000000000000 R09: 0000000000000000
[97751.636353] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffb5f80392
[97751.636354] R13: ffff93b3feb730c0 R14: 0000000000000005 R15: ffff939541798000
[97751.636355] FS: 0000000000000000(0000) GS:ffff93b3feb40000(0000) knlGS:0000000000000000
[97751.636356] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[97751.636357] CR2: ffffffffffffffd6 CR3: 0000000105f14000 CR4: 0000000000752ee0
[97751.636358] PKRU: 55555554
[97751.636359] Call Trace:
[97751.636360] <TASK>
[97751.636362] ? show_regs+0x6d/0x80
[97751.636367] ? __die+0x24/0x80
[97751.636369] ? page_fault_oops+0x99/0x1b0
[97751.636373] ? do_user_addr_fault+0x316/0x6b0
[97751.636374] ? exc_page_fault+0x83/0x1b0
[97751.636377] ? asm_exc_page_fault+0x27/0x30
[97751.636380] ? sched_clock_cpu+0x12/0x1e0
[97751.636384] ? psi_task_switch+0x33/0x270
[97751.636387] ? __schedule+0x391/0x770
[97751.636390] ? schedule_idle+0x2a/0x50
[97751.636391] ? do_idle+0xb7/0xf0
[97751.636393] ? cpu_startup_entry+0x1d/0x20
[97751.636394] ? start_secondary+0x129/0x160
[97751.636397] ? secondary_startup_64_no_verify+0x17e/0x18b
[97751.636401] </TASK>
[97751.636402] Modules linked in: tls snd_seq_dummy snd_hrtimer rfcomm vhost_net tap xfrm_user xfrm_algo xt_addrtype br_netfilter xt_CHECKSUM xt_MASQUERADE xt_conntrack nft_masq zfs(PO) spl(O) wireguard curve25519_x86_64 libchacha20poly1305 chacha_x86_64 poly1305_x86_64 libcurve25519_generic libchacha ip6_udp_tunnel udp_tunnel vhost_vsock vmw_vsock_virtio_transport_common vhost vhost_iotlb vsock nf_conntrack_netlink nfnetlink_acct nft_chain_nat ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_multiport xt_cgroup xt_mark xt_owner xt_tcpudp nft_compat nf_tables nfnetlink overlay cmac algif_hash algif_skcipher bridge af_alg stp llc bnep openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 binfmt_misc nls_iso8859_1 snd_sof_pci_intel_tgl snd_sof_intel_hda_common soundwire_intel snd_sof_intel_hda_mlink soundwire_cadence snd_sof_intel_hda snd_sof_pci snd_sof_xtensa...

Read more...

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.