watchdog: BUG: soft lockup - CPU#10 stuck for 134s! [DyingLightGame_:4256]

Bug #2028274 reported by James Fox
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
nvidia-graphics-drivers-535 (Ubuntu)
New
Undecided
Unassigned

Bug Description

Playing Dying Light 2 through steam/proton on Ubuntu 22.04 with nvidia 535.

Game is perfect until I press quit in Dying Light 2's menu. The computer totally locks up and I have to hold the power button down. It happens every time.

Downgrading to 525 with 'sudo apt install nvidia-driver-525 nvidia-dkms-525' fixed the issue.

**** journalctl ****
Jul 20 10:56:28 desktop kernel: watchdog: BUG: soft lockup - CPU#10 stuck for 134s! [DyingLightGame_:4256]
Jul 20 10:56:28 desktop kernel: Modules linked in: vboxnetadp(OE) vboxnetflt(OE) vboxdrv(OE) binfmt_misc nls_iso8859_1 snd_sof_pci_intel_tgl snd_sof_intel_hda_common soundwire_intel soundwire_generic_allocation soundwire_cadence snd_sof_intel_hda snd_sof_pci snd_sof_xtensa_dsp snd_sof snd_soc_hdac_hda snd_hda_ext_core snd_soc_acpi_intel_match snd_soc_acpi soundwire_bus snd_soc_core snd_usb_audio snd_hda_codec_realtek intel_rapl_msr snd_compress intel_rapl_common snd_hda_codec_generic ac97_bus ledtrig_audio snd_pcm_dmaengine snd_hda_codec_hdmi snd_usbmidi_lib x86_pkg_temp_thermal intel_powerclamp snd_rawmidi snd_seq_device coretemp mc snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi kvm_intel snd_hda_codec snd_hda_core xpad mei_hdcp snd_hwdep snd_pcm ff_memless joydev snd_timer kvm snd mei_me soundcore mei ee1004 intel_cstate intel_wmi_thunderbolt gigabyte_wmi wmi_bmof nvidia_uvm(POE) mac_hid acpi_pad acpi_tad sch_fq_codel msr pstore_blk ramoops pstore_zone reed_solomon efi_pstore ip_tables x_tables
Jul 20 10:56:28 desktop kernel: autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear nvidia_drm(POE) nvidia_modeset(POE) hid_generic usbhid hid nvidia(POE) crct10dif_pclmul crc32_pclmul ghash_clmulni_intel drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec aesni_intel rc_core crypto_simd cryptd drm r8169 ahci i2c_i801 xhci_pci i2c_smbus realtek libahci xhci_pci_renesas wmi video pinctrl_tigerlake
Jul 20 10:56:28 desktop kernel: CPU: 10 PID: 4256 Comm: DyingLightGame_ Tainted: P OEL 5.15.0-76-generic #83-Ubuntu
Jul 20 10:56:28 desktop kernel: Hardware name: Gigabyte Technology Co., Ltd. H510M S2H/H510M S2H, BIOS F9 08/23/2021
Jul 20 10:56:28 desktop kernel: RIP: 0010:_nv039537rm+0x3b/0x80 [nvidia]
Jul 20 10:56:28 desktop kernel: Code: d3 89 de 48 8d 55 0f c6 45 0f 00 e8 3f 4c 60 ff 80 7d 0f 00 41 89 c4 75 11 41 39 5d 10 76 20 49 8b 45 00 c1 eb 02 44 8b 24 98 <5b> 44 89 e0 41 5c 41 5d 48 83 c5 10 c3 0f 1f 84 00 00 00 00 00 be
Jul 20 10:56:28 desktop kernel: RSP: 0018:ffffbd0d839bf8f0 EFLAGS: 00000202
Jul 20 10:56:28 desktop kernel: RAX: ffffbd0d82000000 RBX: 0000000000002440 RCX: 0000000000009100
Jul 20 10:56:28 desktop kernel: RDX: ffff952e995e596f RSI: 0000000000009100 RDI: ffff952de38c0008
Jul 20 10:56:28 desktop kernel: RBP: ffff952e995e5960 R08: 0000000000000020 R09: 0000000000000000
Jul 20 10:56:28 desktop kernel: R10: 0000000000009100 R11: ffff952e995e59c8 R12: 0000000000000001
Jul 20 10:56:28 desktop kernel: R13: ffff952de38c0bc8 R14: 0000000000000000 R15: 0000000000000000
Jul 20 10:56:28 desktop kernel: FS: 00007fe4617b4040(0000) GS:ffff95355fc80000(0000) knlGS:000000007fee0000
Jul 20 10:56:28 desktop kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jul 20 10:56:28 desktop kernel: CR2: 00007f130400c000 CR3: 00000001fca94005 CR4: 0000000000770ee0
Jul 20 10:56:28 desktop kernel: PKRU: 55555554
Jul 20 10:56:28 desktop kernel: Call Trace:
Jul 20 10:56:28 desktop kernel: <TASK>
Jul 20 10:56:28 desktop kernel: ? _nv013076rm+0x10f/0x170 [nvidia]
Jul 20 10:56:28 desktop kernel: ? _nv047477rm+0x20/0x30 [nvidia]
Jul 20 10:56:28 desktop kernel: ? _nv030454rm+0x5a/0x110 [nvidia]
Jul 20 10:56:28 desktop kernel: ? _nv030546rm+0x13f/0x340 [nvidia]
Jul 20 10:56:28 desktop kernel: ? _nv030547rm+0x50/0x60 [nvidia]
Jul 20 10:56:28 desktop kernel: ? _nv013174rm+0x86/0xc0 [nvidia]
Jul 20 10:56:28 desktop kernel: ? _nv013170rm+0x3a4/0x400 [nvidia]
Jul 20 10:56:28 desktop kernel: ? _nv044237rm+0xd1/0x1b0 [nvidia]
Jul 20 10:56:28 desktop kernel: ? _nv041109rm+0x1e7/0x370 [nvidia]
Jul 20 10:56:28 desktop kernel: ? _nv048377rm+0x40/0x95 [nvidia]
Jul 20 10:56:28 desktop kernel: ? _nv035020rm+0x14d/0x2e0 [nvidia]
Jul 20 10:56:28 desktop kernel: ? _nv048374rm+0xc5/0x460 [nvidia]
Jul 20 10:56:28 desktop kernel: ? _nv002711rm+0xd/0x20 [nvidia]
Jul 20 10:56:28 desktop kernel: ? _nv004074rm+0x19/0xb0 [nvidia]
Jul 20 10:56:28 desktop kernel: ? _nv016053rm+0x51c/0x620 [nvidia]
Jul 20 10:56:28 desktop kernel: ? _nv043216rm+0xab/0xe0 [nvidia]
Jul 20 10:56:28 desktop kernel: ? _nv044933rm+0xac/0x130 [nvidia]
Jul 20 10:56:28 desktop kernel: ? _nv044932rm+0x3e5/0x690 [nvidia]
Jul 20 10:56:28 desktop kernel: ? _nv043119rm+0xd5/0x160 [nvidia]
Jul 20 10:56:28 desktop kernel: ? _nv043120rm+0x41/0x70 [nvidia]
Jul 20 10:56:28 desktop kernel: ? _nv000566rm+0x4d/0x60 [nvidia]
Jul 20 10:56:28 desktop kernel: ? _nv000714rm+0x1b7/0xe70 [nvidia]
Jul 20 10:56:28 desktop kernel: ? rm_ioctl+0x58/0xb0 [nvidia]
Jul 20 10:56:28 desktop kernel: ? nvidia_ioctl+0x61d/0x840 [nvidia]
Jul 20 10:56:28 desktop kernel: ? nvidia_frontend_unlocked_ioctl+0x55/0x90 [nvidia]
Jul 20 10:56:28 desktop kernel: ? __x64_sys_ioctl+0x92/0xd0
Jul 20 10:56:28 desktop kernel: ? do_syscall_64+0x59/0xc0
Jul 20 10:56:28 desktop kernel: ? do_syscall_64+0x69/0xc0
Jul 20 10:56:28 desktop kernel: ? exit_to_user_mode_prepare+0x37/0xb0
Jul 20 10:56:28 desktop kernel: ? syscall_exit_to_user_mode+0x27/0x50
Jul 20 10:56:28 desktop kernel: ? do_syscall_64+0x69/0xc0
Jul 20 10:56:28 desktop kernel: ? fput+0x13/0x20
Jul 20 10:56:28 desktop kernel: ? exit_to_user_mode_prepare+0x37/0xb0
Jul 20 10:56:28 desktop kernel: ? syscall_exit_to_user_mode+0x27/0x50
Jul 20 10:56:28 desktop kernel: ? do_syscall_64+0x69/0xc0
Jul 20 10:56:28 desktop kernel: ? do_syscall_64+0x69/0xc0
Jul 20 10:56:28 desktop kernel: ? irqentry_exit+0x1d/0x30
Jul 20 10:56:28 desktop kernel: ? sysvec_apic_timer_interrupt+0x4e/0x90
Jul 20 10:56:28 desktop kernel: ? entry_SYSCALL_64_after_hwframe+0x61/0xcb
Jul 20 10:56:28 desktop kernel: </TASK>

**** /proc/cpuinfo ****
processor : 10
vendor_id : GenuineIntel
cpu family : 6
model : 167
model name : 11th Gen Intel(R) Core(TM) i5-11400F @ 2.60GHz
stepping : 1
microcode : 0x57
cpu MHz : 2600.000
cache size : 12288 KB
physical id : 0
siblings : 12
core id : 4
cpu cores : 6
apicid : 9
initial apicid : 9
fpu : yes
fpu_exception : yes
cpuid level : 27
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfm
on pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popc
nt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault invpcid_single ssbd ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_
adjust bmi1 avx2 smep bmi2 erms invpcid mpx avx512f avx512dq rdseed adx smap avx512ifma clflushopt intel_pt avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln p
ts hwp hwp_notify hwp_act_window hwp_epp hwp_pkg_req avx512vbmi umip pku ospke avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq rdpid fsrm md_clear flush_l1d arch_cap
abilities
vmx flags : vnmi preemption_timer posted_intr invvpid ept_x_only ept_ad ept_1gb flexpriority apicv tsc_offset vtpr mtf vapic ept vpid unrestricted_guest vapic_reg vid ple pml ept_mode_base
d_exec tsc_scaling
bugs : spectre_v1 spectre_v2 spec_store_bypass swapgs mmio_stale_data retbleed eibrs_pbrsb
bogomips : 5184.00
clflush size : 64
cache_alignment : 64
address sizes : 39 bits physical, 48 bits virtual
power management:

Revision history for this message
James Fox (jfox950) wrote :

This bug is fixed with update to:
    Package: nvidia-driver-535
    Version: 535.86.05-0ubuntu0.22.04.1

My guess is it was this from nvidias patch notes:
    - Fixed a regression that could cause a system hang when running
      windowed Vulkan applications with sync-to-vblank enabled.

Please can someone close this bug report? Thanks.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.