Hard lockup with "watchdog: BUG: soft lockup - CPU#10 stuck for 22s! [Xorg:13615]" in the journal

Bug #1905984 reported by Dan Watkins
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
nvidia-graphics-drivers-450 (Ubuntu)
New
Undecided
Unassigned

Bug Description

The system was restored from hibernation this morning, but the issue did not exhibit for ~30 minutes after "boot". I have also seen hard locks without hibernation (but they have never produced any journal output, so may be a different issue). Examining `journalctl -k`, I see something like the below repeated every few seconds. I've attached `journalctl -k`s output (truncated from unhibernate this morning).

Nov 27 09:42:09 surprise kernel: watchdog: BUG: soft lockup - CPU#10 stuck for 22s! [Xorg:13615]
Nov 27 09:42:09 surprise kernel: Modules linked in: hid_logitech unix_diag vhost_net tap vhost_vsock vmw_vsock_virtio_transport_common vhost vsock vhost_iotlb binfmt_misc veth nft_masq zfs(PO) zunicode(PO) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) zlua(PO) xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp nft_compat nft_counter nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink bridge stp llc aufs rdma_ucm ib_uverbs rdma_cm iw_cm ib_cm ib_core overlay nls_iso8859_1 snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg snd_usb_audio snd_hda_codec uvcvideo snd_hda_core snd_usbmidi_lib videobuf2_vmalloc videobuf2_memops snd_hwdep videobuf2_v4l2 snd_seq_midi videobuf2_common snd_seq_midi_event snd_rawmidi edac_mce_amd videodev snd_pcm kvm_amd snd_seq mc input_leds kvm snd_seq_device joydev snd_timer ucsi_ccg typec_ucsi snd typec soundcore ccp rapl wmi_bmof k10temp efi_pstore mac_hid nvidia_uvm(OE)
Nov 27 09:42:09 surprise kernel: sch_fq_codel parport_pc ppdev lp parport ip_tables x_tables autofs4 btrfs blake2b_generic xor raid6_pq libcrc32c dm_crypt hid_logitech_hidpp hid_microsoft hid_logitech_dj ff_memless hid_generic usbhid hid nvidia_drm(POE) nvidia_modeset(POE) nvidia(POE) crct10dif_pclmul crc32_pclmul ghash_clmulni_intel drm_kms_helper aesni_intel syscopyarea sysfillrect sysimgblt fb_sys_fops crypto_simd cec cryptd glue_helper rc_core drm i2c_piix4 i2c_nvidia_gpu nvme r8169 ahci xhci_pci nvme_core realtek xhci_pci_renesas libahci wmi gpio_amdpt gpio_generic
Nov 27 09:42:09 surprise kernel: CPU: 10 PID: 13615 Comm: Xorg Tainted: P OE 5.8.0-29-generic #31-Ubuntu
Nov 27 09:42:09 surprise kernel: Hardware name: Gigabyte Technology Co., Ltd. B450M DS3H/B450M DS3H-CF, BIOS F4 01/25/2019
Nov 27 09:42:09 surprise kernel: RIP: 0010:_nv001550kms+0x16/0x70 [nvidia_modeset]
Nov 27 09:42:09 surprise kernel: Code: 53 28 e9 0e fe ff ff 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 41 54 55 49 89 fc 53 48 8d 5f 38 48 89 f5 48 89 df e8 9a 48 00 00 <84> c0 75 46 49 8b 54 24 40 48 39 d3 48 8b 42 18 74 17 48 39 c5 75
Nov 27 09:42:09 surprise kernel: RSP: 0018:ffffb479cf577910 EFLAGS: 00000287
Nov 27 09:42:09 surprise kernel: RAX: ffffffffc1a15800 RBX: ffff9c30b2233640 RCX: 00000000001f1623
Nov 27 09:42:09 surprise kernel: RDX: ffff9c2fd6186ac8 RSI: ffff9c2d33ef4008 RDI: ffff9c30b2233640
Nov 27 09:42:09 surprise kernel: RBP: ffff9c2d33ef4008 R08: ffffb479cf577830 R09: 0000000000000001
Nov 27 09:42:09 surprise kernel: R10: ffff9c2cd20fbbc0 R11: 000000000000001a R12: ffff9c30b2233608
Nov 27 09:42:09 surprise kernel: R13: 0000000000000000 R14: ffff9c2d33ef4008 R15: 0000000000000002
Nov 27 09:42:09 surprise kernel: FS: 00007f82d95e4a40(0000) GS:ffff9c30cf080000(0000) knlGS:0000000000000000
Nov 27 09:42:09 surprise kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Nov 27 09:42:09 surprise kernel: CR2: 000055683e860ff8 CR3: 000000037893c000 CR4: 00000000003406e0
Nov 27 09:42:09 surprise kernel: Call Trace:
Nov 27 09:42:09 surprise kernel: ? _nv001123kms+0xb2/0x400 [nvidia_modeset]
Nov 27 09:42:09 surprise kernel: ? _nv000732kms+0x1e/0x80 [nvidia_modeset]
Nov 27 09:42:09 surprise kernel: ? _nv002395kms+0x112/0x130 [nvidia_modeset]
Nov 27 09:42:09 surprise kernel: ? _nv000515kms+0xd1/0xe1 [nvidia_modeset]
Nov 27 09:42:09 surprise kernel: ? _nv000019kms+0x230/0x6fc [nvidia_modeset]
Nov 27 09:42:09 surprise kernel: ? kfree+0xb8/0x220
Nov 27 09:42:09 surprise kernel: ? os_free_mem+0x22/0x30 [nvidia]
Nov 27 09:42:09 surprise kernel: ? _nv008503rm+0xbe/0x100 [nvidia]
Nov 27 09:42:09 surprise kernel: ? _nv035038rm+0x2a/0x60 [nvidia]
Nov 27 09:42:09 surprise kernel: ? _nv030385rm+0x23/0x40 [nvidia]
Nov 27 09:42:09 surprise kernel: ? _nv033621rm+0x58/0xf0 [nvidia]
Nov 27 09:42:09 surprise kernel: ? _nv008135rm+0x33c/0x3f0 [nvidia]
Nov 27 09:42:09 surprise kernel: ? os_acquire_spinlock+0x12/0x30 [nvidia]
Nov 27 09:42:09 surprise kernel: ? os_release_spinlock+0x1a/0x20 [nvidia]
Nov 27 09:42:09 surprise kernel: ? _nv037019rm+0xa1/0x190 [nvidia]
Nov 27 09:42:09 surprise kernel: ? nvidia_modeset_rm_ops_free_stack+0x1d/0x20 [nvidia]
Nov 27 09:42:09 surprise kernel: ? _nv002759kms+0x12a0/0x1470 [nvidia_modeset]
Nov 27 09:42:09 surprise kernel: ? mpol_rebind_preferred+0x1c0/0x1c0
Nov 27 09:42:09 surprise kernel: ? _nv000531kms+0x50/0x50 [nvidia_modeset]
Nov 27 09:42:09 surprise kernel: ? nvKmsIoctl+0x96/0x1d0 [nvidia_modeset]
Nov 27 09:42:09 surprise kernel: ? nvkms_ioctl+0xfd/0x170 [nvidia_modeset]
Nov 27 09:42:09 surprise kernel: ? nvidia_frontend_unlocked_ioctl+0x3b/0x50 [nvidia]
Nov 27 09:42:09 surprise kernel: ? ksys_ioctl+0x8e/0xc0
Nov 27 09:42:09 surprise kernel: ? __x64_sys_ioctl+0x1a/0x20
Nov 27 09:42:09 surprise kernel: ? do_syscall_64+0x49/0xc0
Nov 27 09:42:09 surprise kernel: ? entry_SYSCALL_64_after_hwframe+0x44/0xa9

ProblemType: Bug
DistroRelease: Ubuntu 20.10
Package: nvidia-driver-450 450.80.02-0ubuntu1
ProcVersionSignature: Ubuntu 5.8.0-29.31-generic 5.8.14
Uname: Linux 5.8.0-29-generic x86_64
NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair nvidia_modeset nvidia
ApportVersion: 2.20.11-0ubuntu50.2
Architecture: amd64
CasperMD5CheckResult: skip
CurrentDesktop: i3
Date: Fri Nov 27 10:11:35 2020
InstallationDate: Installed on 2019-05-07 (569 days ago)
InstallationMedia: Ubuntu 18.04.2 LTS "Bionic Beaver" - Release amd64 (20190210)
SourcePackage: nvidia-graphics-drivers-450
UpgradeStatus: Upgraded to groovy on 2020-06-22 (157 days ago)

Revision history for this message
Dan Watkins (oddbloke) wrote :
Revision history for this message
Dan Watkins (oddbloke) wrote :
Download full text (3.6 KiB)

Looking through the journal further, I do see non-NVidia call traces such as:

Nov 27 09:43:52 surprise kernel: INFO: task qemu-system-x86:16736 blocked for more than 120 seconds.
Nov 27 09:43:52 surprise kernel: Tainted: P OEL 5.8.0-29-generic #31-Ubuntu
Nov 27 09:43:52 surprise kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 27 09:43:52 surprise kernel: qemu-system-x86 D 0 16736 1 0x00000320
Nov 27 09:43:52 surprise kernel: Call Trace:
Nov 27 09:43:52 surprise kernel: __schedule+0x212/0x5d0
Nov 27 09:43:52 surprise kernel: ? usleep_range+0x90/0x90
Nov 27 09:43:52 surprise kernel: schedule+0x55/0xc0
Nov 27 09:43:52 surprise kernel: schedule_timeout+0x10f/0x160
Nov 27 09:43:52 surprise kernel: ? do_sync_core+0x1d/0x20
Nov 27 09:43:52 surprise kernel: __wait_for_common+0xa8/0x150
Nov 27 09:43:52 surprise kernel: wait_for_completion+0x24/0x30
Nov 27 09:43:52 surprise kernel: __wait_rcu_gp+0x11b/0x120
Nov 27 09:43:52 surprise kernel: synchronize_rcu+0x67/0x70
Nov 27 09:43:52 surprise kernel: ? __call_rcu+0x250/0x250
Nov 27 09:43:52 surprise kernel: ? __bpf_trace_rcu_utilization+0x10/0x10
Nov 27 09:43:52 surprise kernel: account_event+0x1e8/0x1f0
Nov 27 09:43:52 surprise kernel: perf_event_alloc+0x77e/0x920
Nov 27 09:43:52 surprise kernel: ? kvm_perf_overflow+0x40/0x40 [kvm]
Nov 27 09:43:52 surprise kernel: perf_event_create_kernel_counter.part.0+0x21/0x160
Nov 27 09:43:52 surprise kernel: perf_event_create_kernel_counter+0xf/0x20
Nov 27 09:43:52 surprise kernel: pmc_reprogram_counter+0x105/0x190 [kvm]
Nov 27 09:43:52 surprise kernel: reprogram_gp_counter+0x194/0x210 [kvm]
Nov 27 09:43:52 surprise kernel: amd_pmu_set_msr+0x17d/0x190 [kvm_amd]
Nov 27 09:43:52 surprise kernel: kvm_pmu_set_msr+0x4e/0x60 [kvm]
Nov 27 09:43:52 surprise kernel: kvm_set_msr_common+0x4cc/0xf00 [kvm]
Nov 27 09:43:52 surprise kernel: svm_set_msr+0x39d/0x6e0 [kvm_amd]
Nov 27 09:43:52 surprise kernel: __kvm_set_msr+0x8a/0x150 [kvm]
Nov 27 09:43:52 surprise kernel: kvm_emulate_wrmsr+0x3c/0x120 [kvm]
Nov 27 09:43:52 surprise kernel: handle_exit+0x39a/0x420 [kvm_amd]
Nov 27 09:43:52 surprise kernel: ? kvm_set_cr8+0x22/0x40 [kvm]
Nov 27 09:43:52 surprise kernel: vcpu_enter_guest+0x862/0xd90 [kvm]
Nov 27 09:43:52 surprise kernel: ? kvm_apic_has_interrupt+0x41/0x80 [kvm]
Nov 27 09:43:52 surprise kernel: ? kvm_cpu_has_interrupt+0x7a/0x90 [kvm]
Nov 27 09:43:52 surprise kernel: ? kvm_vcpu_has_events+0x134/0x190 [kvm]
Nov 27 09:43:52 surprise kernel: vcpu_run+0x76/0x240 [kvm]
Nov 27 09:43:52 surprise kernel: kvm_arch_vcpu_ioctl_run+0x9f/0x2b0 [kvm]
Nov 27 09:43:52 surprise kernel: kvm_vcpu_ioctl+0x247/0x600 [kvm]
Nov 27 09:43:52 surprise kernel: ksys_ioctl+0x8e/0xc0
Nov 27 09:43:52 surprise kernel: __x64_sys_ioctl+0x1a/0x20
Nov 27 09:43:52 surprise kernel: do_syscall_64+0x49/0xc0
Nov 27 09:43:52 surprise kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9
Nov 27 09:43:52 surprise kernel: RIP: 0033:0x7fc2853b16d7
Nov 27 09:43:52 surprise kernel: Code: Bad RIP value.
Nov 27 09:43:52 surprise kernel: RSP: 002b:00007fc276220068 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
Nov 27 09:43:52 surp...

Read more...

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.