Hard lockup with "watchdog: BUG: soft lockup - CPU#10 stuck for 22s! [Xorg:13615]" in the journal

Bug #1905984 reported by Dan Watkins on 2020-11-27
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
nvidia-graphics-drivers-450 (Ubuntu)
Undecided
Unassigned

Bug Description

The system was restored from hibernation this morning, but the issue did not exhibit for ~30 minutes after "boot". I have also seen hard locks without hibernation (but they have never produced any journal output, so may be a different issue). Examining `journalctl -k`, I see something like the below repeated every few seconds. I've attached `journalctl -k`s output (truncated from unhibernate this morning).

Nov 27 09:42:09 surprise kernel: watchdog: BUG: soft lockup - CPU#10 stuck for 22s! [Xorg:13615]
Nov 27 09:42:09 surprise kernel: Modules linked in: hid_logitech unix_diag vhost_net tap vhost_vsock vmw_vsock_virtio_transport_common vhost vsock vhost_iotlb binfmt_misc veth nft_masq zfs(PO) zunicode(PO) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) zlua(PO) xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp nft_compat nft_counter nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink bridge stp llc aufs rdma_ucm ib_uverbs rdma_cm iw_cm ib_cm ib_core overlay nls_iso8859_1 snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg snd_usb_audio snd_hda_codec uvcvideo snd_hda_core snd_usbmidi_lib videobuf2_vmalloc videobuf2_memops snd_hwdep videobuf2_v4l2 snd_seq_midi videobuf2_common snd_seq_midi_event snd_rawmidi edac_mce_amd videodev snd_pcm kvm_amd snd_seq mc input_leds kvm snd_seq_device joydev snd_timer ucsi_ccg typec_ucsi snd typec soundcore ccp rapl wmi_bmof k10temp efi_pstore mac_hid nvidia_uvm(OE)
Nov 27 09:42:09 surprise kernel: sch_fq_codel parport_pc ppdev lp parport ip_tables x_tables autofs4 btrfs blake2b_generic xor raid6_pq libcrc32c dm_crypt hid_logitech_hidpp hid_microsoft hid_logitech_dj ff_memless hid_generic usbhid hid nvidia_drm(POE) nvidia_modeset(POE) nvidia(POE) crct10dif_pclmul crc32_pclmul ghash_clmulni_intel drm_kms_helper aesni_intel syscopyarea sysfillrect sysimgblt fb_sys_fops crypto_simd cec cryptd glue_helper rc_core drm i2c_piix4 i2c_nvidia_gpu nvme r8169 ahci xhci_pci nvme_core realtek xhci_pci_renesas libahci wmi gpio_amdpt gpio_generic
Nov 27 09:42:09 surprise kernel: CPU: 10 PID: 13615 Comm: Xorg Tainted: P OE 5.8.0-29-generic #31-Ubuntu
Nov 27 09:42:09 surprise kernel: Hardware name: Gigabyte Technology Co., Ltd. B450M DS3H/B450M DS3H-CF, BIOS F4 01/25/2019
Nov 27 09:42:09 surprise kernel: RIP: 0010:_nv001550kms+0x16/0x70 [nvidia_modeset]
Nov 27 09:42:09 surprise kernel: Code: 53 28 e9 0e fe ff ff 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 41 54 55 49 89 fc 53 48 8d 5f 38 48 89 f5 48 89 df e8 9a 48 00 00 <84> c0 75 46 49 8b 54 24 40 48 39 d3 48 8b 42 18 74 17 48 39 c5 75
Nov 27 09:42:09 surprise kernel: RSP: 0018:ffffb479cf577910 EFLAGS: 00000287
Nov 27 09:42:09 surprise kernel: RAX: ffffffffc1a15800 RBX: ffff9c30b2233640 RCX: 00000000001f1623
Nov 27 09:42:09 surprise kernel: RDX: ffff9c2fd6186ac8 RSI: ffff9c2d33ef4008 RDI: ffff9c30b2233640
Nov 27 09:42:09 surprise kernel: RBP: ffff9c2d33ef4008 R08: ffffb479cf577830 R09: 0000000000000001
Nov 27 09:42:09 surprise kernel: R10: ffff9c2cd20fbbc0 R11: 000000000000001a R12: ffff9c30b2233608
Nov 27 09:42:09 surprise kernel: R13: 0000000000000000 R14: ffff9c2d33ef4008 R15: 0000000000000002
Nov 27 09:42:09 surprise kernel: FS: 00007f82d95e4a40(0000) GS:ffff9c30cf080000(0000) knlGS:0000000000000000
Nov 27 09:42:09 surprise kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Nov 27 09:42:09 surprise kernel: CR2: 000055683e860ff8 CR3: 000000037893c000 CR4: 00000000003406e0
Nov 27 09:42:09 surprise kernel: Call Trace:
Nov 27 09:42:09 surprise kernel: ? _nv001123kms+0xb2/0x400 [nvidia_modeset]
Nov 27 09:42:09 surprise kernel: ? _nv000732kms+0x1e/0x80 [nvidia_modeset]
Nov 27 09:42:09 surprise kernel: ? _nv002395kms+0x112/0x130 [nvidia_modeset]
Nov 27 09:42:09 surprise kernel: ? _nv000515kms+0xd1/0xe1 [nvidia_modeset]
Nov 27 09:42:09 surprise kernel: ? _nv000019kms+0x230/0x6fc [nvidia_modeset]
Nov 27 09:42:09 surprise kernel: ? kfree+0xb8/0x220
Nov 27 09:42:09 surprise kernel: ? os_free_mem+0x22/0x30 [nvidia]
Nov 27 09:42:09 surprise kernel: ? _nv008503rm+0xbe/0x100 [nvidia]
Nov 27 09:42:09 surprise kernel: ? _nv035038rm+0x2a/0x60 [nvidia]
Nov 27 09:42:09 surprise kernel: ? _nv030385rm+0x23/0x40 [nvidia]
Nov 27 09:42:09 surprise kernel: ? _nv033621rm+0x58/0xf0 [nvidia]
Nov 27 09:42:09 surprise kernel: ? _nv008135rm+0x33c/0x3f0 [nvidia]
Nov 27 09:42:09 surprise kernel: ? os_acquire_spinlock+0x12/0x30 [nvidia]
Nov 27 09:42:09 surprise kernel: ? os_release_spinlock+0x1a/0x20 [nvidia]
Nov 27 09:42:09 surprise kernel: ? _nv037019rm+0xa1/0x190 [nvidia]
Nov 27 09:42:09 surprise kernel: ? nvidia_modeset_rm_ops_free_stack+0x1d/0x20 [nvidia]
Nov 27 09:42:09 surprise kernel: ? _nv002759kms+0x12a0/0x1470 [nvidia_modeset]
Nov 27 09:42:09 surprise kernel: ? mpol_rebind_preferred+0x1c0/0x1c0
Nov 27 09:42:09 surprise kernel: ? _nv000531kms+0x50/0x50 [nvidia_modeset]
Nov 27 09:42:09 surprise kernel: ? nvKmsIoctl+0x96/0x1d0 [nvidia_modeset]
Nov 27 09:42:09 surprise kernel: ? nvkms_ioctl+0xfd/0x170 [nvidia_modeset]
Nov 27 09:42:09 surprise kernel: ? nvidia_frontend_unlocked_ioctl+0x3b/0x50 [nvidia]
Nov 27 09:42:09 surprise kernel: ? ksys_ioctl+0x8e/0xc0
Nov 27 09:42:09 surprise kernel: ? __x64_sys_ioctl+0x1a/0x20
Nov 27 09:42:09 surprise kernel: ? do_syscall_64+0x49/0xc0
Nov 27 09:42:09 surprise kernel: ? entry_SYSCALL_64_after_hwframe+0x44/0xa9

ProblemType: Bug
DistroRelease: Ubuntu 20.10
Package: nvidia-driver-450 450.80.02-0ubuntu1
ProcVersionSignature: Ubuntu 5.8.0-29.31-generic 5.8.14
Uname: Linux 5.8.0-29-generic x86_64
NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair nvidia_modeset nvidia
ApportVersion: 2.20.11-0ubuntu50.2
Architecture: amd64
CasperMD5CheckResult: skip
CurrentDesktop: i3
Date: Fri Nov 27 10:11:35 2020
InstallationDate: Installed on 2019-05-07 (569 days ago)
InstallationMedia: Ubuntu 18.04.2 LTS "Bionic Beaver" - Release amd64 (20190210)
SourcePackage: nvidia-graphics-drivers-450
UpgradeStatus: Upgraded to groovy on 2020-06-22 (157 days ago)

Dan Watkins (oddbloke) wrote :
Dan Watkins (oddbloke) wrote :
Download full text (3.6 KiB)

Looking through the journal further, I do see non-NVidia call traces such as:

Nov 27 09:43:52 surprise kernel: INFO: task qemu-system-x86:16736 blocked for more than 120 seconds.
Nov 27 09:43:52 surprise kernel: Tainted: P OEL 5.8.0-29-generic #31-Ubuntu
Nov 27 09:43:52 surprise kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 27 09:43:52 surprise kernel: qemu-system-x86 D 0 16736 1 0x00000320
Nov 27 09:43:52 surprise kernel: Call Trace:
Nov 27 09:43:52 surprise kernel: __schedule+0x212/0x5d0
Nov 27 09:43:52 surprise kernel: ? usleep_range+0x90/0x90
Nov 27 09:43:52 surprise kernel: schedule+0x55/0xc0
Nov 27 09:43:52 surprise kernel: schedule_timeout+0x10f/0x160
Nov 27 09:43:52 surprise kernel: ? do_sync_core+0x1d/0x20
Nov 27 09:43:52 surprise kernel: __wait_for_common+0xa8/0x150
Nov 27 09:43:52 surprise kernel: wait_for_completion+0x24/0x30
Nov 27 09:43:52 surprise kernel: __wait_rcu_gp+0x11b/0x120
Nov 27 09:43:52 surprise kernel: synchronize_rcu+0x67/0x70
Nov 27 09:43:52 surprise kernel: ? __call_rcu+0x250/0x250
Nov 27 09:43:52 surprise kernel: ? __bpf_trace_rcu_utilization+0x10/0x10
Nov 27 09:43:52 surprise kernel: account_event+0x1e8/0x1f0
Nov 27 09:43:52 surprise kernel: perf_event_alloc+0x77e/0x920
Nov 27 09:43:52 surprise kernel: ? kvm_perf_overflow+0x40/0x40 [kvm]
Nov 27 09:43:52 surprise kernel: perf_event_create_kernel_counter.part.0+0x21/0x160
Nov 27 09:43:52 surprise kernel: perf_event_create_kernel_counter+0xf/0x20
Nov 27 09:43:52 surprise kernel: pmc_reprogram_counter+0x105/0x190 [kvm]
Nov 27 09:43:52 surprise kernel: reprogram_gp_counter+0x194/0x210 [kvm]
Nov 27 09:43:52 surprise kernel: amd_pmu_set_msr+0x17d/0x190 [kvm_amd]
Nov 27 09:43:52 surprise kernel: kvm_pmu_set_msr+0x4e/0x60 [kvm]
Nov 27 09:43:52 surprise kernel: kvm_set_msr_common+0x4cc/0xf00 [kvm]
Nov 27 09:43:52 surprise kernel: svm_set_msr+0x39d/0x6e0 [kvm_amd]
Nov 27 09:43:52 surprise kernel: __kvm_set_msr+0x8a/0x150 [kvm]
Nov 27 09:43:52 surprise kernel: kvm_emulate_wrmsr+0x3c/0x120 [kvm]
Nov 27 09:43:52 surprise kernel: handle_exit+0x39a/0x420 [kvm_amd]
Nov 27 09:43:52 surprise kernel: ? kvm_set_cr8+0x22/0x40 [kvm]
Nov 27 09:43:52 surprise kernel: vcpu_enter_guest+0x862/0xd90 [kvm]
Nov 27 09:43:52 surprise kernel: ? kvm_apic_has_interrupt+0x41/0x80 [kvm]
Nov 27 09:43:52 surprise kernel: ? kvm_cpu_has_interrupt+0x7a/0x90 [kvm]
Nov 27 09:43:52 surprise kernel: ? kvm_vcpu_has_events+0x134/0x190 [kvm]
Nov 27 09:43:52 surprise kernel: vcpu_run+0x76/0x240 [kvm]
Nov 27 09:43:52 surprise kernel: kvm_arch_vcpu_ioctl_run+0x9f/0x2b0 [kvm]
Nov 27 09:43:52 surprise kernel: kvm_vcpu_ioctl+0x247/0x600 [kvm]
Nov 27 09:43:52 surprise kernel: ksys_ioctl+0x8e/0xc0
Nov 27 09:43:52 surprise kernel: __x64_sys_ioctl+0x1a/0x20
Nov 27 09:43:52 surprise kernel: do_syscall_64+0x49/0xc0
Nov 27 09:43:52 surprise kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9
Nov 27 09:43:52 surprise kernel: RIP: 0033:0x7fc2853b16d7
Nov 27 09:43:52 surprise kernel: Code: Bad RIP value.
Nov 27 09:43:52 surprise kernel: RSP: 002b:00007fc276220068 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
Nov 27 09:43:52 surp...

Read more...

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers