watchdog: BUG: soft lockup on Threadripper 2950X
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux (Ubuntu) |
Confirmed
|
Undecided
|
Unassigned |
Bug Description
Description: Ubuntu 20.04.2 LTS
Release: 20.04
Been suddenly seeing a number of crashes today on my threadripper 2950x box today after the system being off over the weekend.
Suspect it may be tied to Ubuntu 5.4.0-80.90-generic 5.4.124 kernel, as I wasn't seeing it last week or previously.
Aug 2 16:52:14 threadripper kernel: [ 600.168436] watchdog: BUG: soft lockup - CPU#19 stuck for 22s! [kworker/
Aug 2 16:52:14 threadripper kernel: [ 600.168490] Modules linked in: veth xt_MASQUERADE nf_conntrack_
_nat br_netfilter bridge stp llc aufs overlay nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua snd_hda_
ledtrig_audio snd_hda_codec_hdmi eeepc_wmi snd_hda_intel edac_mce_amd snd_intel_dspcfg asus_wmi ftdi_sio snd_hda_codec kvm_amd usbserial sparse_keymap snd_
hda_core kvm video wmi_bmof snd_hwdep snd_pcm snd_timer snd ccp soundcore k10temp mac_hid nf_log_ipv6 ip6t_REJECT nf_reject_ipv6 xt_hl ip6t_rt nf_log_ipv4
nf_log_common ipt_REJECT nf_reject_ipv4 xt_LOG xt_limit xt_addrtype sch_fq_codel xt_tcpudp xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip6table
_filter ip6_tables iptable_filter bpfilter ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor
async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid uas usb_storage amdgpu
Aug 2 16:52:14 threadripper kernel: [ 600.168542] amd_iommu_v2 gpu_sched crct10dif_pclmul ttm crc32_pclmul ghash_clmulni_intel drm_kms_helper syscopyare
a aesni_intel crypto_simd mxm_wmi sysfillrect cryptd sysimgblt glue_helper fb_sys_fops igb drm dca i2c_piix4 ahci i2c_algo_bit libahci gpio_amdpt wmi gpio_
generic
Aug 2 16:52:14 threadripper kernel: [ 600.168558] CPU: 19 PID: 11301 Comm: kworker/19:0 Tainted: G L 5.4.0-80-generic #90-Ubuntu
Aug 2 16:52:14 threadripper kernel: [ 600.168559] Hardware name: System manufacturer System Product Name/ROG STRIX X399-E GAMING, BIOS 1203 10/09/2019
Aug 2 16:52:14 threadripper kernel: [ 600.168569] Workqueue: events free_work
Aug 2 16:52:14 threadripper kernel: [ 600.168574] RIP: 0010:smp_
Aug 2 16:52:14 threadripper kernel: [ 600.168576] Code: e8 50 10 92 00 3b 05 ae cf 70 01 89 c7 0f 83 9b fe ff ff 48 63 c7 48 8b 0b 48 03 0c c5 80 99 64 a
1 8b 41 18 a8 01 74 0a f3 90 <8b> 51 18 83 e2 01 75 f6 eb c8 89 cf 48 c7 c2 a0 b8 a4 a1 4c 89 fe
Aug 2 16:52:14 threadripper kernel: [ 600.168577] RSP: 0018:ffffb66b0a
Aug 2 16:52:14 threadripper kernel: [ 600.168579] RAX: 0000000000000003 RBX: ffff8de1fd4ebd40 RCX: ffff8de1fd0b2540
Aug 2 16:52:14 threadripper kernel: [ 600.168580] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000000000000002
Aug 2 16:52:14 threadripper kernel: [ 600.168580] RBP: ffffb66b0aa17d40 R08: ffff8de1f6da7190 R09: 0000000000000003
Aug 2 16:52:14 threadripper kernel: [ 600.168581] R10: ffff8de1f6da7190 R11: 0000000000000002 R12: ffffffffa0281930
Aug 2 16:52:14 threadripper kernel: [ 600.168581] R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000080
Aug 2 16:52:14 threadripper kernel: [ 600.168583] FS: 000000000000000
Aug 2 16:52:14 threadripper kernel: [ 600.168583] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Aug 2 16:52:14 threadripper kernel: [ 600.168584] CR2: 000055ea29edefd0 CR3: 00000009c500a000 CR4: 00000000003406e0
Aug 2 16:52:14 threadripper kernel: [ 600.168585] Call Trace:
Aug 2 16:52:14 threadripper kernel: [ 600.168592] ? load_new_
Aug 2 16:52:14 threadripper kernel: [ 600.168594] on_each_
Aug 2 16:52:14 threadripper kernel: [ 600.168596] flush_tlb_
Aug 2 16:52:14 threadripper kernel: [ 600.168597] __purge_
Aug 2 16:52:14 threadripper kernel: [ 600.168598] free_vmap_
Aug 2 16:52:14 threadripper kernel: [ 600.168600] remove_
Aug 2 16:52:14 threadripper kernel: [ 600.168602] __vunmap+0x5f/0x210
Aug 2 16:52:14 threadripper kernel: [ 600.168603] free_work+0x25/0x30
Aug 2 16:52:14 threadripper kernel: [ 600.168607] process_
Aug 2 16:52:14 threadripper kernel: [ 600.168609] worker_
Aug 2 16:52:14 threadripper kernel: [ 600.168611] kthread+0x104/0x140
Aug 2 16:52:14 threadripper kernel: [ 600.168612] ? process_
Aug 2 16:52:14 threadripper kernel: [ 600.168613] ? kthread_
Aug 2 16:52:14 threadripper kernel: [ 600.168617] ret_from_
Aug 2 16:52:40 threadripper kernel: [ 606.280524] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
Aug 2 16:52:40 threadripper kernel: [ 606.280567] rcu: 2-...0: (1 GPs behind) idle=ae6/
Aug 2 16:52:40 threadripper kernel: [ 606.280609] rcu: 18-...0: (1 GPs behind) idle=c8e/
Aug 2 16:52:40 threadripper kernel: [ 606.280659] (detected by 24, t=15002 jiffies, g=39017, q=5149545)
Aug 2 16:52:40 threadripper kernel: [ 606.280661] Sending NMI from CPU 24 to CPUs 2:
Aug 2 16:52:40 threadripper kernel: [ 616.204803] Sending NMI from CPU 24 to CPUs 18:
Aug 2 16:52:40 threadripper kernel: [ 626.131497] rcu: rcu_sched kthread starved for 4960 jiffies! g39017 f0x2 RCU_GP_DOING_FQS(6) ->state=0x0 ->cpu=7
Aug 2 16:52:40 threadripper kernel: [ 626.131554] rcu: RCU grace-period kthread stack dump:
Aug 2 16:52:40 threadripper kernel: [ 626.131577] rcu_sched R running task 0 11 2 0x80004000
Aug 2 16:52:40 threadripper kernel: [ 626.131580] Call Trace:
Aug 2 16:52:40 threadripper kernel: [ 626.131589] __schedule+
Aug 2 16:52:40 threadripper kernel: [ 626.131592] preempt_
Aug 2 16:52:40 threadripper kernel: [ 626.131594] _cond_resched+
Aug 2 16:52:40 threadripper kernel: [ 626.131597] force_qs_
Aug 2 16:52:40 threadripper kernel: [ 626.131598] ? synchronize_
Aug 2 16:52:40 threadripper kernel: [ 626.131600] rcu_gp_
Aug 2 16:52:40 threadripper kernel: [ 626.131604] kthread+0x104/0x140
Aug 2 16:52:40 threadripper kernel: [ 626.131605] ? kfree_call_
Aug 2 16:52:40 threadripper kernel: [ 626.131607] ? kthread_
Aug 2 16:52:40 threadripper kernel: [ 626.131608] ret_from_
ProblemType: Bug
DistroRelease: Ubuntu 20.04
Package: linux-image-
ProcVersionSign
Uname: Linux 5.4.0-80-generic x86_64
AlsaVersion: Advanced Linux Sound Architecture Driver Version k5.4.0-80-generic.
ApportVersion: 2.20.11-
Architecture: amd64
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/by-path', '/dev/snd/
Card0.Amixer.info:
Card hw:0 'Generic'/'HD-Audio Generic at 0xba600000 irq 96'
Mixer name : 'Realtek ALC1220'
Components : 'HDA:10ec1168,
Controls : 46
Simple ctrls : 20
Card1.Amixer.info:
Card hw:1 'HDMI'/'HDA ATI HDMI at 0x9f860000 irq 98'
Mixer name : 'ATI R6xx HDMI'
Components : 'HDA:1002aa01,
Controls : 14
Simple ctrls : 2
CasperMD5CheckR
Date: Mon Aug 2 19:09:24 2021
IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
MachineType: System manufacturer System Product Name
ProcEnviron:
TERM=screen.
PATH=(custom, no user)
LANG=en_US.UTF-8
SHELL=/bin/bash
ProcFB: 0 amdgpudrmfb
ProcKernelCmdLine: BOOT_IMAGE=
RelatedPackageV
linux-
linux-
linux-firmware 1.187.15
RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
SourcePackage: linux
UpgradeStatus: Upgraded to focal on 2021-01-23 (191 days ago)
dmi.bios.date: 10/09/2019
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: 1203
dmi.board.
dmi.board.name: ROG STRIX X399-E GAMING
dmi.board.vendor: ASUSTeK COMPUTER INC.
dmi.board.version: Rev 1.xx
dmi.chassis.
dmi.chassis.type: 3
dmi.chassis.vendor: Default string
dmi.chassis.
dmi.modalias: dmi:bvnAmerican
dmi.product.family: To be filled by O.E.M.
dmi.product.name: System Product Name
dmi.product.sku: SKU
dmi.product.
dmi.sys.vendor: System manufacturer
This change was made by a bot.