Ubuntu
linux package

watchdog: BUG: soft lockup - CPU#10 stuck for 22s! [gnome-shell:1112]

Bug #1796385 reported by Bounty on 2018-10-05

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	Linux	Confirmed	Medium	linux-kernel-bugs #201379
	linux (Ubuntu)	Incomplete	Medium	Unassigned

Bug Description

I can use the system for a while, then at random, the screen blinks and freezes. Must reboot.
Seems to happen both with Wayland and Xorg.

ProblemType: KernelOops
DistroRelease: Ubuntu 18.10
Package: linux-image-4.18.0-8-generic 4.18.0-8.9
ProcVersionSignature: Ubuntu 4.18.0-8.9-generic 4.18.7
Uname: Linux 4.18.0-8-generic x86_64
Annotation: Your system might become unstable now and might need to be restarted.
ApportVersion: 2.20.10-0ubuntu11
Architecture: amd64
AudioDevicesInUse:
USER PID ACCESS COMMAND
/dev/snd/controlC0: greg 2039 F.... pulseaudio
/dev/snd/controlC1: greg 2039 F.... pulseaudio
Date: Tue Oct 2 15:56:23 2018
Failure: oops
InstallationDate: Installed on 2018-09-28 (7 days ago)
InstallationMedia: Ubuntu 18.10 "Cosmic Cuttlefish" - Beta amd64 (20180927)
IwConfig:
lo no wireless extensions.

eno1 no wireless extensions.
MachineType: Gigabyte Technology Co., Ltd. Default string
ProcFB: 0 radeondrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.18.0-8-generic root=UUID=ce06b10d-2a7f-49db-a15b-85554d9a7e4d ro quiet splash vt.handoff=1
PulseList: Error: command ['pacmd', 'list'] failed with exit code 1: No PulseAudio daemon running, or not running as session daemon.
RelatedPackageVersions: kerneloops-daemon N/A
RfKill:

SourcePackage: linux
Title: watchdog: BUG: soft lockup - CPU#10 stuck for 22s! [gnome-shell:1112]
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 06/07/2017
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: F23
dmi.board.asset.tag: Default string
dmi.board.name: X99-UD4-CF
dmi.board.vendor: Gigabyte Technology Co., Ltd.
dmi.board.version: x.x
dmi.chassis.asset.tag: Default string
dmi.chassis.type: 3
dmi.chassis.vendor: Default string
dmi.chassis.version: Default string
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvrF23:bd06/07/2017:svnGigabyteTechnologyCo.,Ltd.:pnDefaultstring:pvrDefaultstring:rvnGigabyteTechnologyCo.,Ltd.:rnX99-UD4-CF:rvrx.x:cvnDefaultstring:ct3:cvrDefaultstring:
dmi.product.family: Default string
dmi.product.name: Default string
dmi.product.sku: Default string
dmi.product.version: Default string
dmi.sys.vendor: Gigabyte Technology Co., Ltd.

Tags:

Revision history for this message

Bounty (gregr-arsfabula) wrote on 2018-10-05:

AlsaInfo.txt Edit (59.9 KiB, text/plain; charset="utf-8")
CRDA.txt Edit (468 bytes, text/plain; charset="utf-8")
CurrentDmesg.txt Edit (80.9 KiB, text/plain; charset="utf-8")
Dependencies.txt Edit (2.6 KiB, text/plain; charset="utf-8")
Lspci.txt Edit (67.7 KiB, text/plain; charset="utf-8")
Lsusb.txt Edit (655 bytes, text/plain; charset="utf-8")
OopsText.txt Edit (1.9 KiB, text/plain; charset="utf-8")
ProcCpuinfo.txt Edit (13.9 KiB, text/plain; charset="utf-8")
ProcCpuinfoMinimal.txt Edit (1.2 KiB, text/plain; charset="utf-8")
ProcInterrupts.txt Edit (6.3 KiB, text/plain; charset="utf-8")
ProcModules.txt Edit (3.9 KiB, text/plain; charset="utf-8")
UdevDb.txt Edit (317.8 KiB, text/plain; charset="utf-8")
WifiSyslog.txt Edit (109.7 KiB, text/plain; charset="utf-8")

Revision history for this message

Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote on 2018-10-05: Status changed to Confirmed

This change was made by a bot.

Changed in linux (Ubuntu):
status:	New → Confirmed

Revision history for this message

Cristian Aravena Romero (caravena) wrote on 2018-10-08:

Did this issue start happening after an update/upgrade? Was there a prior kernel version where you were not having this particular problem?

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v4.19-rc7 kernel[0].

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.19-rc7/

Changed in linux (Ubuntu):
status:	Confirmed → Incomplete

Joseph Salisbury (jsalisbury) on 2018-10-09

Changed in linux (Ubuntu):
importance:	Undecided → Medium

Revision history for this message

Bounty (gregr-arsfabula) wrote on 2018-10-11:

This issue started with the upgrade to 18.10 beta. I was not having it before.
I installed the V4.19-rc7 and still have the issue though it seems to be happening less frequently.

Changed in linux (Ubuntu):
status:	Incomplete → Confirmed

Bounty (gregr-arsfabula) on 2018-10-11

tags:

added: kernel-bug-exists-upstream

Revision history for this message

In Linux Kernel Bug Tracker #201379, caravena (caravena-linux-kernel-bugs) wrote on 2018-10-11:

#12

Hello,

Open bug in launchpad.net
https://bugs.launchpad.net/bugs/1796385

"I can use the system for a while, then at random, the screen blinks and freezes. Must reboot.
Seems to happen both with Wayland and Xorg."

Best regards,
--
Cristian Aravena Romero (caravena)

Revision history for this message

Cristian Aravena Romero (caravena) wrote on 2018-10-11:

https://bugzilla.kernel.org/show_bug.cgi?id=201379
--
Cristian Aravena Romero (caravena)

Cristian Aravena Romero (caravena) on 2018-10-12

tags:

added: rls-cc-incoming

Revision history for this message

In Linux Kernel Bug Tracker #201379, linux (linux-linux-kernel-bugs) wrote on 2018-10-12:

#13

Talk about shooting the messenger. The kernel's watchdog (kernel/watchdog.c) reports a stalled CPU. This is not a problem with the kernel's watchdog. The kernel watchdog just reports the problem. It is also not a problem with the watchdog subsystem, which is not even involved.

Revision history for this message

In Linux Kernel Bug Tracker #201379, caravena (caravena-linux-kernel-bugs) wrote on 2018-10-12:

#14

@Guenter,

Could you change to where it corresponds in bugzilla if it is not in 'watchdog' this report?

Best regards,
--
Cristian Aravena Romero (caravena)

Revision history for this message

In Linux Kernel Bug Tracker #201379, linux (linux-linux-kernel-bugs) wrote on 2018-10-12:

#15

The bug in lauchpad, unless I am missing something, provides not a single actionable traceback. I don't think it is even possible to identify where exactly the CPU hangs unless additional information is provided. There is no traceback in dmesg, and OopsText doesn't include it either.

Given that, it is not possible to identify the responsible subsystem, much less to fix the underlying problem. The only thing we can say for sure is that it is _not_ a watchdog driver problem.

Revision history for this message

In Linux Kernel Bug Tracker #201379, linux (linux-linux-kernel-bugs) wrote on 2018-10-12:

#16

Also, I don't think I have permission to change any of the bug status fields.

Revision history for this message

In Linux Kernel Bug Tracker #201379, caravena (caravena-linux-kernel-bugs) wrote on 2018-10-12:

#17

@Guenter,

I change it, but I do not know what 'Product' and 'Component'.

Best regards,
--
Cristian Aravena Romero (caravena)

Revision history for this message

Cristian Aravena Romero (caravena) wrote on 2018-10-12:

https://bugzilla.kernel.org/show_bug.cgi?id=201379#c3

We lack the 'Call Trace'
--
Cristian Aravena Romero (caravena)

Revision history for this message

In Linux Kernel Bug Tracker #201379, linux (linux-linux-kernel-bugs) wrote on 2018-10-12:

#18

Unfortunately we do not have information to determine 'Product' and 'Component'.

The only information we have is that the hanging process is gnome-shell (or at least that this was the case in at least one instance), that the screen blinks and freezes when the problem is observed, and that the hanging CPU served most of the graphics card interrupts. If it is persistent, it _might_ suggest that graphics (presumably the Radeon graphics driver and/or the graphics hardware) is involved. This would be even more likely if the observed PCIe errors point to the graphics card (not sure if the information provided shows the PCIe bus tree; if so I have not found it).

Revision history for this message

Cristian Aravena Romero (caravena) wrote on 2018-10-12:

https://bugzilla.kernel.org/show_bug.cgi?id=201379#c6

"Unfortunately we do not have information to determine 'Product' and 'Component'.

Revision history for this message

Cristian Aravena Romero (caravena) wrote on 2018-10-12:

@Bounty

Could you temporarily change the video card to rule out problems with it?

Your current video card is:
[AMD/ATI] Curacao PRO [Radeon R7 370 / R9 270/370 OEM]

Best regards,
--
Cristian Aravena Romero (caravena)

Changed in linux (Ubuntu):
status:	Confirmed → Incomplete

Revision history for this message

Cristian Aravena Romero (caravena) wrote on 2018-10-12:

#10

Hello,

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1796385/comments/6

This suggests another error -> Bug 1797625
--
Cristian Aravena Romero (caravena)

Revision history for this message

Bounty (gregr-arsfabula) wrote on 2018-10-16:

#11

Hello,

I won't be able to quickly test another video card. sorry about that.

Greg

Bug Watch Updater (bug-watch-updater) on 2018-11-07

Changed in linux:
importance:	Unknown → Medium
status:	Unknown → Confirmed

Revision history for this message

Amer Hwitat (amer.hwitat) wrote on 2019-01-28:

#19

Red Hat Enterprise Linux 7 64-bit (2)-2019-01-28-03-57-27.png Edit (26.3 KiB, image/png)

Revision history for this message

Amer Hwitat (amer.hwitat) wrote on 2019-01-28:

#20

Message from syslogd@amer at Jan 27 19:26:19 ...
kernel:NMI watchdog: BUG: soft lockup - CPU#5 stuck for 26s! [swapper/5:0]

Message from syslogd@amer at Jan 27 19:26:19 ...
kernel:NMI watchdog: BUG: soft lockup - CPU#1 stuck for 27s! [dmeventd:71548]

Message from syslogd@amer at Jan 27 19:27:30 ...
kernel:NMI watchdog: BUG: soft lockup - CPU#5 stuck for 22s! [6_scheduler:64928]

Message from syslogd@amer at Jan 27 19:31:25 ...
kernel:NMI watchdog: BUG: soft lockup - CPU#5 stuck for 22s! [ksoftirqd/5:34]

Message from syslogd@amer at Jan 27 19:32:42 ...
kernel:NMI watchdog: BUG: soft lockup - CPU#3 stuck for 33s! [swift-object-up:11358]

Message from syslogd@amer at Jan 27 19:33:55 ...
kernel:NMI watchdog: BUG: soft lockup - CPU#3 stuck for 24s! [dmeventd:71548]

Message from syslogd@amer at Jan 27 19:34:25 ...
kernel:NMI watchdog: BUG: soft lockup - CPU#2 stuck for 65s! [kworker/2:0:59993]

Message from syslogd@amer at Jan 27 19:37:50 ...
kernel:NMI watchdog: BUG: soft lockup - CPU#2 stuck for 24s! [kworker/u256:3:8447]

Message from syslogd@amer at Jan 27 19:37:50 ...
kernel:NMI watchdog: BUG: soft lockup - CPU#5 stuck for 22s! [ksoftirqd/5:34]

Message from syslogd@amer at Jan 27 19:37:51 ...
kernel:NMI watchdog: BUG: soft lockup - CPU#0 stuck for 21s! [systemd:11968]

The CPU has been disabled by the guest operating system. Power off or reset the virtual machine.

Revision history for this message

In Linux Kernel Bug Tracker #201379, amer.hwaitat (amer.hwaitat-linux-kernel-bugs) wrote on 2019-01-28:

#21