[radeon] machine crash after GPU reset
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux (Ubuntu) |
Expired
|
Medium
|
Unassigned |
Bug Description
I showed up to work today (after being away for a couple of weeks) and my machine had crashed.
I'm not sure why it happened this morning at 7am. My only guess is that a janitor came in and touched the mouse which woke the screen up after a couple of weeks of sleep.
(I have been using the machine remotely for computations, which had finished yesterday, but the building was closed so I'm guessing this is the first time the screen has come on in a while).
The first unusual entries in my kern.log are:
Aug 20 07:02:43 winters kernel: [2218007.668454] radeon 0000:03:00.0: ring 0 stalled for more than 10248msec
Aug 20 07:02:43 winters kernel: [2218007.668465] radeon 0000:03:00.0: GPU lockup (current fence id 0x00000000000799d4 last fence id 0x00000000000799d6 on ring 0)
They are plentiful, and are mixed with other messages (see attached log file) such as:
Aug 20 07:03:00 winters kernel: [2218025.438173] radeon 0000:03:00.0: failed VCE resume (-110).
Aug 20 07:03:01 winters kernel: [2218025.721068] [drm:r600_ring_test [radeon]] *ERROR* radeon: ring 0 test failed (scratch(
Aug 20 07:03:01 winters kernel: [2218025.721085] [drm:si_resume [radeon]] *ERROR* si startup failed on resume
Aug 20 07:03:01 winters kernel: [2218025.722887] WARNING: CPU: 18 PID: 3715 at /build/
.
.
.
Aug 20 07:03:38 winters kernel: [2218062.876285] [drm:atom_op_jump [radeon]] *ERROR* atombios stuck in loop for more than 5secs aborting
Aug 20 07:03:38 winters kernel: [2218062.876298] [drm:atom_
Aug 20 07:03:38 winters kernel: [2218062.876309] [drm:atom_
The last message before death appears to be:
Aug 20 07:10:02 winters kernel: [2218447.006362] radeon 0000:03:00.0: GPU reset succeeded, trying to resume
ProblemType: Bug
DistroRelease: Ubuntu 18.04
Package: xserver-
ProcVersionSign
Uname: Linux 4.15.0-32-generic x86_64
NonfreeKernelMo
.tmp.unity_
ApportVersion: 2.20.9-0ubuntu7.2
Architecture: amd64
CompizPlugins: No value set for `/apps/
CompositorRunning: None
CurrentDesktop: ubuntu:GNOME
Date: Mon Aug 20 10:18:42 2018
DistUpgraded: 2018-05-02 12:54:58,714 DEBUG icon theme changed, re-reading
DistroCodename: bionic
DistroVariant: ubuntu
DkmsStatus:
bcmwl, 6.30.223.271+bdcom, 4.15.0-29-generic, x86_64: installed
bcmwl, 6.30.223.271+bdcom, 4.15.0-30-generic, x86_64: installed
bcmwl, 6.30.223.271+bdcom, 4.15.0-32-generic, x86_64: installed
ExtraDebuggingI
GraphicsCard:
Advanced Micro Devices, Inc. [AMD/ATI] Oland GL [FirePro W2100] [1002:6608] (prog-if 00 [VGA controller])
Subsystem: Dell Oland GL [FirePro W2100] [1028:2120]
InstallationDate: Installed on 2018-01-05 (226 days ago)
InstallationMedia: Ubuntu 17.04 "Zesty Zapus" - Release amd64 (20170412)
MachineType: Dell Inc. Precision Tower 7810
ProcKernelCmdLine: BOOT_IMAGE=
SourcePackage: xserver-
UpgradeStatus: Upgraded to bionic on 2018-05-02 (109 days ago)
dmi.bios.date: 06/25/2018
dmi.bios.vendor: Dell Inc.
dmi.bios.version: A27
dmi.board.name: 0KJCC5
dmi.board.vendor: Dell Inc.
dmi.board.version: A00
dmi.chassis.type: 7
dmi.chassis.vendor: Dell Inc.
dmi.modalias: dmi:bvnDellInc.
dmi.product.name: Precision Tower 7810
dmi.sys.vendor: Dell Inc.
version.compiz: compiz 1:0.9.13.
version.libdrm2: libdrm2 2.4.91-2
version.
version.
version.
version.
version.
version.
version.
xserver.bootTime: Fri Jan 5 15:01:43 2018
xserver.configfile: default
xserver.errors:
xserver.logfile: /var/log/Xorg.0.log
xserver.version: 2:1.19.3-1ubuntu1.3
xserver.
Indeed the problem appears to be low level, so reassigning to the kernel.