[amdgpu] Xorg freeze
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux (Ubuntu) |
Confirmed
|
Undecided
|
Unassigned | ||
xserver-xorg-video-amdgpu (Ubuntu) |
New
|
Undecided
|
Unassigned |
Bug Description
I have Radeon Pro WX 7100. It hangs twice a day.
I narrow-downed this issue to the fan speed lock. After reboot, it always runs on 18%, no matter of the load. If I install Radeon Profile and force fan to 100% and then turn auto fan configuration, then fan reacts on the load.
ProblemType: Bug
DistroRelease: Ubuntu 18.04
Package: xorg 1:7.7+19ubuntu7.1
ProcVersionSign
Uname: Linux 5.3.0-62-generic x86_64
ApportVersion: 2.20.9-0ubuntu7.15
Architecture: amd64
BootLog: Error: [Errno 13] Permission denied: '/var/log/boot.log'
CompositorRunning: None
CurrentDesktop: ubuntu:GNOME
Date: Mon Jul 20 09:15:03 2020
DistUpgraded: Fresh install
DistroCodename: bionic
DistroVariant: ubuntu
ExtraDebuggingI
GpuHangFrequency: Very infrequently
GraphicsCard:
Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere [Radeon Pro WX 7100] [1002:67c4] (prog-if 00 [VGA controller])
Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere [Radeon Pro WX 7100] [1002:0b0d]
InstallationDate: Installed on 2020-06-29 (20 days ago)
InstallationMedia: Ubuntu 18.04.3 LTS "Bionic Beaver" - Release amd64 (20190805)
MachineType: System manufacturer System Product Name
ProcKernelCmdLine: BOOT_IMAGE=
SourcePackage: xorg
Symptom: display
Title: Xorg freeze
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 06/17/2020
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: 7803
dmi.board.
dmi.board.name: ROG CROSSHAIR VI HERO (WI-FI AC)
dmi.board.vendor: ASUSTeK COMPUTER INC.
dmi.board.version: Rev 1.xx
dmi.chassis.
dmi.chassis.type: 3
dmi.chassis.vendor: Default string
dmi.chassis.
dmi.modalias: dmi:bvnAmerican
dmi.product.family: To be filled by O.E.M.
dmi.product.name: System Product Name
dmi.product.sku: SKU
dmi.product.
dmi.sys.vendor: System manufacturer
version.compiz: compiz N/A
version.libdrm2: libdrm2 2.4.101-2~18.04.1
version.
version.
version.
version.
version.
version.
version.
This looks like a hardware or driver problem with the Radeon Pro WX 7100. Your logs show two issues:
1. Xorg logs that the monitor is being repeatedly redetected. So it sounds like the monitor is not plugged in properly. Maybe try a new monitor cable.
2. The kernel is logging more serious issues:
[ 107.569355] amdgpu 0000:0a:00.0: GPU fault detected: 146 0x0c38480c for process mksReplay pid 2764 thread main-mks pid 2772 PROTECTION_ FAULT_ADDR 0x0030C387 PROTECTION_ FAULT_STATUS 0x0804800C PROTECTION_ FAULT_ADDR 0x0030C38E PROTECTION_ FAULT_STATUS 0x0808400C PROTECTION_ FAULT_ADDR 0x0030C387 PROTECTION_ FAULT_STATUS 0x0E04800C PROTECTION_ FAULT_ADDR 0x0030C38D PROTECTION_ FAULT_STATUS 0x0E04400C
[ 107.569358] amdgpu 0000:0a:00.0: VM_CONTEXT1_
[ 107.569359] amdgpu 0000:0a:00.0: VM_CONTEXT1_
[ 107.569360] amdgpu 0000:0a:00.0: VM fault (0x0c, vmid 4, pasid 32785) at page 3195783, read from 'TC4' (0x54433400) (72)
[ 107.569365] amdgpu 0000:0a:00.0: GPU fault detected: 146 0x0c38880c for process mksReplay pid 2764 thread main-mks pid 2772
[ 107.569366] amdgpu 0000:0a:00.0: VM_CONTEXT1_
[ 107.569367] amdgpu 0000:0a:00.0: VM_CONTEXT1_
[ 107.569368] amdgpu 0000:0a:00.0: VM fault (0x0c, vmid 4, pasid 32785) at page 3195790, read from 'TC7' (0x54433700) (132)
[ 107.597584] amdgpu 0000:0a:00.0: GPU fault detected: 146 0x0c38480c for process mksReplay pid 2764 thread main-mks pid 2772
[ 107.597587] amdgpu 0000:0a:00.0: VM_CONTEXT1_
[ 107.597588] amdgpu 0000:0a:00.0: VM_CONTEXT1_
[ 107.597590] amdgpu 0000:0a:00.0: VM fault (0x0c, vmid 7, pasid 32785) at page 3195783, read from 'TC4' (0x54433400) (72)
[ 107.597595] amdgpu 0000:0a:00.0: GPU fault detected: 146 0x0c38880c for process mksReplay pid 2764 thread main-mks pid 2772
[ 107.597596] amdgpu 0000:0a:00.0: VM_CONTEXT1_
[ 107.597597] amdgpu 0000:0a:00.0: VM_CONTEXT1_
[ 107.597598] amdgpu 0000:0a:00.0: VM fault (0x0c, vmid 7, pasid 32785) at page 3195789, read from 'TC5' (0x54433500) (68)
[