Kernel bug in amdgpu system freezing

Bug #1852865 reported by Janghou
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

Running any flatpak package like Foliate of Visual Studio Code freezes system in Ubuntu 19.10.

Worked fine in 19.04

Description: Ubuntu 19.10
Release: 19.10

Only see it when using flatpak. For the rest system is quite stable.

Lots of errors in syslog:

mdgpu 0000:09:00.0: [gfxhub] retry page fault (src_id:0 ring:0 vmid:3 pasid:32769, for process Xorg pid 2137 thread Xorg:cs0 pid 2138)
amdgpu 0000:09:00.0: in page starting at address 0x000000011a500000 from 27
amdgpu 0000:09:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00301031

Also errors in Gnone-shell:

gnome-shell[2440]: [GFX1]: Shader compilation failure, cfg: features: 0 multiplier: 1 op: CompositionOp::OP_OVER
gnome-shell[2440]: message repeated 5 times: [ [GFX1]: Shader compilation failure, cfg: features: 0 multiplier: 1 op: CompositionOp::OP_OVER]
gnome-shell[2440]: Object .Gjs_WindowSwitcherPopup (0x557f7238f500), has been already deallocated — impossible to access it. This might be caused by the object having been destroyed from C code using something such as destroy(), dispose(), or remove() vfuncs

ProblemType: Bug
DistroRelease: Ubuntu 19.10
Package: linux-image-5.3.0-23-generic 5.3.0-23.25
ProcVersionSignature: Ubuntu 5.3.0-23.25-generic 5.3.7
Uname: Linux 5.3.0-23-generic x86_64
ApportVersion: 2.20.11-0ubuntu8.2
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC1: ben 2111 F.... pulseaudio
 /dev/snd/controlC0: ben 2111 F.... pulseaudio
CurrentDesktop: ubuntu:GNOME
Date: Sat Nov 16 20:08:36 2019
InstallationDate: Installed on 2019-07-19 (120 days ago)
InstallationMedia: Ubuntu 19.04 "Disco Dingo" - Release amd64 (20190416)
MachineType: System manufacturer System Product Name
ProcFB: 0 amdgpudrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.3.0-23-generic root=UUID=7fb304cc-a08f-4ccf-945c-f9f1556d4342 ro quiet splash vt.handoff=7
RelatedPackageVersions:
 linux-restricted-modules-5.3.0-23-generic N/A
 linux-backports-modules-5.3.0-23-generic N/A
 linux-firmware 1.183.2
RfKill:

SourcePackage: linux
UpgradeStatus: Upgraded to eoan on 2019-10-28 (19 days ago)
dmi.bios.date: 02/28/2019
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: 0809
dmi.board.asset.tag: Default string
dmi.board.name: PRIME B450M-A
dmi.board.vendor: ASUSTeK COMPUTER INC.
dmi.board.version: Rev X.0x
dmi.chassis.asset.tag: Default string
dmi.chassis.type: 3
dmi.chassis.vendor: Default string
dmi.chassis.version: Default string
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr0809:bd02/28/2019:svnSystemmanufacturer:pnSystemProductName:pvrSystemVersion:rvnASUSTeKCOMPUTERINC.:rnPRIMEB450M-A:rvrRevX.0x:cvnDefaultstring:ct3:cvrDefaultstring:
dmi.product.family: To be filled by O.E.M.
dmi.product.name: System Product Name
dmi.product.sku: SKU
dmi.product.version: System Version
dmi.sys.vendor: System manufacturer

Revision history for this message
Janghou (janghou) wrote :
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Status changed to Confirmed

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :
Revision history for this message
Janghou (janghou) wrote :

Unfortunately no improvement.

I would say even worse, because I could not get a VT, so it needed a hard shutdown, after opening a something in `foliate`, Only thing working the mouse.

With old kernel at least (after a time) I could open up a VT, to close and reboot.

uname -a
Linux desk 5.4.0-050400rc8-generic #201911171930 SMP Mon Nov 18 00:33:15 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

89.533687] [drm:amdgpu_dm_commit_planes.constprop.0 [amdgpu]] *ERROR* Waiting for fences timed out!
kernel: [ 94.397626] gmc_v9_0_process_interrupt: 111 callbacks suppressed
kernel: [ 94.397633] amdgpu 0000:09:00.0: [gfxhub0] no-retry page fault (src_id:0 ring:24 vmid:3 pasid:32769, for process Xorg pid 2272 thread Xorg:cs0 pid 2275)
kernel: [ 94.397638] amdgpu 0000:09:00.0: in page starting at address 0x0000000117d00000 from client 27
kernel: [ 94.397641] amdgpu 0000:09:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00301031
kernel: [ 94.397643] amdgpu 0000:09:00.0: MORE_FAULTS: 0x1
kernel: [ 94.397646] amdgpu 0000:09:00.0: WALKER_ERROR: 0x0
kernel: [ 94.397648] amdgpu 0000:09:00.0: PERMISSION_FAULTS: 0x3
kernel: [ 94.397650] amdgpu 0000:09:00.0: MAPPING_ERROR: 0x0
kernel: [ 94.397651] amdgpu 0000:09:00.0: RW: 0x0
kernel: [ 94.397662] amdgpu 0000:09:00.0: [gfxhub0] no-retry page fault (src_id:0 ring:24 vmid:3 pasid:32769, for process Xorg pid 2272 thread Xorg:cs0 pid 2275)
kernel: [ 94.397664] amdgpu 0000:09:00.0: in page starting at address 0x0000000117d00000 from client 27
kernel: [ 94.397666] amdgpu 0000:09:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00301031
kernel: [ 94.397668] amdgpu 0000:09:00.0: MORE_FAULTS: 0x1
kernel: [ 94.397670] amdgpu 0000:09:00.0: WALKER_ERROR: 0x0
kernel: [ 94.397672] amdgpu 0000:09:00.0: PERMISSION_FAULTS: 0x3
kernel: [ 94.397674] amdgpu 0000:09:00.0: MAPPING_ERROR: 0x0
kernel: [ 94.397676] amdgpu 0000:09:00.0: RW: 0x0

Revision history for this message
Janghou (janghou) wrote :

Related issues:
https://gitlab.freedesktop.org/mesa/mesa/issues/1899
https://discussion.fedoraproject.org/t/some-flatpaks-kill-the-machine-on-f31sb-beta-testing/7673/4

So is it a Mesa, and not a kernel bug?

AFAICS it can be solved/workaround by Flatpak packages moving to a newer Mesa version.

Still a bug if only machines with AMD cards are hit.

tags: added: amdgpu
Revision history for this message
Nowaker (nowaker) wrote :

I've encountered VM_L2_PROTECTION_FAULT_STATUS two times today. No Flatpack involved. Using the latest Mesa on Arch Linux 20200924.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.