Comment 9 for bug 2037641

Revision history for this message
Martin Vysny (vyzivus) wrote :

Unfortunately the problem is still reproducible even with the newest mesa, even though it looks like it's much less frequent. Yesterday evening I got another crash, with mesa 23.2.1-1ubuntu2:

```
2023-10-04T22:33:15.415854+03:00 mavi-ThinkPad-T14s kernel: [ 1076.119146] amdgpu 0000:06:00.0: amdgpu: [gfxhub0] no-retry page fault (src_id:0 ring:24 vmid:4 pasid:32770, for process Xwayland pid 4940 thread Xwayland:cs0 pid 5015)
2023-10-04T22:33:15.415871+03:00 mavi-ThinkPad-T14s kernel: [ 1076.119168] amdgpu 0000:06:00.0: amdgpu: in page starting at address 0x0000e5ea2326e000 from IH client 0x1b (UTCL2)
2023-10-04T22:33:15.415873+03:00 mavi-ThinkPad-T14s kernel: [ 1076.119180] amdgpu 0000:06:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00400430
2023-10-04T22:33:15.415874+03:00 mavi-ThinkPad-T14s kernel: [ 1076.119187] amdgpu 0000:06:00.0: amdgpu: Faulty UTCL2 client ID: IA (0x2)
2023-10-04T22:33:15.415875+03:00 mavi-ThinkPad-T14s kernel: [ 1076.119194] amdgpu 0000:06:00.0: amdgpu: MORE_FAULTS: 0x0
2023-10-04T22:33:15.415876+03:00 mavi-ThinkPad-T14s kernel: [ 1076.119200] amdgpu 0000:06:00.0: amdgpu: WALKER_ERROR: 0x0
2023-10-04T22:33:15.415878+03:00 mavi-ThinkPad-T14s kernel: [ 1076.119206] amdgpu 0000:06:00.0: amdgpu: PERMISSION_FAULTS: 0x3
2023-10-04T22:33:15.415878+03:00 mavi-ThinkPad-T14s kernel: [ 1076.119212] amdgpu 0000:06:00.0: amdgpu: MAPPING_ERROR: 0x0
2023-10-04T22:33:15.415879+03:00 mavi-ThinkPad-T14s kernel: [ 1076.119218] amdgpu 0000:06:00.0: amdgpu: RW: 0x0
2023-10-04T22:33:15.415881+03:00 mavi-ThinkPad-T14s kernel: [ 1076.119750] amdgpu 0000:06:00.0: amdgpu: [gfxhub0] no-retry page fault (src_id:0 ring:24 vmid:4 pasid:32770, for process Xwayland pid 4940 thread Xwayland:cs0 pid 5015)
2023-10-04T22:33:15.415881+03:00 mavi-ThinkPad-T14s kernel: [ 1076.119768] amdgpu 0000:06:00.0: amdgpu: in page starting at address 0x0000e5ea2326f000 from IH client 0x1b (UTCL2)
2023-10-04T22:33:15.415883+03:00 mavi-ThinkPad-T14s kernel: [ 1076.119780] amdgpu 0000:06:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00400430
2023-10-04T22:33:15.415884+03:00 mavi-ThinkPad-T14s kernel: [ 1076.119787] amdgpu 0000:06:00.0: amdgpu: Faulty UTCL2 client ID: IA (0x2)
2023-10-04T22:33:15.415885+03:00 mavi-ThinkPad-T14s kernel: [ 1076.119793] amdgpu 0000:06:00.0: amdgpu: MORE_FAULTS: 0x0
2023-10-04T22:33:15.415885+03:00 mavi-ThinkPad-T14s kernel: [ 1076.119800] amdgpu 0000:06:00.0: amdgpu: WALKER_ERROR: 0x0
2023-10-04T22:33:15.415886+03:00 mavi-ThinkPad-T14s kernel: [ 1076.119806] amdgpu 0000:06:00.0: amdgpu: PERMISSION_FAULTS: 0x3
2023-10-04T22:33:15.415887+03:00 mavi-ThinkPad-T14s kernel: [ 1076.119812] amdgpu 0000:06:00.0: amdgpu: MAPPING_ERROR: 0x0
2023-10-04T22:33:15.415888+03:00 mavi-ThinkPad-T14s kernel: [ 1076.119818] amdgpu 0000:06:00.0: amdgpu: RW: 0x0
2023-10-04T22:33:25.115577+03:00 mavi-ThinkPad-T14s kernel: [ 1085.820288] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_low timeout, signaled seq=75561, emitted seq=75563
2023-10-04T22:33:25.115600+03:00 mavi-ThinkPad-T14s kernel: [ 1085.821169] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process Xwayland pid 4940 thread Xwayland:cs0 pid 5015
2023-10-04T22:33:25.115602+03:00 mavi-ThinkPad-T14s kernel: [ 1085.822031] amdgpu 0000:06:00.0: amdgpu: GPU reset begin!
2023-10-04T22:33:25.347547+03:00 mavi-ThinkPad-T14s kernel: [ 1086.052065] [drm] psp gfx command UNLOAD_TA(0x2) failed and response status is (0x117)
2023-10-04T22:33:25.375820+03:00 mavi-ThinkPad-T14s kernel: [ 1086.078048] amdgpu 0000:06:00.0: amdgpu: MODE2 reset
2023-10-04T22:33:25.375835+03:00 mavi-ThinkPad-T14s kernel: [ 1086.078296] amdgpu 0000:06:00.0: amdgpu: GPU reset succeeded, trying to resume
2023-10-04T22:33:25.375837+03:00 mavi-ThinkPad-T14s kernel: [ 1086.078498] [drm] PCIE GART of 1024M enabled.
2023-10-04T22:33:25.375838+03:00 mavi-ThinkPad-T14s kernel: [ 1086.078503] [drm] PTB located at 0x000000F43FC00000
2023-10-04T22:33:25.375839+03:00 mavi-ThinkPad-T14s kernel: [ 1086.078625] [drm] PSP is resuming...
2023-10-04T22:33:26.075542+03:00 mavi-ThinkPad-T14s kernel: [ 1086.780162] [drm] reserve 0x400000 from 0xf43f800000 for PSP TMR
2023-10-04T22:33:26.363527+03:00 mavi-ThinkPad-T14s kernel: [ 1087.066834] amdgpu 0000:06:00.0: amdgpu: RAS: optional ras ta ucode is not available
2023-10-04T22:33:26.375796+03:00 mavi-ThinkPad-T14s kernel: [ 1087.078189] amdgpu 0000:06:00.0: amdgpu: RAP: optional rap ta ucode is not available
2023-10-04T22:33:26.375806+03:00 mavi-ThinkPad-T14s kernel: [ 1087.078196] amdgpu 0000:06:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available
2023-10-04T22:33:26.375808+03:00 mavi-ThinkPad-T14s kernel: [ 1087.078204] amdgpu 0000:06:00.0: amdgpu: SMU is resuming...
2023-10-04T22:33:26.375809+03:00 mavi-ThinkPad-T14s kernel: [ 1087.079063] amdgpu 0000:06:00.0: amdgpu: SMU is resumed successfully!
2023-10-04T22:33:26.375810+03:00 mavi-ThinkPad-T14s kernel: [ 1087.079630] [drm] DMUB hardware initialized: version=0x01010027
2023-10-04T22:33:26.771535+03:00 mavi-ThinkPad-T14s kernel: [ 1087.477274] [drm] kiq ring mec 2 pipe 1 q 0
2023-10-04T22:33:26.775785+03:00 mavi-ThinkPad-T14s kernel: [ 1087.480965] [drm] VCN decode and encode initialized successfully(under DPG Mode).
2023-10-04T22:33:26.775793+03:00 mavi-ThinkPad-T14s kernel: [ 1087.481011] [drm] JPEG decode initialized successfully.
2023-10-04T22:33:26.775794+03:00 mavi-ThinkPad-T14s kernel: [ 1087.481014] amdgpu 0000:06:00.0: amdgpu: ring gfx uses VM inv eng 0 on hub 0
2023-10-04T22:33:26.775795+03:00 mavi-ThinkPad-T14s kernel: [ 1087.481017] amdgpu 0000:06:00.0: amdgpu: ring gfx_low uses VM inv eng 1 on hub 0
2023-10-04T22:33:26.775796+03:00 mavi-ThinkPad-T14s kernel: [ 1087.481019] amdgpu 0000:06:00.0: amdgpu: ring gfx_high uses VM inv eng 4 on hub 0
2023-10-04T22:33:26.775797+03:00 mavi-ThinkPad-T14s kernel: [ 1087.481021] amdgpu 0000:06:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 5 on hub 0
2023-10-04T22:33:26.775798+03:00 mavi-ThinkPad-T14s kernel: [ 1087.481023] amdgpu 0000:06:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 6 on hub 0
2023-10-04T22:33:26.775799+03:00 mavi-ThinkPad-T14s kernel: [ 1087.481025] amdgpu 0000:06:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 7 on hub 0
2023-10-04T22:33:26.775800+03:00 mavi-ThinkPad-T14s kernel: [ 1087.481027] amdgpu 0000:06:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 8 on hub 0
2023-10-04T22:33:26.775800+03:00 mavi-ThinkPad-T14s kernel: [ 1087.481029] amdgpu 0000:06:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 9 on hub 0
2023-10-04T22:33:26.775801+03:00 mavi-ThinkPad-T14s kernel: [ 1087.481031] amdgpu 0000:06:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 10 on hub 0
2023-10-04T22:33:26.775802+03:00 mavi-ThinkPad-T14s kernel: [ 1087.481033] amdgpu 0000:06:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 11 on hub 0
2023-10-04T22:33:26.775803+03:00 mavi-ThinkPad-T14s kernel: [ 1087.481035] amdgpu 0000:06:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 12 on hub 0
2023-10-04T22:33:26.775804+03:00 mavi-ThinkPad-T14s kernel: [ 1087.481037] amdgpu 0000:06:00.0: amdgpu: ring kiq_0.2.1.0 uses VM inv eng 13 on hub 0
2023-10-04T22:33:26.775804+03:00 mavi-ThinkPad-T14s kernel: [ 1087.481039] amdgpu 0000:06:00.0: amdgpu: ring sdma0 uses VM inv eng 0 on hub 8
2023-10-04T22:33:26.775805+03:00 mavi-ThinkPad-T14s kernel: [ 1087.481041] amdgpu 0000:06:00.0: amdgpu: ring vcn_dec uses VM inv eng 1 on hub 8
2023-10-04T22:33:26.775805+03:00 mavi-ThinkPad-T14s kernel: [ 1087.481043] amdgpu 0000:06:00.0: amdgpu: ring vcn_enc0 uses VM inv eng 4 on hub 8
2023-10-04T22:33:26.775806+03:00 mavi-ThinkPad-T14s kernel: [ 1087.481044] amdgpu 0000:06:00.0: amdgpu: ring vcn_enc1 uses VM inv eng 5 on hub 8
2023-10-04T22:33:26.775807+03:00 mavi-ThinkPad-T14s kernel: [ 1087.481046] amdgpu 0000:06:00.0: amdgpu: ring jpeg_dec uses VM inv eng 6 on hub 8
2023-10-04T22:33:26.779618+03:00 mavi-ThinkPad-T14s kernel: [ 1087.483645] amdgpu 0000:06:00.0: amdgpu: recover vram bo from shadow start
2023-10-04T22:33:26.779636+03:00 mavi-ThinkPad-T14s kernel: [ 1087.483648] amdgpu 0000:06:00.0: amdgpu: recover vram bo from shadow done
2023-10-04T22:33:26.779638+03:00 mavi-ThinkPad-T14s kernel: [ 1087.483659] amdgpu 0000:06:00.0: amdgpu: GPU reset(2) succeeded!
2023-10-04T22:33:26.779639+03:00 mavi-ThinkPad-T14s kernel: [ 1087.483713] [drm] Skip scheduling IBs!
2023-10-04T22:33:26.779640+03:00 mavi-ThinkPad-T14s kernel: [ 1087.483729] [drm] Skip scheduling IBs!
2023-10-04T22:33:26.779641+03:00 mavi-ThinkPad-T14s kernel: [ 1087.483843] [drm] Skip scheduling IBs!
2023-10-04T22:33:26.779642+03:00 mavi-ThinkPad-T14s kernel: [ 1087.483850] [drm] Skip scheduling IBs!
2023-10-04T22:33:26.779643+03:00 mavi-ThinkPad-T14s kernel: [ 1087.483857] [drm] Skip scheduling IBs!
```