Since upgrading to 6.1.0-16-generic on lunar, I hit nearly the exact same error again, minus the "fences timed out" log:
[728875.081136] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma0 timeout, signaled seq=263052, emitted seq=263054 [728875.081681] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process spotify pid 626324 thread spotify:cs0 pid 626354 [728875.082223] amdgpu 0000:1f:00.0: amdgpu: GPU reset begin! [728875.674379] amdgpu: [powerplay] Failed message: 0x24, input parameter: 0x0, error code: 0xffffffff [728875.674390] amdgpu: [powerplay] Failed message: 0x9, input parameter: 0xf4, error code: 0xffffffff [728875.674395] amdgpu: [powerplay] Failed message: 0xa, input parameter: 0x103000, error code: 0xffffffff [728875.674399] amdgpu: [powerplay] Failed message: 0xe, input parameter: 0x0, error code: 0xffffffff [728875.674403] amdgpu: [powerplay] Failed message: 0x42, input parameter: 0x1, error code: 0xffffffff [728875.674407] amdgpu: [powerplay] Failed message: 0x24, input parameter: 0x0, error code: 0xffffffff [728875.707103] [drm] REG_WAIT timeout 10us * 3000 tries - dce110_stream_encoder_dp_blank line:936 [728895.793037] [drm:atom_op_jump [amdgpu]] *ERROR* atombios stuck in loop for more than 20secs aborting [728895.793200] [drm:amdgpu_atom_execute_table_locked [amdgpu]] *ERROR* atombios stuck executing E18E (len 824, WS 0, PS 0) @ 0xE30E [728895.793351] [drm:amdgpu_atom_execute_table_locked [amdgpu]] *ERROR* atombios stuck executing E048 (len 326, WS 0, PS 0) @ 0xE138 [728895.793501] [drm:dce110_link_encoder_disable_output [amdgpu]] *ERROR* dce110_link_encoder_disable_output: Failed to execute VBIOS command table! [728915.796976] [drm:atom_op_jump [amdgpu]] *ERROR* atombios stuck in loop for more than 20secs aborting [728915.797146] [drm:amdgpu_atom_execute_table_locked [amdgpu]] *ERROR* atombios stuck executing C30A (len 62, WS 0, PS 0) @ 0xC326 [728935.800920] [drm:atom_op_jump [amdgpu]] *ERROR* atombios stuck in loop for more than 20secs aborting [728935.801087] [drm:amdgpu_atom_execute_table_locked [amdgpu]] *ERROR* atombios stuck executing E18E (len 824, WS 0, PS 0) @ 0xE30E [728935.801238] [drm:amdgpu_atom_execute_table_locked [amdgpu]] *ERROR* atombios stuck executing E048 (len 326, WS 0, PS 0) @ 0xE138 [728935.801388] [drm:dce110_link_encoder_disable_output [amdgpu]] *ERROR* dce110_link_encoder_disable_output: Failed to execute VBIOS command table! [728955.804861] [drm:atom_op_jump [amdgpu]] *ERROR* atombios stuck in loop for more than 20secs aborting [728955.805022] [drm:amdgpu_atom_execute_table_locked [amdgpu]] *ERROR* atombios stuck executing C30A (len 62, WS 0, PS 0) @ 0xC326 [728975.808802] [drm:atom_op_jump [amdgpu]] *ERROR* atombios stuck in loop for more than 20secs aborting [728975.808971] [drm:amdgpu_atom_execute_table_locked [amdgpu]] *ERROR* atombios stuck executing B802 (len 1359, WS 12, PS 8) @ 0xBB17
Since upgrading to 6.1.0-16-generic on lunar, I hit nearly the exact same error again, minus the "fences timed out" log:
[728875.081136] [drm:amdgpu_ job_timedout [amdgpu]] *ERROR* ring sdma0 timeout, signaled seq=263052, emitted seq=263054 job_timedout [amdgpu]] *ERROR* Process information: process spotify pid 626324 thread spotify:cs0 pid 626354 stream_ encoder_ dp_blank line:936 atom_execute_ table_locked [amdgpu]] *ERROR* atombios stuck executing E18E (len 824, WS 0, PS 0) @ 0xE30E atom_execute_ table_locked [amdgpu]] *ERROR* atombios stuck executing E048 (len 326, WS 0, PS 0) @ 0xE138 link_encoder_ disable_ output [amdgpu]] *ERROR* dce110_ link_encoder_ disable_ output: Failed to execute VBIOS command table! atom_execute_ table_locked [amdgpu]] *ERROR* atombios stuck executing C30A (len 62, WS 0, PS 0) @ 0xC326 atom_execute_ table_locked [amdgpu]] *ERROR* atombios stuck executing E18E (len 824, WS 0, PS 0) @ 0xE30E atom_execute_ table_locked [amdgpu]] *ERROR* atombios stuck executing E048 (len 326, WS 0, PS 0) @ 0xE138 link_encoder_ disable_ output [amdgpu]] *ERROR* dce110_ link_encoder_ disable_ output: Failed to execute VBIOS command table! atom_execute_ table_locked [amdgpu]] *ERROR* atombios stuck executing C30A (len 62, WS 0, PS 0) @ 0xC326 atom_execute_ table_locked [amdgpu]] *ERROR* atombios stuck executing B802 (len 1359, WS 12, PS 8) @ 0xBB17
[728875.081681] [drm:amdgpu_
[728875.082223] amdgpu 0000:1f:00.0: amdgpu: GPU reset begin!
[728875.674379] amdgpu: [powerplay] Failed message: 0x24, input parameter: 0x0, error code: 0xffffffff
[728875.674390] amdgpu: [powerplay] Failed message: 0x9, input parameter: 0xf4, error code: 0xffffffff
[728875.674395] amdgpu: [powerplay] Failed message: 0xa, input parameter: 0x103000, error code: 0xffffffff
[728875.674399] amdgpu: [powerplay] Failed message: 0xe, input parameter: 0x0, error code: 0xffffffff
[728875.674403] amdgpu: [powerplay] Failed message: 0x42, input parameter: 0x1, error code: 0xffffffff
[728875.674407] amdgpu: [powerplay] Failed message: 0x24, input parameter: 0x0, error code: 0xffffffff
[728875.707103] [drm] REG_WAIT timeout 10us * 3000 tries - dce110_
[728895.793037] [drm:atom_op_jump [amdgpu]] *ERROR* atombios stuck in loop for more than 20secs aborting
[728895.793200] [drm:amdgpu_
[728895.793351] [drm:amdgpu_
[728895.793501] [drm:dce110_
[728915.796976] [drm:atom_op_jump [amdgpu]] *ERROR* atombios stuck in loop for more than 20secs aborting
[728915.797146] [drm:amdgpu_
[728935.800920] [drm:atom_op_jump [amdgpu]] *ERROR* atombios stuck in loop for more than 20secs aborting
[728935.801087] [drm:amdgpu_
[728935.801238] [drm:amdgpu_
[728935.801388] [drm:dce110_
[728955.804861] [drm:atom_op_jump [amdgpu]] *ERROR* atombios stuck in loop for more than 20secs aborting
[728955.805022] [drm:amdgpu_
[728975.808802] [drm:atom_op_jump [amdgpu]] *ERROR* atombios stuck in loop for more than 20secs aborting
[728975.808971] [drm:amdgpu_