Ubuntu
xserver-xorg-video-amdgpu package

Bug #1882525
Comment #8

Comment 8 for bug 1882525

Revision history for this message

L3P3 (l3p3) wrote on 2020-07-18:

Hello, I am using amdgpu and mate on debian, but I have the very same issue. The only way to fix it was to downgrade the Kernel to 5.3.0. I already wrote to the amdgpu devs but got no response... My message was, back then:

Hi, on my netbook (debian bullseye, AMD A4, Sea Islands), after
updating the kernel to version 5.4.6 and logging into mate desktop, my
screen looked like this (see attached picture).
If you look at the image, do you have any idea what is happening
there? To me it looks like the framebuffer data is misinterpreted at
some stage. The picture looks good on the lightdm login screen, it is
only corrupted when I start a user session and when I then switch to a
console (ctrl+1), the screen looks correct for a second before it
switches. When I switch back (ctrl+7), it looks correct for a second
and afterwards, it turns corrupted again. I was looking into these
commits a bit but I don't have any idea... Maybe mate/marco is doing
something it shouldn't but then, much more people would have problems
now...
I took a screenshot but on that, everything looked fine, that is why I
took a picture. I assume it came by one of these commits, since these
were the only amdgpu changes between a working and a non-working
kernel:

commit 9375fa3799293da82490f0f1fa1f1e7fabae2745
Author: changzhu <email address hidden>
Date: Tue Dec 10 22:00:59 2019 +0800

drm/amdgpu: add invalidate semaphore limit for SRIOV and picasso in gmc9

commit 90f6452ca58d436de4f69b423ecd75a109aa9766 upstream.

    It may fail to load guest driver in round 2 or cause Xstart problem
    when using invalidate semaphore for SRIOV or picasso. So it needs avoid
    using invalidate semaphore for SRIOV and picasso.

    Signed-off-by: changzhu <email address hidden>
    Reviewed-by: Christian König <email address hidden>
    Reviewed-by: Huang Rui <email address hidden>
    Signed-off-by: Alex Deucher <email address hidden>
    Signed-off-by: Greg Kroah-Hartman <email address hidden>

commit bf8ae461a23577a9884f993b31b31f15dd7d6c0a
Author: changzhu <email address hidden>
Date: Tue Dec 10 10:23:09 2019 +0800

drm/amdgpu: avoid using invalidate semaphore for picasso

commit 413fc385a594ea6eb08843be33939057ddfdae76 upstream.

    It may cause timeout waiting for sem acquire in VM flush when using
    invalidate semaphore for picasso. So it needs to avoid using invalidate
    semaphore for piasso.

    Signed-off-by: changzhu <email address hidden>
    Reviewed-by: Huang Rui <email address hidden>
    Signed-off-by: Alex Deucher <email address hidden>
    Signed-off-by: Greg Kroah-Hartman <email address hidden>

commit f45858245286fa901e8abf36776df7e99d9a9581
Author: Xiaojie Yuan <email address hidden>
Date: Wed Nov 20 14:02:22 2019 +0800

drm/amdgpu/gfx10: re-init clear state buffer after gpu reset

commit 210b3b3c7563df391bd81d49c51af303b928de4a upstream.

This patch fixes 2nd baco reset failure with gfxoff enabled on navi1x.

clear state buffer (resides in vram) is corrupted after 1st baco reset,
upon gfxoff exit, CPF gets garbage header in CSIB and hangs.

    Signed-off-by: Xiaojie Yuan <email address hidden>
    Reviewed-by: Hawking Zhang <email address hidden>
    Signed-off-by: Alex Deucher <email address hidden>
    Cc: <email address hidden>
    Signed-off-by: Greg Kroah-Hartman <email address hidden>

commit eebab68448a6bbb9b899216b6e889057f6f4498d
Author: Xiaojie Yuan <email address hidden>
Date: Thu Nov 14 16:56:08 2019 +0800

drm/amdgpu/gfx10: explicitly wait for cp idle after halt/unhalt

commit 1e902a6d32d73e4a6b3bc9d7cd43d4ee2b242dea upstream.

50us is not enough to wait for cp ready after gpu reset on some navi asics.

    Signed-off-by: Xiaojie Yuan <email address hidden>
    Suggested-by: Jack Xiao <email address hidden>
    Acked-by: Alex Deucher <email address hidden>
    Signed-off-by: Alex Deucher <email address hidden>
    Cc: <email address hidden>
    Signed-off-by: Greg Kroah-Hartman <email address hidden>

commit 69e0a0d5bcc4dd677ea460b172a1bec4127c650d
Author: changzhu <email address hidden>
Date: Tue Nov 19 11:13:29 2019 +0800

drm/amdgpu: invalidate mmhub semaphore workaround in gmc9/gmc10

commit f920d1bb9c4e77efb08c41d70b6d442f46fd8902 upstream.

    It may lose gpuvm invalidate acknowldege state across power-gating off
    cycle. To avoid this issue in gmc9/gmc10 invalidation, add semaphore acquire
    before invalidation and semaphore release after invalidation.

    After adding semaphore acquire before invalidation, the semaphore
    register become read-only if another process try to acquire semaphore.
    Then it will not be able to release this semaphore. Then it may cause
    deadlock problem. If this deadlock problem happens, it needs a semaphore
    firmware fix.

    Signed-off-by: changzhu <email address hidden>
    Acked-by: Huang Rui <email address hidden>
    Signed-off-by: Alex Deucher <email address hidden>
    Cc: <email address hidden>
    Signed-off-by: Greg Kroah-Hartman <email address hidden>

commit b23e536fc4d58308e3e1ae0569327995995ee378
Author: changzhu <email address hidden>
Date: Tue Nov 19 10:18:39 2019 +0800

drm/amdgpu: initialize vm_inv_eng0_sem for gfxhub and mmhub

commit 6c2c8972374ac5c35078d36d7559f64c368f7b33 upstream.

    SW must acquire/release one of the vm_invalidate_eng*_sem around the
    invalidation req/ack. Through this way,it can avoid losing invalidate
    acknowledge state across power-gating off cycle.
    To use vm_invalidate_eng*_sem, it needs to initialize
    vm_invalidate_eng*_sem firstly.

    Signed-off-by: changzhu <email address hidden>
    Reviewed-by: Christian König <email address hidden>
    Signed-off-by: Alex Deucher <email address hidden>
    Cc: <email address hidden>
    Signed-off-by: Greg Kroah-Hartman <email address hidden>

----------

Maybe you have an idea and want to share it with me, apart from that
huge thanks for your work!
L3P3

commit 9375fa3799293da82490f0f1fa1f1e7fabae2745
Author: changzhu <Changfeng.Zhu@amd.com>
Date:   Tue Dec 10 22:00:59 2019 +0800

drm/amdgpu: add invalidate semaphore limit for SRIOV and picasso in gmc9

commit 90f6452ca58d436de4f69b423ecd75a109aa9766 upstream.

It may fail to load guest driver in round 2 or cause Xstart problem
    when using invalidate semaphore for SRIOV or picasso. So it needs avoid
    using invalidate semaphore for SRIOV and picasso.

Signed-off-by: changzhu <Changfeng.Zhu@amd.com>
    Reviewed-by: Christian König <christian.koenig@amd.com>
    Reviewed-by: Huang Rui <ray.huang@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit bf8ae461a23577a9884f993b31b31f15dd7d6c0a
Author: changzhu <Changfeng.Zhu@amd.com>
Date:   Tue Dec 10 10:23:09 2019 +0800

drm/amdgpu: avoid using invalidate semaphore for picasso

commit 413fc385a594ea6eb08843be33939057ddfdae76 upstream.

It may cause timeout waiting for sem acquire in VM flush when using
    invalidate semaphore for picasso. So it needs to avoid using invalidate
    semaphore for piasso.

Signed-off-by: changzhu <Changfeng.Zhu@amd.com>
    Reviewed-by: Huang Rui <ray.huang@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit f45858245286fa901e8abf36776df7e99d9a9581
Author: Xiaojie Yuan <xiaojie.yuan@amd.com>
Date:   Wed Nov 20 14:02:22 2019 +0800

drm/amdgpu/gfx10: re-init clear state buffer after gpu reset

commit 210b3b3c7563df391bd81d49c51af303b928de4a upstream.

This patch fixes 2nd baco reset failure with gfxoff enabled on navi1x.

clear state buffer (resides in vram) is corrupted after 1st baco reset,
    upon gfxoff exit, CPF gets garbage header in CSIB and hangs.

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
    Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Cc: stable@vger.kernel.org
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit eebab68448a6bbb9b899216b6e889057f6f4498d
Author: Xiaojie Yuan <xiaojie.yuan@amd.com>
Date:   Thu Nov 14 16:56:08 2019 +0800

drm/amdgpu/gfx10: explicitly wait for cp idle after halt/unhalt

commit 1e902a6d32d73e4a6b3bc9d7cd43d4ee2b242dea upstream.

50us is not enough to wait for cp ready after gpu reset on some navi asics.

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
    Suggested-by: Jack Xiao <Jack.Xiao@amd.com>
    Acked-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Cc: stable@vger.kernel.org
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 69e0a0d5bcc4dd677ea460b172a1bec4127c650d
Author: changzhu <Changfeng.Zhu@amd.com>
Date:   Tue Nov 19 11:13:29 2019 +0800

drm/amdgpu: invalidate mmhub semaphore workaround in gmc9/gmc10

commit f920d1bb9c4e77efb08c41d70b6d442f46fd8902 upstream.

Signed-off-by: changzhu <Changfeng.Zhu@amd.com>
    Acked-by: Huang Rui <ray.huang@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Cc: stable@vger.kernel.org
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit b23e536fc4d58308e3e1ae0569327995995ee378
Author: changzhu <Changfeng.Zhu@amd.com>
Date:   Tue Nov 19 10:18:39 2019 +0800