Comment 784 for bug 1690085

Revision history for this message
In , ashesh.ambasta (ashesh.ambasta-linux-kernel-bugs) wrote :

Created attachment 290243
attachment-20700-0.html

I haven't; and to be honest, I've been procrastinating this issue.

As a very ugly hack/workaround; I've disabled screen power management in
xscreensaver: so the CPU keeps drawing graphics on my screen instead of
my displays going to sleep.

That way, my CPU never really enters the idle states for the crashes to
occur.

I understand that this is /far/ from a satisfactory solution; but I
didn't want to try my luck with the RMA anymore. As long as my system
doesn't crash, I can live with this CPU (albeit this continues to
frustrate me). I may lose patience in the coming months and go forĀ  an
RMA anyway. But I'm deterred by the mixed reports for the RMA as well:
some people claim that an RMA fixes their issues; some people say it
makes no difference. I've even read reports of the RMA'd CPU actually
turning out to be worse.

I don't think I'm prepared for the gamble. I've been burnt pretty bad
with AMD at the moment. For now, I'm just making this work. The next
time I'm buying a CPU, I'll do my research more thoroughly and stay away
from AMD.

AMD did publish an errata in which they claim an issue like this exists;
but a solution is ruled out. Which is further bad news. There was some
discussion on a fix being at the kernel level, but that isn't anywhere
in sight either. I believe these CPU's are plagued by several issues,
which probably makes a fix for this at the kernel level hard. However,
Windows seems to have managed to fix it.

Anyway; rants aside, this is my current take on the CPU.

On 7/12/20 3:58 PM, <email address hidden> wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=196683
>
> --- Comment #698 from <email address hidden> ---
> Did you ask for an RMA? Did it work?
> (In reply to Ashesh Ambasta from comment #692)
>> As a last resort, I've tried `idle=halt` on this machine. And yet my
>> system just crashed after 3 weeks of uptime.
>>
>> I'm done with AMD. I will RMA this processor to try things out, but
>> overall, if that doesn't work, this thing is headed to the junkyard and
>> I'm going to live with Intel.
>>
>> At least in the 13 or so odd Intel systems I've tried, I've not had
>> exhasperating issues like these where the company is positively trying
>> to ignore this ongoing issue.
>>
>> This is disgusting from AMD.
>>
>> On 6/11/20 6:03 PM, <email address hidden> wrote:
>>> https://bugzilla.kernel.org/show_bug.cgi?id=196683
>>>
>>> --- Comment #689 from <email address hidden> ---
>>> (In reply to raulvior.bcn from comment #653)
>>>> (In reply to txrx from comment #651)
>>>>
>>>> Typical Current Idle might not be working. Read the sensor output. If
>>>> voltage is not higher than without enabling it, try to increase the core
>>>> voltage.
>>>>
>>>> My Ryzen 7 1800X seems to not produce hangs since I upgraded to 1003ABB
>> with
>>>> an ASUS Crosshair VI Hero and enabled Typical current idle.
>>>>
>>>>
>>>>> I was able to update my BIOS to version 18, but my system still locks up.
>>>>> I tried the following with the new BIOS:
>>>>> - use factory defaults
>>>>> - disable SMT
>>>>> - disable SMT with Typical Current Idle
>>>>> - all of the above with SVM disabled/enabled
>>>>> Right now I set the power supply idle control to "Low ..." and will
>> report
>>>>> back.
>>>>>
>>> The motherboard kept hanging. I had to remove the Vitals GNOME Extension.
>> It
>>> seems that polling voltage values hangs the motherboard... Still, there are
>>> times that the computer does not come back from suspend. There's something
>>> wrong with the BIOS/UEFI.
>>>