Comment 424 for bug 1690085

Revision history for this message
In , kernel (kernel-linux-kernel-bugs) wrote :

I've sent the aforementioned outlets an email. For completeness, here's the entire thing:

Receipients:
Ars Technica: https://arstechnica.wufoo.com/forms/z7p8x7/
Tomshardware: http://www.purch.com/about/#contact-general
HardOCP: <email address hidden>
Anandtech: http://www.purch.com/about/#contact-general
Tweakers.net: <email address hidden>
Phoronix: https://www.phoronix-media.com/?

Subject: Publication of an AMD Ryzen hardware issue

Dear editor,

Since their introduction the AMD Ryzen processors have been plagued by several issues, most notably the segfault issue that occurred under high (compilation) loads. To that particular issue AMD has responded by replacing affected chips.

However, there is another significant issue that affects both Ryzen 1xxx and Threadripper CPUs, as well as the newer Ryzen 2xxx processors. It appears Epyc is not affected (although sample size is one in this case).

The most complete storyline on this issue can be found in the link below [1], however for an overview the rest of my email attempts to summarise the issue.

This issue results in a complete system freeze, occurring under full idle conditions, and requires a hard reset. Evidence suggests this is a hardware problem, since several workarounds have been found that mitigate/solve the issue, such as disabling C6 entirely (inefficient), overclocking and tweaking voltages (not for everyone), or running processes that keep the CPU active at all times (again, inefficient and pointless).

AMD has been contacted multiple times but refuses to acknowledge the issue. At some point in one reply AMD blamed users' PSUs; this is obviously nonsensical as the issue occurs on a wide variety of PSUs including brand new models, the only constant is the Ryzen platform.

In response, AMD has pushed motherboard manufacturers to add an otherwise undocumented BIOS option, "Power Supply Idle Control", as part of their AGESA update. However, in its default value this setting does *not* solve the problem, and with other settings doesn't *always* solve the problem. To be precise, this setting needs to be changed from "auto" to "typical":
/advanced/amdcbs/zen-common-options/"power supply idle control"
However it is unknown what this option actually does.

The issue is also present in the laptop platforms. In example of one user, he sent his laptop back to Asus repeatedly for motherboard replacements, but the issue remains.

The root cause of this issue is still unknown, and moreover, the issue is still present in the latest Ryzen 2xxx CPUs. AMDs refusal to acknowledge let alone resolve the issue has led to this email, in the hope that media attention to this problem will inspire AMD to take action. Hence, I propose that your outlet publishes an article about the issue.

Kind regards,
<email address hidden>

[1] https://bugzilla.kernel.org/show_bug.cgi?id=196683