Comment 0 for bug 1708222

Revision history for this message
Paulo J. S. Silva (pjssilva) wrote :

I have a Ryzen 1700X on a MSI B350 Motherboard and 64 Gb of RAM (Corsair LPX 2400). If I do a very intensive multi-threaded compilation session sometimes I get a segfault. This seems to be a problem with Ryzen it self nad maybe it is related to the bug described in bug #1690085 but I believe it is not the same. This bug affects many Linux users with Ryzen, see for example this thread in the AMD forum: https://community.amd.com/thread/215773?start=0&tstart=0 or the Gentoo Wiki that talks about this problem in the Troubleshoting section of their Ryzen page: https://wiki.gentoo.org/wiki/Ryzen#Troubleshooting

It is also very easy to verify if you have a processor with the problem. Fortunately some smart people have created a simple script that always shows the problem in my system and in the systems of the other people of the thread. The script can be found in

https://github.com/suaefar/ryzen-test

You just have to clone the repository using git, move to the ryzen-test directory and run ./kill_ryzen.sh. It is a very simple script, it downloads gcc-7.1 source code into a vram disk and start #processors simultaneous compilation of it. If any compilation fails it writes a message in the console saying how long it took to get the failure. After a few minutes, the build in my system fails unless I turn off SMT. With SMT off it can take many hours, but still fails in less than one day.

I am opening this bug report because I believe we should try to verify if this is a widespread problem and inform potential users of the problems. Hopefully AMD or the Kernel developers can find a workaround. I have also already opened a bug report in the Linux Kernel Bugzilla (https://bugzilla.kernel.org/show_bug.cgi?id=196481), but unfortunately it is not calling the attention of the kernel developers.