I have a Ryzen 1700X on a MSI B350 Motherboard and 64 Gb of RAM (Corsair LPX 2400). If I do a very intensive multi-threaded compilation session sometimes I get a segfault. This seems to be a problem with Ryzen it self nad maybe it is related to the bug described in bug #1690085 but I believe it is not the same. This bug affects many Linux users with Ryzen, see for example this thread in the AMD forum: https://community.amd.com/thread/215773?start=0&tstart=0 or the Gentoo Wiki that talks about this problem in the Troubleshoting section of their Ryzen page: https://wiki.gentoo.org/wiki/Ryzen#Troubleshooting
It is also very easy to verify if you have a processor with the problem. Fortunately some smart people have created a simple script that always shows the problem in my system and in the systems of the other people of the thread. The script can be found in
You just have to clone the repository using git, move to the ryzen-test directory and run ./kill_ryzen.sh. It is a very simple script, it downloads gcc-7.1 source code into a vram disk and start #processors simultaneous compilation of it. If any compilation fails it writes a message in the console saying how long it took to get the failure. After a few minutes, the build in my system fails unless I turn off SMT. With SMT off it can take many hours, but still fails in less than one day.
I am opening this bug report because I believe we should try to verify if this is a widespread problem and inform potential users of the problems. Hopefully AMD or the Kernel developers can find a workaround. I have also already opened a bug report in the Linux Kernel Bugzilla (https://bugzilla.kernel.org/show_bug.cgi?id=196481), but unfortunately it is not calling the attention of the kernel developers.
I have a Ryzen 1700X on a MSI B350 Motherboard and 64 Gb of RAM (Corsair LPX 2400). If I do a very intensive multi-threaded compilation session sometimes I get a segfault. This seems to be a problem with Ryzen it self nad maybe it is related to the bug described in bug #1690085 but I believe it is not the same. This bug affects many Linux users with Ryzen, see for example this thread in the AMD forum: https:/ /community. amd.com/ thread/ 215773? start=0& tstart= 0 or the Gentoo Wiki that talks about this problem in the Troubleshoting section of their Ryzen page: https:/ /wiki.gentoo. org/wiki/ Ryzen#Troublesh ooting
It is also very easy to verify if you have a processor with the problem. Fortunately some smart people have created a simple script that always shows the problem in my system and in the systems of the other people of the thread. The script can be found in
https:/ /github. com/suaefar/ ryzen-test
You just have to clone the repository using git, move to the ryzen-test directory and run ./kill_ryzen.sh. It is a very simple script, it downloads gcc-7.1 source code into a vram disk and start #processors simultaneous compilation of it. If any compilation fails it writes a message in the console saying how long it took to get the failure. After a few minutes, the build in my system fails unless I turn off SMT. With SMT off it can take many hours, but still fails in less than one day.
I am opening this bug report because I believe we should try to verify if this is a widespread problem and inform potential users of the problems. Hopefully AMD or the Kernel developers can find a workaround. I have also already opened a bug report in the Linux Kernel Bugzilla (https:/ /bugzilla. kernel. org/show_ bug.cgi? id=196481), but unfortunately it is not calling the attention of the kernel developers.