Comment 190 for bug 1690085

Revision history for this message
In , sfearghail (sfearghail-linux-kernel-bugs) wrote :

Just to add more as I've been following along ever since this 1st happened to me about 6 days ago. To preface I am running Debian 9 with 4.13 kernel from backports. I went several days after upgrade from 4.12 with a very idle Ryzen 1700 and didn't have any issues. I did have 2 qemu-kvm VM's running, and they are also mostly idle. I encountered the soft lock CPU bug out of the blue on the Nov 15th while not doing anything interactive with the system.

I'm now running 6 VM's on this system as it was a replacement for an old Intel system. I simply cannot have this thing crashing on me unattended, especially while I am remote.

Here are the measures I took, and frankly I don't really know if it's going to prevent this as the system simply hasn't been up long enough.

Board: Gigabyte GA-AB350-GAMING 3
Bios version: F7

Disabled Global C-State Control in bios. The manual also says there should be a C6 Mode in bios, however it simply doesn't exist.

Disabled AMD Cool&Quiet in bios.

Further issued a disable of C6 via the zenstate.py script upon boot up in systemd as I wasn't certain what the difference is between Package and Core.

# zenstates.py -l
P0 - Enabled - FID = 78 - DID = 8 - VID = 3A - Ratio = 30.00 - vCore = 1.18750
P1 - Enabled - FID = 87 - DID = A - VID = 50 - Ratio = 27.00 - vCore = 1.05000
P2 - Enabled - FID = 7C - DID = 10 - VID = 6C - Ratio = 15.50 - vCore = 0.87500
P3 - Disabled
P4 - Disabled
P5 - Disabled
P6 - Disabled
P7 - Disabled
C6 State - Package - Disabled
C6 State - Core - Disabled

Question for clarification. Does CONFIG_RCU_NOCB_CPU need to be set to yes in kernel to be able to use the boot parameter? I get the impression that the boot parameter doesn't actually do anything unless kernel option is configured.