Just to add more as I've been following along ever since this 1st happened to me about 6 days ago. To preface I am running Debian 9 with 4.13 kernel from backports. I went several days after upgrade from 4.12 with a very idle Ryzen 1700 and didn't have any issues. I did have 2 qemu-kvm VM's running, and they are also mostly idle. I encountered the soft lock CPU bug out of the blue on the Nov 15th while not doing anything interactive with the system.
I'm now running 6 VM's on this system as it was a replacement for an old Intel system. I simply cannot have this thing crashing on me unattended, especially while I am remote.
Here are the measures I took, and frankly I don't really know if it's going to prevent this as the system simply hasn't been up long enough.
Disabled Global C-State Control in bios. The manual also says there should be a C6 Mode in bios, however it simply doesn't exist.
Disabled AMD Cool&Quiet in bios.
Further issued a disable of C6 via the zenstate.py script upon boot up in systemd as I wasn't certain what the difference is between Package and Core.
# zenstates.py -l
P0 - Enabled - FID = 78 - DID = 8 - VID = 3A - Ratio = 30.00 - vCore = 1.18750
P1 - Enabled - FID = 87 - DID = A - VID = 50 - Ratio = 27.00 - vCore = 1.05000
P2 - Enabled - FID = 7C - DID = 10 - VID = 6C - Ratio = 15.50 - vCore = 0.87500
P3 - Disabled
P4 - Disabled
P5 - Disabled
P6 - Disabled
P7 - Disabled
C6 State - Package - Disabled
C6 State - Core - Disabled
Question for clarification. Does CONFIG_RCU_NOCB_CPU need to be set to yes in kernel to be able to use the boot parameter? I get the impression that the boot parameter doesn't actually do anything unless kernel option is configured.
Just to add more as I've been following along ever since this 1st happened to me about 6 days ago. To preface I am running Debian 9 with 4.13 kernel from backports. I went several days after upgrade from 4.12 with a very idle Ryzen 1700 and didn't have any issues. I did have 2 qemu-kvm VM's running, and they are also mostly idle. I encountered the soft lock CPU bug out of the blue on the Nov 15th while not doing anything interactive with the system.
I'm now running 6 VM's on this system as it was a replacement for an old Intel system. I simply cannot have this thing crashing on me unattended, especially while I am remote.
Here are the measures I took, and frankly I don't really know if it's going to prevent this as the system simply hasn't been up long enough.
Board: Gigabyte GA-AB350-GAMING 3
Bios version: F7
Disabled Global C-State Control in bios. The manual also says there should be a C6 Mode in bios, however it simply doesn't exist.
Disabled AMD Cool&Quiet in bios.
Further issued a disable of C6 via the zenstate.py script upon boot up in systemd as I wasn't certain what the difference is between Package and Core.
# zenstates.py -l
P0 - Enabled - FID = 78 - DID = 8 - VID = 3A - Ratio = 30.00 - vCore = 1.18750
P1 - Enabled - FID = 87 - DID = A - VID = 50 - Ratio = 27.00 - vCore = 1.05000
P2 - Enabled - FID = 7C - DID = 10 - VID = 6C - Ratio = 15.50 - vCore = 0.87500
P3 - Disabled
P4 - Disabled
P5 - Disabled
P6 - Disabled
P7 - Disabled
C6 State - Package - Disabled
C6 State - Core - Disabled
Question for clarification. Does CONFIG_RCU_NOCB_CPU need to be set to yes in kernel to be able to use the boot parameter? I get the impression that the boot parameter doesn't actually do anything unless kernel option is configured.