Comment 111 for bug 1690085

Revision history for this message
In , oyvinds (oyvinds-linux-kernel-bugs) wrote :

I got hit by this bug on Fedora. My Ryzen 1600X system would randomly hang a short while after boot after upgrading to kernel 4.13.4. I looked at various things that could be the cause and thought it was fixed but then I rebooted and it happened again. And again. I quickly figured out that there's no problem with kernel 4.12.3 on Fedora. Tried kernel 4.13.5, same problem, went back to 4.12.3 until I wasted too much time looking into this today.

config-4.12.14-300.fc26.x86_64 on Fedora has
CONFIG_RCU_NOCB_CPU=y
CONFIG_RCU_NOCB_CPU_ALL=y

and there's no problem. config-4.13.5-300.fc27.x86_64 only has
CONFIG_RCU_NOCB_CPU=y

and with that kernel there's a problem _unless_ I add rcu_nocbs=0-11 to the kernel command line - which I only figured out after looking at this bug.

Thank you James Le Cuirot.

This bug is listed as "Regression: No". It should be Yes in the case of Fedora; 4.12.x kernels work, 4.13.x do not work without a kernel boot parameter fix.

The commit that removed CONFIG_RCU_NOCB_CPU_ALL should please be reverted. The statement "The CONFIG_RCU_NOCB_CPU_ALL, CONFIG_RCU_NOCB_CPU_NONE, and
CONFIG_RCU_NOCB_CPU_ZERO Kconfig options are used only in testing" is clearly false since these options are/were used by distributions like Fedora and removing CONFIG_RCU_NOCB_CPU_ALL *breaks* kernel 4.13.5. You can't really expect distributions to ship with/add the rcu_nocbs= parameter as an alternative.

Just to repeat this point: My first conclusion was simply that 4.13.x kernels are broken and this made me simply stick with 4.12.x which doesn't have this new problem with Ryzen CPUs until I wasted time looking into this.