Note: 60c8144afc28 is only masking what the real issue is; it's a real bug, but the reason it's getting hit at all in this specific instance is because of an AMD CPU erratum which is causing spurious MCEs early enough to hit the bug this commit fixes.
However, while the crash is fixed, the thresholding interrupts are still going to be coming in fast and furious, better to disable them on affected CPUs as fixed by the following:
Just ran across this bug in LP.
Note: 60c8144afc28 is only masking what the real issue is; it's a real bug, but the reason it's getting hit at all in this specific instance is because of an AMD CPU erratum which is causing spurious MCEs early enough to hit the bug this commit fixes.
However, while the crash is fixed, the thresholding interrupts are still going to be coming in fast and furious, better to disable them on affected CPUs as fixed by the following:
https:/ /git.kernel. org/pub/ scm/linux/ kernel/ git/torvalds/ linux.git/ commit/ ?id=45d4b7b9cb8 8526f6d5bd4c03e fab88d75d10e4f
https:/ /git.kernel. org/pub/ scm/linux/ kernel/ git/torvalds/ linux.git/ commit/ ?id=71a84402b93 e5fbd8f817f4005 9c137e10171788
If the above 2 commits are in place, 60c8144afc28 becomes less critical, as you should no longer hit that condition.