Comment 96 for bug 1505564

Revision history for this message
Junien F (axino) wrote :

Hi Rafael,

For starters, the server Chris mentioned above didn't panic because the kernel.softlockup_panic wasn't set to 1 on reboot. This is now fixed.

Then, we're still running 3.19 (all the nodes got rebooted to 3.19.0-33-generic). Let me know if you wish us to get back to 3.13.

I verified that all the firmwares were the most recent ones, and they were.

I rebooted all the nodes with the proper x2apic kernel options. I also disabled all C-States, and also set everything relevant to "performance". You can see the changes here : http://paste.ubuntu.com/13312776/ (this paste is showing all possible settings in G7 and Gen8, I of course could only apply the settings that existed on each infrastructure).

Unfortunately, even with all this, we had a G7 that panic'ed and crashdump'ed about ~1h after I set it back in the compute pool. You will find the apport and crashdump below.

Let me know what are the next steps.

Thanks !