I've been continuing Junien's investigations into this problem. The machines have had all the BIOS and firmware updates I could find on HP's website (although in the case of a DL385-G7 the latest appears to be February 2014!) One of them only lasted a day before crashing again.
So, step 2 was to add "nox2apic intremap=off" to the DL385-G7s. I added it to only one of them initially. That machine lasted 9 days before we had another kernel panic ("NMI watchdog: BUG: soft lockup - CPU#27 stuck for 23s! [migration/27:200]"), but after the panic it seems to have settled back down again (without any reboot).
I've also added "intremap=no_x2apic_optout" to one of the DL360-G8s after it crashed a couple of days ago. So far, it's doing ok.
I''m tempted to try upgrading them to linux-image-generic-lts-wily (currently 4.2.0.18.13) unless there's any information from the current setup that could be useful.
Hi Rafael,
I've been continuing Junien's investigations into this problem. The machines have had all the BIOS and firmware updates I could find on HP's website (although in the case of a DL385-G7 the latest appears to be February 2014!) One of them only lasted a day before crashing again.
So, step 2 was to add "nox2apic intremap=off" to the DL385-G7s. I added it to only one of them initially. That machine lasted 9 days before we had another kernel panic ("NMI watchdog: BUG: soft lockup - CPU#27 stuck for 23s! [migration/ 27:200] "), but after the panic it seems to have settled back down again (without any reboot).
I've also added "intremap= no_x2apic_ optout" to one of the DL360-G8s after it crashed a couple of days ago. So far, it's doing ok.
I''m tempted to try upgrading them to linux-image- generic- lts-wily (currently 4.2.0.18.13) unless there's any information from the current setup that could be useful.