Comment 89 for bug 1011792

Revision history for this message
Jérôme Petazzoni (jerome-petazzoni) wrote :

For the record: we also experienced this issue on EC2 PV instances (m2.2xlarge in us-east-1), using a very intensive workload (~1000 LXC containers running a mix of web apps and databases). Running any kind of 3.X kernel (3.2, 3.4, 3.6) causes our test workload to crash in less than one hour.

The same workload, with the same kernel, on bare metal, works fine.
The same workload, with the same kernel, on HVM instances, works fine.
The same workload, with the 3.2.0-39-virtual kernel in "proposed", on PV instances (m3.2xlarge), works fine.
The same workload, with a 3.2.40+grsec kernel custom kernel including the spinlock IPI patch, on PV instances, works fine.

We'll update this ticket if we see other problems appear, but it looks like this (1 line!) patch did the trick.

Congratulations to those involved with the debugging—this thing has been locking us with pre-3.0 kernels on our old fleet of EC2 instances, and seeing it go away is a huge relief!