Initially this looked similar to bug 1413540.
This bug patched 3.13 with 9242b5b to _mitigate_ the issue, but this patch is already present in 3.16. So perhaps we're hitting another failure mode.
It would be good to know if the smp_call_function_* path in the backtrace is actually leading up to an IPI call that gets lost, and thus we spin in csd_lock_wait.
Are you running nested KVM instances? How often does this lockup occur? Can you get crashdumps of this issue?
Initially this looked similar to bug 1413540.
This bug patched 3.13 with 9242b5b to _mitigate_ the issue, but this patch is already present in 3.16. So perhaps we're hitting another failure mode.
It would be good to know if the smp_call_function_* path in the backtrace is actually leading up to an IPI call that gets lost, and thus we spin in csd_lock_wait.
Are you running nested KVM instances? How often does this lockup occur? Can you get crashdumps of this issue?
Thanks,
--chris j arges