Hello @pooja-9, @kphatak-pf9 and @vlee,
You wouldn't happen to have Kernel Samepage Merging (KSM) enabled on your compute nodes would you?
You can check by looking at the value of:
$ cat /sys/kernel/mm/ksm/run
If it is 1, your nodes have it enabled, and if it is 0 or "missing", you don't have it on.
We have just hit the problem, and I think I have found a fix for it. I will fix the 4.15 kernel once I have analysed the problem a bit more.
Hello @pooja-9, @kphatak-pf9 and @vlee,
You wouldn't happen to have Kernel Samepage Merging (KSM) enabled on your compute nodes would you?
You can check by looking at the value of:
$ cat /sys/kernel/ mm/ksm/ run
If it is 1, your nodes have it enabled, and if it is 0 or "missing", you don't have it on.
We have just hit the problem, and I think I have found a fix for it. I will fix the 4.15 kernel once I have analysed the problem a bit more.