Comment 56 for bug 1861359

Revision history for this message
Sultan Alsawaf (kerneltoast) wrote :

This problem is caused by an upstream memory management feature called watermark boosting. Normally, when a memory allocation fails and falls back to the page allocator, the page allocator will wake up kswapd to free up pages in order to make the memory allocation succeed. kswapd tries to free memory until it reaches a minimum amount of memory for each memory zone called the high watermark.

What watermark boosting does is try to preemptively fire up kswapd to free memory when there hasn't been an allocation failure. It does this by increasing kswapd's high watermark goal and then firing up kswapd. The reason why this causes freezes is because, with the increased high watermark goal, kswapd will steal memory from processes that need it in order to make forward progress. These processes will, in turn, try to allocate memory again, which will cause kswapd to steal necessary pages from those processes again, in a positive feedback loop known as page thrashing. When page thrashing occurs, your system is essentially livelocked until the necessary forward progress can be made to stop processes from trying to continuously allocate memory and trigger kswapd to steal it back.

This problem already occurs with kswapd *without* watermark boosting, but it's usually only encountered on machines with a small amount of memory and/or a slow CPU. Watermark boosting just makes the existing problem worse enough to notice on higher spec'd machines.

To fix the issue in this bug, watermark boosting can be disabled with the following:
# echo 0 > /proc/sys/vm/watermark_boost_factor

There's really no harm in doing so, because watermark boosting is an inherently broken feature...