Comment 8 for bug 1680390

Revision history for this message
Seth Forshee (sforshee) wrote :

We've been getting a number of other reports of this problem. I've been trying to reproduce it locally, without any luck so far. However it does appear that the problem happens only in zesty kernels and not with upstream 4.10 stable kernels, which suggests that one of the backports or sauce patches we've applied. The stack trace suggests a problem with migration (or possibly KSM).

Going through those sorts of commits related to the kernel mm code, a few stand out based on the size of the changes and the code they're touching. All of them are backports requested for power9 on bug #1671613.

6e2a092a48d3 mm: introduce page_vma_mapped_walk()
3000e033152a mm, ksm: convert write_protect_page() to use page_vma_mapped_walk()
c228a1037cd6 mm/ksm: handle protnone saved writes when making page write protect

Some upstream bug fixes reference these patches too (but no mention of the BUG we're hitting):

d19469e84158 power/mm: update pte_write and pte_wrprotect to handle savedwrite
d75450ff40df mm: fix page_vma_mapped_walk() for ksm pages

I'm going to try to set up the Avocado tests today to see if that allows me to reproduce. If you are able to reproduce reliably, you could try applying the fixes above to see if they help, or try bisecting the patches applied to zesty on top of upstream 4.10 to identify the patch which causes the issues.