Comment 10 for bug 1837810

Revision history for this message
Matthew Ruffell (mruffell) wrote :

Verification steps for Bionic:

First, I made sure I could reproduce the problem on 4.15.0-115-generic.

I made a fresh Bionic VM, and copied over the ksm_refcnt_overflow.sh and zero_page_refcound.c files.

I built the kernel module, and inserted it into the kernel.

From there, I checked the zero_page reference counter.

$ sudo insmod zero_page_refcount.ko
[sudo] password for ubuntu:
ubuntu@ubuntu:~/module$ cat /proc/zero_page_refcount
Zero Page Refcount: 0x1 or 1

From there, in another terminal, I ran the script ksm_refcnt_overflow.sh, and
checked to see VMs were running:

$ virsh list
 Id Name State
----------------------------------------------------
 1 instance-0 running
 2 instance-1 running
 3 instance-2 running
 4 instance-3 running
 5 instance-4 running

From there, we can see the reference counter increment:

$ cat /proc/zero_page_refcount
Zero Page Refcount: 0x1158 or 4440
ubuntu@ubuntu:~/module$ cat /proc/zero_page_refcount
Zero Page Refcount: 0x1622 or 5666
ubuntu@ubuntu:~/module$ cat /proc/zero_page_refcount
Zero Page Refcount: 0x163a or 5690

I issued the set command, to get it ready to overflow:

$ cat /proc/zero_page_refcount_set
Zero Page Refcount set to 0x1FFFFFFFFF000

I then checked and saw it overflow:

ubuntu@ubuntu:~/module$ cat /proc/zero_page_refcount
Zero Page Refcount: 0x7fffff27 or 2147483431
ubuntu@ubuntu:~/module$ cat /proc/zero_page_refcount
Zero Page Refcount: 0x7fffff92 or 2147483538
ubuntu@ubuntu:~/module$ cat /proc/zero_page_refcount
Zero Page Refcount: 0x80000000 or -2147483648

Instances became paused, and virtualisation broken:

$ virsh list
 Id Name State
----------------------------------------------------
 5 instance-4 paused
 6 instance-5 paused
 7 instance-6 paused
 8 instance-7 paused
 9 instance-0 paused
 10 instance-1 paused
 11 instance-2 paused
 12 instance-3 paused

From there, we see the usual call trace in dmesg:

https://paste.ubuntu.com/p/wpJkGCH3fJ/

I rebooted, and enabled -proposed. I then installed the 4.15.0-116-generic kernel, and rebooted again.

I rebuilt the zero_page_refcount kernel module with the new headers, and inserted it into the running kernel.

$ uname -rv
4.15.0-116-generic #117-Ubuntu SMP Fri Aug 28 16:04:22 UTC 2020
$ sudo insmod zero_page_refcount.ko
ubuntu@ubuntu:~/module$ cat /proc/zero_page_refcount
Zero Page Refcount: 0x1 or 1

From there, I started the script ksm_refcnt_overflow.sh in another terminal.

We can see that VMs are running:

$ virsh list
 Id Name State
----------------------------------------------------
 1 instance-1 running
 2 instance-2 running
 3 instance-3 running
 4 instance-4 running

Checking the value of the zero_page reference counter:

$ cat /proc/zero_page_refcount
Zero Page Refcount: 0x1 or 1

We are still at 1. Now attempting to trigger overflow:

$ cat /proc/zero_page_refcount_set
Zero Page Refcount set to 0x1FFFFFFFFF000

$ cat /proc/zero_page_refcount
Zero Page Refcount: 0x7fffff00 or 2147483392
ubuntu@ubuntu:~/module$ cat /proc/zero_page_refcount
Zero Page Refcount: 0x7fffff00 or 2147483392
ubuntu@ubuntu:~/module$ cat /proc/zero_page_refcount
Zero Page Refcount: 0x7fffff00 or 2147483392

The reference counter is never incremented, and will not overflow.

The problem is solved, and I am happy to mark this bug as verified for bionic.