Avoid migration issues with aligned 2MB THB

Bug #1788098 reported by Christian Ehrhardt  on 2018-08-21
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
The Ubuntu-power-systems project
Medium
bugproxy
linux (Ubuntu)
Medium
Unassigned
Bionic
Medium
Unassigned
Cosmic
Medium
Unassigned
qemu (Ubuntu)
Undecided
Unassigned

Bug Description

FYI: This blocks bug 1781526 - once this one here is resolved we can go on with SRU considerations for 1781526

------- Comment From <email address hidden> 2018-08-20 17:12 EDT-------

Hi, in some environments it was observed that this qemu patch to enable THP made it more likely to hit guest migration issues, however the following kernel patch resolves those migration issues:

https://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc.git/commit/?h=kvm-ppc-next&id=c066fafc595eef5ae3c83ae3a8305956b8c3ef15
KVM: PPC: Book3S HV: Use correct pagesize in kvm_unmap_radix()

Once merged upstream, it would be good to include that change as well to avoid potential migration problems. Should I open a new bug for that or is it better to track here?

Note Paelzer: I have not seen related migration issues myself, but it seems reasonable and confirmed by IBM.

Oh, I just realized while initially reported against qemu in bug 1781526 that this is a kernel, and not a qemu patch.

That spreads the timeline a bit:
- this should be in Cosmic before Release to avoid issues due to the fix of 1781526.
  - since that is kind of short I'll bump priority there.
- This has to be in Bionic before a fix for bug 1781526 (I'll wait with a qemu change until this one is complete)

I'm marking the qemu task invalid (no action there other than to track the Bionic release of this which will finally unblock the SRU of bug 1781526 to Bionic).

I'm adding a kernel task to reflect that this is a kernel change that is needed.
Finally I'm adding a Cosmic and Bionic Task.

Changed in qemu (Ubuntu):
status: New → Invalid
no longer affects: qemu (Ubuntu Bionic)
no longer affects: qemu (Ubuntu Cosmic)
Changed in linux (Ubuntu Cosmic):
importance: Undecided → Critical
Changed in linux (Ubuntu Bionic):
importance: Undecided → Medium

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1788098

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete

For this particular case the log files are not needed and/or applicable.
After discussing in #stable-kernel I set it to confirmed.

Changed in linux (Ubuntu Bionic):
status: New → Confirmed
Changed in linux (Ubuntu Cosmic):
status: Incomplete → Confirmed

FYI: this is essentially an IBM request, reverse mirroring will happen at some point, but I wanted to make you aware right now

no longer affects: qemu
Changed in ubuntu-power-systems:
assignee: nobody → Canonical Kernel Team (canonical-kernel-team)
importance: Undecided → Critical
tags: added: triage-g
description: updated
Manoj Iyer (manjo) on 2018-08-22
Changed in ubuntu-power-systems:
assignee: Canonical Kernel Team (canonical-kernel-team) → bugproxy (bugproxy)
Changed in linux (Ubuntu Bionic):
assignee: nobody → Joseph Salisbury (jsalisbury)
Changed in linux (Ubuntu Cosmic):
assignee: nobody → Joseph Salisbury (jsalisbury)
Changed in linux (Ubuntu Bionic):
importance: Medium → Critical
status: Confirmed → In Progress
Changed in linux (Ubuntu Cosmic):
status: Confirmed → In Progress
Joseph Salisbury (jsalisbury) wrote :

I built a test kernel with the following patch:
KVM: PPC: Book3S HV: Use correct pagesize in kvm_unmap_radix()

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1788098

Can you test this kernel and see if it resolves this bug?

Note about installing test kernels:
• If the test kernel is prior to 4.15(Bionic) you need to install the linux-image and linux-image-extra .deb packages.
• If the test kernel is 4.15(Bionic) or newer, you need to install the linux-modules, linux-modules-extra and linux-image-unsigned .deb packages.

Thanks in advance!

------- Comment From <email address hidden> 2018-08-30 10:29 EDT-------
Thanks, I've asked for some testing assistance from our KVM team but will note here some of the details from the original report of this problem..

repro steps are just a simple local host migration.

..they later noted that increasing the speed was a workaround:
(qemu) migrate_set_speed 1G

so you would want to test w/ default speed to confirm the issue is resolved

(qemu) migrate -d tcp:localhost:4444

using " cosmic qemu version 1:2.12+dfsg-3 " from Bug 169712 / LP 1781526 (which enables qemu to use 2MB THP backing for powerpc), plus the test kernel build from this bug.

Note without the kernel fix discussed in this bug, a migration problem might still happen even without that qemu THP patch if you got lucky enough to have a 2MB alignment by chance.

tags: added: architecture-ppc64le bugnameltc-170805 severity-critical targetmilestone-inin---
bugproxy (bugproxy) on 2018-08-30
tags: added: targetmilestone-inin1804
removed: targetmilestone-inin---
Changed in ubuntu-power-systems:
status: New → In Progress
Manoj Iyer (manjo) on 2018-09-24
tags: added: triage-a
removed: triage-g
Manoj Iyer (manjo) on 2018-10-01
Changed in ubuntu-power-systems:
status: In Progress → Incomplete
Andrew Cloke (andrew-cloke) wrote :

Marking as incomplete while awaiting the IBM testing assistance described in comment #6.

Nothing yet happened here.
I also declared the related qemu fix that is blocked by this as incomplete.
@manoj/jfh - maybe time for triage-r here?

tags: added: triage-r
removed: triage-a
Andrew Cloke (andrew-cloke) wrote :

After discussions with IBM, reducing the priority.

Changed in ubuntu-power-systems:
importance: Critical → Medium
Changed in linux (Ubuntu):
importance: Critical → Medium
Changed in linux (Ubuntu Bionic):
importance: Critical → Medium
Changed in linux (Ubuntu Cosmic):
importance: Critical → Medium
bugproxy (bugproxy) wrote :
Download full text (3.5 KiB)

------- Comment From <email address hidden> 2018-12-21 12:10 EDT-------
Hello,

I have been trying to reproduce this bug over this week, but I couldn't do so on Ubuntu.

Could anyone verify what I have been doing wrong?

#################

## QEMU

I have built version Qemu 3.1.0 and made sure the patch that enables THP was included:
../configure --target-list=ppc-linux-user,ppc64-linux-user,ppc64le-linux-user,ppc-softmmu,ppc64-softmmu --enable-debug-info --enable-trace-backends=log --python=/usr/bin/python3 && make -j $(nproc)'

./ppc-softmmu/qemu-system-ppc -version
QEMU emulator version 3.1.0 (v3.1.0-dirty)

## Kernel

uname -a
Linux NAME 4.15.0-20-generic #21-Ubuntu SMP Tue Apr 24 06:14:44 UTC 2018 ppc64le ppc64le ppc64le GNU/Linux

cat /sys/kernel/mm/transparent_hugepage/enabled
[always] madvise never

## CLI command

Both commands were sent on the same host, (1) is the "migrating from" instance and (2) is the "migrate to" instance.

(1)
MALLOC_PERTURB_=1 /home/leonardo/qemu/build/ppc64-softmmu/qemu-system-ppc64 \
-nographic \
-serial mon:stdio \
-S \
-name 'avocado-vt-vm1' \
-machine pseries \
-nodefaults \
-vga std \
-device pci-bridge,id=pci_bridge,bus=pci.0,addr=0x3,chassis_nr=1 \
-device virtio-serial-pci,id=virtio_serial_pci0,bus=pci.0,addr=0x4 \
-object rng-random,filename=/dev/random,id=passthrough-RHq4nIpF \
-device virtio-rng-pci,id=virtio-rng-pci-aXCni2OX,rng=passthrough-RHq4nIpF,bus=pci.0,addr=0x5 \
-device nec-usb-xhci,id=usb1,bus=pci.0,addr=0x6 \
-device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pci.0,addr=0x7 \
-drive id=drive_image1,if=none,snapshot=off,aio=native,cache=none,format=qcow2,file=/home/leonardo/images/ubuntu-18.04-ppc64le.qcow2 \
-device scsi-hd,id=image1,drive=drive_image1 \
-m 8192 \
-smp 4,maxcpus=4,cores=2,threads=1,sockets=2 \
-device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \
-vnc :0 \
-rtc base=utc,clock=host \
-boot order=cdn,once=c,menu=off,strict=off \
-enable-kvm \
-watchdog i6300esb \
-watchdog-action reset \
-device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x9

(2) Same as above. Changes only a few stuff:
- -name 'avocado-vt-vm1' \
+ -name 'avocado-vt-vm2' \
- -vnc :0 \
+ -vnc :1 \
+ -incoming tcp:0:5801 \

## Testing and Results

(1) On guest :
# stress --io 5 --cpu 4
stress: info: [812] dispatching hogs: 4 cpu, 5 io, 0 vm, 0 hdd

(1) on Qemu Terminal:
(qemu) migrate_set_speed 256
(qemu) migrate -d tcp:0:5801
(qemu) info migrate
globals:
store-global-state: on
only-migratable: off
send-configuration: on
send-section-footer: on
capabilities: xbzrle: off rdma-pin-all: off auto-converge: off zero-blocks: off compress: off events: off postcopy-ram: off x-colo: off release-ram: off block: off return-path: off pause-before-switchover: off x
-multifd: off dirty-bitmaps: off
Migration status: completed
total time: 1776 milliseconds
downtime: 61 milliseconds
setup: 9 milliseconds
transferred ram: 422571 kbytes
throughput: 1964.89 mbps
remaining ram: 0 kbytes
total ram: 8405056 kbytes
duplicate: 2006371 pages
skipped: 0 pages
normal: 101037 pages
normal bytes: 404148 kbytes
dirty sync count: 3
page size: 4 kbytes
(qemu) info status
VM status: paused (postmigrate)

It's all over on ~2 second...

Read more...

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2019-01-04 06:12 EDT-------
I have tried the following test in order to reproduce the bug:

##
root@localhost:~# uname -a
Linux localhost 4.15.0-20-generic #21-Ubuntu SMP Tue Apr 24 06:14:44 UTC 2018 ppc64le ppc64le ppc64le GNU/Linux
root@localhost:~# cat /sys/kernel/mm/transparent_hugepage/enabled
[always] madvise never
##

dd if=/dev/urandom of=/dev/shm/img bs=2M count=2000
md5sum /dev/shm/img > test.md5

After the migration, i did:
md5sum -c test.md5
And the result was OK. (memory not corrupted).

I also modified the above test allocating chunks of 2M, this way:

for i in {0001..2000} ; do dd if=/dev/urandom of=/dev/shm/img_${i} bs=2M count=1 ; done
md5sum /dev/shm/* > test.md5

After the migration, i did:
md5sum -c test.md5
And the result was OK for every file. (memory not corrupted).

Conclusion:
- I have found no difference between patched and unpatched kernel during the tests.
- The memory after the migration seems fine, returning the same memory block (tested with md5sum)

Is there any other suggestion about how to reproduce the bug?

Thanks!

bugproxy (bugproxy) wrote :
Download full text (5.5 KiB)

------- Comment From <email address hidden> 2019-01-04 14:29 EDT-------
Test: Verify all memory after migration

###################
Host:
###################

# uname -a
Linux host 4.15.0-20-generic #21-Ubuntu SMP Tue Apr 24 06:14:44 UTC 2018 ppc64le ppc64le ppc64le GNU/Linux

#cat /sys/kernel/mm/transparent_hugepage/enabled
[always] madvise never

#cat /proc/cpuinfo
[...]
processor : 159
cpu : POWER9, altivec supported
clock : 2300.000000MHz
revision : 2.2 (pvr 004e 1202)

timebase : 512000000
platform : PowerNV
model : 8375-42A
machine : PowerNV 8375-42A
firmware : OPAL
MMU : Radix

As previously, I have built version Qemu 3.1.0 and made sure the patch that enables THP was included:
#../configure --target-list=ppc-linux-user,ppc64-linux-user,ppc64le-linux-user,ppc-softmmu,ppc64-softmmu --enable-debug-info --enable-trace-backends=log --python=/usr/bin/python3 && make -j $(nproc)'

#./ppc-softmmu/qemu-system-ppc -version
QEMU emulator version 3.1.0 (v3.1.0-dirty)

###################
Guest:
###################

### CLI 1: Migrating from:
MALLOC_PERTURB_=1 /home/leonardo/qemu/build/ppc64-softmmu/qemu-system-ppc64 \
-nographic \
-serial mon:stdio \
-name 'avocado-vt-vm1' \
-machine pseries \
-nodefaults \
-vga std \
-device pci-bridge,id=pci_bridge,bus=pci.0,addr=0x3,chassis_nr=1 \
-device virtio-serial-pci,id=virtio_serial_pci0,bus=pci.0,addr=0x4 \
-object rng-random,filename=/dev/random,id=passthrough-RHq4nIpF \
-device virtio-rng-pci,id=virtio-rng-pci-aXCni2OX,rng=passthrough-RHq4nIpF,bus=pci.0,addr=0x5 \
-device nec-usb-xhci,id=usb1,bus=pci.0,addr=0x6 \
-device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pci.0,addr=0x7 \
-drive id=drive_image1,if=none,snapshot=off,aio=native,cache=none,format=qcow2,file=/home/leonardo/images/ubuntu-18.04-ppc64le.qcow2 \
-device scsi-hd,id=image1,drive=drive_image1 \
-m 8192 \
-smp 4,maxcpus=4,cores=2,threads=1,sockets=2 \
-device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \
-vnc :0 \
-rtc base=utc,clock=host \
-boot order=cdn,once=c,menu=off,strict=off \
-enable-kvm \
-watchdog i6300esb \
-watchdog-action reset \
-device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x9 \
-initrd /boot/initrd.img-4.15.0-20-generic \
-kernel /boot/vmlinux-4.15.0-20-generic \
-append "root=UUID=b4ef9412-06d6-4947-9969-f15c7cc2c986 ro quiet splash

### CLI 2: Migrating To
Copy of CLI 1, changing:

- -name 'avocado-vt-vm1' \
+ -name 'avocado-vt-vm2' \
+ -S
- -vnc :0 \
+ -vnc :1 \
+ -incoming tcp:0:5801 \

### Inside Guest:

#uname -a
Linux localhost 4.15.0-20-generic #21-Ubuntu SMP Tue Apr 24 06:14:44 UTC 2018 ppc64le ppc64le ppc64le GNU/Linux

# cat /sys/kernel/mm/transparent_hugepage/enabled
[always] madvise never

#cat /proc/cpuinfo
processor : 3
cpu : POWER9 (architected), altivec supported
clock : 2900.000000MHz
revision : 2.2 (pvr 004e 1202)

timebase : 512000000
platform : pSeries
model : IBM pSeries (emulated by qemu)
machine : CHRP IBM pSeries (emulated by qemu)
MMU : Radix

###################
Test Software:
###################
I created a simple C file to:
- allocate 2MB blocks,
- write urandom to them,
- md5sum all the blocks together,
- stops,...

Read more...

------- Comment (attachment only) From <email address hidden> 2019-01-04 14:32 EDT-------

Hi Leonardo,
thanks for your efforts trying to verify that.
Given that you couldn't trigger it I wonder what to do.
Currently it is incomplete waiting for such a test, but as it seems to elude you I'd suggest we call the bug invalid until we would know otherwise.

For the related bug 1781526 I would think it faces a similar destiny.
There also the test/verification kind of left us with Jhopper.
It was said that this bug here might occur more often if 1781526 would be applied.
While we couldn't trigger the bug here, I'm reluctant to push a nice but minor performance fix while we know it might trigger more crashes.

Therefore I'll set BOTH bugs to invalid and would ask the kernel Team to stop working on this one here until one can provide a working trigger&verification.

Changed in linux (Ubuntu Cosmic):
status: In Progress → Invalid
Changed in linux (Ubuntu Bionic):
status: In Progress → Invalid
Changed in linux (Ubuntu):
status: In Progress → Invalid
Changed in ubuntu-power-systems:
status: Incomplete → Invalid

------- Comment From <email address hidden> 2019-01-24 09:26 EDT-------
By suggestion of Michael Ranweiler, I did some concurrent migration tests.
In fact, I just repeated the procedure used before, but did it twice at roughly the same time (in parallel).

The results are attached.
Migration 1: from1.txt to1.txt
Migration 2: from2.txt to2.txt

bugproxy (bugproxy) wrote : from1

------- Comment (attachment only) From <email address hidden> 2019-01-24 09:28 EDT-------

------- Comment From <email address hidden> 2019-01-24 09:34 EDT-------
By the test results, the problem doesn't seem to reproduce.

Are there any other suggestions to reproduce it?

bugproxy (bugproxy) wrote : from2

------- Comment (attachment only) From <email address hidden> 2019-01-24 09:29 EDT-------

bugproxy (bugproxy) wrote : to1

------- Comment (attachment only) From <email address hidden> 2019-01-24 09:30 EDT-------

bugproxy (bugproxy) wrote : to2

------- Comment (attachment only) From <email address hidden> 2019-01-24 09:33 EDT-------

Thanks for your continuous efforts on this Leonardo, I have no further suggestion.
I think to stay on the safe side we will keep everything as-is for now.

I'd say it is IBMs call to decide between this now:
a) Speed: Call 1781526 unblocked by the evaluation here. We'd re-consider SRUing that bug then based on your call this won't cause issues on ppc64el.
b) Safety: since it was only a minor performance improvement but has the potential hidden breakage associated we keep 1781526 in Won't Fix

Changed in linux (Ubuntu):
assignee: Joseph Salisbury (jsalisbury) → nobody
Changed in linux (Ubuntu Bionic):
assignee: Joseph Salisbury (jsalisbury) → nobody
Changed in linux (Ubuntu Cosmic):
assignee: Joseph Salisbury (jsalisbury) → nobody

------- Comment From <email address hidden> 2019-02-08 14:05 EDT-------
In a meeting with lagarcia, I was informed this patch is very important, and that it is already on kernel 4.18-15 onwards.

In fact, including this one. there are two important patches on this subject:

https://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc.git/commit/?h=kvm-ppc-next&id=c066fafc595eef5ae3c83ae3a8305956b8c3ef15
https://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc.git/commit/?h=kvm-ppc-next&id=6579804c431712d56956a63b1a01509441cc6800

As I said before, for 18.10 onwards (kernel >= 4.18), the patch is available from kernel upstream source, but for Ubuntu 18.04 they may not be so easily applied.

So I will work on backporting them to v4.15.

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2019-02-19 20:52 EDT-------
(In reply to comment #34)
> In a meeting with lagarcia, I was informed this patch is very important, and
> that it is already on kernel 4.18-15 onwards.
>
> In fact, including this one. there are two important patches on this subject:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc.git/commit/
> ?h=kvm-ppc-next&id=c066fafc595eef5ae3c83ae3a8305956b8c3ef15
> https://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc.git/commit/
> ?h=kvm-ppc-next&id=6579804c431712d56956a63b1a01509441cc6800

To get those you will need to cherry-pick the following patches from upstream:

39c983ea0f96 KVM: PPC: Remove unused kvm_unmap_hva callback
c4c8a7643e74 KVM: PPC: Book3S HV: Radix page fault handler optimizations
f7caf712d885 KVM: PPC: Book3S HV: Streamline setting of reference and change bits
58c5c276b4c2 KVM: PPC: Book3S HV: Handle 1GB pages in radix page fault handler
31c8b0d0694a KVM: PPC: Book3S HV: Use __gfn_to_pfn_memslot() in page fault handler
e2560b108fb1 KVM: PPC: Book3S HV: Make radix use correct tlbie sequence in kvmppc_radix_tlbie_page
7e3d9a1d0f2c KVM: PPC: Book3S HV: Make radix clear pte when unmapping
df158189dbcc KVM: PPC: Book 3S HV: Do ptesync in radix guest exit path
21828c99ee91 powerpc/kvm: Switch kvm pmd allocator to custom allocator
99491e2d0e50 powerpc/mm/radix: Remove unused code
0078778a86b1 powerpc/mm/radix: implement LPID based TLB flushes to be used by KVM (note that this one will generate some conflicts)
a5fad1e95952 KVM: PPC: Book3S HV: Use a helper to unmap ptes in the radix fault path
a5704e83aa3d KVM: PPC: Book3S HV: Recursively unmap all page table entries when unmapping
d91cb39ffa7b KVM: PPC: Book3S HV: Make radix use the Linux translation flush functions for partition scope
9a4506e11b97 KVM: PPC: Book3S HV: Make radix handle process scoped LPID flush in C, with relocation on
bc64dd0e1c4e KVM: PPC: Book3S HV: radix: Refine IO region partition scope attributes
878cf2bb2d8d KVM: PPC: Book3S HV: radix: Do not clear partition PTE when RC or write bits do not match
c066fafc595e KVM: PPC: Book3S HV: Use correct pagesize in kvm_unmap_radix()
71d29f43b633 KVM: PPC: Book3S HV: Don't use compound_order to determine host mapping size
6579804c4317 KVM: PPC: Book3S HV: Avoid crash from THP collapse during radix page fault

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers