Comment 0 for bug 1792501

Revision history for this message
bugproxy (bugproxy) wrote :

== Comment: #0 - Satheesh Rajendran <email address hidden> - 2018-09-11 04:10:09 ==
---Problem Description---
package installation segfaults inside debian chroot env in P9 KVM guest with HTM enabled

---Additional Hardware Info---
FW with tm-suspend-mode enabled
#cd /sys/firmware/devicetree/base/ibm,opal/fw-features/
#ls -1 tm-suspend-mode
enabled
name
phandle

qemu-kvm 1:2.11+dfsg-1ubuntu7.4

Machine Type = Power9 DD2.2

---Steps to Reproduce---
 1. Boot a P9 KVM guest Ubuntu 18.04 (with cap-htm=on, bydefault it is on)
tried with upstream kernel aswell(same results)

create tap device in host
# tunctl -t tap1 -u `whoami`;brctl addif virbr0 tap1;ifconfig tap1 up
#qemu-system-ppc64 -enable-kvm -M pseries -m 8192 -smp 4 -drive file=/home/sath/ubuntu-18.04-ppc64le.qcow2,format=qcow2,if=none,id=drive-scsi0 -device virtio-scsi-pci,id=drive-scsi0 -device scsi-hd,drive=drive-scsi0 -serial mon:stdio -enable-kvm -vga none -nographic -kernel /home/sath/vmlinux_4.19 -append root=/dev/sda2 rw console=tty0 console=ttyS0,115200 init=/sbin/init initcall_debug -netdev tap,id=mynet1,ifname=tap1,script=no,downscript=no -device virtio-net,netdev=mynet1,mac=52:55:00:d1:55:42

run dhclient inside guest.

2. # mkdir -p stretch
# debootstrap stretch /stretch http://httpredir.debian.org/debian
# chroot /stretch
/# apt-get update && apt-get install -y make gcc ruby python

...
[ 32.029474] random: crng init done
[ 32.029477] random: 7 urandom warning(s) missed due to ratelimiting
[ 500.300835] dpkg-deb[8704]: segfault (11) at c0000000000037fa nip 7fffac2d098c lr 7fffac2d08c4 code 1 in libc-2.24.so[7fffac170000+190000]
[ 500.300863] dpkg-deb[8704]: code: 48000028 eb090010 2eb80000 4096006c 419e0074 85270004 394a0001 794a0020
[ 500.300881] dpkg-deb[8704]: code: 71280001 408200a0 1d2a0018 7d2b4a14 <a1090006> 2ea80000 40960010 e9090008

---uname output---
4.15.0-34,4.19.0-rc3

---Debugger---
A debugger is not configured

Contact Information = <email address hidden>

Userspace tool common name:

KVM Guest: Ubuntu GLIBC 2.27-3ubuntu1) stable release version 2.27,
Chroot inside KVM Guest: Debian GLIBC 2.24-11+deb9u3) stable release version 2.24

Userspace rpm:

KVM Guest: Ubuntu GLIBC 2.27-3ubuntu1) stable release version 2.27,
Chroot inside KVM Guest: Debian GLIBC 2.24-11+deb9u3) stable release version 2.24

The userspace tool has the following bit modes: both

Userspace tool obtained from project website: na

*Additional Instructions for <email address hidden>:
-Post a private note with access information to the machine that the bug is occuring on.
-Attach ltrace and strace of userspace application.

So latest update taken from https://github.ibm.com/powercloud/icp-ppc64le/issues/470

was able to recreate segfault using TM test cases

/linux/tools/testing/selftests/powerpc/tm

# ./tm-vmxcopy
test: tm_vmxcopy
tags: git_version:v4.19-rc3-0-g11da3a7f84f1-dirty
!! child died by signal 11
failure: tm_vmxcopy

this particular test on being run gets a signal 11

[267132.434651] tm-vmxcopy[641]: unhandled signal 11 at 0000000000000001 nip 0000000104ba122c lr 0000000104ba11e4 code 30001
[267253.708795] tm-vmxcopy[7861]: unhandled signal 11 at 0000000000000001 nip 000000012a31122c lr 000000012a3111e4 code 30001
[267385.064533] tm-vmxcopy[13314]: unhandled signal 11 at 0000000000000001 nip 00000001235f122c lr 00000001235f11e4 code 30001

== Comment: #12 - Michael Neuling <email address hidden> - 2018-09-13 00:34:16 ==
Fixes r11 corruption.

== Comment: #14 - Satheesh Rajendran <email address hidden> - 2018-09-13 03:15:46 ==
Tested with above patch on KVM host and reported issue is fixed.

# git log -1
commit 72664e47565f5de0a1fead1d9111c97b9b537713 (HEAD -> fix)
Author: Michael Neuling <email address hidden>
Date: Thu Sep 13 15:33:47 2018 +1000

    KVM: PPC: Book3S HV: Fix guest r11 corruption with POWER9 TM workarounds

    When we come into the softpatch handler (0x1500), we use r11 to store
    the HSRR0 for later use by the denorm handler.

    We also use the softpatch handler for the TM workarounds for
    POWER9. Unfortunately, in kvmppc_interrupt_hv we later store r11 out
    to the vcpu assuming it's still what we got from userspace.

    This causes r11 to be corrupted in the VCPU and hence when we restore
    the guest, we get a corrupted r11. We've seen this when running TM
    tests inside guests on P9.

    This fixes the problem by only touching r11 in the denorm case.

    Fixes: 4bb3c7a020 ("KVM: PPC: Book3S HV: Work around transactional memory bugs in POWER9")
    Cc: <email address hidden> # 4.17+
    Test-by: Suraj Jitindar Singh <email address hidden>
    Reviewed-by: Paul Mackerras <email address hidden>
    Signed-off-by: Michael Neuling <email address hidden>

Regards,
-Satheesh

http://patchwork.ozlabs.org/patch/969256/