Comment 4 for bug 1670315

Revision history for this message
Michael Hohnbaum (hohnbaum) wrote : Re: [Bug 1670315] [NEW] Ubuntu 17.04: Guest does not reflect all the cpus hotplugged

Leann,

Kernel patch referenced to fix this issue. Please have the Kernel team look.

Thanks.

                     Michael

On 03/06/2017 02:51 AM, Launchpad Bug Tracker wrote:
> bugproxy (bugproxy) has assigned this bug to you for Ubuntu:
>
> == Comment: #0 - Satheesh Rajendran <email address hidden> - 2017-02-28 06:00:53 ==
> ---Problem Description---
> Guest does not reflect all the cpus hotplugged,
> Holpug vcpus using setvcpu with initial less number of cpus(1) to a greater cpus(~256), though
> setvcpu(libvirt) returns no error, guest does not reflect all cpus inside.
>
>
> Contact Information = <email address hidden>
>
> ---uname output---
> Linux ltc-test-ci1 4.10.0-9-generic #11-Ubuntu SMP Mon Feb 20 13:45:11 UTC 2017 ppc64le ppc64le ppc64le GNU/Linux
>
> Machine Type = power 8 ppc64le
>
> ---Debugger---
> A debugger is not configured
>
> ---Steps to Reproduce---
> 1. Start the guest(Ubuntu 17.04) with 1 current vcpu and 255 maxvcpus
> ...
> <vcpu placement='static' current='1'>255</vcpu>
> ...
> <cpu>
> <topology sockets='1' cores='255' threads='1'/>
> </cpu>
> ....
>
> # lscpu
> Architecture: ppc64le
> Byte Order: Little Endian
> CPU(s): 1
> On-line CPU(s) list: 0
> Thread(s) per core: 1
> Core(s) per socket: 1
> Socket(s): 1
> NUMA node(s): 1
> Model: 2.1 (pvr 004b 0201)
> Model name: POWER8E (raw), altivec supported
> Hypervisor vendor: KVM
> Virtualization type: para
> L1d cache: 64K
> L1i cache: 32K
> NUMA node0 CPU(s): 0
>
> 2.# time virsh setvcpus virt-tests-vm1 255 --live --config
>
>
> real 0m4.460s
> user 0m0.012s
> sys 0m0.000s
> root@ltc-test-ci1:/var/lib/libvirt/images/workspace/runAvocadoFVTTest/avocado-fvt-wrapper# echo $?
> 0
>
> 3. Check inside the guest after some time (10-15 mins) (
> dmesg of guest shows all the RTAS(255) events,but the guest showed only 90 vcpus(it consistent around ~ 100 always).
>
> root@ubuntu:~# lscpu
> Architecture: ppc64le
> Byte Order: Little Endian
> CPU(s): 97
> On-line CPU(s) list: 0-96
> Thread(s) per core: 1
> Core(s) per socket: 97
> Socket(s): 1
> NUMA node(s): 1
> Model: 2.1 (pvr 004b 0201)
> Model name: POWER8E (raw), altivec supported
> Hypervisor vendor: KVM
> Virtualization type: para
> L1d cache: 64K
> L1i cache: 32K
> NUMA node0 CPU(s): 0-96
> root@ubuntu:~# tail /proc/cpuinfo
>
> processor : 96
> cpu : POWER8E (raw), altivec supported
> clock : 3425.000000MHz
> revision : 2.1 (pvr 004b 0201)
>
> timebase : 512000000
> platform : pSeries
> model : IBM pSeries (emulated by qemu)
> machine : CHRP IBM pSeries (emulated by qemu)
>
>
> Userspace tool common name: libvirt, qemu
>
> The userspace tool has the following bit modes: both
>
> Userspace rpm: qemu-kvm 1:2.8+dfsg-
> 2ubuntu1 ppc64el,ii libvirt-bin
> 2.5.0-3ubuntu2 ppc64el
>
> Userspace tool obtained from project website: na
>
> Guest Details:
> #cat /etc/os-release |grep VERSION=
> VERSION="17.04 (Zesty Zapus)"
> # uname -a
> Linux ubuntu 4.10.0-8-generic #10-Ubuntu SMP Mon Feb 13 14:00:06 UTC 2017 ppc64le ppc64le ppc64le GNU/Linux
> root@ubuntu:~# dpkg -l |grep rtas
> ii librtas-dev 2.0.0-2 ppc64el userspace RTAS library development files
> ii librtas2 2.0.0-2 ppc64el userspace RTAS library
> ii librtasevent-dev 2.0.0-2 ppc64el RTAS events library development files
> ii librtasevent2 2.0.0-2 ppc64el RTAS events library
> ii ppc64-diag 2.7.1-6 ppc64el Platform error log analysis tool and rtas_errd daemon
>
>
> *Additional Instructions for <email address hidden>:
> -Post a private note with access information to the machine that the bug is occuring on.
> -Attach ltrace and strace of userspace application.
>
>
> == Comment: #9 - BHARATA BHASKER RAO <email address hidden> - 2017-03-03 04:32:14 ==
> When a large number of hotplug requests are generated too quickly, guest will miss the handling of a few RTAS events due to buffer overrun. Because of this, guest will not see all the hotplugged CPUs. This was raised earlier in bz 142499 with the following resolution:
>
> - We need in-kernel CPU hotplug feature in the guest for this to work.
> - Until in-kernel CPU hotplug is available in the guest kernel, user should be careful not to overload the guest with so many successive hotplug requests.
>
> I reproduced the problem with ubuntu-1704 guest (with default kernel)
> and was able to get over the problem with a self-compiled guest kernel
> from latest linux git that has in-kernel CPU hotplug.
>
> bharata@ubuntu-1704:~$ uname -a
> Linux ubuntu-1704 4.10.0+ #1 SMP Fri Mar 3 15:42:46 IST 2017 ppc64le ppc64le ppc64le GNU/Linux
>
> bharata@ubuntu-1704:~$ lscpu
> Architecture: ppc64le
> Byte Order: Little Endian
> CPU(s): 255
> On-line CPU(s) list: 0-254
> Thread(s) per core: 1
> Core(s) per socket: 1
> Socket(s): 255
> NUMA node(s): 1
> Model: 2.0 (pvr 004d 0200)
> Model name: POWER8 (raw), altivec supported
> Hypervisor vendor: KVM
> Virtualization type: para
> L1d cache: 64K
> L1i cache: 32K
> NUMA node0 CPU(s): 0-254
>
>
> == Comment: #10 - BHARATA BHASKER RAO <email address hidden> - 2017-03-06 02:28:42 ==
>
> commit 3dbbaf200f532e01e56168b8339f2981f2cb1d67
> Author: Michael Roth <email address hidden>
> Date: Mon Feb 20 19:12:18 2017 -0600
>
> powerpc/pseries: Advertise Hot Plug Event support to firmware
>
> With the inclusion of commit 333f7b76865b ("powerpc/pseries: Implement
> indexed-count hotplug memory add") and commit 753843471cbb
> ("powerpc/pseries: Implement indexed-count hotplug memory remove"), we
> now have complete handling of the RTAS hotplug event format as described
> by PAPR via ACR "PAPR Changes for Hotplug RTAS Events".
>
> This capability is indicated by byte 6, bit 2 (5 in IBM numbering) of
> architecture option vector 5, and allows for greater control over
> cpu/memory/pci hot plug/unplug operations.
>
> Existing pseries kernels will utilize this capability based on the
> existence of the /event-sources/hot-plug-events DT property, so we
> only need to advertise it via CAS and do not need a corresponding
> FW_FEATURE_* value to test for.
>
> Signed-off-by: Michael Roth <email address hidden>
> Signed-off-by: Michael Ellerman <email address hidden>
>
> $ git tag --contains 3dbbaf200
> v4.11-rc1
> $
>
> Commit 3dbbaf200 is available upstream in kernel v4.11-rc1 onwards, thus
> missing from 1704 kernel. This commit is needed for memory unplug support.
>
> ** Affects: ubuntu
> Importance: Undecided
> Assignee: Taco Screen team (taco-screen-team)
> Status: New
>
>
> ** Tags: architecture-ppc64le bugnameltc-152070 severity-critical targetmilestone-inin1704

--
Michael Hohnbaum
OIL Program Manager
Power (ppc64el) Development Project Manager
Canonical, Ltd.