------- Comment From <email address hidden> 2017-04-27 16:58 EDT-------
I've finally had the opportunity to test the patch set I've sent to qemu mailing list ("[PATCH 0/4 v7] migration/ppc: migrating DRC, ccs_list and pending_events?") that fixes this issue using libvirt. Until then I've tested using QEMU alone. virsh still reports the same error, but the hot unplug is successful in the VM after the migration. Apparently my QEMU patch set alone is not enough to fix this virsh behavior.
In my test I've used 2 Ubuntu 17.04 P8 hosts. I had to compile libvirt from scratch to make it work with the compiled upstream QEMU + my patch set:
- source host:
root@source:/home/danielhb/usr/bin#
root@source:/home/danielhb/usr/bin# ./virsh start dhb_ub1704_nfs
Domain dhb_ub1704_nfs started
root@source:/home/danielhb/usr/bin# ./virsh console dhb_ub1704_nfs
Connected to domain dhb_ub1704_nfs
Escape character is ^]
Password:
Last login: Thu Apr 27 13:48:41 CDT 2017 on hvc0
danielhb@Ub1704NFS:~$ lscpu
Architecture: ppc64le
Byte Order: Little Endian
CPU(s): 1
On-line CPU(s) list: 0
Thread(s) per core: 1
Core(s) per socket: 1
Socket(s): 1
NUMA node(s): 1
Model: 2.0 (pvr 004d 0200)
Model name: POWER8 (architected), altivec supported
Hypervisor vendor: KVM
Virtualization type: para
L1d cache: 64K
L1i cache: 32K
NUMA node0 CPU(s): 0
danielhb@Ub1704NFS:~$
root@source:/home/danielhb/usr/bin# ./virsh setvcpus dhb_ub1704_nfs 2 --live
root@source:/home/danielhb/usr/bin# ./virsh -c 'qemu:///system' migrate --live --domain dhb_ub1704_nfs --desturi qemu+ssh://<target_ip>/system --timeout 60 --verbose
Migration: [100 %]
root@source:/home/danielhb/usr/bin#
- In the destination host:
root@target:/home/danielhb/usr/bin# ./virsh console dhb_ub1704_nfs
Connected to domain dhb_ub1704_nfs
Escape character is ^]
Ub1704NFS login: danielhb
Password:
danielhb@Ub1704NFS:~$
danielhb@Ub1704NFS:~$ lscpu
Architecture: ppc64le
Byte Order: Little Endian
CPU(s): 2
On-line CPU(s) list: 0,1
Thread(s) per core: 1
Core(s) per socket: 1
Socket(s): 2
NUMA node(s): 1
Model: 2.0 (pvr 004d 0200)
Model name: POWER8 (architected), altivec supported
Hypervisor vendor: KVM
Virtualization type: para
L1d cache: 64K
L1i cache: 32K
NUMA node0 CPU(s): 0,1
Migration was successful and the VM is reporting 2 CPUs, one of them was hotplugged before the migration.
------- Comment From <email address hidden> 2017-04-27 16:58 EDT-------
I've finally had the opportunity to test the patch set I've sent to qemu mailing list ("[PATCH 0/4 v7] migration/ppc: migrating DRC, ccs_list and pending_events?") that fixes this issue using libvirt. Until then I've tested using QEMU alone. virsh still reports the same error, but the hot unplug is successful in the VM after the migration. Apparently my QEMU patch set alone is not enough to fix this virsh behavior.
In my test I've used 2 Ubuntu 17.04 P8 hosts. I had to compile libvirt from scratch to make it work with the compiled upstream QEMU + my patch set:
- source host:
root@source: /home/danielhb/ usr/bin# /home/danielhb/ usr/bin# ./virsh start dhb_ub1704_nfs
root@source:
Domain dhb_ub1704_nfs started
root@source: /home/danielhb/ usr/bin# ./virsh console dhb_ub1704_nfs Ub1704NFS: ~$ lscpu Ub1704NFS: ~$ /home/danielhb/ usr/bin# ./virsh setvcpus dhb_ub1704_nfs 2 --live /home/danielhb/ usr/bin# ./virsh -c 'qemu:///system' migrate --live --domain dhb_ub1704_nfs --desturi qemu+ssh: //<target_ ip>/system --timeout 60 --verbose /home/danielhb/ usr/bin#
Connected to domain dhb_ub1704_nfs
Escape character is ^]
Password:
Last login: Thu Apr 27 13:48:41 CDT 2017 on hvc0
danielhb@
Architecture: ppc64le
Byte Order: Little Endian
CPU(s): 1
On-line CPU(s) list: 0
Thread(s) per core: 1
Core(s) per socket: 1
Socket(s): 1
NUMA node(s): 1
Model: 2.0 (pvr 004d 0200)
Model name: POWER8 (architected), altivec supported
Hypervisor vendor: KVM
Virtualization type: para
L1d cache: 64K
L1i cache: 32K
NUMA node0 CPU(s): 0
danielhb@
root@source:
root@source:
Migration: [100 %]
root@source:
- In the destination host:
root@target: /home/danielhb/ usr/bin# ./virsh console dhb_ub1704_nfs
Connected to domain dhb_ub1704_nfs
Escape character is ^]
Ub1704NFS login: danielhb Ub1704NFS: ~$ Ub1704NFS: ~$ lscpu
Password:
danielhb@
danielhb@
Architecture: ppc64le
Byte Order: Little Endian
CPU(s): 2
On-line CPU(s) list: 0,1
Thread(s) per core: 1
Core(s) per socket: 1
Socket(s): 2
NUMA node(s): 1
Model: 2.0 (pvr 004d 0200)
Model name: POWER8 (architected), altivec supported
Hypervisor vendor: KVM
Virtualization type: para
L1d cache: 64K
L1i cache: 32K
NUMA node0 CPU(s): 0,1
Migration was successful and the VM is reporting 2 CPUs, one of them was hotplugged before the migration.
- Hot unplugged one CPU using the source host:
root@source: /home/danielhb/ usr/bin# ./virsh -c 'qemu+ssh: //<target_ ip>/system' setvcpus dhb_ub1704_nfs 1 --live
error: operation failed: vcpu unplug request timed out
Same error as reported in the bug.
- Back on the target host here is the message that appears on libvirtd log:
/home/danielhb/ usr/sbin# 2017-04-27 19:58:31.779+0000: 52854: error : qemuDomainHotpl ugDelVcpu: 5403 : operation failed: vcpu unplug request timed out
- However, the VM reports that the hot unplug was successful:
root@target: /home/danielhb/ usr/bin# ./virsh console dhb_ub1704_nfs Ub1704NFS: ~$ Ub1704NFS: ~$ dmesg | tail -n 5 6.956:9) : apparmor="STATUS" operation= "profile_ load" profile= "unconfined" name="/ usr/lib/ connman/ scripts/ dhclient- script" pid=613 comm="apparmor_ parser" 6.964:10) : apparmor="STATUS" operation= "profile_ load" profile= "unconfined" name="/ usr/bin/ lxc-start" pid=629 comm="apparmor_ parser" hotplug- cpu: CPU with drc index 10000008 already exists Ub1704NFS: ~$ lscpu Ub1704NFS: ~$
Connected to domain dhb_ub1704_nfs
Escape character is ^]
danielhb@
danielhb@
[ 5.113376] audit: type=1400 audit(149332287
[ 5.122892] audit: type=1400 audit(149332287
[ 5.453058] cgroup: new mount options do not match the existing superblock, will be ignored
[ 235.125486] pseries-
[ 235.144708] cpu 1 (hwid 8) Ready to die...
danielhb@
Architecture: ppc64le
Byte Order: Little Endian
CPU(s): 1
On-line CPU(s) list: 0
Thread(s) per core: 1
Core(s) per socket: 1
Socket(s): 1
NUMA node(s): 1
Model: 2.0 (pvr 004d 0200)
Model name: POWER8 (architected), altivec supported
Hypervisor vendor: KVM
Virtualization type: para
L1d cache: 64K
L1i cache: 32K
NUMA node0 CPU(s): 0
danielhb@
As a reference, here's the VM console in the scenario where QEMU isn't patched with my patch set, after all the process:
danielhb@ Ub1704NFS: ~$ dmesg | tail -n 5 7.052:10) : apparmor="STATUS" operation= "profile_ load" profile= "unconfined" name="/ usr/bin/ lxc-start" pid=569 comm="apparmor_ parser" hotplug- cpu: CPU with drc index 10000008 already exists hotplug- cpu: Failed to release drc (10000008) for CPU <NULL>, rc: -1 Ub1704NFS: ~$ lscpu Ub1704NFS: ~$
[ 5.212861] audit: type=1400 audit(149331887
[ 5.476860] cgroup: new mount options do not match the existing superblock, will be ignored
[ 250.352364] pseries-
[ 250.370854] cpu 1 (hwid 8) Ready to die...
[ 250.391898] pseries-
danielhb@
Architecture: ppc64le
Byte Order: Little Endian
CPU(s): 2
On-line CPU(s) list: 0,1
Thread(s) per core: 1
Core(s) per socket: 1
Socket(s): 2
NUMA node(s): 1
Model: 2.0 (pvr 004d 0200)
Model name: POWER8 (architected), altivec supported
Hypervisor vendor: KVM
Virtualization type: para
L1d cache: 64K
L1i cache: 32K
NUMA node0 CPU(s): 0,1
danielhb@
The libvirtd error message is the same with or without the patch:
2017-04-27 18:52:07.904+0000: 45123: error : qemuDomainHotpl ugDelVcpu: 5403 : operation failed: vcpu unplug request timed out