Comment 9 for bug 1677552

Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2017-05-03 09:11 EDT-------
Update: it turns out that Libvirt isn't at fault here. I've followed up Mike suspicion on the situation:

----
MICHAEL D. ROTH 2017-05-02
One thing to verify would be whether or not unplug is completely done on the target. Even though the guest might show unplug successful, depending on how it sets the DRC states for the device QEMU may or may not have finalized the object, which would result in QEMU never emitting the device-deleted QMP event.
----

And he was right. The patch set wasn't allowing the hot unplug to happen as expected in the target system after the migration. The QMP event was never fired and then Libvirt would simply hang out waiting for response until the timeout.

I've fixed the issue in QEMU side. Here is my latest test with my new patch set and Libvirt upstream:

-- source host:

# ./virsh start dhb_ub1704_nfs
Domain dhb_ub1704_nfs started
# ./virsh setvcpus dhb_ub1704_nfs 2 --live
#
# ./virsh -c 'qemu:///system' migrate --live --domain dhb_ub1704_nfs --desturi qemu+ssh://9.40.193.37/system --timeout 60 --verbose
Migration: [100 %]
# ./virsh -c 'qemu+ssh://9.40.193.37/system' setvcpus dhb_ub1704_nfs 1 --live

#

--- destination host after migration and remote hot unplug:

# ./virsh console dhb_ub1704_nfs
Connected to domain dhb_ub1704_nfs
Escape character is ^]

danielhb@Ub1704NFS:~$ lscpu
Architecture: ppc64le
Byte Order: Little Endian
CPU(s): 1
On-line CPU(s) list: 0
Thread(s) per core: 1
Core(s) per socket: 1
Socket(s): 1
NUMA node(s): 1
Model: 2.0 (pvr 004d 0200)
Model name: POWER8 (architected), altivec supported
Hypervisor vendor: KVM
Virtualization type: para
L1d cache: 64K
L1i cache: 32K
NUMA node0 CPU(s): 0
danielhb@Ub1704NFS:~$ dmesg | tail -n 5
[ 5.361640] audit: type=1400 audit(1493815576.200:9): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/lib/connman/scripts/dhclient-script" pid=651 comm="apparmor_parser"
[ 5.363306] audit: type=1400 audit(1493815576.200:10): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/bin/lxc-start" pid=691 comm="apparmor_parser"
[ 5.880997] cgroup: new mount options do not match the existing superblock, will be ignored
[ 115.040357] pseries-hotplug-cpu: CPU with drc index 10000008 already exists
[ 115.058023] cpu 1 (hwid 8) Ready to die...
danielhb@Ub1704NFS:~$
#
# ./virsh qemu-monitor-command dhb_ub1704_nfs --hmp info cpus ; ./virsh qemu-monitor-command dhb_ub1704_nfs --hmp info hotpluggable-cpus
* CPU #0: nip=0xc00000000009f22c thread_id=183233

Hotpluggable CPUs:
type: "host-spapr-cpu-core"
vcpus_count: "1"
CPUInstance Properties:
core-id: "3"
type: "host-spapr-cpu-core"
vcpus_count: "1"
CPUInstance Properties:
core-id: "2"
type: "host-spapr-cpu-core"
vcpus_count: "1"
CPUInstance Properties:
core-id: "1"
type: "host-spapr-cpu-core"
vcpus_count: "1"
qom_path: "/machine/unattached/device[0]"
CPUInstance Properties:
core-id: "0"

I'll clean the code up and resend the patch set to the mailing for approval. This alone will fix the issue, no libvirt changes will be needed. Sorry for the false libvirt bug alarm I might be triggered. And thanks Mike for pointing it out the QMP event issue.

Daniel