CPU hotplug fails in the system with empty numa nodes, "Invalid value '0-1,16-17' for 'cpuset.mems': Invalid argument"

Bug #1709877 reported by bugproxy
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Release Notes for Ubuntu
Fix Released
Undecided
Unassigned
The Ubuntu-power-systems project
Fix Released
High
Canonical Server
libvirt (Ubuntu)
Fix Released
Undecided
Ubuntu on IBM Power Systems Bug Triage
Xenial
Won't Fix
Low
Christian Ehrhardt 

Bug Description

== Comment: #0 - Satheesh Rajendran <email address hidden> - 2017-07-19 04:13:18 ==
CPU hotplug operation fails in the host with empty numa nodes(with no memory) even though VM placement is static and with/without numad is running.
..
 <vcpu placement='static' current='4'>32</vcpu>
...

# virsh setvcpus virt-tests-vm1 6 --live
error: Invalid value '0-1,16-17' for 'cpuset.mems': Invalid argument

# numactl --hardware
available: 4 nodes (0-1,16-17)
node 0 cpus: 0 8 16 24 32 40
node 0 size: 16188 MB
node 0 free: 1119 MB
node 1 cpus: 48 56 64 72 80 88
node 1 size: 32630 MB
node 1 free: 13233 MB
node 16 cpus: 96 104 112 120 128 136
node 16 size: 0 MB
node 16 free: 0 MB
node 17 cpus: 144 152 160 168 176 184
node 17 size: 0 MB
node 17 free: 0 MB
node distances:
node 0 1 16 17
  0: 10 20 40 40
  1: 20 10 40 40
 16: 40 40 10 20
 17: 40 40 20 10

# cat /sys/fs/cgroup/cpuset/cpuset.mems
0-1

Host:
#uname -a
Linux powerkvm4-lp1 4.10.0-27-generic #30~16.04.2-Ubuntu SMP Thu Jun 29 16:06:52 UTC 2017 ppc64le ppc64le ppc64le GNU/Linux

ii libvirt-bin 1.3.1-1ubuntu10.11
ii numad 0.5+20150602-4
qemu-kvm 1:2.5+dfsg-5ubuntu10.14

bugproxy (bugproxy)
tags: added: architecture-ppc64le bugnameltc-156806 severity-high targetmilestone-inin16043
Changed in ubuntu:
assignee: nobody → Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage)
affects: ubuntu → libvirt (Ubuntu)
Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2017-08-10 09:35 EDT-------
From Nitesh:

-----------------------------

The following commit resolves the issue:

commit 77cb01bc0fec4d0da02e1d4df75d28870b0ef926
Author: Peter Krempa <email address hidden>
Date: Tue Sep 13 15:55:06 2016 +0200

numa: Rename virNumaGetHostNodeset and make it return only nodes with memory
Name it virNumaGetHostMemoryNodeset and return only NUMA nodes which
have memory installed. This is necessary as the kernel is not very happy
to set the memory cgroup setting for nodes which do not have any memory.
This would break vcpu hotplug with following message on such
configruation:
Invalid value '0,8' for 'cpuset.mems': Invalid argument
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1375268

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Change is in >=v2.3.0 which makes this Fix released for a while.
Lets add and consider SRU from there.

Changed in libvirt (Ubuntu):
status: New → Fix Released
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

The first release to have a version >=2.3 was Zesty which implied this is good >=UCA-Ocata and thereby available in a supported way to LTS users as well if they opt into UCA.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

There is some noise when applying this to Xenial.
Nothing too big, but I think at least
commit 5555dc0d7fe0267e2ff6e5a9625164f2896f9cc5 (HEAD)
Author: Peter Krempa <email address hidden>
Date: Tue Sep 13 14:28:33 2016 +0200

    util: numa: Remove impossible error handling

Would be needed to apply better, yet OTOH the code back in Xenial might not fulfill this condition.

Not rocket science but I see some work and regression potential which means for the SRU I'd like to have a really good case.
In that sense I wonder how "real" or "artificial" a system with an empty numa node is.
Is that a thing that really exists outside of a lab?
If so great - lets work on the SRU and please help me to add a SRU Template with your arguments to make a case for it convincing the SRU Team.

Frank Heimes (fheimes)
Changed in ubuntu-power-systems:
status: New → In Progress
Changed in ubuntu-power-systems:
importance: Undecided → High
status: In Progress → Incomplete
Changed in libvirt (Ubuntu Xenial):
status: New → Incomplete
Changed in ubuntu-power-systems:
assignee: nobody → David Britton (davidpbritton)
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Lacking feedback on how real the case is to make a compelling SRU statement for the SRU Team.
Please see my comment #4 and reply with the details needed to make a SRU possible.

Until that was provided I set this back from incomplete to invalid (no offense, consider it a timeout on the "incomplete" to clear the view for currently actionable items), please set back to new once the data was provided.

Changed in libvirt (Ubuntu Xenial):
status: Incomplete → Invalid
Changed in ubuntu-power-systems:
status: Incomplete → Invalid
Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2017-09-13 00:51 EDT-------
(In reply to comment #12)
> There is some noise when applying this to Xenial.
> Nothing too big, but I think at least
> commit 5555dc0d7fe0267e2ff6e5a9625164f2896f9cc5 (HEAD)
> Author: Peter Krempa <email address hidden>
> Date: Tue Sep 13 14:28:33 2016 +0200
>
> util: numa: Remove impossible error handling
>
> Would be needed to apply better, yet OTOH the code back in Xenial might not
> fulfill this condition.
>
> Not rocket science but I see some work and regression potential which means
> for the SRU I'd like to have a really good case.
> In that sense I wonder how "real" or "artificial" a system with an empty
> numa node is.
> Is that a thing that really exists outside of a lab?
> If so great - lets work on the SRU and please help me to add a SRU Template
> with your arguments to make a case for it convincing the SRU Team.

There can a real possibility of having a memory less numa node as a valid config of system provided the system does not have full config(maximum memory possible for that system), which can cause these functional issues that can be resolved by having this fix.

More over host numa node config affecting guest functional is unacceptable, so it is good to have this fix applied, Thanks.

Regards,
-Satheesh

Changed in libvirt (Ubuntu Xenial):
status: Invalid → Triaged
importance: Undecided → Low
Changed in ubuntu-power-systems:
status: Invalid → Triaged
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

I checked and I don't have such a system, so I'll rely on you testing the code.
I can do the general regression checks but on the case I will need you to confirm it working.

I'll first provide a PPA with the fix that you should verify to fix your case.
If that passed your verification and my regression checks, we will move on to the actual SRU.

There you will then need to verify what we have in -proposed.

At any time if there is any way to "construct" such a case artificially please post how to do so.

Changed in libvirt (Ubuntu Xenial):
assignee: nobody → ChristianEhrhardt (paelzer)
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Hi,
There is a test build of a backport available at [1].

This did work through some basic checks, but a full regression test will take some more time.

Please could you check if that fixes the issue you have with the empty numa node setup?

If it does we can go on with the SRU. It would be great if you could provide as much Detail for the SRU Template [2] to this bugs description, I'll then help to add the rest.

[1]: https://launchpad.net/~ci-train-ppa-service/+archive/ubuntu/2959
[2]: https://wiki.ubuntu.com/StableReleaseUpdates#SRU_Bug_Template

Changed in libvirt (Ubuntu Xenial):
status: Triaged → In Progress
Frank Heimes (fheimes)
Changed in ubuntu-power-systems:
status: Triaged → In Progress
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Make sure it is clear we are waiting on a ppa verification before going on by updating the status - see c#8

Changed in libvirt (Ubuntu Xenial):
status: In Progress → Incomplete
Manoj Iyer (manjo)
Changed in ubuntu-power-systems:
status: In Progress → Fix Released
Revision history for this message
bugproxy (bugproxy) wrote :
Download full text (3.2 KiB)

------- Comment From <email address hidden> 2017-12-22 12:47 EDT-------
(In reply to comment #17)
> Hi,
> There is a test build of a backport available at [1].
>
> This did work through some basic checks, but a full regression test will
> take some more time.
>
> Please could you check if that fixes the issue you have with the empty numa
> node setup?
>
> If it does we can go on with the SRU. It would be great if you could provide
> as much Detail for the SRU Template [2] to this bugs description, I'll then
> help to add the rest.
>
> [1]: https://launchpad.net/~ci-train-ppa-service/+archive/ubuntu/2959
> [2]: https://wiki.ubuntu.com/StableReleaseUpdates#SRU_Bug_Template

Am able to reproduce the issue still with this ppa,

# virsh dumpxml virt-tests-vm1|grep vcpu
<vcpu placement='static' current='4'>32</vcpu>

# virsh setvcpus virt-tests-vm1 6 --live
error: Invalid value '0-1,16-17' for 'cpuset.mems': Invalid argument

# numactl -H
available: 4 nodes (0-1,16-17)
node 0 cpus: 0 8 16 24 32
node 0 size: 32590 MB
node 0 free: 24161 MB
node 1 cpus: 40 48 56 64 72
node 1 size: 0 MB
node 1 free: 0 MB
node 16 cpus: 80 88 96 104 112
node 16 size: 32722 MB
node 16 free: 32126 MB
node 17 cpus: 120 128 136 144 152
node 17 size: 32564 MB
node 17 free: 31433 MB
node distances:
node 0 1 16 17
0: 10 20 40 40
1: 20 10 40 40
16: 40 40 10 20
17: 40 40 20 10

#uname -a
Linux c158f2u07os 4.4.0-105-generic #128-Ubuntu SMP Thu Dec 14 12:38:44 UTC 2017 ppc64le ppc64le ppc64le GNU/Linux

ii libvirt-bin 1.3.1-1ubuntu10.15 ppc64el programs for the libvirt library
ii libvirt0:ppc64el 1.3.1-1ubuntu10.15 ppc64el library for interfacing with different virtualization systems
ii numad 0.5+20150602-4 ppc64el User-level daemon that monitors NUMA topology and usage
ii qemu-kvm 1:2.5+dfsg-5ubuntu10.16 ppc64el QEMU Full virtualization

Installing ppa...
# sudo add-apt-repository ppa:ci-train-ppa-service/2959
https://bileto.ubuntu.com/#/ticket/2959

/
More info: https://launchpad.net/~ci-train-ppa-service/+archive/ubuntu/2959
Press [ENTER] to continue or ctrl-c to cancel adding it

gpg: keyring `/tmp/tmpkodr59wf/secring.gpg' created
gpg: keyring `/tmp/tmpkodr59wf/pubring.gpg' created
gpg: requesting key ECF1204C from hkp server keyserver.ubuntu.com
gpg: /tmp/tmpkodr59wf/trustdb.gpg: trustdb created
gpg: key ECF1204C: public key "Launchpad PPA for CI Train PPA Service Team" imported
gpg: Total number processed: 1
gpg: imported: 1 (RSA: 1)
OK

# apt update
Hit:1 http://us.ports.ubuntu.com/ubuntu-ports xenial-proposed InRelease
Hit:2 http://us.ports.ubuntu.com/ubuntu-ports xenial InRelease
Hit:3 http://us.ports.ubuntu.com/ubuntu-ports xenial-updates InRelease
Hit:4 http://us.ports.ubuntu.com/ubuntu-ports xenial-backports InRelease
Hit:5 http://ports.ubuntu.com/ubuntu-ports xenial-security InRelease
Hit:6 http://ppa.launchpad.net/ci-train-ppa-service/2959/ubuntu xenial InRelease
Reading package lists... Done
Building dependency tree
Re...

Read more...

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Hi Satheera,
since this was waiting for verification so long the version number used in the ppa was consumed by another (unrelated) update. I think there is no reason to drive all of the rebuild just for this as there is an easier way to do the testing against the version in the ppa as-is.

You will need to force the install of the version from the ppa being 1.3.1-1ubuntu10.15~ppa1
Fortunately I see in your report above that the newer libvirt-bin 1.3.1-1ubuntu10.15 was in use.
So if after adding the ppa you could not just normally apt update&upgrade but then force the version like:
$ apt install libvirt-bin=1.3.1-1ubuntu10.15~ppa1 libvirt0=1.3.1-1ubuntu10.15~ppa1

This would allow you to check from the ppa without any collision with this or other updates.

Revision history for this message
bugproxy (bugproxy) wrote :
Download full text (3.3 KiB)

------- Comment From <email address hidden> 2018-01-02 05:07 EDT-------
(In reply to comment #22)
> Hi Satheera,
> since this was waiting for verification so long the version number used in
> the ppa was consumed by another (unrelated) update. I think there is no
> reason to drive all of the rebuild just for this as there is an easier way
> to do the testing against the version in the ppa as-is.
>
> You will need to force the install of the version from the ppa being
> 1.3.1-1ubuntu10.15~ppa1
> Fortunately I see in your report above that the newer libvirt-bin
> 1.3.1-1ubuntu10.15 was in use.
> So if after adding the ppa you could not just normally apt update&upgrade
> but then force the version like:
> $ apt install libvirt-bin=1.3.1-1ubuntu10.15~ppa1
> libvirt0=1.3.1-1ubuntu10.15~ppa1
>
> This would allow you to check from the ppa without any collision with this
> or other updates.
This helped to get the right package installed, though am not seeing the issue reported, now I am hitting at a different hotplug unsupported issue.

# virsh setvcpus virt-tests-vm1 6 --live
error: internal error: unable to execute QEMU command 'cpu-add': Not supported---------???

# dpkg -l|grep libvirt
ii libvirt-bin 1.3.1-1ubuntu10.15~ppa1 ppc64el programs for the libvirt library
ii libvirt0:ppc64el 1.3.1-1ubuntu10.15~ppa1 ppc64el library for interfacing with different virtualization systems

package install logs:

# apt install libvirt-bin=1.3.1-1ubuntu10.15~ppa1 libvirt0=1.3.1-1ubuntu10.15~ppa1
Reading package lists... Done
Building dependency tree
Reading state information... Done
Suggested packages:
radvd
Recommended packages:
dmidecode
The following packages will be DOWNGRADED:
libvirt-bin libvirt0
0 upgraded, 0 newly installed, 2 downgraded, 0 to remove and 0 not upgraded.
Need to get 2,916 kB of archives.
After this operation, 0 B of additional disk space will be used.
Do you want to continue? [Y/n] y
Get:1 http://ppa.launchpad.net/ci-train-ppa-service/2959/ubuntu xenial/main ppc64el libvirt-bin ppc64el 1.3.1-1ubuntu10.15~ppa1 [1,945 kB]
Get:2 http://ppa.launchpad.net/ci-train-ppa-service/2959/ubuntu xenial/main ppc64el libvirt0 ppc64el 1.3.1-1ubuntu10.15~ppa1 [971 kB]
Fetched 2,916 kB in 4s (695 kB/s)
dpkg: warning: downgrading libvirt-bin from 1.3.1-1ubuntu10.15 to 1.3.1-1ubuntu10.15~ppa1
(Reading database ... 93134 files and directories currently installed.)
Preparing to unpack .../libvirt-bin_1.3.1-1ubuntu10.15~ppa1_ppc64el.deb ...
Unpacking libvirt-bin (1.3.1-1ubuntu10.15~ppa1) over (1.3.1-1ubuntu10.15) ...
dpkg: warning: downgrading libvirt0:ppc64el from 1.3.1-1ubuntu10.15 to 1.3.1-1ubuntu10.15~ppa1
Preparing to unpack .../libvirt0_1.3.1-1ubuntu10.15~ppa1_ppc64el.deb ...
Unpacking libvirt0:ppc64el (1.3.1-1ubuntu10.15~ppa1) over (1.3.1-1ubuntu10.15) ...
Processing triggers for ureadahead (0.100.0-19) ...
ureadahead will be reprofiled on next reboot
Processing triggers for systemd (229-4ubuntu21) ...
Processing triggers for man-db (2.7.5-1) ...
Processing triggers for libc-bin (2.23-0ubuntu9) ...
Setting up libvirt0:ppc64el (1.3.1-1ubuntu10.15~ppa1) ...
Setting u...

Read more...

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

These are the fixes you suggested on top of libvirt.
If there is something else missing I don't know of it yet.

Lacking a test system with the empty numa node I can't cross check this easily.
One might construct such a system with KVM I think.
If you happen so have a guest xml for libvirt which will make this testable for me that might be useful to allow me testing it on my own.

If you don't have one I might have to try creating one from scratch.
Actually (if not KVM) how do you crate such a system with an empty Numa node.
Is it in phyp or via some odd HW combination (like really only mem in a few slots)?

Frank Heimes (fheimes)
tags: added: triage-g
Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2018-08-31 15:29 EDT-------
:ast update was 7 months ago, perhaps because it probably should have been moved to NEEDINFO. Satheesh?

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2019-01-04 06:51 EDT-------
Finally I managed to get the hardware with config I had raised this issue, and still able to reproduce the failure with latest pieces of software.

Host env:
#uname -a
Linux abc 4.4.0-141-generic #167-Ubuntu SMP Wed Dec 5 10:33:00 UTC 2018 ppc64le ppc64le ppc64le GNU/Linux

# dpkg -l|grep libvirt-bin
ii libvirt-bin 1.3.1-1ubuntu10.24 ppc64el programs for the libvirt library

# dpkg -l|grep qemu
ii ipxe-qemu 1.0.0+git-20150424.a25a16d-1ubuntu1.2 all PXE boot firmware - ROM images for qemu
ii qemu-block-extra:ppc64el 1:2.5+dfsg-5ubuntu10.33 ppc64el extra block backend modules for qemu-system and qemu-utils
ii qemu-kvm 1:2.5+dfsg-5ubuntu10.33 ppc64el QEMU Full virtualization
ii qemu-slof 20151103+dfsg-1ubuntu1.1 all Slimline Open Firmware -- QEMU PowerPC version
ii qemu-system-common 1:2.5+dfsg-5ubuntu10.33 ppc64el QEMU full system emulation binaries (common files)
ii qemu-system-ppc 1:2.5+dfsg-5ubuntu10.33 ppc64el QEMU full system emulation binaries (ppc)

# numactl -H
available: 4 nodes (0-1,16-17)
node 0 cpus: 0 8 16 24 32 40
node 0 size: 65318 MB
node 0 free: 48506 MB
node 1 cpus: 48 56 64 72 80 88
node 1 size: 65456 MB
node 1 free: 64685 MB
node 16 cpus: 96 104 112 120 128 136
node 16 size: 65169 MB
node 16 free: 63412 MB
node 17 cpus: 144 152 160 168 176 184
node 17 size: 0 MB
node 17 free: 0 MB
node distances:
node 0 1 16 17
0: 10 20 40 40
1: 20 10 40 40
16: 40 40 10 20
17: 40 40 20 10

#cat /sys/devices/system/node/has_normal_memory
0-1,16

Still I see the issue,
# virsh setvcpus virt-tests-vm1 6 --live
error: Invalid value '0-1,16-17' for 'cpuset.mems': Invalid argument

I tried to installed package from suggested ppa, I guess it would have outdated/deleted,

# add-apt-repository ppa:ci-train-ppa-service/2959

Cannot add PPA: 'ppa:~ci-train-ppa-service/ubuntu/2959'.
The team named '~ci-train-ppa-service' has no PPA named 'ubuntu/2959'
Please choose from the following available PPAs:
* '1669': 1669 - 2017-03-14
* '1924': 1924 - 2017-01-19
* '1961': 1961 - 2017-04-03
* '1982': 1982 - 2017-04-03
* '1996': 1996 - 2017-04-03

Please advice the way forward.
Thanks.

Regards,
-Satheesh

Changed in ubuntu-power-systems:
status: Fix Released → Confirmed
Changed in libvirt (Ubuntu Xenial):
status: Incomplete → Confirmed
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Well, if it is a real issue and not just existing in the Lab with a very special machine (it seems hard to get one for you as well) then I can re-port the fix to the current version of Xenials Libvirt and you can retest with that.

Anything later is already fixed, but I'd ask you to confirm that to be sure.
So next steps:
1. please test the same with e.g. Ubuntu 18.04 - is the bug fixed in there?
2. let me know if you still consider this important, then I'll prep a new build for Xenial to test
   (If it is not important we can mark it won't fix)

Changed in ubuntu-power-systems:
assignee: David Britton (davidpbritton) → Canonical Server Team (canonical-server)
status: Confirmed → Incomplete
Revision history for this message
Andrew Cloke (andrew-cloke) wrote :

Moving to "Incomplete" while awaiting information about whether this issue is restricted to a single lab machine or is impacting production workloads.

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2019-01-08 01:08 EDT-------
(In reply to comment #28)
> Well, if it is a real issue and not just existing in the Lab with a very
> special machine (it seems hard to get one for you as well) then I can
> re-port the fix to the current version of Xenials Libvirt and you can retest
> with that.
>
> Anything later is already fixed, but I'd ask you to confirm that to be sure.
> So next steps:
> 1. please test the same with e.g. Ubuntu 18.04 - is the bug fixed in there?
I can try that out, need some more time.
> 2. let me know if you still consider this important, then I'll prep a new
> build for Xenial to test
> (If it is not important we can mark it won't fix)
In general libvirt/cgroup should handle different system configs, this happens in one of my test systems
not sure if we have productions configs similar, like memoryless numa nodes.

Am fine to close this as won't fix but we need to document for the user that "not to expect vcpu hotplug working in case they have memoryless numa node", but am afraid if that is what we want to do?

Regards,
-Satheesh

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

A proper documentation of a limitation is just as much effort as a fix since it is known in this case. And I like to fix things, I just wanted to make sure that the case is real.

You will have to check the PPA and later in the processing of the bug you'll have to do the SRU verification as you are the only one with such a system to trigger and verify the issue.
Please could you explicitly state that you are also willing and able to do the SRU verification later on? Otherwise the SRU hangs around forever and blocks other things.

I have rebased (wow 10 versions since our last try) the fix that I had and created a new PPA for you at [1].
Please could you verify that PPA, if the fix is ok we can go on to start the SRU process on this.

[1]: https://launchpad.net/~paelzer/+archive/ubuntu/bug-1709877-libvirt-empty-numa

Changed in libvirt (Ubuntu Xenial):
status: Confirmed → Incomplete
Revision history for this message
bugproxy (bugproxy) wrote :
Download full text (10.0 KiB)

------- Comment From <email address hidden> 2019-01-09 04:58 EDT-------
(In reply to comment #31)
> A proper documentation of a limitation is just as much effort as a fix since
> it is known in this case. And I like to fix things, I just wanted to make
> sure that the case is real.
>
> You will have to check the PPA and later in the processing of the bug you'll
> have to do the SRU verification as you are the only one with such a system
> to trigger and verify the issue.
> Please could you explicitly state that you are also willing and able to do
> the SRU verification later on? Otherwise the SRU hangs around forever and
> blocks other things.
>
> I have rebased (wow 10 versions since our last try) the fix that I had and
> created a new PPA for you at [1].
> Please could you verify that PPA, if the fix is ok we can go on to start the
> SRU process on this.
>
> [1]:
> https://launchpad.net/~paelzer/+archive/ubuntu/bug-1709877-libvirt-empty-numa

I hit with below error now during vcpu hotplug with this ppa, error looks same when I tested previous time with ppa, ibm comment #23, launchpad comment link, https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1709877/comments/12

# virsh setvcpus virt-tests-vm1 32 --live
error: internal error: unable to execute QEMU command 'cpu-add': Not supported

# dpkg -l|grep libvirt-bin
ii libvirt-bin 1.3.1-1ubuntu10.25~ppa1 ppc64el programs for the libvirt library

this is how I installed the package from ppa.

# add-apt-repository ppa:paelzer/bug-1709877-libvirt-empty-numa
More info: https://launchpad.net/~paelzer/+archive/ubuntu/bug-1709877-libvirt-empty-numa
Press [ENTER] to continue or ctrl-c to cancel adding it

gpg: keyring `/tmp/tmpi1qel9vg/secring.gpg' created
gpg: keyring `/tmp/tmpi1qel9vg/pubring.gpg' created
gpg: requesting key B6832E30 from hkp server keyserver.ubuntu.com
gpg: /tmp/tmpi1qel9vg/trustdb.gpg: trustdb created
gpg: key B6832E30: public key "Launchpad PPA for ChristianEhrhardt" imported
gpg: Total number processed: 1
gpg: imported: 1 (RSA: 1)
OK

# apt-get update
Hit:1 http://us.ports.ubuntu.com/ubuntu-ports xenial InRelease
Get:2 http://us.ports.ubuntu.com/ubuntu-ports xenial-updates InRelease [109 kB]
Get:3 http://us.ports.ubuntu.com/ubuntu-ports xenial-backports InRelease [107 kB]
Get:4 http://ports.ubuntu.com/ubuntu-ports xenial-security InRelease [107 kB]
Get:5 http://us.ports.ubuntu.com/ubuntu-ports xenial-updates/main ppc64el Packages [688 kB]
Get:6 http://ports.ubuntu.com/ubuntu-ports xenial-proposed InRelease [260 kB]
Get:7 http://ppa.launchpad.net/paelzer/bug-1709877-libvirt-empty-numa/ubuntu xenial InRelease [18.1 kB]
Get:8 http://us.ports.ubuntu.com/ubuntu-ports xenial-updates/universe ppc64el Packages [600 kB]
Get:9 http://ppa.launchpad.net/paelzer/bug-1709877-libvirt-empty-numa/ubuntu xenial/main ppc64el Packages [1,808 B]
Get:10 http://ports.ubuntu.com/ubuntu-ports xenial-security/main ppc64el Packages [417 kB]
Get:11 http://ports.ubuntu.com/ubuntu-ports xenial-security/main Translation-en [248 kB]
Get:12 http://ports.ubuntu.com/ubuntu-ports xenial-security/universe ppc64el Packages [321 kB]
Get:13 http://ppa.l...

Changed in ubuntu-power-systems:
status: Incomplete → Confirmed
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

This is just the libvirt as in Xenial plus the requested patches.
I don't see anything obvious to be wrong except if the requested changes were incomplete.

Could it be that the use case you are testing is affected by two things:
1. the one we are trying to fix here
2. a second one that is exposed once the first is fixed?

In that case one would have to identify the further changes that would be needed.
If you or one of your dev's knows about extra changes needed for #2 let me know and we can give it a try.

But given that this seems more Lab than Field issue (there were no other examples so far) this gets more and more complex but I wonder for what.

Changed in ubuntu-power-systems:
status: Confirmed → Incomplete
Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2019-01-10 00:15 EDT-------
(In reply to comment #33)
> This is just the libvirt as in Xenial plus the requested patches.
> I don't see anything obvious to be wrong except if the requested changes
> were incomplete.
>
> Could it be that the use case you are testing is affected by two things:
> 1. the one we are trying to fix here
> 2. a second one that is exposed once the first is fixed?
>
Looks like the we are hitting the second one, from the reference of this related bz comment for 17.04,
https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1667805/comments/9, cpu hotplug feature for pseries was not complete/had to use different approach(device_add instead of cpu-add) in qemu 2.5 version which xenial and due to the fix of initial issue reported here, we hit now with actual failure, though error message here is misleading.

From my previous bug reports, have learnt that vcpu hotplug for pseries works from zesty(17.04), reference,
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1670315 and I could not find any working reference for vcpu hotplug on xenial for pseries neither supported/not-supported statement from ubuntu wiki aswell.

If there are no plans to support vcpu hotplug for pseries in xenial, my suggestion would be having the error
statement while trying hotplug be improved saying hotplug is not supported with version of qemu/libvirt, or we can bring in all the necessary changes required for vcpu hotplug in xenial aswell.

please advice.

Thanks a lot for the continued support on this.

Regards,
-Satheesh

> In that case one would have to identify the further changes that would be
> needed.
> If you or one of your dev's knows about extra changes needed for #2 let me
> know and we can give it a try.
>
> But given that this seems more Lab than Field issue (there were no other
> examples so far) this gets more and more complex but I wonder for what.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

I agree to your summary.
I was considering this being "open" so long without being a major issue for anyone combined with the required work getting bigger to implement/fix it for real. So of the two choices:

a) having the error statement while trying hotplug be improved saying hotplug is not supported with that version of qemu/libvirt
b) bring in all the necessary changes required for vcpu hotplug in xenial aswell.

b) imho seems like a rabbithole of changes and even potential regression for more important cases. Especially that there is an LTS released where things are fine (18.04)
a) seems to be less invasive and the right situation given the situation.

Therefore after that being "selected", would you mind prepping a change that you'd want us to add for that improved message when people are trying to use Hotplug on Xenial?

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2019-01-21 04:57 EDT-------
(In reply to comment #35)
> I agree to your summary.
> I was considering this being "open" so long without being a major issue for
> anyone combined with the required work getting bigger to implement/fix it
> for real. So of the two choices:
>
> a) having the error statement while trying hotplug be improved saying
> hotplug is not supported with that version of qemu/libvirt
> b) bring in all the necessary changes required for vcpu hotplug in xenial
> aswell.
>
> b) imho seems like a rabbithole of changes and even potential regression for
> more important cases. Especially that there is an LTS released where things
> are fine (18.04)
> a) seems to be less invasive and the right situation given the situation.
>
> Therefore after that being "selected", would you mind prepping a change that
> you'd want us to add for that improved message when people are trying to use
> Hotplug on Xenial?

It can be below message
"Currently vcpu hotplug not supported on Xenial, please try upgrade to Bionic to avail the support."

Optionally the second part of message can be added.

Regards,
-Satheesh.

Revision history for this message
Frank Heimes (fheimes) wrote :

Let me recommend the following to get this issue addressed:
Since there are always features that are in one or the other situation or version or a certain platform not supported, the usual way to address this is to add a statement/disclaimer to the release notes to get it documented - if necessary.
The msg that is given is imho not too bad and pretty common:
"error: internal error: unable to execute QEMU command 'cpu-add': Not supported"
And with the availability of 18.04.1 the recommendation to upgrade from 16.04 to 18.04 is in place, too.
Creating, testing and rolling-out a new package just with a modified message is not appropriate and SRU-ing is probably not possible at all, since SRUs are meant to address high impact bugs (https://wiki.ubuntu.com/StableReleaseUpdates).
So if there are no further and hard requirements for a patched package (that justify an SRU), I strongly recommend to add an appropriate msg to the 16.04 release notes and get this ticket closed.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Thanks for the discussion and suggestion Frank.
I agree that the 16.04 release notes are much more appropriate and easier to change in that regard.

@Satheera - would that work and be ok for you?

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2019-01-23 01:49 EDT-------
(In reply to comment #38)
> Thanks for the discussion and suggestion Frank.
> I agree that the 16.04 release notes are much more appropriate and easier to
> change in that regard.
>
> @Satheera - would that work and be ok for you?

Am fine with that, makes sense.

Regards,
-Satheesh

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :
Changed in libvirt (Ubuntu Xenial):
status: Incomplete → Won't Fix
Changed in ubuntu-release-notes:
status: New → Fix Released
Changed in ubuntu-power-systems:
status: Incomplete → Fix Released
Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2019-02-15 01:12 EDT-------
Closing as previous comment.

Regards,
-Satheesh

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.