virt-manager fail to start on huawei arm server

Bug #1811198 reported by Bin Li on 2019-01-10
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
libvirt (Ubuntu)
Undecided
Unassigned
Bionic
Undecided
Unassigned
Cosmic
Undecided
Unassigned

Bug Description

[Impact]

 * Some arm systems can have unexpected high core_id's which breaks
   libvirts arbitrary assumption to limit it at 4k

 * Upstream fix drops this limit and makes use of libvirts ability to size
   the bitsets automatically as needed

[Test Case]

 * This is the hard part, as it needs special HW to be affected, but also
   to verify the fix.

 * We can ask the reporter to verify on his platform, but not 100% rely on
   that in case he is unavailable.

[Regression Potential]

 * The change is removing a limit which only hit very special HW. For this
   special HW it is a fix, for all others it should be no functional
   change.
 * The one regression I could think of is that this dynamic bitmask
   handling would have (not yet identified issues) that would affect other
   cases (the feature is almost three years old now since lbivirt 1.3.3).

[Other Info]

 * n/a

---

We met an issue when run virt-manager on huawei arm server with 18.04.1.

Error polling connection 'qemu:///system': internal error: Socket 5418 can't be handled (max socket is 4095)

Related branches

Bin Li (binli) wrote :
Bin Li (binli) wrote :

https://www.redhat.com/archives/libvir-list/2018-August/msg00798.html

While in most cases the values are going to be much
smaller than our arbitrary 4096 limit, there is really
no guarantee that would be the case: in fact, a few
aarch64 servers have been spotted in the wild with
core_id as high as 6216.

Take advantage of virBitmap's ability to automatically
alter its size at runtime to accomodate such values.

Bin Li (binli) on 2019-01-10
summary: - Remove arbitrary limit on socket_id/core_id
+ virt-manager fail to start on huawei arm server
Bin Li (binli) wrote :

I made a patch for it, after test, this issue is fixed.
virt-manager could be run successfully.

Bin Li (binli) wrote :

Here is the origin patch from the https://github.com/libvirt/libvirt

commit ba35ac2ebbc7f94abc50ffbf1d681458e2406444
Author: Andrea Bolognani <email address hidden>
Date: Fri Aug 3 10:15:16 2018 +0200

    utils: Remove arbitrary limit on socket_id/core_id

    While in most cases the values are going to be much
    smaller than our arbitrary 4096 limit, there is really
    no guarantee that would be the case: in fact, a few
    aarch64 servers have been spotted in the wild with
    core_id as high as 6216.

    Take advantage of virBitmap's ability to automatically
    alter its size at runtime to accomodate such values.

    Signed-off-by: Andrea Bolognani <email address hidden>

The attachment "libvirt-utils-remove-arbitrary-limit.patch" seems to be a patch. If it isn't, please remove the "patch" flag from the attachment, remove the "patch" tag, and if you are a member of the ~ubuntu-reviewers, unsubscribe the team.

[This is an automated message performed by a Launchpad user owned by ~brian-murray, for any issues please contact him.]

tags: added: patch

Hi Bin Li,
thank you a lot for your report and identifying + verifying the test.

This is upstream as of libvirt 4.7
OTOH I think the limit is in place long enough to not touch and have the backport noise to Xenial.

Therefor I'll define Bionic and Cosmic as affected, but for the SRU [1] I'll need the fix in the latest release first. I'm working on libvirt 4.10 anyway atm, so that is already ongoing.

But I'm on a business trip next week so things will take a while.
After the newer libvirt completes I'll take a look at also creating an SRU for it then.

Even thou that will take a short while, since this needs special HW to test I wanted to ask you if you will then be willing and able to do the SRU verification on Bionic and Cosmic as you have the right HW to do so.

For now there is not much more to do than to wait until the new libvirt completes, if anything you might help by copying the SRU template from [1] and editing it into the bug description here.
This would be nice, but if you won't I can do it when I prepare the SRU.

[1]: https://wiki.ubuntu.com/StableReleaseUpdates

Changed in libvirt (Ubuntu):
status: New → In Progress
Changed in libvirt (Ubuntu Bionic):
status: New → Triaged
Changed in libvirt (Ubuntu Cosmic):
status: New → Triaged
tags: added: libvirt-19.04
Bin Li (binli) wrote :

Hi Christian,

 Fine, it's okay for bionic and cosmic, and when you are ready, just let me know it, I would like help. Thanks a lot!

Launchpad Janitor (janitor) wrote :
Download full text (11.2 KiB)

This bug was fixed in the package libvirt - 5.0.0-1ubuntu1

---------------
libvirt (5.0.0-1ubuntu1) disco; urgency=medium

  * Merged with Debian unstable
    Among many other new features and fixes this includes fixes for:
    LP: #1754871 - 1799446 zPCI passthrough support for KVM
    LP: #1811198 - remove arbitrary limit on socket_id/core_id
    Remaining changes:
    - Disable libssh2 support (universe dependency)
    - Disable firewalld support (universe dependency)
    - Set qemu-group to kvm (for compat with older ubuntu)
    - Additional apport package-hook
    - Autostart default bridged network (As upstream does, but not Debian).
      In addition to just enabling it our solution provides:
      + do not autostart if subnet is already taken (e.g. in guests).
      + iterate some alternative subnets before giving up
    - d/p/ubuntu/Allow-libvirt-group-to-access-the-socket.patch: This is
      the group based access to libvirt functions as it was used in Ubuntu
      for quite long.
      + d/p/ubuntu/daemon-augeas-fix-expected.patch fix some related tests
        due to the group access change.
      + d/libvirt-daemon-system.postinst: add users in sudo to the libvirt
        group.
    - ubuntu/parallel-shutdown.patch: set parallel shutdown by default.
    - Update Vcs-Git and Vcs-Browser fields to point to launchpad
    - Xen related
      - d/p/ubuntu/ubuntu-libxl-qemu-path.patch: this change was split. The
        section that adapts the path of the emulator to the Debian/Ubuntu
        packaging is kept.
      - d/p/ubuntu/ubuntu-libxl-Fix-up-VRAM-to-minimum-requirements.patch: auto
        set VRAM to minimum requirements
      - d/p/ubuntu/xen-default-uri.patch: set default URI on xen hosts
      - Add libxl log directory
      - libvirt-uri.sh: Automatically switch default libvirt URI for users on
        Xen dom0 via user profile (was missing on changelogs before)
    - d/p/ubuntu/apibuild-skip-libvirt-common.h: drop libvirt-common.h from
      included_files to avoid build failures due to duplicate definitions.
    - Update README.Debian with Ubuntu changes
    - Enable some additional features on ppc64el and s390x (for arch parity)
      + systemtap, zfs, numa and numad on s390x.
      + systemtap on ppc64el.
    - d/t/control, d/t/smoke-qemu-session: fixup smoke-qemu-session by making
      vmlinuz available and accessible (Debian bug 848314)
    - d/t/control, d/t/smoke-lxc: fix up lxc smoke test isolation
    - d/p/ubuntu/ubuntu_machine_type.patch: accept ubuntu types as pci440fx
    - Further upstreamed apparmor Delta, especially any new one
      Our former delta is split into logical pieces and is either Ubuntu only
      or is part of a continuous upstreaming effort.
      Listing related remaining changes in debian/patches/ubuntu-aa/:
      + 0001-apparmor-Allow-pygrub-to-run-on-Debian-Ubuntu.patch: apparmor:
        Allow pygrub to run on Debian/Ubuntu
      + 0003-apparmor-libvirt-qemu-Allow-read-access-to-overcommi.patch:
        apparmor, libvirt-qemu: Allow read access to overcommit_memory
      + 0007-apparmor-libvirt-qemu-Allow-owner-read-access-to-PRO.patch:
        apparmor, libvirt-qemu: Allow owner rea...

Changed in libvirt (Ubuntu):
status: In Progress → Fix Released
Bin Li (binli) wrote :

Christian,

Thanks a lot!

@Bin Li,
I'm prepping this for the stable release updates and wanted to check with you if you will be willing and able to verify the fix on your hardware?
That would be two times - once with a PPA I will provide soon and once more with a package from Bionic/Cosmic-proposed when the actual SRU takes place.

Let me know if you have the necessary HW to verify this as it would help to raise the confidence for the SRU team [1] to accept the change.

[1]: https://wiki.ubuntu.com/StableReleaseUpdates

description: updated

FYI - the fix for this bug as an SRUs is ready and testable from a PPA for
Cosmic [1] and Bionic [2].

Since the verification of this bug requires special hardware, I'd appreciate if you could precheck these PPAs if they fix the issues. That would ensure that:
a) the fix is most likely to work when pushed as SRU
b) our plan to verify the actual by your testing SRU will work

In addition I'll push these PPAs through the automated regression tests for qemu/libvirt.

[1]: https://launchpad.net/~ci-train-ppa-service/+archive/ubuntu/3620
[2]: https://launchpad.net/~ci-train-ppa-service/+archive/ubuntu/3621

Christian,

Now I am in Chinese Spring Festival holiday, will back on 15th Feb. Thanks.

On Thu, Jan 31, 2019, 23:41 Christian Ehrhardt  <<email address hidden>
wrote:

> FYI - the fix for this bug as an SRUs is ready and testable from a PPA for
> Cosmic [1] and Bionic [2].
>
> Since the verification of this bug requires special hardware, I'd
> appreciate if you could precheck these PPAs if they fix the issues. That
> would ensure that:
> a) the fix is most likely to work when pushed as SRU
> b) our plan to verify the actual by your testing SRU will work
>
> In addition I'll push these PPAs through the automated regression tests
> for qemu/libvirt.
>
> [1]: https://launchpad.net/~ci-train-ppa-service/+archive/ubuntu/3620
> [2]: https://launchpad.net/~ci-train-ppa-service/+archive/ubuntu/3621
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1811198
>
> Title:
> virt-manager fail to start on huawei arm server
>
> Status in libvirt package in Ubuntu:
> Fix Released
> Status in libvirt source package in Bionic:
> Triaged
> Status in libvirt source package in Cosmic:
> Triaged
>
> Bug description:
> [Impact]
>
> * Some arm systems can have unexpected high core_id's which breaks
> libvirts arbitrary assumption to limit it at 4k
>
> * Upstream fix drops this limit and makes use of libvirts ability to
> size
> the bitsets automatically as needed
>
> [Test Case]
>
> * This is the hard part, as it needs special HW to be affected, but
> also
> to verify the fix.
>
> * We can ask the reporter to verify on his platform, but not 100% rely
> on
> that in case he is unavailable.
>
> [Regression Potential]
>
> * The change is removing a limit which only hit very special HW. For
> this
> special HW it is a fix, for all others it should be no functional
> change.
> * The one regression I could think of is that this dynamic bitmask
> handling would have (not yet identified issues) that would affect
> other
> cases (the feature is almost three years old now since lbivirt 1.3.3).
>
> [Other Info]
>
> * n/a
>
> ---
>
> We met an issue when run virt-manager on huawei arm server with
> 18.04.1.
>
> Error polling connection 'qemu:///system': internal error: Socket 5418
> can't be handled (max socket is 4095)
>
> To manage notifications about this bug go to:
>
> https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1811198/+subscriptions
>

Ping, any chance to pre-check - I really want to get this going.

Bin Li (binli) wrote :

Christian,
 Sorry for late, I had tried the 3621 for bionic, after installed new below packages, it works fine.
 And I couldn't test he cosmic one, because we had only one hardware and we just need to support it for bionic. I thought it should be okay for the cosmic packages if the patches were same with bionic.
 Thanks a lot!

libvirt0:arm64 (4.0.0-1ubuntu8.7~ppa1)
qemu:arm64 (1:2.11+dfsg-1ubuntu7.10~ppa2)

Thanks in advance.
@Bin - For the final verification in -proposed- which will be called by the SRU team I'd really appreciate if you then could also give Cosmic a test (Even if you / your project only needs it for Bionic).

That said -all prechecks ready and uploaded to the SRU queue waiting for approval.

Hello Bin, or anyone else affected,

Accepted libvirt into cosmic-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/libvirt/4.6.0-2ubuntu3.3 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-cosmic to verification-done-cosmic. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-cosmic. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in libvirt (Ubuntu Cosmic):
status: Triaged → Fix Committed
tags: added: verification-needed verification-needed-cosmic
Changed in libvirt (Ubuntu Bionic):
status: Triaged → Fix Committed
tags: added: verification-needed-bionic
Brian Murray (brian-murray) wrote :

Hello Bin, or anyone else affected,

Accepted libvirt into bionic-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/libvirt/4.0.0-1ubuntu8.7 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-bionic to verification-done-bionic. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-bionic. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Bin Li (binli) wrote :

Enabled the proposed on bionic, and did update, it work fine now. Thanks!

Get:2 http://us.ports.ubuntu.com/ubuntu-ports bionic-proposed/universe arm64 qemu arm64 1:2.11+dfsg-1ubuntu7.10
Get:1 http://us.ports.ubuntu.com/ubuntu-ports bionic-proposed/main arm64 libvirt0 arm64 4.0.0-1ubuntu8.7

tags: added: verification-done-bionic
removed: verification-needed-bionic

@Bin - thanks for the quick test.
I know you personally only need it for Bionic - but for this to complete overall we also would need a verification on Cosmic. Any chance you can upgrade one of your devices to give it a test as well?

Bin Li (binli) wrote :

Christian,

 Okay, I will try, and I just prepare verify it on amd64 first for cosmic, so that I could make sure the libvirt work correctly. Thanks!

Bin Li (binli) wrote :

First I tried cosmic amd64 on Thinkpad T480s, upgraded libvirt0 to 4.6.0-2ubuntu3.3. It could run virt-manager successfully.

Thank you so much, setting verified then

tags: added: verification-done verification-done-cosmic
removed: verification-needed verification-needed-cosmic
Bin Li (binli) wrote :

I installed the cosmic on Huawei arm server, I could reproduce with libvirt0 4.6.0-2ubuntu3.2, after enabled the proposed, upgraded to 4.6.0-2ubuntu3.3, then it works fine. Thanks!

tags: added: verification-needed
removed: verification-done
tags: added: verification-done
removed: verification-needed

The verification of the Stable Release Update for libvirt has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package libvirt - 4.6.0-2ubuntu3.3

---------------
libvirt (4.6.0-2ubuntu3.3) cosmic; urgency=medium

  * d/p/ubuntu/lp-1811198-utils-Remove-arbitrary-limit-on-socket_id-core_id
    .patch: fix arm servers with high core_id (LP: #1811198)
  * d/p/ubuntu/lp-1771662-*: fix assumption that all VFs have PFs assigned
    (LP: #1771662)

 -- Christian Ehrhardt <email address hidden> Thu, 31 Jan 2019 12:29:37 +0100

Changed in libvirt (Ubuntu Cosmic):
status: Fix Committed → Fix Released
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package libvirt - 4.0.0-1ubuntu8.7

---------------
libvirt (4.0.0-1ubuntu8.7) bionic; urgency=medium

  * d/p/ubuntu/lp-1811198-utils-Remove-arbitrary-limit-on-socket_id-core_id
    .patch: fix arm servers with high core_id (LP: #1811198)
  * d/p/ubuntu/lp-1771662-*: fix assumption that all VFs have PFs assigned
    (LP: #1771662)

 -- Christian Ehrhardt <email address hidden> Thu, 31 Jan 2019 12:45:18 +0100

Changed in libvirt (Ubuntu Bionic):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Bug attachments