libvirt / cgroups v2: cannot boot instance with more than 16 CPUs

Bug #1978489 reported by Artom Lifshitz
54
This bug affects 8 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
In Progress
Undecided
Unassigned
Ubuntu Cloud Archive
Invalid
Undecided
Unassigned
Yoga
Fix Released
High
Unassigned
nova (Ubuntu)
Fix Released
Undecided
Unassigned
Jammy
Fix Released
High
Unassigned

Bug Description

Description
===========

Using the libvirt driver and a host OS that uses cgroups v2 (RHEL 9, Ubuntu Jammy), an instance with more than 16 CPUs cannot be booted.

Steps to reproduce
==================

1. Boot an instance with 10 (or more) CPUs on RHEL 9 or Ubuntu Jammy using Nova with the libvirt driver.

Expected result
===============

Instance boots.

Actual result
=============

Instance fails to boot with a 'Value specified in CPUWeight is out of range' error.

Environment
===========

Originially report as a libvirt but in RHEL 9 [1]

Additional information
======================

This is happening because Nova defaults to 1024 * (# of CPUs) for the value of domain/cputune/shares in the libvirt XML. This is then passed directly by libvirt to the cgroups API, but cgroups v2 has a maximum value of 10000. 10000 / 1024 ~= 9.76

[1] https://bugzilla.redhat.com/show_bug.cgi?id=2035518

====================================

Ubuntu SRU Details:

[Impact]
See above.

[Test Case]
See above.

[Regression Potential]
We've had this change in other jammy-based versions of the nova package for a while now, including zed, antelope, bobcat.

Changed in nova:
status: New → In Progress
Revision history for this message
Artom Lifshitz (notartom) wrote :
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.opendev.org/c/openstack/nova/+/824048
Committed: https://opendev.org/openstack/nova/commit/f77a9fee5b736899ecc39d33e4f4e4012cee751c
Submitter: "Zuul (22348)"
Branch: master

commit f77a9fee5b736899ecc39d33e4f4e4012cee751c
Author: Artom Lifshitz <email address hidden>
Date: Mon Jan 10 13:36:36 2022 -0500

    libvirt: remove default cputune shares value

    Previously, the libvirt driver defaulted to 1024 * (# of CPUs) for the
    value of domain/cputune/shares in the libvirt XML. This value is then
    passed directly by libvirt to the cgroups API. Cgroups v2 imposes a
    maximum value of 10000 that can be passed in. This makes Nova
    unable to launch instances with more than 9 CPUs on hosts that run
    cgroups v2, like Ubuntu Jammy or RHEL 9.

    Fix this by just removing the default entirely. Because there is no
    longer a guarantee that domain/cputune will contain at least a shares
    element, we can stop always generating the former, and only generate
    it if it will actually contain something.

    We can also make operators's lives easier by leveraging the fact that
    we update the XML during live migration, so this patch also adds a
    method to remove the shares value from the live migration XML if one
    was not set as the quota:cpu_shares flavor extra spec.

    For operators that *have* set this extra spec to something greater
    than 10000, their flavors will have to get updates, and their
    instances resized.

    Partial-bug: 1978489
    Change-Id: I49d757f5f261b3562ada27e6cf57284f615ca395

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/nova/+/898326

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/yoga)

Fix proposed to branch: stable/yoga
Review: https://review.opendev.org/c/openstack/nova/+/898554

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (master)

Change abandoned by "Tobias Urdin <email address hidden>" on branch: master
Review: https://review.opendev.org/c/openstack/nova/+/898326

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/yoga)

Reviewed: https://review.opendev.org/c/openstack/nova/+/898554
Committed: https://opendev.org/openstack/nova/commit/0a6b57a9a24a0936383aaf444c690772aacc3245
Submitter: "Zuul (22348)"
Branch: stable/yoga

commit 0a6b57a9a24a0936383aaf444c690772aacc3245
Author: Artom Lifshitz <email address hidden>
Date: Mon Jan 10 13:36:36 2022 -0500

    libvirt: remove default cputune shares value

    Previously, the libvirt driver defaulted to 1024 * (# of CPUs) for the
    value of domain/cputune/shares in the libvirt XML. This value is then
    passed directly by libvirt to the cgroups API. Cgroups v2 imposes a
    maximum value of 10000 that can be passed in. This makes Nova
    unable to launch instances with more than 9 CPUs on hosts that run
    cgroups v2, like Ubuntu Jammy or RHEL 9.

    Fix this by just removing the default entirely. Because there is no
    longer a guarantee that domain/cputune will contain at least a shares
    element, we can stop always generating the former, and only generate
    it if it will actually contain something.

    We can also make operators's lives easier by leveraging the fact that
    we update the XML during live migration, so this patch also adds a
    method to remove the shares value from the live migration XML if one
    was not set as the quota:cpu_shares flavor extra spec.

    For operators that *have* set this extra spec to something greater
    than 10000, their flavors will have to get updates, and their
    instances resized.

    Partial-bug: 1978489
    Change-Id: I49d757f5f261b3562ada27e6cf57284f615ca395
    (cherry picked from commit f77a9fee5b736899ecc39d33e4f4e4012cee751c)

tags: added: in-stable-yoga
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in nova (Ubuntu):
status: New → Confirmed
Revision history for this message
Serhat Rıfat Demircan (demircan-serhat) wrote :

Hello,

We are hitting too this bug on production. When you are planning to publish a new Jammy package that contains this fix?

Changed in nova (Ubuntu Jammy):
status: New → Triaged
description: updated
Revision history for this message
Corey Bryant (corey.bryant) wrote :

A new version of nova has been uploaded to the jammy unapproved queue where it is a waiting SRU team review: https://launchpad.net/ubuntu/jammy/+queue?queue_state=1&queue_text=nova

Revision history for this message
Jan Graichen (jgraichen) wrote :

Hello,

We're affected by this bug too. Unfortunately, the patch changes the behavior for instances by completely removing the default cputune. Therefore, instance are no longer weighted to each other at all.

We tried adding `quota:cpu_shares` to our flavors (vcpus * 100), but that isn't applied to any existing instance. They stay unweighted and are now overloaded by new instances.

As far as we know, updating flavors never was planned to affect existing instances, even if there are only some extra spec changes, but here, a bug/change breaks all existing instances, and fixing the flavor doesn't help at all.

Some other flavors already had `quota:cpu_shares` > 10000. They broke completely too and cannot be fixed without patching inside the nova database at around three places.

Is there any workaround to rebuilding hundreds of instances like force nova to override flavors of existing instances?

Revision history for this message
Steve Langasek (vorlon) wrote : Please test proposed package

Hello Artom, or anyone else affected,

Accepted nova into jammy-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/nova/3:25.2.1-0ubuntu2 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-jammy to verification-done-jammy. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-jammy. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in nova (Ubuntu Jammy):
status: Triaged → Fix Committed
tags: added: verification-needed verification-needed-jammy
Revision history for this message
Stefan Lupsa (stefanlupsacbsl) wrote :
Download full text (18.7 KiB)

Hello, I've tried testing this in the following setup: juju openstack on focal with openstack-origin=cloud:focal-yoga on nova-compute charm with a 12 vcpu flavor in nova.

Test cases covered: launch 12 cpu instance on jammy host, live-migrate 12 cpu instance from focal-yoga host to jammy host.

Initial nova compute host packages:
# dpkg -l | grep nova
ii nova-api-metadata 3:25.2.1-0ubuntu1~cloud0 all OpenStack Compute - metadata API frontend
ii nova-common 3:25.2.1-0ubuntu1~cloud0 all OpenStack Compute - common files
ii nova-compute 3:25.2.1-0ubuntu1~cloud0 all OpenStack Compute - compute node base
ii nova-compute-kvm 3:25.2.1-0ubuntu1~cloud0 all OpenStack Compute - compute node (KVM)
ii nova-compute-libvirt 3:25.2.1-0ubuntu1~cloud0 all OpenStack Compute - compute node libvirt support
ii python3-nova 3:25.2.1-0ubuntu1~cloud0 all OpenStack Compute Python 3 libraries
ii python3-novaclient 2:17.6.0-0ubuntu1~cloud0 all client library for OpenStack Compute API - 3.x

Upgrade 1 compute node os to jammy with do-release-upgreade, resulting in nova packages:
# dpkg -l | grep nova
ii nova-api-metadata 3:25.2.1-0ubuntu1 all OpenStack Compute - metadata API frontend
ii nova-common 3:25.2.1-0ubuntu1 all OpenStack Compute - common files
ii nova-compute 3:25.2.1-0ubuntu1 all OpenStack Compute - compute node base
ii nova-compute-kvm 3:25.2.1-0ubuntu1 all OpenStack Compute - compute node (KVM)
ii nova-compute-libvirt 3:25.2.1-0ubuntu1 all OpenStack Compute - compute node libvirt support
ii python3-nova 3:25.2.1-0ubuntu1 all OpenStack Compute Python 3 libraries
ii python3-novaclient 2:17.6.0-0ubuntu1 all client library for OpenStack Compute API - 3.x

Create and migrate instances of the test flavor to the upgraded compute replicates the bug:

Live migration to jammy host:
2024-01-23 13:26:07.290 15908 ERROR nova.virt.libvirt.driver [-] [instance: 41b66168-6d5d-44d8-92fe-cf51dfddeac6] Live Migration failure: error from service: GDBus.Error:org.freedesktop.DBus.Error.InvalidArgs: Value specified in CPUWeight is out of range: libvirt.libvirtError: error from service: GDBus.Error:org.freedesktop.DBus.Error.InvalidArgs: Value specified in CPUWeight is out of range

Instance creation on jammy host:
2024-01-23 13:24:19.772 7695 ERROR nova.compute.manager [req-867b6b88-b27f-43b3-8c36-07fd6ab7b5c1 7...

tags: added: verification-done-jammy
removed: verification-needed-jammy
Revision history for this message
Stefan Lupsa (stefanlupsacbsl) wrote :

For our needs the current patch/package is enough of the fix as we have already have upgraded the deployment in question to jammy and we just need the clean version of the package; however, for posterity, the cloud:focal-yoga package should still include a fix for migrating to a jammy node with cgroups v2.

Revision history for this message
Robie Basak (racb) wrote :

What's the status of this in Noble please?

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

In addition to what was asked in comment $14,I would also like some clarification on:

- comment #10: is the patch breaking backwards compatibility?
- comment #12: it shows a live migration problem which I'm not sure is due to this bug, or something else.

Revision history for this message
Stefan Lupsa (stefanlupsacbsl) wrote :

> - comment #12: it shows a live migration problem which I'm not sure is due to this bug, or something else.

The patch fixes the problem for jammy distro.

The same patch should also be available on cloud archive cloud:focal-yoga for someone to be able to migrate VMs to a jammy node during an upgrade process. Any instance with >= 10 vcpu will not be able to live-migrate to an upgraded node (running jammy distro) even if it already has the patch and would throw the same error. It's the same problem and the patch addresses this with functionality to update the cpu shares on the instance handled during the migration. This means by the migrations source, which in a juju environment would be an ubuntu focal running the yoga cloud-archive.

Revision history for this message
Jorge Merlino (jorge-merlino) wrote :

Hi Andreas and Robie,

Regarding the answer about the state of Noble, this patch is currently merged in Noble and Mantic. It is also present in uca-zed and newer versions.

I think the verification done on step #12 is not very clear. The point is that the bug that is shown there should be fixed by this patch. That was confirmed in comment #16 by the author of #12 but not before.

I think comment #10 has a valid point regarding backwards compatibility as this changes the behavior of migrated VMs by default. Now (by default) all VMs have the same weight when migrated whereas before the VMs had more weight the more VCPUs they possessed. This behavior can be recovered by setting the quota:cpu_shares flavor extra spec. This seems to be acceptable by upstream as this has been merged to master and also backported to stable/yoga (https://review.opendev.org/c/openstack/nova/+/898554)

Please let me know if you have more questions in order to release this package.

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

> This behavior can be recovered by setting the quota:cpu_shares flavor extra spec.

You are the openstack experts here, but I will point out that it looks like comment #10 already tried this.

That comment also ends with: "Is there any workaround to rebuilding hundreds of instances like force nova to override flavors of existing instances?"

Do we have a concrete answer for that? Like, "here is what you do, step by step".

Revision history for this message
James Page (james-page) wrote (last edit ):

I think that the challenge of how to update the cpu tuning for all existing running instances is solvable.

a) quota:cpu_* is an additional property for a flavor and as such can be updated (applying to new instances created).

b) Using the virsh tool, its possible to live set the scheduling tuning on a running instance - for example:

sudo virsh schedinfo instance-0008905c --config --live --set cpu_shares=2048

That obviously needs tailoring for the actual running environment/instances.

That does not however deal with the in-balance between instances created before and post update with no flavor extra-specs defined - the default value for cpu_shares of 1024 should be used for existing instances with no explicit cpu_share extra spec.

Revision history for this message
James Page (james-page) wrote :

Re:

> The same patch should also be available on cloud archive cloud:focal-yoga

This will happen alongside the changes being made into 22.04 - the updates are in the yoga-proposed pocket at the moment.

Changed in cloud-archive:
status: New → Invalid
Changed in nova (Ubuntu Jammy):
importance: Undecided → High
Revision history for this message
Edward Hope-Morley (hopem) wrote :

As a recap, this patch addresses the problem of moving vms between hosts running cgroups v1 (e.g. Ubuntu Focal) and v2 (Ubuntu Jammy) which now has a cap of 10K [1] for cpu.weight, resulting in vms with > 9 vcpus not being able to boot if they use the default Nova 1024 * guest.vcpus. The patch addresses the problem by no longer applying a default weight to instances while keeping the option to apply quota:cpu_shares from a flavor extra-specs.

The consequence of this is:
Vms booted without quota:cpu_shares extra-specs after upgrading to this patch will have the default cgroups v2 weight of 100.
New Vms can get a higher weight if they use a flavor with extra-specs quota:cpu_shares BUT this will only apply to existing vms if they are resized so as to switch to using the new/modified flavor which will require workload downtime - a vm reboot will not consume the new value.
Vms created from a flavor with extra-specs quota:cpu_shares set to a value > 10K will fail to boot and to fix this will require a new/modified flavor with adjusted value then vm resize to consume therefore workload downtime.

It is important to note that point 3 is not a consequence of this patch and is therefore neither introduced nor resolved by it and will require a separate patch solution. One way to resolve this could be to have Nova cap quota:cpu_shares at cgroup cpu.weight max value and log a warning to say that was done, that way instances will at least boot and have a max weight. Therefore I am in favour of proceeding with this SRU to provide users a way to migrate from v1 to v2 and suggest we propose a new patch to address the flavor extra-specs issue. As @jamespage has pointed out there are some interim manual solutions that can be used as a stop-gap until this is fully resolved in Nova.

[1] https://www.kernel.org/doc/Documentation/cgroup-v2.txt

Revision history for this message
Edward Hope-Morley (hopem) wrote (last edit ):

Forgot to add to ^ that instead of removing the default weight (1024 * guest.vcpus) might it not have made sense to simply cap it at the max allowed value? Again, perhaps something that could be proposed to Nova as a new patch.

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

comment #17 stated that noble and mantic have the patch, so I'm marking the noble (devel) task as fix released.

Changed in nova (Ubuntu):
status: Confirmed → Fix Released
Revision history for this message
Andreas Hasenack (ahasenack) wrote : Update Released

The verification of the Stable Release Update for nova has completed successfully and the package is now being released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package nova - 3:25.2.1-0ubuntu2

---------------
nova (3:25.2.1-0ubuntu2) jammy; urgency=medium

  * d/p/libvirt-remove-default-cputune-shares-value.patch:
    Enable launch of instances with more than 9 CPUs on Jammy
    (LP: #1978489).

 -- Corey Bryant <email address hidden> Tue, 16 Jan 2024 12:30:33 -0500

Changed in nova (Ubuntu Jammy):
status: Fix Committed → Fix Released
Revision history for this message
James Page (james-page) wrote :

The verification of the Stable Release Update for nova has completed successfully and the package has now been released to -updates. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
James Page (james-page) wrote :

This bug was fixed in the package nova - 3:25.2.1-0ubuntu2~cloud0
---------------

 nova (3:25.2.1-0ubuntu2~cloud0) focal-yoga; urgency=medium
 .
   * New update for the Ubuntu Cloud Archive.
 .
 nova (3:25.2.1-0ubuntu2) jammy; urgency=medium
 .
   * d/p/libvirt-remove-default-cputune-shares-value.patch:
     Enable launch of instances with more than 9 CPUs on Jammy
     (LP: #1978489).

tags: added: verification-done
removed: verification-needed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.