Comment 21 for bug 1978489

Revision history for this message
Edward Hope-Morley (hopem) wrote :

As a recap, this patch addresses the problem of moving vms between hosts running cgroups v1 (e.g. Ubuntu Focal) and v2 (Ubuntu Jammy) which now has a cap of 10K [1] for cpu.weight, resulting in vms with > 9 vcpus not being able to boot if they use the default Nova 1024 * guest.vcpus. The patch addresses the problem by no longer applying a default weight to instances while keeping the option to apply quota:cpu_shares from a flavor extra-specs.

The consequence of this is:
Vms booted without quota:cpu_shares extra-specs after upgrading to this patch will have the default cgroups v2 weight of 100.
New Vms can get a higher weight if they use a flavor with extra-specs quota:cpu_shares BUT this will only apply to existing vms if they are resized so as to switch to using the new/modified flavor which will require workload downtime - a vm reboot will not consume the new value.
Vms created from a flavor with extra-specs quota:cpu_shares set to a value > 10K will fail to boot and to fix this will require a new/modified flavor with adjusted value then vm resize to consume therefore workload downtime.

It is important to note that point 3 is not a consequence of this patch and is therefore neither introduced nor resolved by it and will require a separate patch solution. One way to resolve this could be to have Nova cap quota:cpu_shares at cgroup cpu.weight max value and log a warning to say that was done, that way instances will at least boot and have a max weight. Therefore I am in favour of proceeding with this SRU to provide users a way to migrate from v1 to v2 and suggest we propose a new patch to address the flavor extra-specs issue. As @jamespage has pointed out there are some interim manual solutions that can be used as a stop-gap until this is fully resolved in Nova.

[1] https://www.kernel.org/doc/Documentation/cgroup-v2.txt