Cannot separately enable cpu_power_management and cpu pinning

Bug #2043707 reported by Balazs Gibizer
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Undecided
Unassigned

Bug Description

If [libvirt]cpu_power_management is set to true but [compute]cpu_dedicated_set is empty nova-compute is fails to start with:

2023-11-16 10:42:42.444 2 ERROR oslo_service.service [None req-56dbf76c-524c-455d-9c64-d3474509e8d0 - - - - - -] Error starting thread.: nova.exception.InvalidConfiguration: '[compute]/cpu_dedicated_set' is mandatory to be set if '[libvirt]/cpu_power_management' is set.Please provide the CPUs that can be pinned or don't use the power management if you only use shared CPUs.
2023-11-16 10:42:42.444 2 ERROR oslo_service.service Traceback (most recent call last):
2023-11-16 10:42:42.444 2 ERROR oslo_service.service File "/usr/lib/python3.9/site-packages/oslo_service/service.py", line 806, in run_service
2023-11-16 10:42:42.444 2 ERROR oslo_service.service service.start()
2023-11-16 10:42:42.444 2 ERROR oslo_service.service File "/usr/lib/python3.9/site-packages/nova/service.py", line 162, in start
2023-11-16 10:42:42.444 2 ERROR oslo_service.service self.manager.init_host(self.service_ref)
2023-11-16 10:42:42.444 2 ERROR oslo_service.service File "/usr/lib/python3.9/site-packages/nova/compute/manager.py", line 1608, in init_host
2023-11-16 10:42:42.444 2 ERROR oslo_service.service self.driver.init_host(host=self.host)
2023-11-16 10:42:42.444 2 ERROR oslo_service.service File "/usr/lib/python3.9/site-packages/nova/virt/libvirt/driver.py", line 831, in init_host
2023-11-16 10:42:42.444 2 ERROR oslo_service.service libvirt_cpu.power_down_all_dedicated_cpus()
2023-11-16 10:42:42.444 2 ERROR oslo_service.service File "/usr/lib/python3.9/site-packages/nova/virt/libvirt/cpu/api.py", line 122, in power_down_all_dedicated_cpus
2023-11-16 10:42:42.444 2 ERROR oslo_service.service raise exception.InvalidConfiguration(msg)
2023-11-16 10:42:42.444 2 ERROR oslo_service.service nova.exception.InvalidConfiguration: '[compute]/cpu_dedicated_set' is mandatory to be set if '[libvirt]/cpu_power_management' is set.Please provide the CPUs that can be pinned or don't use the power management if you only use shared CPUs.

This is not a functional bug. But it is a UX bug. I would like to independently enable the CPU power management feature from configuring pinned CPU cores even if it means the no CPU cores is power managed while cpu_dedicated_set is empty.

Imagine a deployment engine that would like to enable cpu_power_management automatically by default. But it cannot defined the list of pinned CPU cores at the same time as that his hypervisor HW dependent. The current strict validation prevents enabling cpu_power_management before defining the list of PCPUs.

The actual power management logic can gracefully handle the case when zero PCPUs are defined simply by managing all the PCPUs i.e. managing no PCPUs in this case.

Tags: libvirt
Revision history for this message
Balazs Gibizer (balazs-gibizer) wrote :

Th PR enabling the feature in a deployment and showing the error: https://github.com/openstack-k8s-operators/nova-operator/pull/597

tags: added: libvirt
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/nova/+/901188

Changed in nova:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.opendev.org/c/openstack/nova/+/901188
Committed: https://opendev.org/openstack/nova/commit/b1a0aee1abca0ed61c156dd99544adeaebaf0960
Submitter: "Zuul (22348)"
Branch: master

commit b1a0aee1abca0ed61c156dd99544adeaebaf0960
Author: Balazs Gibizer <email address hidden>
Date: Thu Nov 16 18:01:29 2023 +0100

    Allow enabling cpu_power_management with 0 dedicated CPUs

    The CPU power management feature of the libvirt driver, enabled with
    [libvirt]cpu_power_management, only manages dedicated CPUs and does not
    touch share CPUs. Today nova-compute refuses to start if configured
    with [libvirt]cpu_power_management=true [compute]cpu_dedicated_set=None.
    While this is functionally not limiting it does limit the possibility to
    independently enable the power management and define the
    cpu_dedicated_set. E.g. there might be a need to enable the former in
    the whole cloud in a single step, while not all nodes of the cloud will
    have dedicated CPUs configured.

    This patch removes the strict config check. The implementation already
    handles each PCPU individually, so if there are an empty list of PCPUs
    then it does nothing.

    Closes-Bug: #2043707
    Change-Id: Ib070e1042c0526f5875e34fa4f0d569590ec2514

Changed in nova:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/2023.2)

Fix proposed to branch: stable/2023.2
Review: https://review.opendev.org/c/openstack/nova/+/901656

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/2023.1)

Fix proposed to branch: stable/2023.1
Review: https://review.opendev.org/c/openstack/nova/+/901660

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/2023.2)

Reviewed: https://review.opendev.org/c/openstack/nova/+/901656
Committed: https://opendev.org/openstack/nova/commit/4549e3479250cec7889c7809e719f77d19514222
Submitter: "Zuul (22348)"
Branch: stable/2023.2

commit 4549e3479250cec7889c7809e719f77d19514222
Author: Balazs Gibizer <email address hidden>
Date: Thu Nov 16 18:01:29 2023 +0100

    Allow enabling cpu_power_management with 0 dedicated CPUs

    The CPU power management feature of the libvirt driver, enabled with
    [libvirt]cpu_power_management, only manages dedicated CPUs and does not
    touch share CPUs. Today nova-compute refuses to start if configured
    with [libvirt]cpu_power_management=true [compute]cpu_dedicated_set=None.
    While this is functionally not limiting it does limit the possibility to
    independently enable the power management and define the
    cpu_dedicated_set. E.g. there might be a need to enable the former in
    the whole cloud in a single step, while not all nodes of the cloud will
    have dedicated CPUs configured.

    This patch removes the strict config check. The implementation already
    handles each PCPU individually, so if there are an empty list of PCPUs
    then it does nothing.

    Closes-Bug: #2043707
    Change-Id: Ib070e1042c0526f5875e34fa4f0d569590ec2514
    (cherry picked from commit b1a0aee1abca0ed61c156dd99544adeaebaf0960)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/2023.1)

Reviewed: https://review.opendev.org/c/openstack/nova/+/901660
Committed: https://opendev.org/openstack/nova/commit/3ce1bd5225e17bc23b5a2662f1ca8040617b2709
Submitter: "Zuul (22348)"
Branch: stable/2023.1

commit 3ce1bd5225e17bc23b5a2662f1ca8040617b2709
Author: Balazs Gibizer <email address hidden>
Date: Thu Nov 16 18:01:29 2023 +0100

    Allow enabling cpu_power_management with 0 dedicated CPUs

    The CPU power management feature of the libvirt driver, enabled with
    [libvirt]cpu_power_management, only manages dedicated CPUs and does not
    touch share CPUs. Today nova-compute refuses to start if configured
    with [libvirt]cpu_power_management=true [compute]cpu_dedicated_set=None.
    While this is functionally not limiting it does limit the possibility to
    independently enable the power management and define the
    cpu_dedicated_set. E.g. there might be a need to enable the former in
    the whole cloud in a single step, while not all nodes of the cloud will
    have dedicated CPUs configured.

    This patch removes the strict config check. The implementation already
    handles each PCPU individually, so if there are an empty list of PCPUs
    then it does nothing.

    Closes-Bug: #2043707
    Change-Id: Ib070e1042c0526f5875e34fa4f0d569590ec2514
    (cherry picked from commit b1a0aee1abca0ed61c156dd99544adeaebaf0960)
    (cherry picked from commit 4549e3479250cec7889c7809e719f77d19514222)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 28.0.1

This issue was fixed in the openstack/nova 28.0.1 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 27.2.0

This issue was fixed in the openstack/nova 27.2.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 29.0.0.0rc1

This issue was fixed in the openstack/nova 29.0.0.0rc1 release candidate.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.