AIO System Controller CPU assignment changes required

Bug #1855920 reported by Jim Gauld
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
High
Jim Gauld

Bug Description

Brief Description
-----------------
The following changes are required for an AIO running as a DC system controller
- All cores are assigned as platform cores
- kubelet, same config options as a standard controller

Severity
--------
Major: System/Feature is usable but degraded

Steps to Reproduce
------------------
Launch AIO with DC system controller.

Expected Behavior
------------------
Desire all sysinv managed cores have 'Platform' function. This implies CPU affinity of the platform and kubernetes processes float across all logical cpus and numa nodes. The kubelet process should have CPU manager policy 'none' with no reserved cpus.

Actual Behavior
----------------
Only 2 cores from numa node 0 are assigned 'Platform' function, so this system is highly constrained since the CPU affinity of processes are limited to these cores. The kubelet process has CPU manager policy 'non' with reserved cpus.

Reproducibility
---------------
Reproducible.

System Configuration
--------------------
AIO DC System controller.

Branch/Pull Time/Commit
-----------------------
NA.

Last Pass
---------
no, this is day one issue

Timestamp/Logs
--------------
NA.

Test Activity
-------------
Other - System scaleability engineering.

Jim Gauld (jgauld)
Changed in starlingx:
assignee: nobody → Jim Gauld (jgauld)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to stx-puppet (master)

Fix proposed to branch: master
Review: https://review.opendev.org/698321

Changed in starlingx:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to config (master)

Fix proposed to branch: master
Review: https://review.opendev.org/698323

Ghada Khalil (gkhalil)
tags: added: stx.distcloud
Revision history for this message
Ghada Khalil (gkhalil) wrote :

stx.3.0 / high priority - given this is a key optimization for distributed cloud.
The fix can be included in an upcoming mtce release

Changed in starlingx:
importance: Undecided → High
tags: added: stx.3.0
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to stx-puppet (master)

Reviewed: https://review.opendev.org/698321
Committed: https://git.openstack.org/cgit/starlingx/stx-puppet/commit/?id=615689aca8e7c003f417e96d79f6104684b5826b
Submitter: Zuul
Branch: master

commit 615689aca8e7c003f417e96d79f6104684b5826b
Author: Jim Gauld <email address hidden>
Date: Tue Dec 10 15:24:43 2019 -0500

    AIO System Controller CPU manager changes for kubelet

    This changes Puppet manifest for kubernetes. kubelet options for AIO
    system controller are made to match a standard controller. The CPU
    manager policy is configured as 'none' with no reserved cpus.

    Change-Id: I10d41cd9974d0a9255f634f3b3fc09212774bdec
    Partial-Bug: 1855920
    Signed-off-by: Jim Gauld <email address hidden>

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to config (master)

Reviewed: https://review.opendev.org/698323
Committed: https://git.openstack.org/cgit/starlingx/config/commit/?id=539c29d717402efd44b634d653a4ff75c5b9b8fa
Submitter: Zuul
Branch: master

commit 539c29d717402efd44b634d653a4ff75c5b9b8fa
Author: Jim Gauld <email address hidden>
Date: Tue Dec 10 15:45:50 2019 -0500

    AIO System Controller CPU assignment changes

    This changes AIO running DC system controller CPU assignment so that
    all logical cpus spanning all numa nodes are configured as Platform
    function.

    Change-Id: I6be3c8f63661786b193b0ed7a72781a7c48808cb
    Closes-Bug: 1855920
    Depends-On: https://review.opendev.org/#/c/698321/
    Signed-off-by: Jim Gauld <email address hidden>

Changed in starlingx:
status: In Progress → Fix Released
Revision history for this message
John Kung (john-kung) wrote :

An update is required to scope this fix to only the controller personality:

i.e. condition to scope also to the controller only is required for the change sysinv/conductor/manager.py::_get_default_platform_cpu_count

        # Reserve all logical cpus on all numa nodes for AIO systemcontroller
        system = self.dbapi.isystem_get_one()
        system_type = system.system_type
        dc_role = system.distributed_cloud_role
        if (system_type == constants.TIS_AIO_BUILD and
                dc_role == constants.DISTRIBUTED_CLOUD_ROLE_SYSTEMCONTROLLER and
                cutils.host_has_function(ihost, constants.CONTROLLER) ):
            return cpu_count

Revision history for this message
Frank Miller (sensfan22) wrote :

Re-opening given the issue reported above by John Kung.

The fix will need to go into master as well as all commits for this LP need to be cherry picked to r/stx.3.0 for the first maintenance release.

Changed in starlingx:
status: Fix Released → Confirmed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to stx-puppet (master)

Fix proposed to branch: master
Review: https://review.opendev.org/698769

Changed in starlingx:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to config (master)

Fix proposed to branch: master
Review: https://review.opendev.org/698772

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to stx-puppet (master)

Reviewed: https://review.opendev.org/698769
Committed: https://git.openstack.org/cgit/starlingx/stx-puppet/commit/?id=cb50d8046130992c1fe7c602b6bbdd9279d83a49
Submitter: Zuul
Branch: master

commit cb50d8046130992c1fe7c602b6bbdd9279d83a49
Author: Jim Gauld <email address hidden>
Date: Thu Dec 12 12:07:36 2019 -0500

    AIO System Controller CPU assignment changes

    This changes AIO running DC system controller CPU assignment so that
    all logical cpus spanning all numa nodes are configured as Platform
    function.

    This update scopes the change to controller personality.

    Change-Id: I9eeac00f92948d9de2edea450754047c0d9a0ce2
    Partial-Bug: 1855920
    Signed-off-by: Jim Gauld <email address hidden>

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to config (master)

Reviewed: https://review.opendev.org/698772
Committed: https://git.openstack.org/cgit/starlingx/config/commit/?id=1ae7898020390d91be65a6d3bba66ae38faf1692
Submitter: Zuul
Branch: master

commit 1ae7898020390d91be65a6d3bba66ae38faf1692
Author: Jim Gauld <email address hidden>
Date: Thu Dec 12 12:16:18 2019 -0500

    AIO System Controller CPU assignment changes

    This changes AIO running DC system controller CPU assignment so that
    all logical cpus spanning all numa nodes are configured as Platform
    function.

    This update scopes the change to controller personality.

    Closes-Bug: 1855920
    Depends-On: https://review.opendev.org/#/c/698769/
    Signed-off-by: Jim Gauld <email address hidden>

    Change-Id: I4e55da3049a0e62d82a08e0181a2d50b6ec3a3ce

Changed in starlingx:
status: In Progress → Fix Released
Revision history for this message
Ghada Khalil (gkhalil) wrote :

@Jim Gauld, please cherrypick the changes to the r/stx.3.0 branch for inclusion in the first maintenance release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to config (r/stx.3.0)

Fix proposed to branch: r/stx.3.0
Review: https://review.opendev.org/700448

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to stx-puppet (r/stx.3.0)

Fix proposed to branch: r/stx.3.0
Review: https://review.opendev.org/700450

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to stx-puppet (r/stx.3.0)

Reviewed: https://review.opendev.org/700450
Committed: https://git.openstack.org/cgit/starlingx/stx-puppet/commit/?id=f5d06c0bd3eb1d98f9ea42e128b4ca0169d420b2
Submitter: Zuul
Branch: r/stx.3.0

commit f5d06c0bd3eb1d98f9ea42e128b4ca0169d420b2
Author: Jim Gauld <email address hidden>
Date: Tue Dec 10 15:24:43 2019 -0500

    AIO System Controller CPU manager changes for kubelet

    This changes Puppet manifest for kubernetes. kubelet options for AIO
    system controller are made to match a standard controller. The CPU
    manager policy is configured as 'none' with no reserved cpus.

    Change-Id: I10d41cd9974d0a9255f634f3b3fc09212774bdec
    Partial-Bug: 1855920
    Signed-off-by: Jim Gauld <email address hidden>
    (cherry picked from commit 615689aca8e7c003f417e96d79f6104684b5826b)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to config (r/stx.3.0)

Reviewed: https://review.opendev.org/700448
Committed: https://git.openstack.org/cgit/starlingx/config/commit/?id=be92ebe6aa992a2cf01ceb5b60a799e395e2a9cb
Submitter: Zuul
Branch: r/stx.3.0

commit be92ebe6aa992a2cf01ceb5b60a799e395e2a9cb
Author: Jim Gauld <email address hidden>
Date: Tue Dec 10 15:45:50 2019 -0500

    AIO System Controller CPU assignment changes

    This changes AIO running DC system controller CPU assignment so that
    all logical cpus spanning all numa nodes are configured as Platform
    function.

    Change-Id: I6be3c8f63661786b193b0ed7a72781a7c48808cb
    Closes-Bug: 1855920
    Depends-On: https://review.opendev.org/#/c/700450/
    Signed-off-by: Jim Gauld <email address hidden>
    (cherry picked from commit 539c29d717402efd44b634d653a4ff75c5b9b8fa)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to stx-puppet (r/stx.3.0)

Fix proposed to branch: r/stx.3.0
Review: https://review.opendev.org/702267

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to config (r/stx.3.0)

Fix proposed to branch: r/stx.3.0
Review: https://review.opendev.org/702268

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to stx-puppet (r/stx.3.0)

Reviewed: https://review.opendev.org/702267
Committed: https://git.openstack.org/cgit/starlingx/stx-puppet/commit/?id=b3fbee154fe99ca4bd8612d15367e875a710c24e
Submitter: Zuul
Branch: r/stx.3.0

commit b3fbee154fe99ca4bd8612d15367e875a710c24e
Author: Jim Gauld <email address hidden>
Date: Thu Dec 12 12:07:36 2019 -0500

    AIO System Controller CPU assignment changes

    This changes AIO running DC system controller CPU assignment so that
    all logical cpus spanning all numa nodes are configured as Platform
    function.

    This update scopes the change to controller personality.

    Change-Id: I9eeac00f92948d9de2edea450754047c0d9a0ce2
    Partial-Bug: 1855920
    Signed-off-by: Jim Gauld <email address hidden>
    (cherry picked from commit cb50d8046130992c1fe7c602b6bbdd9279d83a49)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to config (r/stx.3.0)

Reviewed: https://review.opendev.org/702268
Committed: https://git.openstack.org/cgit/starlingx/config/commit/?id=62f336947d28526a9b8d5d960547b6a4dac5ced4
Submitter: Zuul
Branch: r/stx.3.0

commit 62f336947d28526a9b8d5d960547b6a4dac5ced4
Author: Jim Gauld <email address hidden>
Date: Thu Dec 12 12:16:18 2019 -0500

    AIO System Controller CPU assignment changes

    This changes AIO running DC system controller CPU assignment so that
    all logical cpus spanning all numa nodes are configured as Platform
    function.

    This update scopes the change to controller personality.

    Closes-Bug: 1855920
    Depends-On: https://review.opendev.org/#/c/702267/
    Signed-off-by: Jim Gauld <email address hidden>

    Change-Id: I4e55da3049a0e62d82a08e0181a2d50b6ec3a3ce
    (cherry picked from commit 1ae7898020390d91be65a6d3bba66ae38faf1692)

Ghada Khalil (gkhalil)
tags: added: in-r-stx30
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.