FluxCD platform pods are requesting CPU resources

Bug #1975713 reported by Thiago Paiva Brito
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Low
Thiago Paiva Brito

Bug Description

Brief Description
-----------------

While running tests in the lab I noticed that the helm-controller and source-controller pods in the flux-helm namespace are requesting CPU resources. All platform pods are supposed to have no CPU resource requests as it messes up the reported CPU allocations and reduces the amount of CPU available to the customer applications.

Severity
--------

Minor: System/Feature is usable with minor issue (unless the customer wants to use every available CPU)

Note: due to the fact that customers can easily test this, Matt has mentioned that he views this issue as gating.

Steps to Reproduce
------------------

Boot up the current StarlingX, run "kubectl describe node <nodename>" for either an AIO node or both controller and worker nodes, then look at the "CPU Requests" column in the table for "Non-terminated Pods".

Expected Behavior
-----------------

There should be no CPU requests for pods in the platform namespaces.

Actual Behavior
-----------------

helm-controller pod requested 100m (10% of a CPU), and source-controller pod requested 50m (5% of a CPU).

Non-terminated Pods: (18 in total)
   Namespace Name CPU CPU Memory Memory Age

                                                                                                                                  Requests Limits Requests Limits
  --------- ---------------------------------- --------- ------- --------- ----------- ---
  armada armada-api-56fc59fd7c-6brgr 0 (0%) 0 (0%) 0 (0%) 0 (0%) 3d4h
  cert-manager cm-cert-manager-c9865d867-6mdwg 0 (0%) 0 (0%) 0 (0%) 0 (0%) 3d7h
  cert-manager cm-cert-manager-cainjector-d5f589f98-jxskp 0 (0%) 0 (0%) 0 (0%) 0 (0%) 3d7h
  cert-manager cm-cert-manager-webhook-846559f4c8-k29ps 0 (0%) 0 (0%) 0 (0%) 0 (0%) 3d7h
  flux-helm helm-controller-64cdcd69c8-jpfpk 100m (0%) 1 (6%) 64Mi (0%) 1Gi (6%) 3d4h
  flux-helm source-controller-6d7db457f4-8lv7h 50m (0%) 1 (6%) 64Mi (0%) 1Gi (6%) 3d4h
  kube-system calico-node-8xzsg 250m (1%) 0 (0%) 0 (0%) 0 (0%) 3d11h
  kube-system cephfs-provisioner-bd46f868-lws8l 0 (0%) 0 (0%) 0 (0%) 0 (0%) 3d7h
  kube-system coredns-5f4fcc5c76-dxwsg 0 (0%) 0 (0%) 70Mi (0%) 170Mi (1%) 3d7h
  kube-system ic-nginx-ingress-ingress-nginx-controller-2d6k7 100m (0%) 0 (0%) 90Mi (0%) 0 (0%) 3d7h

Reproducibility
-----------------

100% Reproducible

System Configuration
--------------------

All

Last Pass
---------

FluxCD is relatively new, so it was probably introduced with FluxCD.

Timestamp/Logs
--------------

N/A

Alarms
------

N/A

Test Activity
-------------

Developer testing.

Workaround
----------

Can still use the system, just can't allocate as many application CPUs as expected.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ansible-playbooks (master)
Changed in starlingx:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ansible-playbooks (master)

Reviewed: https://review.opendev.org/c/starlingx/ansible-playbooks/+/843283
Committed: https://opendev.org/starlingx/ansible-playbooks/commit/b4cb6f05e5b252a04a845190ad0fdbd83e06109d
Submitter: "Zuul (22348)"
Branch: master

commit b4cb6f05e5b252a04a845190ad0fdbd83e06109d
Author: Thiago Brito <email address hidden>
Date: Wed May 25 10:18:59 2022 -0300

    Remove CPU/memory requests for FluxCD deployments

    All platform pods are supposed to have no CPU resource requests as it
    messes up the reported CPU allocations and reduces the amount of CPU
    available to the customer applications. Nonetheless, the deployments
    of FluxCD are requesting resources.

    This change fixes it, leaving the limits as a safeguard so those pods
    doesn't take too much resources from the system.

    TEST PLAN
    PASS Build CentOS ISO with change and verify that flux-helm resources
         were created without resource requests and are all up and running

    Logs: https://paste.opendev.org/show/bEarxJu6DQ7g6MSfmJ3C/

    Closes-Bug: #1975713

    Signed-off-by: Thiago Brito <email address hidden>
    Change-Id: I31b29649e3011682dc7d9b0fe987d54c7232dc1a

Changed in starlingx:
status: In Progress → Fix Released
Ghada Khalil (gkhalil)
Changed in starlingx:
importance: Undecided → Low
tags: added: stx.7.0 stx.apps
Changed in starlingx:
assignee: nobody → Thiago Paiva Brito (outbrito)
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.