collectd top 10 k8s system process list incorrectly has addon processes

Bug #2009877 reported by Cesar Bombonate
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Low
Cesar Bombonate

Bug Description

Brief Description
-----------------
Observe the collectd top 10 memory processes logs for "Kubernetes System" incorrectly contain processes which should be only captured in "Kubernetes Addon" logs.
collectd separates pods by namespace into Kubernetes System for platform pods and Kubernetes Addon for openstack pods.
While this breakdown is correctly done in the overall platform memory usage log, the process top 10 list for Kubernetes System shows addon processes (e.g. java, autodetect, etc.)

Will hamper debugging of memory usage issues.

Severity
--------
Minor

Steps to Reproduce
------------------
Install 22.12 build, apply an addon package and observe collectd "top 10 memory rss process" logs

Expected Behavior
------------------
Processes in Kubernetes System and Kubernetes Addon should not overlap

Actual Behavior
----------------
Kubernetes System logs contain addon processes

Reproducibility
---------------
Reproducible

System Configuration
--------------------
Kernel: Real-time (low latency)
Hyperthreading: Disabled
Platform cores: 2
Application cores: 34
Labels
kube-cpu-mgr-policy=static
Huge pages: One 1 GB huge page was configured on each processo

Branch/Pull Time/Commit
-----------------------
StarlingX/Master Dec. 19, 2022

Last Pass
---------
The top 10 memory logs were introduced in 22.12

Timestamp/Logs
--------------
2023-01-09T22:25:32.172 controller-0 collectd[153770]: info The top 10 memory rss processes for the Kubernetes System are :[('java', '36.72 GiB'), ('java', '26.87 GiB'), ('java', '4.25 GiB'), ('java', '2.71 GiB'), ('autodetect', '860.24 MiB'), ('java', '826.97 MiB'), ('kube-apiserver', '801
.15 MiB'), ('autodetect', '606.67 MiB'), ('java', '363.57 MiB'), ('metricbeat', '249.55 MiB')]
2023-01-09T22:25:32.172 controller-0 collectd[153770]: info The top 10 memory rss processes Kubernetes Addon are :[('java', '36.70 GiB'), ('java', '26.87 GiB'), ('java', '4.25 GiB'), ('java', '2.71 GiB'), ('autodetect', '860.24 MiB'), ('java', '826.97 MiB'), ('autodetect', '606.67 MiB'), ('java', '363.57 MiB'), ('metricbeat', '251.21 MiB'), ('filebeat', '186.35 MiB')]

Test Activity
-------------
Performance Testing

Workaround
----------
NA

Changed in starlingx:
status: New → In Progress
summary: - collectd top 10 k8s system process list incorrectly has WRA processes
+ collectd top 10 k8s system process list incorrectly has addon processes
description: updated
description: updated
description: updated
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to monitoring (master)

Reviewed: https://review.opendev.org/c/starlingx/monitoring/+/876778
Committed: https://opendev.org/starlingx/monitoring/commit/d5aa0bf73771186c67506cfa766bd12371f5228e
Submitter: "Zuul (22348)"
Branch: master

commit d5aa0bf73771186c67506cfa766bd12371f5228e
Author: cpompeud <email address hidden>
Date: Tue Mar 7 16:26:31 2023 -0300

    Collectd top 10 k8s system process list incorrectly has k8s addon

    This change corrects the process list so that only
    processes from the kube_system are displayed.

    The list was changed from this:
    2023-01-09T22:25:32.172 controller-0 collectd[153770]: info The top
    10 memory rss processes for the Kubernetes System are :
    [('java', '36.72 GiB')
    , ('java', '26.87 GiB')
    , ('java', '4.25 GiB')
    , ('java', '2.71 GiB')
    , ('autodetect', '860.24 MiB')
    , ('java', '826.97 MiB')
    , ('kube-apiserver', '801
    .15 MiB')
    , ('autodetect', '606.67 MiB')
    , ('java', '363.57 MiB')
    , ('metricbeat', '249.55 MiB')
    ]

    To this after this fix was implemented.
    2023-03-07T16:40:49.669 controller-0 collectd[65421]: info The top
    10 memory rss processes for the Kubernetes System are :
    [('kube-apiserver', '609.29 MiB')
    , ('kube-controller', '137.29 MiB')
    , ('helm-controller', '93.80 MiB')
    , ('uwsgi', '88.61 MiB')
    , ('uwsgi', '88.60 MiB')
    , ('uwsgi', '88.60 MiB')
    , ('uwsgi', '88.55 MiB')
    , ('cephcsi', '81.06 MiB')
    , ('cephcsi', '80.25 MiB')
    , ('source-controll', '79.47 MiB')
    ]

    Closes-Bug: 2009877

    Test Plan:

    PASS: Build an image, install and bootstrap successfully
    PASS: Apply monitor pods so addon logs would be installed.
    PASS: Ensure only Kubernetes System processes are displayed in the
    top 10 Kubernetes System list.

    Signed-off-by: cpompeud <email address hidden>
    Change-Id: I1361de835003fdaa7f70941f83b9dd79bfe75c60

Changed in starlingx:
status: In Progress → Fix Released
Bruce Jones (bejones)
tags: added: stx.9.0 stx.monitor
Ghada Khalil (gkhalil)
Changed in starlingx:
importance: Undecided → Low
assignee: nobody → Cesar Bombonate (cpompeud)
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.