Very high platform CPU usage on AIO-DX active controller with stx-openstack installed

Bug #1837426 reported by Bart Wensley
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
High
Gerry Kopec

Bug Description

Brief Description
-----------------
After installing an AIO-DX (two node) system (including stx-openstack application), the platform CPU usage on the active controller is very high (90-100%) resulting in major/critical CPU alarms. This is in steady state with no nova instances launched and no other activity.

Severity
--------
Critical: system shows critical CPU alarms

Steps to Reproduce
------------------
Install an AIO-DX system

Expected Behavior
------------------
Platform CPUs should not be running at critical levels

Actual Behavior
----------------
Platform CPUs are running at critical levels

Reproducibility
---------------
Reproducible. Could be dependant on hardware being used.

System Configuration
--------------------
AIO-DX (two node) system with the following CPU:

[root@controller-1 ~(keystone_admin)]# lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 36
On-line CPU(s) list: 0-35
Thread(s) per core: 1
Core(s) per socket: 18
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 85
Model name: Intel(R) Xeon(R) Gold 6140 CPU @ 2.30GHz
Stepping: 4
CPU MHz: 2300.000
BogoMIPS: 4600.00
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 1024K
L3 cache: 25344K
NUMA node0 CPU(s): 0-17
NUMA node1 CPU(s): 18-35
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch epb cat_l3 cdp_l3 intel_ppin intel_pt ssbd mba ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm mpx rdt_a avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm arat pln pts pku ospke md_clear spec_ctrl intel_stibp flush_l1d

Branch/Pull Time/Commit
-----------------------
Designer load:
BUILD_DATE="2019-07-19 09:53:25 -0500"

Last Pass
---------
I did not see the major/critical CPU alarms (at least to this degree) in the same lab with a load built on July 16th.

Timestamp/Logs
--------------
Collect logs and output from schedtop will be attached.

Test Activity
-------------
Developer testing

Revision history for this message
Bart Wensley (bartwensley) wrote :
Revision history for this message
Bart Wensley (bartwensley) wrote :
Revision history for this message
Bart Wensley (bartwensley) wrote :
Revision history for this message
Ghada Khalil (gkhalil) wrote :

Marking as stx.2.0 - All-in-one systems reporting critical alarms

Changed in starlingx:
importance: Undecided → High
status: New → Triaged
tags: added: stx.2.0 stx.containers
Changed in starlingx:
assignee: nobody → Al Bailey (albailey1974)
Revision history for this message
Al Bailey (albailey1974) wrote :

The investigation has shown several things.

High number of nginx threads (this has been fixed through another launchpad)
High number of rabbitmq threads (this has been fixed through other launchpads)
radosgw had high threads (this was fixed through a fix to make this optional)

At the moment there are no specific high-runner processes.

However, there are many short lived processes, and the load average and occupancy are both high on the two platform CPUs (0 and 1)

There are also many short lived processes related to kubernetes metrics (these cannot be disabled) and OCF scripts (these can perhaps be improved, but will likely not have much impact).

Gerry experimented with disabling the readiness/liveness probes for the openstack components and the load dropped significantly. It appears that rabbit is the most expensive of these probes.

For bare metal, the OCF script for rabbitmq runs every 20 seconds, but containerized probes run two equivalent rabbit status commands every 10 seconds.

Frank Miller (sensfan22)
Changed in starlingx:
assignee: Al Bailey (albailey1974) → Gerry Kopec (gerry-kopec)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to upstream (master)

Fix proposed to branch: master
Review: https://review.opendev.org/677041

Changed in starlingx:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to config (master)

Fix proposed to branch: master
Review: https://review.opendev.org/677043

Revision history for this message
Gerry Kopec (gerry-kopec) wrote :

Implemented changes to increase rabbitmq pod probe period from 10s to 30s per reviews in comments 6 & 7. These should recover about 20% cpu (out of 200%) for the two platform CPUs.

Other suggested areas of investigation:
- Increase period of cinder-volume-usage-audit and heat-engine-cleaner. These currently run every 5 minutes.
- Saw number of erlang beam.smp threads in openstack rabbitmq container drop from 631 after initial install to 151 after subsequent application remove/apply. This corresponded with a decrease in cpu usage for those threads. Commit https://review.opendev.org/#/c/676035/ may address this but that should be confirmed.
- Saw the cpu usage of kubelet process slowly increasing over time (10% to 14% of cpu0&1 over a week).
- Subsequent application-apply's may fail due to nova-db-sync job failing due to system overload and then being unable to create tables on subsequent retries as they already exist. Have to drop the nova databases to recover. Could smooth out compute-kit (libvirt, nova, nova-api-proxy, neutron, placement) startup by not running all charts in parallel.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to upstream (master)

Reviewed: https://review.opendev.org/677041
Committed: https://git.openstack.org/cgit/starlingx/upstream/commit/?id=f84470ad0807d9e7582c5745d2ad96eee4058f17
Submitter: Zuul
Branch: master

commit f84470ad0807d9e7582c5745d2ad96eee4058f17
Author: Gerry Kopec <email address hidden>
Date: Fri Aug 16 14:51:07 2019 -0400

    Update rabbitmq chart to enable probe overrides

    Add variables for initial delay, period and timeout for rabbitmq
    liveness and readiness probes. Default to current upstream settings.

    Do not recommend this for upstreaming to openstack-helm-infra as
    enhancements have been added since the last starlingx rebase to enable
    more generic override of probes. On next rebase of starlingx on
    openstack-helm-infra, recommend refactoring this change based on these
    upstream commits (assuming upstream hasn't done it already):
    https://review.opendev.org/#/c/668710/
    https://review.opendev.org/#/c/631597/

    Partial-Bug: 1837426
    Change-Id: I0a8d8f466c4b8482cc9161d28de37bff6fc7ced3
    Signed-off-by: Gerry Kopec <email address hidden>

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to config (master)

Reviewed: https://review.opendev.org/677043
Committed: https://git.openstack.org/cgit/starlingx/config/commit/?id=bc62558b3767112520b6578cc312bfe17f807b93
Submitter: Zuul
Branch: master

commit bc62558b3767112520b6578cc312bfe17f807b93
Author: Gerry Kopec <email address hidden>
Date: Fri Aug 16 15:55:54 2019 -0400

    Increase rabbitmq pod probe period from 10 to 30s

    To reduce cpu usage on platform cores (especially on AIO), reduce the
    frequency of the rabbitmq readiness and liveness probes from every 10s
    to 30s. These probes both run the command "rabbitmqctl status" which
    seems to have significant cpu impact.

    For reference, the platform rabbitmq process status check runs every
    20s.

    Partial-Bug: 1837426
    Depends-On: https://review.opendev.org/#/c/677041
    Change-Id: Ie8eea35b9ed268f4156d1cdc884a6d5004e87018
    Signed-off-by: Gerry Kopec <email address hidden>

Revision history for this message
Frank Miller (sensfan22) wrote :

Marking LP as Fix Released. The platform cpu usage on the active controller on an idle AIO-DX with stx-openstack applied has been reduced from 90-100% down to under 80%. This was addressed by a series of commits including:
https://review.opendev.org/677043
https://review.opendev.org/677041
https://review.opendev.org/#/c/676035/
https://review.opendev.org/#/c/673218/

Changed in starlingx:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to upstream (r/stx.2.0)

Fix proposed to branch: r/stx.2.0
Review: https://review.opendev.org/678052

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to config (r/stx.2.0)

Fix proposed to branch: r/stx.2.0
Review: https://review.opendev.org/678054

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to upstream (r/stx.2.0)

Reviewed: https://review.opendev.org/678052
Committed: https://git.openstack.org/cgit/starlingx/upstream/commit/?id=138a38449a44be9c81ca13a5d2699f8a8b0fbbfb
Submitter: Zuul
Branch: r/stx.2.0

commit 138a38449a44be9c81ca13a5d2699f8a8b0fbbfb
Author: Gerry Kopec <email address hidden>
Date: Fri Aug 16 14:51:07 2019 -0400

    Update rabbitmq chart to enable probe overrides

    Add variables for initial delay, period and timeout for rabbitmq
    liveness and readiness probes. Default to current upstream settings.

    Do not recommend this for upstreaming to openstack-helm-infra as
    enhancements have been added since the last starlingx rebase to enable
    more generic override of probes. On next rebase of starlingx on
    openstack-helm-infra, recommend refactoring this change based on these
    upstream commits (assuming upstream hasn't done it already):
    https://review.opendev.org/#/c/668710/
    https://review.opendev.org/#/c/631597/

    Partial-Bug: 1837426
    Change-Id: I0a8d8f466c4b8482cc9161d28de37bff6fc7ced3
    Signed-off-by: Gerry Kopec <email address hidden>
    (cherry picked from commit f84470ad0807d9e7582c5745d2ad96eee4058f17)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to config (r/stx.2.0)

Reviewed: https://review.opendev.org/678054
Committed: https://git.openstack.org/cgit/starlingx/config/commit/?id=1e4df889928466fc61b8330ada1eade1bb872acb
Submitter: Zuul
Branch: r/stx.2.0

commit 1e4df889928466fc61b8330ada1eade1bb872acb
Author: Gerry Kopec <email address hidden>
Date: Fri Aug 16 15:55:54 2019 -0400

    Increase rabbitmq pod probe period from 10 to 30s

    To reduce cpu usage on platform cores (especially on AIO), reduce the
    frequency of the rabbitmq readiness and liveness probes from every 10s
    to 30s. These probes both run the command "rabbitmqctl status" which
    seems to have significant cpu impact.

    For reference, the platform rabbitmq process status check runs every
    20s.

    Partial-Bug: 1837426
    Depends-On: https://review.opendev.org/#/c/678052
    Change-Id: Ie8eea35b9ed268f4156d1cdc884a6d5004e87018
    Signed-off-by: Gerry Kopec <email address hidden>
    (cherry picked from commit bc62558b3767112520b6578cc312bfe17f807b93)

Ghada Khalil (gkhalil)
tags: added: in-r-stx20
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.