StarlingX

Bug #1837426
Comment #5

Comment 5 for bug 1837426

Revision history for this message

Al Bailey (albailey1974) wrote on 2019-08-14:

The investigation has shown several things.

High number of nginx threads (this has been fixed through another launchpad)
High number of rabbitmq threads (this has been fixed through other launchpads)
radosgw had high threads (this was fixed through a fix to make this optional)

At the moment there are no specific high-runner processes.

However, there are many short lived processes, and the load average and occupancy are both high on the two platform CPUs (0 and 1)

There are also many short lived processes related to kubernetes metrics (these cannot be disabled) and OCF scripts (these can perhaps be improved, but will likely not have much impact).

Gerry experimented with disabling the readiness/liveness probes for the openstack components and the load dropped significantly. It appears that rabbit is the most expensive of these probes.

For bare metal, the OCF script for rabbitmq runs every 20 seconds, but containerized probes run two equivalent rabbit status commands every 10 seconds.