high CPU load, controller barely responding
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Canonical Juju |
Incomplete
|
High
|
Joseph Phillips |
Bug Description
On a juju controller running 2.8.9, HA-enabled, we suddenly get high CPU usage on one controller unit, and the controller barely responds, blocking any useful work.
The logs show a lot of:
2021-07-02 12:33:37 ERROR juju.worker.
2021-07-02 12:33:37 ERROR juju.worker.
2021-07-02 12:33:37 ERROR juju.worker.
and:
2021-07-02 12:37:33 ERROR juju.worker.
2021-07-02 12:37:33 ERROR juju.worker.
2021-07-02 12:37:34 ERROR juju.worker.
A restart of the unit with high CPU usage solves the issue.
@jameinel, I left logs on mombin for your attention.
tags: | added: canonical-is-upgrades |
Changed in juju: | |
milestone: | none → 2.9-next |
Changed in juju: | |
status: | Triaged → In Progress |
assignee: | nobody → Simon Richardson (simonrichardson) |
Changed in juju: | |
assignee: | Simon Richardson (simonrichardson) → Harry Pidcock (hpidcock) |
assignee: | Harry Pidcock (hpidcock) → Joseph Phillips (manadart) |
Other symptoms that we were seeing was goroutines growing significantly during that time, and a lot of API calls to Leadership (which could just be worker routines trying to run hooks getting restarted, and trying to reestablish their state).
juju_engine_report was failing to return.