Canonical Juju

Bug #2039418
Comment #1

Comment 1 for bug 2039418

Revision history for this message

Kian Parvin (kian-parvin) wrote on 2023-10-16:

Similarly in our case, we have a k8s charm deployed against the same controller and saw the number of ready pods in the stateful set drop to 0 when the Juju controller primary was changed.

Looking at our Grafana dashboards, almost to the minute when the command `rs.stepDown(120)` was run on the primary, the number of ready pods dropped and then started coming back up 4-5 minutes later. In this case the pods didn't enter a CrashLoopBackoff though.

And this behaviour is identical to what we see when the Juju controllers are restarted as mentioned in https://bugs.launchpad.net/juju/+bug/2036594

Here is the output from `juju debug-log --replay` around the time. I've removed the controller IPs but out of an abundance of caution it's a Canonical only pastebin - https://pastebin.canonical.com/p/hQC53MqGr3/

And finally logs from the charm-init container don't seem all too helpful
$ kubectl logs <unit-0> -n <namespace> -c charm-init
starting containeragent init command