Juju incorrectly thinks K8s unit still exists
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Canonical Juju |
Expired
|
High
|
Unassigned |
Bug Description
An example can be found here:
https:/
Notice that in the `Deploy Kubeflow` step, these messages get repeated until `juju-wait` times out:
DEBUG:
DEBUG:
Then under the `Debug failures` step, `juju status` shows this output:
dex-auth/2 terminated failed 10.1.89.52 5556/TCP unit stopped by the cloud
dex-auth/4* active idle 10.1.89.54 5556/TCP
Meanwhile, listing all pods in microk8s shows only this dex-auth pod:
kubeflow dex-auth-
So Juju seems to think that dex-auth/2 is still around, even though it isn't.
Changed in juju: | |
milestone: | none → 2.8.5 |
status: | New → Triaged |
importance: | Undecided → High |
Changed in juju: | |
assignee: | nobody → Thomas Miller (tlmiller) |
status: | Triaged → In Progress |
Changed in juju: | |
milestone: | 2.8.5 → 2.8.6 |
Changed in juju: | |
milestone: | 2.8.6 → 2.8.7 |
Changed in juju: | |
milestone: | 2.8.7 → none |
Hey Ken,
I have been looking into the bug. I have tracked the problem down to a reconciliation problem in Juju where it looks at how many units where around V what is should have and then puts the extra units into this removed state that required manual cleanup. I am working with Ian now to figure out what the correct logic should be and then will submit the PR.
Cheers
Tom