I've ended up with a failed k8s workload charm unit in a different model to that in LP:1882146. However, there are differences.
1) The error is:
application-mattermost: 2020-06-08 21:37:32 DEBUG juju.worker.leadership mattermost/15 waiting for mattermost leadership release gave err: error blocking on leadership release: connection is shut down
application-mattermost: 2020-06-08 21:37:32 DEBUG juju.worker.caasoperator killing "mattermost/15"
application-mattermost: 2020-06-08 21:37:32 INFO juju.worker.caasoperator stopped "mattermost/15", err: leadership failure: error making a leadership claim: connection is shut down
application-mattermost: 2020-06-08 21:37:32 DEBUG juju.worker.caasoperator "mattermost/15" done: leadership failure: error making a leadership claim: connection is shut down
application-mattermost: 2020-06-08 21:37:32 ERROR juju.worker.caasoperator exited "mattermost/15": leadership failure: error making a leadership claim: connection is shut down
application-mattermost: 2020-06-08 21:37:32 DEBUG juju.worker.caasoperator no restart, removing "mattermost/15" from known workers
application-mattermost: 2020-06-08 21:37:40 DEBUG juju.worker.uniter starting uniter for "mattermost/15"
application-mattermost: 2020-06-08 21:37:40 DEBUG juju.worker.caasoperator start "mattermost/15"
application-mattermost: 2020-06-08 21:37:40 INFO juju.worker.caasoperator start "mattermost/15"
application-mattermost: 2020-06-08 21:37:40 DEBUG juju.worker.caasoperator "mattermost/15" started
application-mattermost: 2020-06-08 21:37:40 DEBUG juju.worker.leadership mattermost/15 making initial claim for mattermost leadership
application-mattermost: 2020-06-08 21:37:40 INFO juju.worker.uniter unit "mattermost/15" started
application-mattermost: 2020-06-08 21:37:50 INFO juju.worker.leadership mattermost leadership for mattermost/15 denied
application-mattermost: 2020-06-08 21:37:50 DEBUG juju.worker.leadership mattermost/15 is not mattermost leader
application-mattermost: 2020-06-08 21:37:50 DEBUG juju.worker.leadership mattermost/15 waiting for mattermost leadership release
application-mattermost: 2020-06-08 21:37:51 INFO juju.worker.uniter unit "mattermost/15" shutting down: open /var/lib/juju/agents/unit-mattermost-15/charm/metadata.yaml: no such file or directory
application-mattermost: 2020-06-08 21:37:51 DEBUG juju.worker.uniter.remotestate got leadership change for mattermost/15: leader
application-mattermost: 2020-06-08 21:37:51 INFO juju.worker.caasoperator stopped "mattermost/15", err: open /var/lib/juju/agents/unit-mattermost-15/charm/metadata.yaml: no such file or directory
application-mattermost: 2020-06-08 21:37:51 DEBUG juju.worker.caasoperator "mattermost/15" done: open /var/lib/juju/agents/unit-mattermost-15/charm/metadata.yaml: no such file or directory
application-mattermost: 2020-06-08 21:37:51 ERROR juju.worker.caasoperator exited "mattermost/15": open /var/lib/juju/agents/unit-mattermost-15/charm/metadata.yaml: no such file or directory
application-mattermost: 2020-06-08 21:37:51 INFO juju.worker.caasoperator restarting "mattermost/15" in 3s
application-mattermost: 2020-06-08 21:37:54 INFO juju.worker.caasoperator start "mattermost/15"
application-mattermost: 2020-06-08 21:37:54 DEBUG juju.worker.caasoperator "mattermost/15" started
2) Restarting the controller does not fix the problem.
I thought I'd try copying the charm back into the unit directories on the modeloperator unit, to see what would happen. This triggered a panic in one of the units and resulted in the other two entering a state whereby bouncing the controller did remove the units.
So at least we have a workaround, although since this model is hosting a soon-to-be-production service, it'd be nice not to have to rely on it.
I've ended up with a failed k8s workload charm unit in a different model to that in LP:1882146. However, there are differences.
1) The error is:
application- mattermost: 2020-06-08 21:37:32 DEBUG juju.worker. leadership mattermost/15 waiting for mattermost leadership release gave err: error blocking on leadership release: connection is shut down mattermost: 2020-06-08 21:37:32 DEBUG juju.worker. caasoperator killing "mattermost/15" mattermost: 2020-06-08 21:37:32 INFO juju.worker. caasoperator stopped "mattermost/15", err: leadership failure: error making a leadership claim: connection is shut down mattermost: 2020-06-08 21:37:32 DEBUG juju.worker. caasoperator "mattermost/15" done: leadership failure: error making a leadership claim: connection is shut down mattermost: 2020-06-08 21:37:32 ERROR juju.worker. caasoperator exited "mattermost/15": leadership failure: error making a leadership claim: connection is shut down mattermost: 2020-06-08 21:37:32 DEBUG juju.worker. caasoperator no restart, removing "mattermost/15" from known workers mattermost: 2020-06-08 21:37:40 DEBUG juju.worker.uniter starting uniter for "mattermost/15" mattermost: 2020-06-08 21:37:40 DEBUG juju.worker. caasoperator start "mattermost/15" mattermost: 2020-06-08 21:37:40 INFO juju.worker. caasoperator start "mattermost/15" mattermost: 2020-06-08 21:37:40 DEBUG juju.worker. caasoperator "mattermost/15" started mattermost: 2020-06-08 21:37:40 DEBUG juju.worker. leadership mattermost/15 making initial claim for mattermost leadership mattermost: 2020-06-08 21:37:40 INFO juju.worker.uniter unit "mattermost/15" started mattermost: 2020-06-08 21:37:50 INFO juju.worker. leadership mattermost leadership for mattermost/15 denied mattermost: 2020-06-08 21:37:50 DEBUG juju.worker. leadership mattermost/15 is not mattermost leader mattermost: 2020-06-08 21:37:50 DEBUG juju.worker. leadership mattermost/15 waiting for mattermost leadership release mattermost: 2020-06-08 21:37:51 INFO juju.worker.uniter unit "mattermost/15" shutting down: open /var/lib/ juju/agents/ unit-mattermost -15/charm/ metadata. yaml: no such file or directory mattermost: 2020-06-08 21:37:51 DEBUG juju.worker. uniter. remotestate got leadership change for mattermost/15: leader mattermost: 2020-06-08 21:37:51 INFO juju.worker. caasoperator stopped "mattermost/15", err: open /var/lib/ juju/agents/ unit-mattermost -15/charm/ metadata. yaml: no such file or directory mattermost: 2020-06-08 21:37:51 DEBUG juju.worker. caasoperator "mattermost/15" done: open /var/lib/ juju/agents/ unit-mattermost -15/charm/ metadata. yaml: no such file or directory mattermost: 2020-06-08 21:37:51 ERROR juju.worker. caasoperator exited "mattermost/15": open /var/lib/ juju/agents/ unit-mattermost -15/charm/ metadata. yaml: no such file or directory mattermost: 2020-06-08 21:37:51 INFO juju.worker. caasoperator restarting "mattermost/15" in 3s mattermost: 2020-06-08 21:37:54 INFO juju.worker. caasoperator start "mattermost/15" mattermost: 2020-06-08 21:37:54 DEBUG juju.worker. caasoperator "mattermost/15" started
application-
application-
application-
application-
application-
application-
application-
application-
application-
application-
application-
application-
application-
application-
application-
application-
application-
application-
application-
application-
application-
application-
2) Restarting the controller does not fix the problem.
I thought I'd try copying the charm back into the unit directories on the modeloperator unit, to see what would happen. This triggered a panic in one of the units and resulted in the other two entering a state whereby bouncing the controller did remove the units.
So at least we have a workaround, although since this model is hosting a soon-to- be-production service, it'd be nice not to have to rely on it.