inconsistent message on unit teardown

Bug #1979292 reported by Pietro Pasotti
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Canonical Juju
Fix Released
Medium
Nicolas Vinuesa

Bug Description

After doing juju remove-application, the unit status shows 'agent lost' and the message tells me to check status-log, but when I do that I get an error "no status history available".

To reproduce: (on microk8s)

juju deploy something
juju remove-application something
(wait for unknown/lost)
juju show-status-log something
(get error)

Changed in juju:
importance: Undecided → Medium
milestone: none → 3.0.0
status: New → Triaged
Changed in juju:
milestone: 3.0.0 → 3.0.1
Changed in juju:
milestone: 3.0.1 → 3.0.2
Changed in juju:
milestone: 3.0.2 → 3.0.3
Changed in juju:
assignee: nobody → Nicolas Vinuesa (nvinuesa)
Revision history for this message
Nicolas Vinuesa (nvinuesa) wrote :

I've been trying to reproduce this bug with no success.

For the microk8s setup I followed https://juju.is/docs/olm/get-started-with-juju (using the microk8s sections) and only deployed `postgresql-k8s`.
After waiting for the app to be correctly started and in a correct state, I removed the application and waited until the app was fully removed.

```
ubuntu@tutorial-vm:~$ juju bootstrap microk8s tutorial-controller

Creating Juju controller "tutorial-controller" on microk8s/localhost
Bootstrap to Kubernetes cluster identified as microk8s/localhost
Creating k8s resources for controller "controller-tutorial-controller"
Downloading images
Starting controller pod
Bootstrap agent now started
Contacting Juju controller at 10.152.183.27 to verify accessibility...

Bootstrap complete, controller "tutorial-controller" is now available in namespace "controller-tutorial-controller"

Now you can run
 juju add-model <model-name>
to create a new model to deploy k8s workloads.
ubuntu@tutorial-vm:~$ juju add-model tutorial-model
Added 'tutorial-model' model on microk8s/localhost with credential 'microk8s' for user 'admin'
ubuntu@tutorial-vm:~$ juju deploy postgresql-k8s
Located charm "postgresql-k8s" in charm-hub, revision 20
Deploying "postgresql-k8s" from charm-hub charm "postgresql-k8s", revision 20 in channel stable on ubuntu@20.04/stable
ubuntu@tutorial-vm:~$ juju remove-application postgresql-k8s
removing application postgresql-k8s
- will detach storage logs/0
- will detach storage pgdata/1
ubuntu@tutorial-vm:~$ juju show-status-log postgresql-k8s
ERROR "postgresql-k8s" is not a valid name for a unit
```

Since the app was correctly removed, the unit status didn't show anything else, so of course `$ juju show-status-log postgresql-k8s` failed with `ERROR "postgresql-k8s" is not a valid name for a unit`.

I tried reproducing it both with 2.9 and 3.0 versions of juju, in microk8s and lxd localhost.

Am I missing something? What was the exact setup (juju version, microk8s version, application + version) used to reproduce this bug?

Changed in juju:
status: Triaged → Incomplete
Revision history for this message
Jordan Barrett (barrettj12) wrote :

YESSS, I run into this all the time. It's not easy to reproduce but I notice it often when deploying COS Lite to MicroK8s. @nvinuesa maybe try deploying COS Lite or its components (traefik, grafana, loki, etc) to see if you get this.

@nvinuesa I'm also not sure how long you waited between
    $ juju remove-application postgresql-k8s
    $ juju show-status-log postgresql-k8s
Often with k8s applications, the agent shuts down quickly but k8s takes a long time to remove the pod. That intermediate period is where you'll encounter this issue.

I guess since we've intentionally called remove-application here, the error message
    agent lost: check `juju show status-log ...`
is not really useful. Maybe it would be better for the controller to record that we have intentionally removed the application (rather than the agent failing), and give a more useful status message like `pod shutting down`. Or, don't even show the unit anymore in `juju status`.

Revision history for this message
Nicolas Vinuesa (nvinuesa) wrote :

Indeed @barrettj12 I did reproduce it with the cos-lite bundle.
See https://github.com/juju/juju/pull/15116 for a patch

Changed in juju:
status: Incomplete → In Progress
Changed in juju:
milestone: 3.0.3 → 2.9.39
Changed in juju:
status: In Progress → Fix Committed
Changed in juju:
milestone: 2.9.39 → 2.9.42
Changed in juju:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.