leftover units when upgrading k8s charm with daemon deployment type

Bug #1864396 reported by Evan Hanson
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Canonical Juju
Fix Released
High
Yang Kelvin Liu

Bug Description

When upgrading a charm with deployment type "daemon", Juju leaves
terminated units behind.

This is after an upgrade with the `juju upgrade-charm` command, and
nothing has been done "behind the scenes" with `kubectl`:

    Model Controller Cloud/Region Version SLA Timestamp
    fluent-bit catalyst-nz-hlz-1 catalyst-magnum/default 2.8-beta1 unsupported 13:11:03+13:00

    App Version Status Scale Charm Store Rev OS Address Notes
    fluent-bit-k8s active 4/1 fluent-bit-k8s local 2 kubernetes 10.254.233.28

    Unit Workload Agent Address Ports Message
    fluent-bit-k8s/1* terminated idle 192.168.1.28 2020/TCP unit stopped by the cloud
    fluent-bit-k8s/2 terminated idle 192.168.2.31 2020/TCP unit stopped by the cloud
    fluent-bit-k8s/3 terminated idle 192.168.4.43 2020/TCP unit stopped by the cloud
    fluent-bit-k8s/4 terminated idle 192.168.3.51 2020/TCP unit stopped by the cloud
    fluent-bit-k8s/5 active idle 192.168.1.29 2020/TCP
    fluent-bit-k8s/6 active idle 192.168.4.44 2020/TCP
    fluent-bit-k8s/7 active idle 192.168.2.32 2020/TCP
    fluent-bit-k8s/8 active idle 192.168.3.52 2020/TCP

These terminated units accumulate over time and continue to run the juju
agent software. So, if this application were upgraded again, there would
be eight terminated units and four running ones. This is similar to [1],
but from the other side of things.

I'm not sure whether possible to remove the units without resorting to
`kubectl`, since attempts to use the `remove-unit` command are refused:

    $ juju remove-unit fluent-bit-k8s/1
    ERROR k8s models do not support removing named units.
    Instead specify an application with --num-units (defaults to 1).

    $ juju remove-unit fluent-bit-k8s --num-units=4
    ERROR cannot remove more units than currently exist not valid

    $ juju remove-unit fluent-bit-k8s --num-units=1
    scaling down to 0 units

I guess the behaviour of the second two commands have something to do
with [2], and the first one makes sense to refuse as well, this is just
to indicate that these units may be stuck in "terminated" if the user
wants to stick to the juju CLI and not go behind its back.

[1]: https://bugs.launchpad.net/juju/+bug/1864394
[2]: https://bugs.launchpad.net/juju/+bug/1864395

Ian Booth (wallyworld)
Changed in juju:
milestone: none → 2.8-beta1
status: New → Triaged
importance: Undecided → High
Revision history for this message
Yang Kelvin Liu (kelvin.liu) wrote :

https://github.com/juju/juju/pull/11276 will be landed to 2.8 to fix this bug.

Changed in juju:
assignee: nobody → Yang Kelvin Liu (kelvin.liu)
status: Triaged → In Progress
Changed in juju:
status: In Progress → Fix Committed
Harry Pidcock (hpidcock)
Changed in juju:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.