change in podspec causing removing resource and juju units

Bug #1871388 reported by Narinder Gupta
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Canonical Juju
New
Undecided
Unassigned

Bug Description

hi,
 I am working on k8s charm for zookeeper and Kafka stuck into a problem where a change in pod spec causing pods to stop and start which end up deleting a juju unit, resources including persistent volume and creating a new one. As I am using persistent volume so any persistent volume also gets deleted and added again when pod stops and start.

to reproduce the problem.
deploy zookeeper 3 units
juju deploy cs:~narindergupta/charm-k8s-zookeeper-1 -n3
wait for units to be ready might take 5-10 minutes for status update

juju status
Model Controller Cloud/Region Version SLA Timestamp
look microk8s-localhost microk8s/localhost 2.7.5 unsupported 17:15:45Z

App Version Status Scale Charm Store Rev OS Address Notes
zookeeper-k8s rocks.canonical.com:443/k8s... active 3 charm-k8s-zookeeper jujucharms 1 kubernetes 10.152.183.137

Unit Workload Agent Address Ports Message
zookeeper-k8s/0 maintenance idle 10.1.31.14 2888/TCP,2181/TCP,3888/TCP config changing
zookeeper-k8s/1 active idle 10.1.31.15 2888/TCP,2181/TCP,3888/TCP ready Not a Leader
zookeeper-k8s/2* active idle 10.1.31.16 2888/TCP,2181/TCP,3888/TCP ready

enable the ha-mode
juju config zookeeper-k8s ha-mode=true
above command cause a pod spec change and which exhibits the behavior

juju status
Model Controller Cloud/Region Version SLA Timestamp
look microk8s-localhost microk8s/localhost 2.7.5 unsupported 17:24:48Z

App Version Status Scale Charm Store Rev OS Address Notes
zookeeper-k8s rocks.canonical.com:443/k8s... active 3 charm-k8s-zookeeper jujucharms 1 kubernetes 10.152.183.137

Unit Workload Agent Address Ports Message
zookeeper-k8s/3* active idle 10.1.31.17 2888/TCP,2181/TCP,3888/TCP ready
zookeeper-k8s/4 active idle 10.1.31.18 2888/TCP,2181/TCP,3888/TCP ready Not a Leader
zookeeper-k8s/5 active idle 10.1.31.19 2888/TCP,2181/TCP,3888/TCP ready Not a Leader

Thanks and Regards,
Narinder Gupta

Revision history for this message
Narinder Gupta (narindergupta) wrote :

Zookeeper podspec is http://paste.ubuntu.com/p/mqxjxZyBxh/ and adding another unit causes the podspec to change with --servers=%(zookeeper-units)s by adding the number of units. Which causes the zookeeper podspec to change and then pod gets stop and start.

Revision history for this message
Narinder Gupta (narindergupta) wrote :

You can see persisted volume status terminating but actually mounted on the zookeeper. Somehow change in pod spec should not delete

 microk8s.kubectl get pods -n look
NAME READY STATUS RESTARTS AGE
zookeeper-k8s-0 1/1 Running 0 6m54s
zookeeper-k8s-1 1/1 Running 0 7m54s
zookeeper-k8s-2 1/1 Running 0 8m54s
zookeeper-k8s-operator-0 1/1 Running 0 14m
ubuntu@juju-acf53c-default-0:~$
ubuntu@juju-acf53c-default-0:~$
ubuntu@juju-acf53c-default-0:~$ microk8s.kubectl get pv,pvc -n look
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
persistentvolume/pvc-083ecfe5-b2fd-4893-8004-cd41ce1fd5f2 1Gi RWO Delete Bound apps/charm-kafka-k8s-operator-0 microk8s-hostpath 35m
persistentvolume/pvc-4f5b779d-6f43-44ca-b3ba-5482f50472c4 1Gi RWO Delete Bound apps/charm-kafka-k8s-new-operator-0 microk8s-hostpath 30m
persistentvolume/pvc-86ecd602-c542-45c0-95bc-60dc8accfaa3 1Gi RWO Delete Terminating look/database-0ebc24c1-zookeeper-k8s-0 microk8s-hostpath 15m
persistentvolume/pvc-a36387a4-9754-406b-93cf-6ff74982e5e5 1Gi RWO Delete Terminating look/database-0ebc24c1-zookeeper-k8s-2 microk8s-hostpath 14m
persistentvolume/pvc-a43b7628-935d-4fc1-918d-3caf3591fe4f 20Gi RWO Delete Bound controller-microk8s-localhost/storage-controller-0 microk8s-hostpath 38m
persistentvolume/pvc-b9199d70-79ac-43ba-9800-90c91692a538 1Gi RWO Delete Bound look/charm-zookeeper-k8s-operator-0 microk8s-hostpath 15m
persistentvolume/pvc-d264d6e1-b9f6-4691-b725-f307052caa1f 1Gi RWO Delete Terminating look/database-0ebc24c1-zookeeper-k8s-1 microk8s-hostpath 14m

NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
persistentvolumeclaim/charm-zookeeper-k8s-operator-0 Bound pvc-b9199d70-79ac-43ba-9800-90c91692a538 1Gi RWO microk8s-hostpath 15m
persistentvolumeclaim/database-0ebc24c1-zookeeper-k8s-0 Terminating pvc-86ecd602-c542-45c0-95bc-60dc8accfaa3 1Gi RWO microk8s-hostpath 15m
persistentvolumeclaim/database-0ebc24c1-zookeeper-k8s-1 Terminating pvc-d264d6e1-b9f6-4691-b725-f307052caa1f 1Gi RWO microk8s-hostpath 14m
persistentvolumeclaim/database-0ebc24c1-zookeeper-k8s-2 Terminating pvc-a36387a4-9754-406b-93cf-6ff74982e5e5 1Gi RWO microk8s-hostpath 14m

description: updated
Ian Booth (wallyworld)
Changed in juju:
milestone: none → 2.8-rc1
Revision history for this message
Ian Booth (wallyworld) wrote :

When a charm updates the pod spec, that flows through to the StatefulSet used to manage the workload pods. Updating a StatefulSet's PodSpecTemplate will cause k8s to do a rolling update on the replicaset, and thus each pod will be stopped and started. But each pod's PV is reattached.

There was a bug in Juju where this would result in duplicate units appearing in the model. It may be the cause of your problem.

Are you able to try again with either the 2.7.6 candidate snap or 2.8 edge snap?

Revision history for this message
Narinder Gupta (narindergupta) wrote : Re: [Bug 1871388] Re: change in podspec causing removing resource and juju units
Download full text (3.5 KiB)

Ian,
I can confirm that this issue not seen with 2.7.6 candidate and all new
units comes fine with same unit number and persistent volume just get
reattached.

Thanks and Regards,
Narinder Gupta
Canonical, Ltd.
+1.281.736.5150

Ubuntu- Linux for human beings | www.ubuntu.com | www.canonical.com

On Mon, Apr 13, 2020 at 10:50 PM Ian Booth <email address hidden> wrote:

> When a charm updates the pod spec, that flows through to the StatefulSet
> used to manage the workload pods. Updating a StatefulSet's
> PodSpecTemplate will cause k8s to do a rolling update on the replicaset,
> and thus each pod will be stopped and started. But each pod's PV is
> reattached.
>
> There was a bug in Juju where this would result in duplicate units
> appearing in the model. It may be the cause of your problem.
>
> Are you able to try again with either the 2.7.6 candidate snap or 2.8
> edge snap?
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1871388
>
> Title:
> change in podspec causing removing resource and juju units
>
> Status in juju:
> New
>
> Bug description:
> hi,
> I am working on k8s charm for zookeeper and Kafka stuck into a problem
> where a change in pod spec causing pods to stop and start which end up
> deleting a juju unit, resources including persistent volume and creating a
> new one. As I am using persistent volume so any persistent volume also gets
> deleted and added again when pod stops and start.
>
> to reproduce the problem.
> deploy zookeeper 3 units
> juju deploy cs:~narindergupta/charm-k8s-zookeeper-1 -n3
> wait for units to be ready might take 5-10 minutes for status update
>
> juju status
> Model Controller Cloud/Region Version SLA
> Timestamp
> look microk8s-localhost microk8s/localhost 2.7.5 unsupported
> 17:15:45Z
>
> App Version Status Scale Charm
> Store Rev OS Address Notes
> zookeeper-k8s rocks.canonical.com:443/k8s... active 3
> charm-k8s-zookeeper jujucharms 1 kubernetes 10.152.183.137
>
> Unit Workload Agent Address Ports
> Message
> zookeeper-k8s/0 maintenance idle 10.1.31.14
> 2888/TCP,2181/TCP,3888/TCP config changing
> zookeeper-k8s/1 active idle 10.1.31.15
> 2888/TCP,2181/TCP,3888/TCP ready Not a Leader
> zookeeper-k8s/2* active idle 10.1.31.16
> 2888/TCP,2181/TCP,3888/TCP ready
>
>
> enable the ha-mode
> juju config zookeeper-k8s ha-mode=true
> above command cause a pod spec change and which exhibits the behavior
>
> juju status
> Model Controller Cloud/Region Version SLA
> Timestamp
> look microk8s-localhost microk8s/localhost 2.7.5 unsupported
> 17:24:48Z
>
> App Version Status Scale Charm
> Store Rev OS Address Notes
> zookeeper-k8s rocks.canonical.com:443/k8s... active 3
> charm-k8s-zookeeper jujucharms 1 kubernetes 10.152.183.137
>
> Unit Workload Agent Address Ports
> Message
> zookeeper-k8...

Read more...

Ian Booth (wallyworld)
Changed in juju:
milestone: 2.8-rc1 → 2.7.6
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.