Juju model which includes Pebble Charms could not be deleted.

Bug #1972712 reported by gulsum atici
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Canonical Juju
Expired
High
Unassigned

Bug Description

Hello,

juju version: 2.9.22

The all juju models are deleted by using below command:

juju destroy-controller --release-storage --destroy-all-models -y osm-vca

However after tried several times, it's reported that 2 pebble charms could not be deleted.

~/osm-repo/devops/installers$ juju status
Model Controller Cloud/Region Version SLA Timestamp Notes
osm osm-vca k8s-cloud/localhost 2.9.22 unsupported 14:26:35+03:00 attempt 23 to destroy model failed (will retry): model not empty, found 2 applications (model not empty)

App Version Status Scale Charm Channel Rev Address Exposed Message
kafka active 0/1 kafka-k8s edge 5 10.152.183.144 no
keystone active 0/1 osm-keystone edge 4 10.152.183.102 no

They could only be deleted after giving --force --no-wait parameter.

~/osm-repo/devops/installers$ juju destroy-controller --release-storage --destroy-all-models -y osm-vca --force --no-wait
Destroying controller
Waiting for hosted model resources to be reclaimed
Waiting for 1 model, 2 applications
Waiting for 1 model, 2 applications

The full logs are here:

https://pastebin.ubuntu.com/p/fDk69kNS6C/

Could you please check it?

Thanks,

Revision history for this message
Juan M. Tirado (tiradojm) wrote :

Is this happening all the time? Can it be reproduced?

Changed in juju:
status: New → Triaged
Revision history for this message
gulsum atici (gatici) wrote :

Hello Juan,

It's not happening all the time. We encounter it sometimes. Today, we also had the same problem with Juju 2.9.29.
OSM is deployed in a model and while destroying the OSM model it stuck in removal process.

Revision history for this message
Mark Beierl (mbeierl) wrote :

This is a bug recreation scenario:

JUJU_VERSION=2.9
JUJU_AGENT_VERSION=2.9.29
MICROK8S_VERSION=1.23
MY_IP="10.0.2.241"

sudo snap install microk8s --classic --channel=${MICROK8S_VERSION}/stable
sudo snap install juju --classic --channel=$JUJU_VERSION/stable
microk8s enable storage dns
echo "${MY_IP}-${MY_IP}" | microk8s.enable metallb

juju bootstrap microk8s osm-vca --config controller-service-type=loadbalancer --agent-version=$JUJU_AGENT_VERSION
juju add-model osm microk8s

juju deploy kafka-k8s --channel latest/edge

juju destroy-model osm -y --destroy-storage

and then wait for it to time out :(

Revision history for this message
Mark Beierl (mbeierl) wrote :

Oops. Posted that too soon. The model eventually did go away. Working on an update

Revision history for this message
gulsum atici (gatici) wrote :
Download full text (9.1 KiB)

Here are some logs regarding with unterminated applications:

gatici@gaticipc:~/OSM/osm-packages/charm-packages/temp/simple/src$ juju status
Model Controller Cloud/Region Version SLA Timestamp Notes
osm osm-vca microk8s/localhost 2.9.22 unsupported 22:21:56+03:00 attempt 23 to destroy model failed (will retry): model not empty, found 2 applications (model not empty)

App Version Status Scale Charm Channel Rev Address Exposed Message
kafka active 0/1 kafka-k8s edge 5 10.152.183.5 no
keystone active 0/1 osm-keystone edge 4 10.152.183.78 no

###################

Kafka juju debug-log:

unit-kafka-0: 22:00:53 INFO juju.worker.uniter.operation ran "update-status" hook (via hook dispatching script: dispatch)
unit-kafka-0: 22:02:55 WARNING unit.kafka/0.juju-log kafka:19: Invalid Prometheus alert rules folder at /var/lib/juju/agents/unit-kafka-0/charm/src/prometheus_alert_rules: directory does not exist
unit-kafka-0: 22:02:56 INFO juju.worker.caasunitterminationworker terminating due to SIGTERM
unit-kafka-0: 22:02:56 INFO juju.worker.uniter.operation ran "kafka-relation-departed" hook (via hook dispatching script: dispatch)
unit-kafka-0: 22:02:59 WARNING unit.kafka/0.juju-log Invalid Prometheus alert rules folder at /var/lib/juju/agents/unit-kafka-0/charm/src/prometheus_alert_rules: directory does not exist
unit-kafka-0: 22:02:59 INFO juju.worker.uniter.operation ran "stop" hook (via hook dispatching script: dispatch)
application-pla: 22:02:59 INFO juju.worker.caasoperator.uniter.pla/0.operation ran "kafka-relation-departed" hook (via hook dispatching script: dispatch)
unit-kafka-0: 22:02:59 INFO juju.worker.uniter unit "kafka/0" shutting down: agent should be terminated
application-ro: 22:02:59 INFO juju.worker.caasoperator.uniter.ro/6.operation ran "kafka-relation-departed" hook (via hook dispatching script: dispatch)
application-mon: 22:02:59 INFO juju.worker.caasoperator.uniter.mon/1.operation ran "kafka-relation-departed" hook (via hook dispatching script: dispatch)
application-lcm: 22:03:00 INFO juju.worker.caasoperator.uniter.lcm/0.operation ran "kafka-relation-departed" hook (via hook dispatching script: dispatch)
application-nbi: 22:03:00 INFO juju.worker.caasoperator.uniter.nbi/0.operation ran "kafka-relation-departed" hook (via hook dispatching script: dispatch)
application-pol: 22:03:00 INFO juju.worker.caasoperator.uniter.pol/0.operation ran "kafka-relation-departed" hook (via hook dispatching script: dispatch)
unit-kafka-0: 22:03:02 ERROR juju.worker.uniter pebble poll failed for container "kafka": failed to get pebble info: cannot obtain system details: cannot communicate with server: Get "http://localhost/v1/system-info": dial unix /charm/containers/kafka/pebble.socket: connect: no such file or directory
application-ro: 22:03:06 INFO juju.worker.caasoperator.uniter.ro/6.operation ran "kafka-relation-broken" hook (via hook dispatching script: dispatch)
unit-kafka-0: 22:03:07 ERROR juju.worker.uniter pebble poll failed for container "kafka": failed to get pebble info: cannot obtain system details: cannot communic...

Read more...

Changed in juju:
milestone: none → 2.9.31
Revision history for this message
Mark Beierl (mbeierl) wrote :

Paste the following commands all at once:

juju add-model osm microk8s

juju deploy kafka-k8s --channel latest/edge kafka
juju deploy zookeeper-k8s --channel latest/edge zookeeper
juju deploy charmed-osm-mariadb-k8s mariadb
juju deploy mongodb-k8s --channel latest/stable mongodb
juju deploy osm-prometheus --channel latest/edge prometheus
juju deploy osm-keystone --channel latest/edge --resource keystone-image=opensourcemano/keystone:testing-daily keystone
juju deploy osm-nbi --channel latest/edge --resource image=opensourcemano/nbi:testing-daily
juju deploy osm-mon --channel latest/edge --resource image=opensourcemano/mon:testing-daily

juju add-relation kafka zookeeper
juju add-relation keystone:db mariadb:mysql
juju add-relation osm-nbi mongodb:database
juju add-relation osm-nbi kafka
juju add-relation osm-nbi keystone
juju add-relation osm-nbi prometheus
juju add-relation osm-mon:mongodb mongodb:database
juju add-relation osm-mon kafka
juju add-relation osm-mon keystone
juju add-relation osm-mon prometheus

Once everything is stable, destroy the model and keystone remains as an application, but everything else is removed:

juju destroy-model osm -y --destroy-storage

Revision history for this message
Harry Pidcock (hpidcock) wrote :

This might be already fixed in 2.9.30 but I'm assigning Ben to validate it since he was working in this area recently.

Changed in juju:
importance: Undecided → High
assignee: nobody → Ben Hoyt (benhoyt)
Revision history for this message
Ben Hoyt (benhoyt) wrote :

I tried this several times on Microk8s using the latest 2.9 commit (a461c98) with the model Mark gave in comment #6, and at first I couldn't reproduce, but then it happened after about the 5th attempt.

So it seems like a different issue to the ones I fixed in my recently-merged commits (which will be included in Juju 2.9.30 coming out this week or early next). I'll continue to investigate this between other items over the next few days.

Revision history for this message
Ben Hoyt (benhoyt) wrote :

Hmmm, I've tried to reproduce this again today using the latest 2.9 commit (1ba597d), and I can't get it to happen at all, trying 25-30 times with different timings. So either I've been "unlucky" today, or something between a461c98 and 1ba597d has fixed it (I scanned the list of commits and nothing jumps out).

Mark, can you please upgrade to the latest 2.9.30 (currently available in the snap as latest/candidate, but should be stable soon) and try to reproduce this again?

Revision history for this message
Mark Beierl (mbeierl) wrote :

JUJU_VERSION=2.9
JUJU_AGENT_VERSION=2.9.30
MICROK8S_VERSION=1.23
MY_IP="10.0.2.92"

sudo snap install microk8s --classic --channel=${MICROK8S_VERSION}/stable
sudo snap install juju --classic --channel=$JUJU_VERSION/stable

sudo usermod -a -G microk8s ubuntu
sudo chown -f -R ubuntu ~/.kube
sg microk8s microk8s enable storage dns
echo "${MY_IP}-${MY_IP}" | sg microk8s.enable metallb

juju bootstrap microk8s osm-vca --config controller-service-type=loadbalancer --agent-version=$JUJU_AGENT_VERSION

===========================================================
And then ran the steps in comment #6.

It took a very long time for pebble to start, and now three units are stuck "waiting for container":

kafka/0* active idle 10.1.173.8
keystone/0* active idle 10.1.173.14
mariadb/0* waiting idle waiting for container
mongodb/0* active idle 10.1.173.15 27017/TCP
osm-mon/0* active idle 10.1.173.20 8000/TCP ready
osm-nbi/0* waiting idle waiting for container
prometheus/0* waiting idle waiting for container
zookeeper/0* active idle 10.1.173.10

microk8s describe pod doesn't give much more info:

Containers:
  osm-nbi:
    Container ID:
    Image: opensourcemano/nbi:testing-daily
    Image ID:
    Port: 9999/TCP
    Host Port: 0/TCP
    State: Waiting
      Reason: PodInitializing
    Ready: False
    Restart Count: 0

Revision history for this message
Ben Hoyt (benhoyt) wrote :

Thanks, Mark. A couple of questions:

* This is a different issue than above, right: the issue above is a teardown problem, but the issue you just posted is a deployment problem, correct?
* I think this must have been 2.9.29, which is currently the latest in the Snap store, so the "snap install" wouldn't have been able to install 2.9.30. (I wonder what the --agent-version=2.9.30 does in that case).

John A Meinel (jameinel)
Changed in juju:
milestone: 2.9.31 → 2.9.32
status: Triaged → In Progress
Revision history for this message
Mark Beierl (mbeierl) wrote :

Sorry, as my commands pasted were wrong. I installed juju with this:

sudo snap install juju --classic --channel=$JUJU_VERSION/candidate

Confirming it was 2.9.30:

Model Controller Cloud/Region Version SLA Timestamp
osm osm-vca microk8s/localhost 2.9.30 unsupported 18:55:27Z

Revision history for this message
Ben Hoyt (benhoyt) wrote :

Hi Mark, unfortunately I haven't been able to reproduce this again (either the original teardown issue or the deployment issue you mentioned in comment #10). Would you please be able to try to reproduce the teardown issue, and when you have, send the output of `JUJU_DEV_FEATURE_FLAGS=developer-mode juju dump-db` as well as of `juju debug-log -n1000` and any relevant output from kubectl (for example, describe pods in the namespace)? That would help us get to the bottom of this.

Revision history for this message
Mark Beierl (mbeierl) wrote :

Thanks, Ben. I have tried with 2.9.31 now, and it also appears that I cannot reproduce it.

John A Meinel (jameinel)
Changed in juju:
milestone: 2.9.32 → 2.9-next
status: In Progress → Incomplete
Revision history for this message
Harry Pidcock (hpidcock) wrote :

Please raise this bug again if you encounter it.

Changed in juju:
assignee: Ben Hoyt (benhoyt) → nobody
Harry Pidcock (hpidcock)
Changed in juju:
milestone: 2.9-next → none
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for Canonical Juju because there has been no activity for 60 days.]

Changed in juju:
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.