Bug #1972712 “Juju model which includes Pebble Charms could not ...” : Bugs : Canonical Juju

Revision history for this message

Juan M. Tirado (tiradojm) wrote on 2022-05-10:

#1

Is this happening all the time? Can it be reproduced?

Changed in juju:
status:	New → Triaged

Revision history for this message

gulsum atici (gatici) wrote on 2022-05-10:

#2

Hello Juan,

It's not happening all the time. We encounter it sometimes. Today, we also had the same problem with Juju 2.9.29.
OSM is deployed in a model and while destroying the OSM model it stuck in removal process.

Revision history for this message

Mark Beierl (mbeierl) wrote on 2022-05-10:

#3

This is a bug recreation scenario:

JUJU_VERSION=2.9
JUJU_AGENT_VERSION=2.9.29
MICROK8S_VERSION=1.23
MY_IP="10.0.2.241"

sudo snap install microk8s --classic --channel=${MICROK8S_VERSION}/stable
sudo snap install juju --classic --channel=$JUJU_VERSION/stable
microk8s enable storage dns
echo "${MY_IP}-${MY_IP}" | microk8s.enable metallb

juju bootstrap microk8s osm-vca --config controller-service-type=loadbalancer --agent-version=$JUJU_AGENT_VERSION
juju add-model osm microk8s

juju deploy kafka-k8s --channel latest/edge

juju destroy-model osm -y --destroy-storage

and then wait for it to time out :(

Revision history for this message

Mark Beierl (mbeierl) wrote on 2022-05-11:

#4

Oops. Posted that too soon. The model eventually did go away. Working on an update

Revision history for this message

gulsum atici (gatici) wrote on 2022-05-11:

#5

Download full text (9.1 KiB)

Here are some logs regarding with unterminated applications:

gatici@gaticipc:~/OSM/osm-packages/charm-packages/temp/simple/src$ juju status
Model Controller Cloud/Region Version SLA Timestamp Notes
osm osm-vca microk8s/localhost 2.9.22 unsupported 22:21:56+03:00 attempt 23 to destroy model failed (will retry): model not empty, found 2 applications (model not empty)

App Version Status Scale Charm Channel Rev Address Exposed Message
kafka active 0/1 kafka-k8s edge 5 10.152.183.5 no
keystone active 0/1 osm-keystone edge 4 10.152.183.78 no

###################

Kafka juju debug-log:

unit-kafka-0: 22:00:53 INFO juju.worker.uniter.operation ran "update-status" hook (via hook dispatching script: dispatch)
unit-kafka-0: 22:02:55 WARNING unit.kafka/0.juju-log kafka:19: Invalid Prometheus alert rules folder at /var/lib/juju/agents/unit-kafka-0/charm/src/prometheus_alert_rules: directory does not exist
unit-kafka-0: 22:02:56 INFO juju.worker.caasunitterminationworker terminating due to SIGTERM
unit-kafka-0: 22:02:56 INFO juju.worker.uniter.operation ran "kafka-relation-departed" hook (via hook dispatching script: dispatch)
unit-kafka-0: 22:02:59 WARNING unit.kafka/0.juju-log Invalid Prometheus alert rules folder at /var/lib/juju/agents/unit-kafka-0/charm/src/prometheus_alert_rules: directory does not exist
unit-kafka-0: 22:02:59 INFO juju.worker.uniter.operation ran "stop" hook (via hook dispatching script: dispatch)
application-pla: 22:02:59 INFO juju.worker.caasoperator.uniter.pla/0.operation ran "kafka-relation-departed" hook (via hook dispatching script: dispatch)
unit-kafka-0: 22:02:59 INFO juju.worker.uniter unit "kafka/0" shutting down: agent should be terminated
application-ro: 22:02:59 INFO juju.worker.caasoperator.uniter.ro/6.operation ran "kafka-relation-departed" hook (via hook dispatching script: dispatch)
application-mon: 22:02:59 INFO juju.worker.caasoperator.uniter.mon/1.operation ran "kafka-relation-departed" hook (via hook dispatching script: dispatch)
application-lcm: 22:03:00 INFO juju.worker.caasoperator.uniter.lcm/0.operation ran "kafka-relation-departed" hook (via hook dispatching script: dispatch)
application-nbi: 22:03:00 INFO juju.worker.caasoperator.uniter.nbi/0.operation ran "kafka-relation-departed" hook (via hook dispatching script: dispatch)
application-pol: 22:03:00 INFO juju.worker.caasoperator.uniter.pol/0.operation ran "kafka-relation-departed" hook (via hook dispatching script: dispatch)
unit-kafka-0: 22:03:02 ERROR juju.worker.uniter pebble poll failed for container "kafka": failed to get pebble info: cannot obtain system details: cannot communicate with server: Get "http://localhost/v1/system-info": dial unix /charm/containers/kafka/pebble.socket: connect: no such file or directory
application-ro: 22:03:06 INFO juju.worker.caasoperator.uniter.ro/6.operation ran "kafka-relation-broken" hook (via hook dispatching script: dispatch)
unit-kafka-0: 22:03:07 ERROR juju.worker.uniter pebble poll failed for container "kafka": failed to get pebble info: cannot obtain system details: cannot communic...

Here are some logs regarding with unterminated applications:

gatici@gaticipc:~/OSM/osm-packages/charm-packages/temp/simple/src$ juju status
Model  Controller  Cloud/Region        Version  SLA          Timestamp       Notes
osm    osm-vca     microk8s/localhost  2.9.22   unsupported  22:21:56+03:00  attempt 23 to destroy model failed (will retry):  model not empty, found 2 applications (model not empty)

App       Version  Status  Scale  Charm         Channel  Rev  Address        Exposed  Message
kafka              active    0/1  kafka-k8s     edge       5  10.152.183.5   no       
keystone           active    0/1  osm-keystone  edge       4  10.152.183.78  no

###################

Kafka juju debug-log:

unit-kafka-0: 22:00:53 INFO juju.worker.uniter.operation ran "update-status" hook (via hook dispatching script: dispatch)
unit-kafka-0: 22:02:55 WARNING unit.kafka/0.juju-log kafka:19: Invalid Prometheus alert rules folder at /var/lib/juju/agents/unit-kafka-0/charm/src/prometheus_alert_rules: directory does not exist
unit-kafka-0: 22:02:56 INFO juju.worker.caasunitterminationworker terminating due to SIGTERM
unit-kafka-0: 22:02:56 INFO juju.worker.uniter.operation ran "kafka-relation-departed" hook (via hook dispatching script: dispatch)
unit-kafka-0: 22:02:59 WARNING unit.kafka/0.juju-log Invalid Prometheus alert rules folder at /var/lib/juju/agents/unit-kafka-0/charm/src/prometheus_alert_rules: directory does not exist
unit-kafka-0: 22:02:59 INFO juju.worker.uniter.operation ran "stop" hook (via hook dispatching script: dispatch)
application-pla: 22:02:59 INFO juju.worker.caasoperator.uniter.pla/0.operation ran "kafka-relation-departed" hook (via hook dispatching script: dispatch)
unit-kafka-0: 22:02:59 INFO juju.worker.uniter unit "kafka/0" shutting down: agent should be terminated
application-ro: 22:02:59 INFO juju.worker.caasoperator.uniter.ro/6.operation ran "kafka-relation-departed" hook (via hook dispatching script: dispatch)
application-mon: 22:02:59 INFO juju.worker.caasoperator.uniter.mon/1.operation ran "kafka-relation-departed" hook (via hook dispatching script: dispatch)
application-lcm: 22:03:00 INFO juju.worker.caasoperator.uniter.lcm/0.operation ran "kafka-relation-departed" hook (via hook dispatching script: dispatch)
application-nbi: 22:03:00 INFO juju.worker.caasoperator.uniter.nbi/0.operation ran "kafka-relation-departed" hook (via hook dispatching script: dispatch)
application-pol: 22:03:00 INFO juju.worker.caasoperator.uniter.pol/0.operation ran "kafka-relation-departed" hook (via hook dispatching script: dispatch)
unit-kafka-0: 22:03:02 ERROR juju.worker.uniter pebble poll failed for container "kafka": failed to get pebble info: cannot obtain system details: cannot communicate with server: Get "http://localhost/v1/system-info": dial unix /charm/containers/kafka/pebble.socket: connect: no such file or directory
application-ro: 22:03:06 INFO juju.worker.caasoperator.uniter.ro/6.operation ran "kafka-relation-broken" hook (via hook dispatching script: dispatch)
unit-kafka-0: 22:03:07 ERROR juju.worker.uniter pebble poll failed for container "kafka": failed to get pebble info: cannot obtain system details: cannot communicate with server: Get "http://localhost/v1/system-info": dial unix /charm/containers/kafka/pebble.socket: connect: no such file or directory
unit-kafka-0: 22:03:07 INFO juju.worker.logger logger worker stopped
application-mon: 22:03:08 INFO juju.worker.caasoperator.uniter.mon/1.operation ran "kafka-relation-broken" hook (via hook dispatching script: dispatch)
application-pol: 22:03:18 INFO juju.worker.caasoperator.uniter.pol/0.operation ran "kafka-relation-broken" hook (via hook dispatching script: dispatch)
application-pla: 22:03:22 INFO juju.worker.caasoperator.uniter.pla/0.operation ran "kafka-relation-broken" hook (via hook dispatching script: dispatch)
application-nbi: 22:03:24 INFO juju.worker.caasoperator.uniter.nbi/0.operation ran "kafka-relation-broken" hook (via hook dispatching script: dispatch)
application-lcm: 22:03:29 INFO juju.worker.caasoperator.uniter.lcm/0.operation ran "kafka-relation-broken" hook (via hook dispatching script: dispatch)
controller-0: 22:03:35 INFO juju.worker.caasapplicationprovisioner.runner stopped "kafka", err: <nil>

###############

Keystone juju debug-log:

application-mariadb: 21:57:58 INFO unit.mariadb/0.juju-log request -> mysql:6 for app -> keystone
unit-keystone-0: 22:02:24 INFO unit.keystone/0.juju-log Keys saved!
unit-keystone-0: 22:02:24 INFO juju.worker.uniter.operation ran "update-status" hook (via hook dispatching script: dispatch)
unit-keystone-0: 22:02:56 INFO juju.worker.caasunitterminationworker terminating due to SIGTERM
unit-keystone-0: 22:02:57 INFO juju.worker.uniter.operation ran "db-relation-departed" hook (via hook dispatching script: dispatch)
unit-keystone-0: 22:02:59 INFO juju.worker.uniter.operation ran "stop" hook (via hook dispatching script: dispatch)
application-mariadb: 22:02:59 INFO unit.mariadb/0.juju-log mysql:26: request -> mysql:6 for app -> keystone
unit-keystone-0: 22:02:59 INFO juju.worker.uniter unit "keystone/0" shutting down: agent should be terminated
unit-keystone-0: 22:03:03 ERROR juju.worker.uniter pebble poll failed for container "keystone": failed to get pebble info: cannot obtain system details: cannot communicate with server: Get "http://localhost/v1/system-info": dial unix /charm/containers/keystone/pebble.socket: connect: no such file or directory
application-nbi: 22:03:06 INFO juju.worker.caasoperator.uniter.nbi/0.operation ran "keystone-relation-departed" hook (via hook dispatching script: dispatch)
unit-keystone-0: 22:03:08 ERROR juju.worker.uniter pebble poll failed for container "keystone": failed to get pebble info: cannot obtain system details: cannot communicate with server: Get "http://localhost/v1/system-info": dial unix /charm/containers/keystone/pebble.socket: connect: no such file or directory
unit-keystone-0: 22:03:13 ERROR juju.worker.uniter pebble poll failed for container "keystone": failed to get pebble info: cannot obtain system details: cannot communicate with server: Get "http://localhost/v1/system-info": dial unix /charm/containers/keystone/pebble.socket: connect: no such file or directory
application-mon: 22:03:25 INFO juju.worker.caasoperator.uniter.mon/1.operation ran "keystone-relation-departed" hook (via hook dispatching script: dispatch)
application-mon: 22:03:32 INFO juju.worker.caasoperator.uniter.mon/1.operation ran "keystone-relation-broken" hook (via hook dispatching script: dispatch)
application-nbi: 22:03:50 INFO juju.worker.caasoperator.uniter.nbi/0.operation ran "keystone-relation-broken" hook (via hook dispatching script: dispatch)
controller-0: 22:04:07 INFO juju.worker.caasapplicationprovisioner.runner stopped "keystone", err: <nil>

##################

Here is the log of successfully terminated application named lcm:

application-lcm: 21:44:44 INFO juju.worker.caasoperator.uniter.lcm/0.operation ran "update-status" hook (via hook dispatching script: dispatch)
application-lcm: 21:50:19 INFO juju.worker.caasoperator.uniter.lcm/0.operation ran "update-status" hook (via hook dispatching script: dispatch)
application-lcm: 21:55:44 INFO juju.worker.caasoperator.uniter.lcm/0.operation ran "update-status" hook (via hook dispatching script: dispatch)
application-lcm: 22:00:31 INFO juju.worker.caasoperator.uniter.lcm/0.operation ran "update-status" hook (via hook dispatching script: dispatch)
application-lcm: 22:03:00 INFO juju.worker.caasoperator.uniter.lcm/0.operation ran "kafka-relation-departed" hook (via hook dispatching script: dispatch)
application-lcm: 22:03:05 INFO juju.worker.caasoperator.uniter.lcm/0.operation ran "ro-relation-departed" hook (via hook dispatching script: dispatch)
application-lcm: 22:03:05 WARNING juju.worker.caasoperator.uniter.lcm/0.operation we should run a leader-deposed hook here, but we can't yet
application-lcm: 22:03:10 INFO juju.worker.caasoperator.uniter.lcm/0.operation ran "leader-settings-changed" hook (via hook dispatching script: dispatch)
application-lcm: 22:03:16 INFO juju.worker.caasoperator.uniter.lcm/0.operation ran "mongodb-relation-departed" hook (via hook dispatching script: dispatch)
application-lcm: 22:03:23 INFO juju.worker.caasoperator.uniter.lcm/0.operation ran "ro-relation-broken" hook (via hook dispatching script: dispatch)
application-lcm: 22:03:29 INFO juju.worker.caasoperator.uniter.lcm/0.operation ran "kafka-relation-broken" hook (via hook dispatching script: dispatch)
application-lcm: 22:03:35 INFO juju.worker.caasoperator.uniter.lcm/0.operation ran "mongodb-relation-broken" hook (via hook dispatching script: dispatch)
application-lcm: 22:03:40 INFO juju.worker.caasoperator.uniter.lcm/0.operation ran "stop" hook (via hook dispatching script: dispatch)
application-lcm: 22:03:43 INFO juju.worker.caasoperator.uniter.lcm/0.operation ran "remove" hook (via hook dispatching script: dispatch)
application-lcm: 22:03:44 INFO juju.worker.caasoperator.uniter.lcm/0 unit "lcm/0" shutting down: unit dead
application-lcm: 22:03:44 INFO juju.worker.caasoperator.runner stopped "lcm/0", err: <nil>
controller-0: 22:03:49 INFO juju.worker.caasapplicationprovisioner.runner stopped "lcm", err: <nil>

Juan M. Tirado (tiradojm) on 2022-05-17

Changed in juju:
milestone:	none → 2.9.31

Revision history for this message

Mark Beierl (mbeierl) wrote on 2022-05-17:

#6

Paste the following commands all at once:

juju add-model osm microk8s

juju deploy kafka-k8s --channel latest/edge kafka
juju deploy zookeeper-k8s --channel latest/edge zookeeper
juju deploy charmed-osm-mariadb-k8s mariadb
juju deploy mongodb-k8s --channel latest/stable mongodb
juju deploy osm-prometheus --channel latest/edge prometheus
juju deploy osm-keystone --channel latest/edge --resource keystone-image=opensourcemano/keystone:testing-daily keystone
juju deploy osm-nbi --channel latest/edge --resource image=opensourcemano/nbi:testing-daily
juju deploy osm-mon --channel latest/edge --resource image=opensourcemano/mon:testing-daily

juju add-relation kafka zookeeper
juju add-relation keystone:db mariadb:mysql
juju add-relation osm-nbi mongodb:database
juju add-relation osm-nbi kafka
juju add-relation osm-nbi keystone
juju add-relation osm-nbi prometheus
juju add-relation osm-mon:mongodb mongodb:database
juju add-relation osm-mon kafka
juju add-relation osm-mon keystone
juju add-relation osm-mon prometheus

Once everything is stable, destroy the model and keystone remains as an application, but everything else is removed:

juju destroy-model osm -y --destroy-storage

Revision history for this message

Harry Pidcock (hpidcock) wrote on 2022-05-17:

#7

This might be already fixed in 2.9.30 but I'm assigning Ben to validate it since he was working in this area recently.

Changed in juju:
importance:	Undecided → High
assignee:	nobody → Ben Hoyt (benhoyt)

Revision history for this message

Ben Hoyt (benhoyt) wrote on 2022-05-19:

#8

I tried this several times on Microk8s using the latest 2.9 commit (a461c98) with the model Mark gave in comment #6, and at first I couldn't reproduce, but then it happened after about the 5th attempt.

So it seems like a different issue to the ones I fixed in my recently-merged commits (which will be included in Juju 2.9.30 coming out this week or early next). I'll continue to investigate this between other items over the next few days.

Revision history for this message

Ben Hoyt (benhoyt) wrote on 2022-05-25:

#9

Hmmm, I've tried to reproduce this again today using the latest 2.9 commit (1ba597d), and I can't get it to happen at all, trying 25-30 times with different timings. So either I've been "unlucky" today, or something between a461c98 and 1ba597d has fixed it (I scanned the list of commits and nothing jumps out).

Mark, can you please upgrade to the latest 2.9.30 (currently available in the snap as latest/candidate, but should be stable soon) and try to reproduce this again?

Revision history for this message

Mark Beierl (mbeierl) wrote on 2022-05-25:

#10

JUJU_VERSION=2.9
JUJU_AGENT_VERSION=2.9.30
MICROK8S_VERSION=1.23
MY_IP="10.0.2.92"

sudo snap install microk8s --classic --channel=${MICROK8S_VERSION}/stable
sudo snap install juju --classic --channel=$JUJU_VERSION/stable

sudo usermod -a -G microk8s ubuntu
sudo chown -f -R ubuntu ~/.kube
sg microk8s microk8s enable storage dns
echo "${MY_IP}-${MY_IP}" | sg microk8s.enable metallb

juju bootstrap microk8s osm-vca --config controller-service-type=loadbalancer --agent-version=$JUJU_AGENT_VERSION

===========================================================
And then ran the steps in comment #6.

It took a very long time for pebble to start, and now three units are stuck "waiting for container":

kafka/0* active idle 10.1.173.8
keystone/0* active idle 10.1.173.14
mariadb/0* waiting idle waiting for container
mongodb/0* active idle 10.1.173.15 27017/TCP
osm-mon/0* active idle 10.1.173.20 8000/TCP ready
osm-nbi/0* waiting idle waiting for container
prometheus/0* waiting idle waiting for container
zookeeper/0* active idle 10.1.173.10

microk8s describe pod doesn't give much more info:

Containers:
  osm-nbi:
    Container ID:
    Image: opensourcemano/nbi:testing-daily
    Image ID:
    Port: 9999/TCP
    Host Port: 0/TCP
    State: Waiting
      Reason: PodInitializing
    Ready: False
    Restart Count: 0

Revision history for this message

Ben Hoyt (benhoyt) wrote on 2022-05-26:

#11

Thanks, Mark. A couple of questions:

* This is a different issue than above, right: the issue above is a teardown problem, but the issue you just posted is a deployment problem, correct?
* I think this must have been 2.9.29, which is currently the latest in the Snap store, so the "snap install" wouldn't have been able to install 2.9.30. (I wonder what the --agent-version=2.9.30 does in that case).

John A Meinel (jameinel) on 2022-05-26

Changed in juju:
milestone:	2.9.31 → 2.9.32
status:	Triaged → In Progress

Revision history for this message

Mark Beierl (mbeierl) wrote on 2022-05-26:

#12

Sorry, as my commands pasted were wrong. I installed juju with this:

sudo snap install juju --classic --channel=$JUJU_VERSION/candidate

Confirming it was 2.9.30:

Model Controller Cloud/Region Version SLA Timestamp
osm osm-vca microk8s/localhost 2.9.30 unsupported 18:55:27Z

Revision history for this message

Ben Hoyt (benhoyt) wrote on 2022-05-30:

#13

Hi Mark, unfortunately I haven't been able to reproduce this again (either the original teardown issue or the deployment issue you mentioned in comment #10). Would you please be able to try to reproduce the teardown issue, and when you have, send the output of `JUJU_DEV_FEATURE_FLAGS=developer-mode juju dump-db` as well as of `juju debug-log -n1000` and any relevant output from kubectl (for example, describe pods in the namespace)? That would help us get to the bottom of this.

Revision history for this message

Mark Beierl (mbeierl) wrote on 2022-06-03:

#14

Thanks, Ben. I have tried with 2.9.31 now, and it also appears that I cannot reproduce it.

John A Meinel (jameinel) on 2022-06-06

Changed in juju:
milestone:	2.9.32 → 2.9-next
status:	In Progress → Incomplete

Revision history for this message

Harry Pidcock (hpidcock) wrote on 2022-06-06:

#15

Please raise this bug again if you encounter it.

Changed in juju:
assignee:	Ben Hoyt (benhoyt) → nobody

Harry Pidcock (hpidcock) on 2023-01-12

Changed in juju:
milestone:	2.9-next → none

Revision history for this message

Launchpad Janitor (janitor) wrote on 2023-03-14:

#16

[Expired for Canonical Juju because there has been no activity for 60 days.]

Changed in juju:
status:	Incomplete → Expired

Canonical Juju

Juju model which includes Pebble Charms could not be deleted.

Bug Description

Other bug subscribers

Remote bug watches