Canonical Juju

Bug #1768064
Activity log

Activity log for bug #1768064

Date	Who	What changed	Old value	New value	Message
2018-04-30 15:38:51	Dmitriy Kropivnitskiy	bug			added bug
2018-04-30 15:39:30	Dmitriy Kropivnitskiy	description	Looks like there is some basic "order of actions" bug in Juju where when it is trying to terminate multiple AWS instances. I have seen this happen with both destroy-model command and remove-unit command. It seems that the instance gets terminated before juju marks the machine as stopped (I can observe the instance being terminated in AWS console and the machine is marked as "started" in juju status) resulting in juju repeatedly trying to communicate with a dead instance. As a result shutting down even a single instance takes a long time, since juju does a lot of retries. There are a few specifics to my setup that should be noted. I am using an existing VPC, so I have bootstrapped my controller via vpc-id-force=true. I have set multiple spaces (two actually, public and private) and my machines are spread between them (this does not seem to make any difference though, the issue I am describing seems to happen to machines in either space). Not sure if this matters or not, but I am using "instance-type" constraints. Juju version is 2.3.7 on both controller and the model. The model I am using is as follows, 1 machine is a t2.small that runs easyrsa and kubernetes-master and 3 machines are t2.large running 3 units of etcd and 3 units of kubernetes-worker. And everything is tied together with flannel. Latest charms from "containers" for everything. This should be fairly easy to replicate, but once I am done bringing my cluster back up, I will try to create a minimal repeatable setup for this issue.	Looks like there is some basic "order of actions" bug in Juju when it is trying to terminate multiple AWS instances. I have seen this happen with both destroy-model command and remove-unit command. It seems that the instance gets terminated before juju marks the machine as stopped (I can observe the instance being terminated in AWS console and the machine is marked as "started" in juju status) resulting in juju repeatedly trying to communicate with a dead instance. As a result shutting down even a single instance takes a long time, since juju does a lot of retries. There are a few specifics to my setup that should be noted. I am using an existing VPC, so I have bootstrapped my controller via vpc-id-force=true. I have set multiple spaces (two actually, public and private) and my machines are spread between them (this does not seem to make any difference though, the issue I am describing seems to happen to machines in either space). Not sure if this matters or not, but I am using "instance-type" constraints. Juju version is 2.3.7 on both controller and the model. The model I am using is as follows, 1 machine is a t2.small that runs easyrsa and kubernetes-master and 3 machines are t2.large running 3 units of etcd and 3 units of kubernetes-worker. And everything is tied together with flannel. Latest charms from "containers" for everything. This should be fairly easy to replicate, but once I am done bringing my cluster back up, I will try to create a minimal repeatable setup for this issue.
2018-07-09 23:31:38	Anastasia	bug task added		juju
2018-07-09 23:31:43	Anastasia	bug task deleted	juju-core
2018-07-10 12:14:37	Anastasia	juju: status	New	Triaged
2018-07-10 12:14:40	Anastasia	juju: importance	Undecided	Medium
2018-07-10 12:14:47	Anastasia	tags		usability
2022-11-03 16:51:45	Canonical Juju QA Bot	juju: importance	Medium	Low
2022-11-03 16:51:46	Canonical Juju QA Bot	tags	usability	expirebugs-bot usability