[gce] odd error messages are printed with kill-controller

Bug #1831527 reported by Peter Matulis
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Canonical Juju
Fix Released
Low
Anastasia
2.5
Fix Released
Low
Anastasia
2.6
Fix Released
Low
Anastasia

Bug Description

juju bootstrap --credential gandalf google gce
juju deploy cs:canonical-kubernetes
juju status

Output:

https://paste.ubuntu.com/p/zcJrcxsZ8f/

juju kill-controller --debug -y -t0 gce

Output:

https://paste.ubuntu.com/p/5vNzxKJqRV/

The cloud vendor's dashboard shows all instances have been removed.

Although output is drawn out, destroy-controller works without these errors:

juju destroy-controller -y --destroy-all-models gce

Output:

https://paste.ubuntu.com/p/R9YhQT3pwM/

Revision history for this message
Anastasia (anastasia-macmood) wrote :

I am not sure why you have observed the errors. I am in the process of trying to reproduce what you were seeing...

I've noticed that the instances that were 'not found' but destroyed anyway were on either us-east1-b or us-east1-c. Maybe these zones were more efficient at a time? :)

Anyway, at a cursory glance it looks like some kind of timing issue and the fact that everything gets destroyed successfully in the end is good news. I'll triage this a Low priority.

Changed in juju:
status: New → Triaged
importance: Undecided → Low
assignee: nobody → Anastasia (anastasia-macmood)
Revision history for this message
Anastasia (anastasia-macmood) wrote :

So I got focused on this and am going to propose a fix soon.

The reason that you are seeing it is due to timing and using -t 0 highlights it.
This 'not found' error is not actually an error in this case (hence, the operation succeeds). It happens because we attempt to destroy all Juju side first which involves removing machines and their instances and then without waiting for the completion (because we've specified -t 0), we attempt to destroy cloud instances. In other words, we try to destroy cloud instances twice, almost at the same time, and there is a greater potential that we'd get some instances listed for destruction but they are being marked for destruction at the same time.

The code does try to cater for this and does have a code path that ignores 'not found' errors. Unfortunately, the check that inspects errors is not correct for GCE errors. And this is the bit that I'll correct - this will stop notifying users of errors that are not actually relevant.

I'll also double-check other providers to ensure that similar behavior works.

Changed in juju:
status: Triaged → In Progress
Revision history for this message
Anastasia (anastasia-macmood) wrote :
Revision history for this message
Anastasia (anastasia-macmood) wrote :
Revision history for this message
Anastasia (anastasia-macmood) wrote :

Big merge PR that includes this change, going into develop towards 2.7: https://github.com/juju/juju/pull/10367

Changed in juju:
status: In Progress → Fix Committed
milestone: none → 2.7-beta1
Changed in juju:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.