juju controllers --refresh fails (very slowly)

Bug #1732349 reported by Paul Gear
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Canonical Juju
Fix Released
High
Anastasia

Bug Description

On one of our juju controllers, 'juju controllers' reports out-of-date information. 'juju controllers --refresh' fails to refresh the data due to the following error:

error updating cached details for "CONTROLLER-NAME": model UUID has been removed

It also takes 1m 48.340s to fail: https://pastebin.canonical.com/203269/

Tags: canonical-is
Revision history for this message
Paul Gear (paulgear) wrote :

For the record, the controller mentioned above has 197 models, not 2 as the pastebin claims, and has 3 machines, presumably configured in HA mode (although this hasn't been confirmed).

Revision history for this message
Paul Gear (paulgear) wrote :
Revision history for this message
John A Meinel (jameinel) wrote : Re: [Bug 1732349] Re: juju controllers --refresh fails (very slowly)

I know in 2.3 we've done quite a few rounds of fixes for models that
disappear while the command is running.
That particular issue isn't one I'm sure if we've fixed, but how we cache
controller details, etc has change in 2.3.

On Wed, Nov 15, 2017 at 10:44 AM, Paul Gear <email address hidden> wrote:

> See also https://bugs.launchpad.net/juju/+bug/1732353
>
> --
> You received this bug notification because you are subscribed to juju.
> Matching subscriptions: juju bugs
> https://bugs.launchpad.net/bugs/1732349
>
> Title:
> juju controllers --refresh fails (very slowly)
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/juju/+bug/1732349/+subscriptions
>

Revision history for this message
Anastasia (anastasia-macmood) wrote :

@Paul Gear (paulgear),

This bug has been addressed in 2.3.

Essentially, it is coming from a place where we are asking for a model status on a number of models but err out if we fail to retrieve it for any model in the list.

[Extra detail if you are curious:
This could happen because every time we reach out to the model, we get a model connection from the pool. The error you are seeing is coming from the pool for a model that has been marked for destruction. However, not all model connections previously acquired were returned, in other words - the pool observed that not all connections have been returned and prevents this model's removal from the pool's own cache.]

The fix was to process all models despite any errors and the commit with the fix is https://github.com/juju/juju/commit/1c2d96883b41a01317412c29937e60e25d41eb8a. It went into 2.3-b2.

The only workaround I can suggest at the moment is either to surgically remove a model that is causing the issue (there must be still some artifacts hanging around) or restart the controller (which will clear out connections to the model that have not been released). However, I have a strong suspicion that this occurrence could straighten itself out even on 2.2 given enough time to release model's connections.

Changed in juju:
status: New → Fix Committed
milestone: none → 2.3-beta2
assignee: nobody → Anastasia (anastasia-macmood)
importance: Undecided → High
Revision history for this message
Paul Gear (paulgear) wrote :

On 15/11/17 17:54, John A Meinel wrote:
> I know in 2.3 we've done quite a few rounds of fixes for models that
> disappear while the command is running.
> That particular issue isn't one I'm sure if we've fixed, but how we cache
> controller details, etc has change in 2.3.

For the record, this is not due to a model which disappears during the
running of the list-models command.  This has been tried multiple times,
days apart.  So it seems like it might be triggered by any model which
is in the middle of destruction, which could last a long time if the
machines themselves have disappeared.

Paul Gear (paulgear)
tags: added: canonical-is
Changed in juju:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.