Comment 1 for bug 2003135

Revision history for this message
Erik Lönroth (erik-lonroth) wrote :

More input.

I today upgraded the controller from 2.9.37 -> 2.9.38

    juju upgrade-controller

That worked.

From there, I normally upgrade all models with this command:

    for m in $(juju models --all --format json | jq -r '.models[]["model-uuid"]'); do echo $m; juju upgrade-model -m $m; done

It basically loops over all models in the controller and tries to upgrade them.

This works - partially and the output looks like this the first round.

e444c390-4053-4838-868a-8436dd861b20
best version:
    2.9.38
started upgrade to 2.9.38
e444c390-4053-4838-868a-8436dd861b20
best version:
    2.9.38
started upgrade to 2.9.38
2c7a52c8-76e3-4b49-8f0d-d4e7f75ddc9e
no upgrades available
997bd7de-5062-4a92-8ee9-627b16c3c3d4
best version:
    2.9.38
started upgrade to 2.9.38
ERROR cannot find tool version from simple streams: creating environ for model "controller" (2eb4342a-966c-446d-8fec-3e06bd45c61b): Get "https://192.168.211.2:8443/1.0/profiles?project=default": x509: certificate is valid for 127.0.0.1, ::1, not 192.168.211.2
ERROR some agents have not upgraded to the current model version 2.9.37: machine-0, machine-3, unit-besu-0, unit-prysm-beacon-1
cdccba01-df55-493a-8f80-23e376840d4c
best version:
    2.9.38
started upgrade to 2.9.38
a68e2aae-e590-494e-8f0f-c193ba07101a
best version:
    2.9.38
started upgrade to 2.9.38

... and so on, mixed OK, with ERRORS.

So, I continue to run this command, over and over, until most models are upgraded.

The "certificate errors" goes away in the upgrade (after multiple runs) but eventually, all models are upgraded.

===== Unrelated =====

At this point, only one model remains which gives an error like this:

    juju upgrade-model -m d84f172a-9f81-4cd5-8759-2cc786cdec41
    ERROR some agents have not upgraded to the current model version 2.9.37: machine-0, machine-3, unit-besu-0, unit-prysm-beacon-1

So, I introspect the model and see that some agents has lost communications with the controller (See the attached Screenshot):

This is all fine, since the machines in that model are turned off, but I think the "ERROR" should be reduced to WARNING