juju upgrade-model "can not get manifests for jujud-operator"

Bug #2011639 reported by Haw Loeung
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Canonical Juju
Triaged
Medium
Unassigned

Bug Description

Hi,

Working through upgrading our Juju 2.x controllers to the latest 2.9.42. There's this model that's stuck:

| $ juju upgrade-model -m admin/prod-mattermost --agent-version=2.9.42 --debug --agent-stream=proposed
| 23:30:13 INFO juju.cmd supercommand.go:56 running juju [2.9.42 7b871e782195bdac9c90f8a8f01723cc3e08ab92 gc go1.18.10]
| 23:30:13 DEBUG juju.cmd supercommand.go:57 args: []string{"juju", "upgrade-model", "-m", "admin/prod-mattermost", "--agent-version=2.9.42", "--debug", "--agent-stream=proposed"}
| 23:30:13 INFO juju.juju api.go:86 connecting to API addresses: [10.25.0.183:17070 10.25.0.182:17070 10.25.0.184:17070]
| 23:30:13 DEBUG juju.api apiclient.go:1152 successfully dialed "wss://10.25.0.183:17070/api"
| 23:30:13 INFO juju.api apiclient.go:687 connection established to "wss://10.25.0.183:17070/api"
| 23:30:13 INFO juju.juju api.go:86 connecting to API addresses: [10.25.0.183:17070 10.25.0.182:17070 10.25.0.184:17070]
| 23:30:13 DEBUG juju.api apiclient.go:1152 successfully dialed "wss://10.25.0.182:17070/api"
| 23:30:13 INFO juju.api apiclient.go:687 connection established to "wss://10.25.0.182:17070/api"
| 23:30:13 INFO juju.juju api.go:340 API endpoints changed from [10.25.0.182:17070 10.25.0.184:17070 10.25.0.183:17070] to [10.25.0.182:17070 10.25.0.183:17070 10.25.0.184:17070]
| 23:30:13 INFO juju.juju api.go:86 connecting to API addresses: [10.25.0.182:17070 10.25.0.183:17070 10.25.0.184:17070]
| 23:30:13 DEBUG juju.api apiclient.go:1152 successfully dialed "wss://10.25.0.184:17070/model/d5f2c078-362f-43c2-8c3d-36fabadfba11/api"
| 23:30:13 INFO juju.api apiclient.go:687 connection established to "wss://10.25.0.184:17070/model/d5f2c078-362f-43c2-8c3d-36fabadfba11/api"
| 23:30:15 DEBUG juju.api monitor.go:35 RPC connection died
| 23:30:15 DEBUG juju.api monitor.go:35 RPC connection died
| 23:30:15 DEBUG juju.cmd.juju.commands upgrademodel.go:553 upgradeModel failed cannot get architecture for jujud-operator:2.9.33: can not get manifests for jujud-operator:2.9.33: Get "https://index.docker.io/v2/jujusolutions/jujud-operator/manifests/2.9.33": non-successful response status=429
| 23:30:15 DEBUG juju.api monitor.go:35 RPC connection died
| ERROR cannot get architecture for jujud-operator:2.9.33: can not get manifests for jujud-operator:2.9.33: Get "https://index.docker.io/v2/jujusolutions/jujud-operator/manifests/2.9.33": non-successful response status=429
| 23:30:15 DEBUG cmd supercommand.go:537 error stack:
| cannot get architecture for jujud-operator:2.9.33: can not get manifests for jujud-operator:2.9.33: Get "https://index.docker.io/v2/jujusolutions/jujud-operator/manifests/2.9.33": non-successful response status=429
| github.com/juju/juju/rpc.(*Conn).Call:178:
| github.com/juju/juju/api.(*state).APICall:1251:
| github.com/juju/juju/api/client/modelupgrader.(*Client).UpgradeModel:66:

I'm specifying to use 2.9.42 but it looks to be doing weird stuff with 2.9.33. I also tried 2.9.38 with the same results.

Haw Loeung (hloeung)
tags: added: canonical-is canonical-is-upgrades
Revision history for this message
Ian Booth (wallyworld) wrote :

Out of interest, why agent-stream=proposed?

Looks like we're hitting the dockerhub api request limits. I thought we had paid to allow an increase in the number of allowed requests. This isn't a juju issue per se, but we'll need to understand why docker.io is refusing to service the request.

Revision history for this message
Haw Loeung (hloeung) wrote :

Oh, I tried without agent-stream, then tried agent-stream=proposed.

Revision history for this message
Harry Pidcock (hpidcock) wrote :

The problem here is we have too many tags on jujusolutions/jujud-operator, as toolVersionsForCAAS tries to resolve the available architectures for each tag, which is not really optimal.

We'll probably need to change toolVersionsForCAAS to filter out tags we no longer care about.

Revision history for this message
Ian Booth (wallyworld) wrote :

We do filter out all the irrelevant tags.

The issue is simply that we're hitting the dockerhub api request rate limiting for non authenticated requests. We've paid to increase the allowed requests but this is only for authenticated requests I believe.

There's no short term fix other than to wait a bit and retry. Longer term we want to host the jujud agent oci images on a different repo without such limitations.

Revision history for this message
Ian Booth (wallyworld) wrote :

It also seems that docker buildx has "helpfully" changed the oci image format and dockerhub does not support older clients. They have also broken their own CLI tool.

https://bugs.launchpad.net/cloud-images/+bug/2007408

We'll need to regenerate the 2.9.42 images using the old format.

Changed in juju:
importance: Undecided → Medium
milestone: none → 2.9.43
status: New → Triaged
tags: added: jujud
Revision history for this message
Ian Booth (wallyworld) wrote :

This isn't medium. This is a critical blocker to upgrades. We need to repack the 2.9.42 oci images to allow people to upgrade to 2.9.42 and ensure we generate old style images moving forward until such time as juju can handle both formats.

Changed in juju:
importance: Medium → Critical
Revision history for this message
Harry Pidcock (hpidcock) wrote :

Changing this to high and moving off 2.9. For now, we will support old juju controllers that don't support OCI manifests with attestations/sboms etc, we will need to support this for a long time on our dockerhub repo. Moving forward we will need to start publishing new images with all these features, so this will likely be fixed in a later juju and likely move away from dockerhub at the same time.

Changed in juju:
importance: Critical → High
milestone: 2.9.43 → none
Changed in juju:
milestone: none → 3.4.1
assignee: nobody → Simon Richardson (simonrichardson)
assignee: Simon Richardson (simonrichardson) → Harry Pidcock (hpidcock)
Harry Pidcock (hpidcock)
Changed in juju:
assignee: Harry Pidcock (hpidcock) → nobody
importance: High → Medium
milestone: 3.4.1 → none
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.