OpenStack provider: retry-provisioning doesn't work for `Quota exceeded for ...`
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Canonical Juju |
Fix Released
|
High
|
Ian Booth |
Bug Description
Hi,
My model is running Juju 2.8.9 and my client 2.9.9. On deployment using a Juju bundle, I ran into an issue with running out of quotas. I quickly corrected that but it seems there's no way to retry provisioning of this failed machine:
| 10 down pending focal cannot run instance: Unauthorised URL https:/
caused by: request (https:/
A `juju retry-provisioning 10` doesn't work. As discussed with Ian, this is due to the error code being a 403 indicating permissions/
Any chance we could allow retrying provisioning of machines in this state? Maybe allow retry-provisioning for all 4XX error codes or with a `--force` option to `retry-
Changed in juju: | |
status: | Fix Committed → Fix Released |
Juju will retry provisioning machines it considers have transient provisioning errors. I looked into this in more detail, and the interpretation of the error code is the means by which juju will automatically retry provisioning. However, the retry-provisioning command is a way to signal that provisioning should be retried as per a user request to do so, but only if the machine status is "error" or "provisioning error".
juju show-machine does seem to indicate the machine is in error
$ juju show-machine 16 /xxxx:8774/ v2.1/servers /xxxx:8774/ v2.1/servers) returned unexpected status: 403; error info: {"forbidden": {"code": 403, "message": "Quota exceeded for cores: Requested 2, but already used 30 of 31 cores"}} n-status: source= volume
machines:
"10":
juju-status:
current: down
message: agent is not communicating with the server
since: 03 Aug 2021 22:37:33Z
instance-id: pending
machine-status:
current: provisioning error
message: |-
cannot run instance: Unauthorised URL https:/
caused by: request (https:/
since: 03 Aug 2021 22:37:33Z
modificatio
current: idle
since: 03 Aug 2021 22:34:33Z
series: bionic
constraints: root-disk-
I am surprised though that the 403 is not putting the model into suspended state as 403 should be interpreted as an invalid credential. That's not what we want here but it's what I would have expected to see. Ideally there would be a different http code used for quota exceeded.