Juju keeps creating OpenStack VMs if it cannot allocate a floating IP

Bug #1969309 reported by Angelos Kolaitis
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Canonical Juju
Fix Released
High
Angelos Kolaitis

Bug Description

### Summary

When attempting to create a new machine with OpenStack clouds, Juju uses the allocate-public-ip constraint to decide whether it should allocate and/or attach a floating IP to the Nova instance.

If it fails to allocate a public IP, it abandons the machine creation loop, but does not cleanup the nova instance. It then restarts the loop, which leads to multiple machines being created and abandoned.

### Reproduce

1. Have an OpenStack project, allocate and attach all floating IPs to reach the project quota.
2. In a configured controller on this cloud, attempt to add a new machine with `juju add-machine --constraints 'allocate-public-ip=true'`
3. Wait for Juju to start machine

### juju status output

The output of `juju status` alternates between:

```
Machine State DNS Inst id Series AZ Message
0 pending pending focal instance "83951fe2-36bb-409a-ab66-973be946c925" has status BUILD, wait 10 seconds before retry, attempt 1
```

and

```
Machine State DNS Inst id Series AZ Message
0 pending pending focal failed to start machine 0 (cannot allocate a public IP as needed: failed to allocate a floating ip
caused by: request (https://zerostack.open-cloud.xyz:9696/v2.0/floatingips) returned unexpected status: 409; error info: {"NeutronError": {"type": "OverQuota", "message": "Quota exceeded for resources: ['floatingip'].", "detail": ""}}), retrying in 10s (9 more attempts)
```

until it finally times out and becomes:

```
0 down pending focal cannot allocate a public IP as needed: failed to allocate a floating ip
caused by: request (https://zerostack.open-cloud.xyz:9696/v2.0/floatingips) returned unexpected status: 409; error info: {"NeutronError": {"type": "OverQuota", "message": "Quota exceeded for resources: ['floatingip'].", "detail": ""}}
```

### openstack server list

Even though the machine is never created successfully, Juju creates and abandons a number of Nova instances:

```
ubuntu@wayfarer ~ $ o server list
+--------------------------------------+--------------------------+--------+---------------------------------+--------------+--------------+
| ID | Name | Status | Networks | Image | Flavor |
+--------------------------------------+--------------------------+--------+---------------------------------+--------------+--------------+
| 7128bfa0-054f-4921-b63c-311278cf46cb | juju-ba629f-default-0 | ACTIVE | dev=10.0.2.50 | ubuntu-20.04 | m1.small |
| 51cf173a-9f3f-4ffa-bdac-1c017dace5a5 | juju-ba629f-default-0 | ACTIVE | dev=10.0.2.60 | ubuntu-20.04 | m1.small |
| 4f0c2f91-cca8-4df0-a05a-fe77a73b9e15 | juju-ba629f-default-0 | ACTIVE | dev=10.0.2.107 | ubuntu-20.04 | m1.small |
| e3851eee-be1f-4517-a925-50b1b968ff61 | juju-ba629f-default-0 | ACTIVE | dev=10.0.3.109 | ubuntu-20.04 | m1.small |
| d6a33325-3d16-4847-9f1a-585d4bfb6fa8 | juju-ba629f-default-0 | ACTIVE | dev=10.0.2.122 | ubuntu-20.04 | m1.small |
| fc5266f3-bfd1-4282-9cf3-05c9fec56bca | juju-ba629f-default-0 | ACTIVE | dev=10.0.2.176 | ubuntu-20.04 | m1.small |
| edfe3afb-c87a-46ac-9a59-415f280aebea | juju-ba629f-default-0 | ACTIVE | dev=10.0.0.4 | ubuntu-20.04 | m1.small |
| 8b5bcea2-db05-485c-bd7c-13c2224a3473 | juju-ba629f-default-0 | ACTIVE | dev=10.0.2.229 | ubuntu-20.04 | m1.small |
| 83951fe2-36bb-409a-ab66-973be946c925 | juju-ba629f-default-0 | ACTIVE | dev=10.0.0.43 | ubuntu-20.04 | m1.small |
| 14659ea9-c8a2-46bb-a09c-29867b3a9b81 | juju-ba629f-default-0 | ACTIVE | dev=10.0.1.137 | ubuntu-20.04 | m1.small |
| 0866922f-c6b8-4f98-984b-6ca0e6adbce9 | juju-ba629f-default-0 | ACTIVE | dev=10.0.2.70 | ubuntu-20.04 | m1.small |
...
```

### Debugging

I believe the issue comes from https://github.com/juju/juju/blob/develop/provider/openstack/provider.go#L1339-L1356

I see that Juju will terminate an instance if it fails to assign the public IP (e.terminateInstances), but not if it fails to allocate a new one.

### Notes

This bug does not occur when bootstrapping new Juju controllers for OpenStack clouds.

Changed in juju:
milestone: none → 2.9-next
status: New → Triaged
Revision history for this message
Harry Pidcock (hpidcock) wrote :
Changed in juju:
milestone: 2.9-next → 2.9.30
assignee: nobody → Angelos Kolaitis (aggkolaitis)
status: Triaged → Fix Committed
status: Fix Committed → In Progress
Ian Booth (wallyworld)
Changed in juju:
importance: Undecided → High
status: In Progress → Fix Committed
Changed in juju:
milestone: 2.9.30 → 2.9.31
Changed in juju:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.