Destroying LXD containers does not appear in MAAS machines

Bug #2076969 reported by Natalia Litvinova

This bug report will be marked for expiration in 52 days if no further activity occurs. (find out why)

6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Canonical Juju
Incomplete
Undecided
Unassigned
MAAS
Invalid
Undecided
Unassigned

Bug Description

I have a juju controller based on MAAS, after deploying and removing containers on the baremetal machine, I can still see those containers in the instances tab, even though lxc list --all-projects comes back empty on the machine. The biggest problem here is that MAAS does not free the container's IP addresses. And I have a small OAM network that has run out of IPs.
Is there a way to make MAAS refresh the containers list?

Versions:
MAAS 3.4.3
Juju 3.5.1

Changed in maas:
assignee: nobody → Anton Troyanov (troyanov)
status: New → In Progress
Revision history for this message
Anton Troyanov (troyanov) wrote :

After inspecting logs on the affected environment I don't think this is a MAAS issue.

Juju is calling MAAS API POST /MAAS/api/2.0/devices/ to signal about newly created containers.
When container is deleted it should call DELETE /MAAS/api/2.0/devices

regiond.log.1:2024-08-12 10:44:31 regiond: [info] 127.0.0.1 POST /MAAS/api/2.0/devices/?op= HTTP/1.1 --> 200 OK (referrer: -; agent: Go-http-client/2.0)
regiond.log.1:2024-08-12 10:44:50 regiond: [info] 127.0.0.1 DELETE /MAAS/api/2.0/devices/a8kbwg/ HTTP/1.1 --> 204 NO_CONTENT (referrer: -; agent: Go-http-client/2.0)
regiond.log.1:2024-08-12 10:45:09 regiond: [info] 127.0.0.1 DELETE /MAAS/api/2.0/devices/qtqyx6/ HTTP/1.1 --> 204 NO_CONTENT (referrer: -; agent: Go-http-client/2.0)

However for some unknown reason some devices are not removed and there were no calls towards MAAS API according to nginx and regiond logs. Maybe Juju failed to call MAAS? I didn't find any relevant logs on the Juju side tho.

Some of the orphaned devices (there are plenty of them):

 system_id | hostname
-----------+----------------------
 qp4dmb | juju-d66aa4-11-lxd-0
 wkp673 | juju-d66aa4-0-lxd-19
 qcy7gd | juju-d66aa4-1-lxd-10
 pf37pa | juju-d66aa4-0-lxd-3
 4ama3t | juju-d66aa4-1-lxd-17
 gfmaws | juju-d66aa4-0-lxd-34
 nxg873 | juju-d66aa4-0-lxd-2
 wqtbfh | juju-d66aa4-2-lxd-2
 mesrsb | juju-d66aa4-1-lxd-0
 fhge4e | juju-d66aa4-0-lxd-5
 ge3g4s | juju-d66aa4-0-lxd-35
 yraf7w | juju-d66aa4-0-lxd-29
 kawc36 | juju-d66aa4-0-lxd-36
 ...
 ...

We picked up one device to check if there was a delete attempt, but no results were returned.
$ grep -r "DELETE /MAAS/api/2.0/devices/kawc36"

In order to solve the problem, we manually removed orphaned devices using MAAS CLI
$ maas admin device delete kawc36

http/access.log:10.4.26.244 - - [14/Aug/2024:09:38:12 +0000] "DELETE /MAAS/api/2.0/devices/kawc36/ HTTP/1.1" 204 0 "-" "Python-httplib2/0.20.2 (gzip)"
regiond.log:2024-08-14 09:38:12 regiond: [info] 127.0.0.1 DELETE /MAAS/api/2.0/devices/kawc36/ HTTP/1.1 --> 204 NO_CONTENT (referrer: -; agent: Python-httplib2/0.20.2 (gzip))

Changed in maas:
status: In Progress → Invalid
Revision history for this message
Joseph Phillips (manadart) wrote :

Juju has logic in its machine undertaker to release container addresses, which in the MAAS provider deletes the devices as Anton has alluded to.

Questions:
- Where the container both provisioned and removed using Juju?
- What is the configured container-networking-method for the model in question.

Changed in juju:
status: New → Incomplete
Changed in maas:
assignee: Anton Troyanov (troyanov) → nobody
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.