Juju doesn't remove KVM virtual machines on maas nodes when using "juju remove-unit"
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Canonical Juju |
Fix Released
|
High
|
Joseph Phillips |
Bug Description
Hello,
Let me know if you need additional information. I'm not sure if this bug only happens in a nested virtualized environment. please, double-check.
Environment description:
Juju version: 2.9.32
MaaS version: 3.2
I used 3 virtual machines on my ubuntu 20.04 Workstation. One is the MaaS server, one is the juju controller and another is "node1", a node used for deploying juju applications.
Also, I used my local Workstation as a juju client.
I installed MaaS 3.2 as a snap on the first node, then I added 2 Machines using the mac addresses of the VMs I created on my workstation. Also, I used the "virsh" powertype and it works without any problem.
At this point, on my workstation I created the env with:
$ juju clouds --local # Using MaaS
$ juju add-credential maas-cloud
$ juju bootstrap maas-cloud --to juju-controller
when completed, I created a new application with the following command:
juju deploy ubuntu --series focal
The third VM has been deployed by MaaS without any problem.
Then, I did the following:
juju deploy ubuntu --series bionic testubuntu --to kvm:0
The previous command creates a new VM inside machine 0 (I know, machine 0 is a VM so we have nested virtualization here)
when finished, the output of "juju status" was the following:
marino-
Model Controller Cloud/Region Version SLA Timestamp
default maascloud-default maascloud/default 2.9.32 unsupported 13:10:46+02:00
App Version Status Scale Charm Channel Rev Exposed Message
testubuntu 18.04 active 1 ubuntu stable 20 no
ubuntu 20.04 active 1 ubuntu stable 20 no
Unit Workload Agent Machine Public address Ports Message
testubuntu/5* active idle 0/kvm/3 192.168.100.60
ubuntu/0* active idle 0 192.168.100.54
Machine State DNS Inst id Series AZ Message
0 started 192.168.100.54 node1 focal default Deployed
0/kvm/3 started 192.168.100.60 juju-a7ae43-0-kvm-3 bionic Container started
Also, let's check the status of running VMs on machine 0:
marino-
setlocale: No such file or directory
Id Name State
-------
4 juju-a7ae43-0-kvm-3 running
Now, let's try to remove the unit:
marino-
removing unit testubuntu/5
Juju status again:
marino-
Model Controller Cloud/Region Version SLA Timestamp
default maascloud-default maascloud/default 2.9.32 unsupported 13:12:58+02:00
App Version Status Scale Charm Channel Rev Exposed Message
testubuntu unknown 0 ubuntu stable 20 no
ubuntu 20.04 active 1 ubuntu stable 20 no
Unit Workload Agent Machine Public address Ports Message
ubuntu/0* active idle 0 192.168.100.54
Machine State DNS Inst id Series AZ Message
0 started 192.168.100.54 node1 focal default Deployed
Sounds good, but if you check the running VMs again, kvm/3 is still running!:
marino-
setlocale: No such file or directory
Id Name State
-------
4 juju-a7ae43-0-kvm-3 running
Also, I can still ping the IP:
marino-
PING 192.168.100.60 (192.168.100.60) 56(84) bytes of data.
64 bytes from 192.168.100.60: icmp_seq=1 ttl=64 time=1.01 ms
64 bytes from 192.168.100.60: icmp_seq=2 ttl=64 time=0.958 ms
--- 192.168.100.60 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1000ms
rtt min/avg/max/mdev = 0.958/0.
Also, if you check the MaaS web interface and the database, the IP, the interface and the instance have been removed. So we have a "zombie" VM that is running and with a reachable IP that can cause duplicates over the network.
Note that this doesn't happen if I use an LXD container instead of a KVM machine (the container will be removed based on the output of "sudo lxc list")
Thank you.
Regards,
Marco
Changed in juju: | |
status: | New → Triaged |
importance: | Undecided → Medium |
assignee: | nobody → Joseph Phillips (manadart) |
Changed in juju: | |
status: | Triaged → In Progress |
importance: | Medium → High |
milestone: | none → 2.9.34 |
Changed in juju: | |
status: | In Progress → Fix Committed |
Changed in juju: | |
status: | Fix Committed → Fix Released |
Do the juju logs show any errors or other indication that there was an issue?