Removing a lxd remote leaves services on the lxd worker

Bug #1978353 reported by Brian Murray
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Auto Package Testing
New
Undecided
Unassigned

Bug Description

I recently removed the lxd remote unit with IP address 10.44.124.140 and replaced it with 10.44.124.197. After executing `mojo run` I noticed that the lxd worker still had stray services on it.

ubuntu@juju-4d1272-prod-proposed-migration-5:~$ ls -lh /<email address hidden>*
/<email address hidden>:
total 4.0K
-rw-r--r-- 1 root root 132 May 11 16:34 autopkgtest-lxd-remote.conf

/<email address hidden>:
total 4.0K
-rw-r--r-- 1 root root 132 May 11 16:34 autopkgtest-lxd-remote.conf

While a working lxd remote has the following services on the lxd worker:

ubuntu@juju-4d1272-prod-proposed-migration-5:~$ ls -lh /<email address hidden>*
lrwxrwxrwx 1 root root 83 Jun 10 19:14 /<email address hidden> -> /var/lib/juju/agents/unit-aut
opkgtest-lxd-worker-0/charm/units/autopkgtest@.service
lrwxrwxrwx 1 root root 83 Jun 10 19:14 /<email address hidden> -> /var/lib/juju/agents/unit-aut
opkgtest-lxd-worker-0/charm/units/autopkgtest@.service
lrwxrwxrwx 1 root root 83 Jun 10 19:14 /<email address hidden> -> /var/lib/juju/agents/unit-aut
opkgtest-lxd-worker-0/charm/units/autopkgtest@.service

/<email address hidden>:
total 4.0K
-rw-r--r-- 1 root root 132 Jun 10 19:14 autopkgtest-lxd-remote.conf

/<email address hidden>:
total 4.0K
-rw-r--r-- 1 root root 132 Jun 10 19:14 autopkgtest-lxd-remote.conf

/<email address hidden>:
total 4.0K
-rw-r--r-- 1 root root 132 Jun 10 19:14 autopkgtest-lxd-remote.conf

Clearly something is trying to clean up the services but ends up missing a couple. This then results in errors in the `cloud-worker-maintenance` service.

May 22 13:52:19 juju-4d1272-prod-proposed-migration-5 /home/ubuntu/autopkgtest-cloud/worker/worker[482763]: WARNING: Testbed failure. Retrying in 5 minutes... Log follows:
May 22 13:52:19 juju-4d1272-prod-proposed-migration-5 /home/ubuntu/autopkgtest-cloud/worker/worker[482763]: ERROR: autopkgtest [13:52:17]: starting date: 2022-05-22
                                                                                                            autopkgtest [13:52:17]: git checkout: 167b209 lxd: Increase various timeouts
                                                                                                            autopkgtest [13:52:17]: host juju-4d1272-prod-proposed-migration-5; command line: /home/ubuntu/autopkgte>
                                                                                                            <VirtSubproc>: failure: ['lxc', 'launch', '--ephemeral', 'lxd-armhf-10.44.124.140:autopkgtest/ubuntu/kin>
                                                                                                            Fatal Python error: Cannot recover from stack overflow.
                                                                                                            Python runtime state: initialized

                                                                                                            Current thread 0x00007fa3c0b46740 (most recent call first):
                                                                                                              File "/home/ubuntu/autopkgtest/lib/adtlog.py", line 96 in debug

Tags: adt-32
Revision history for this message
Brian Murray (brian-murray) wrote :

Also if we are regularly replacing lxd remote systems then we could run into a situation where we are re-using an IP address which could then cause issues when new service files are being created.

tags: added: adt-32
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.