ubuntu_lxc hang on Impish 5.13 with api_test.py test

Bug #1952036 reported by Po-Hsu Lin
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
ubuntu-kernel-tests
New
Undecided
Unassigned
python3-lxc (Ubuntu)
New
Undecided
Unassigned
Impish
Won't Fix
Undecided
Unassigned

Bug Description

The ubuntu_lxc test will timeout on some Impish test systems:
  * baltar.ppc64el.9
  * kernel03.s390x.zvm
  * kuzzle.arm64
  * rizzo.amd64
  * s2lp4.s390x.lpar

It didn't hang on some instances because:
  * dryden.ppc64el - test setup failed at the very beginning
  * scobee-kernel.arm64 - unable to fetch GPG key on this maas (bug 1939871)
  * vought.amd64 - unable to fetch GPG key on this maas (bug 1939871)
  * appleton-kernel.arm64 - unable to fetch GPG key on this maas (bug 1939871)

The test log shows the api_test.py test timeout and getting nuked:
 START ubuntu_lxc.api_test.py ubuntu_lxc.api_test.py timestamp=1637718036 timeout=900 localtime=Nov 24 01:40:36
 Persistent state client._record_indent now set to 2
 Persistent state client.unexpected_reboot now set to ('ubuntu_lxc.api_test.py', 'ubuntu_lxc.api_test.py')
 Waiting for pid 98177 for 900 seconds
 System python is too old, crash handling disabled
 Running 'python3 /usr/share/doc/python3-lxc/examples/api_test.py'
 Using image from local cache
 Unpacking the rootfs

 ---
 You just created an Ubuntu xenial s390x (20211106_07:55) container.

 To enable SSH, run: apt install openssh-server
 No default root or user password are set by LXC.
 Timer expired (900 sec.), nuking pid 98177

However, the process is still running:
$ ps aux | grep lxc
root 1857 0.0 0.5 27720 11588 pts/0 S 02:19 0:00 /usr/bin/python2 -u autotest/client/autotest-local --verbose autotest/client/tests/ubuntu_lxc/control
root 1858 0.0 0.5 27720 11588 pts/0 S 02:19 0:00 /usr/bin/python2 -u autotest/client/autotest-local --verbose autotest/client/tests/ubuntu_lxc/control
root 2980 0.0 0.0 225104 1640 ? Ssl 02:21 0:00 /usr/bin/lxcfs /var/lib/lxcfs
lxc-dns+ 3223 0.0 0.1 14572 2436 ? S 02:21 0:00 dnsmasq --conf-file=/dev/null -u lxc-dnsmasq --strict-order --bind-interfaces --pid-file=/run/lxc/dnsmasq.pid --listen-address 10.0.3.1 --dhcp-range 10.0.3.2,10.0.3.254 --dhcp-lease-max=253 --dhcp-no-override --except-interface=lo --interface=lxcbr0 --dhcp-leasefile=/var/lib/misc/dnsmasq.lxcbr0.leases --dhcp-authoritative
root 103371 0.0 0.5 19304 11728 pts/0 S 02:31 0:01 python3 /usr/share/doc/python3-lxc/examples/api_test.py
root 103423 0.0 0.3 19304 7304 ? Ss 02:32 0:00 [lxc monitor] /var/lib/lxc ae2b9000-4cce-11ec-a2a2-23d8ce9e6048
ubuntu 106780 0.0 0.1 8548 2368 pts/1 S+ 03:09 0:00 grep --color=auto lxc

Thus causing the whole test suite to hand, and eventually being killed by the jenkins timeout setting.

I tried to run this test manually on a KVM node, it will hang with "Getting the IP addresses":
$ time sudo /usr/share/doc/python3-lxc/examples/api_test.py
Getting instance for '5b086620-4cea-11ec-8f02-f9df989d7e3f'
Creating rootfs using 'download', arch=amd64
Using image from local cache
Unpacking the rootfs

---
You just created an Ubuntu xenial amd64 (20211123_16:46) container.

To enable SSH, run: apt install openssh-server
No default root or user password are set by LXC.
Testing the configuration
Testing the networking
Starting the container
Getting the interface names
Getting the IP addresses

Po-Hsu Lin (cypressyew)
tags: added: impish sru-20211108 ubuntu-lxc
Po-Hsu Lin (cypressyew)
description: updated
Po-Hsu Lin (cypressyew)
description: updated
Po-Hsu Lin (cypressyew)
summary: - ubuntu_lxc hang on Impish 5.13 (api_test.py)
+ ubuntu_lxc hang on Impish 5.13 with api_test.py test
Revision history for this message
Francis Ginther (fginther) wrote :

This is expected due to the change in impish to cgroups2. This is really a test issue, not an issue with lxc.

Changed in python3-lxc (Ubuntu):
status: New → Invalid
Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

This hang is caused by the while loop
    while not ips or count == 10:

It can't get the ip address with container.get_ips(), thus ips is always empty and making this loop infinite.

https://github.com/lxc/python3-lxc/blob/master/examples/api_test.py#L121

I've created a MP to fix this.

Revision history for this message
Francis Ginther (fginther) wrote :

Restoring the state for python3-lxc as we're using the test from this project.

Changed in python3-lxc (Ubuntu):
status: Invalid → New
Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

Fix applied upstream, it’s not making this test to pass but it will resolve the hang issue. We might need to SRU it to the python3-lxc package if it's not going to be released by the maintainer.
https://github.com/lxc/python3-lxc/commit/37ac09d7a4b58d96e1a89a2304abcc8972712da7

Revision history for this message
Brian Murray (brian-murray) wrote :

Ubuntu 21.10 (Impish Indri) has reached end of life, so this bug will not be fixed for that specific release.

Changed in python3-lxc (Ubuntu Impish):
status: New → Won't Fix
Cory Todd (corytodd)
tags: added: jammy sru-20230320
tags: added: oracle
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.