lxc-test-api-reboot will hang with autopkgtest

Bug #1776381 reported by Po-Hsu Lin
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
ubuntu-kernel-tests
Confirmed
Undecided
Unassigned
lxc (Ubuntu)
New
Undecided
Unassigned

Bug Description

Steps:
  1. Deploy Bionic on a bare-metal system.
  2. Enable deb-src, install the autopkgtest package
  3. sudo autopkgtest lxc -- null

Result:
  * The test will hang, a "reboot" lxc container will be created. The last message on the screen will be:
    make[1]: Entering directory '/tmp/autopkgtest.JxRLRE/build.bSQ/src'
    make[1]: Nothing to be done for 'all-am'.
    make[1]: Leaving directory '/tmp/autopkgtest.JxRLRE/build.bSQ/src'
  * Tried to connect to the "reboot" container with "sudo lxc-console reboot", but nothing there:
    Connected to tty 1
    Type <Ctrl+a q> to exit the console, <Ctrl+a Ctrl+a> to enter Ctrl+a itself
  * If you kill the job and run it again, this time the test will go on (fail with the lxc-test-api-reboot test, as the container has already been created)

This issue can be reproduced on an amd64 box and a s390x zKVM.

Po-Hsu Lin (cypressyew)
description: updated
Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

If you leave it there for a long period, it will time out in the end:
make[1]: Leaving directory '/tmp/autopkgtest.ZiY11u/build.Nic/src'
FAIL: lxc-tests: lxc-test-api-reboot (9845s)
---
Terminated
---

Session terminated, terminating shell...bash: line 1: 15305 Terminated /tmp/autopkgtest.ZiY11u/build.Nic/src/debian/tests/exercise 2> >(tee -a /tmp/autopkgtest.ZiY11u/exercise-stderr >&2) > >(tee -a /tmp/autopkgtest.ZiY11u/exercise-stdout)
 ...terminated.
autopkgtest [06:26:24]: ERROR: timed out on command "su -s /bin/bash root -c set -e; export USER=`id -nu`; . /etc/profile >/dev/null 2>&1 || true; . ~/.profile >/dev/null 2>&1 || true; buildtree="/tmp/autopkgtest.ZiY11u/build.Nic/src"; mkdir -p -m 1777 -- "/tmp/autopkgtest.ZiY11u/exercise-artifacts"; export AUTOPKGTEST_ARTIFACTS="/tmp/autopkgtest.ZiY11u/exercise-artifacts"; export ADT_ARTIFACTS="$AUTOPKGTEST_ARTIFACTS"; mkdir -p -m 755 "/tmp/autopkgtest.ZiY11u/autopkgtest_tmp"; export AUTOPKGTEST_TMP="/tmp/autopkgtest.ZiY11u/autopkgtest_tmp"; export ADTTMP="$AUTOPKGTEST_TMP"; export DEBIAN_FRONTEND=noninteractive; export LANG=C.UTF-8; export DEB_BUILD_OPTIONS=parallel=16; unset LANGUAGE LC_CTYPE LC_NUMERIC LC_TIME LC_COLLATE LC_MONETARY LC_MESSAGES LC_PAPER LC_NAME LC_ADDRESS LC_TELEPHONE LC_MEASUREMENT LC_IDENTIFICATION LC_ALL;rm -f /tmp/autopkgtest_script_pid; set -C; echo $$ > /tmp/autopkgtest_script_pid; set +C; trap "rm -f /tmp/autopkgtest_script_pid" EXIT INT QUIT PIPE; cd "$buildtree"; export AUTOPKGTEST_NORMAL_USER=; export ADT_NORMAL_USER=; chmod +x /tmp/autopkgtest.ZiY11u/build.Nic/src/debian/tests/exercise; touch /tmp/autopkgtest.ZiY11u/exercise-stdout /tmp/autopkgtest.ZiY11u/exercise-stderr; /tmp/autopkgtest.ZiY11u/build.Nic/src/debian/tests/exercise 2> >(tee -a /tmp/autopkgtest.ZiY11u/exercise-stderr >&2) > >(tee -a /tmp/autopkgtest.ZiY11u/exercise-stdout);" (kind: test)
autopkgtest [06:26:24]: test exercise: -----------------------]
autopkgtest [06:26:24]: test exercise: - - - - - - - - - - results - - - - - - - - - -
exercise FAIL timed out
autopkgtest [06:26:24]: @@@@@@@@@@@@@@@@@@@@ summary
exercise FAIL timed out

Revision history for this message
Free Ekanayaka (free.ekanayaka) wrote :

It might be a duplicate of https://github.com/lxc/lxd/issues/4485 (which is fixed in 3.0.1, now in -proposed I believe).

We'd need to see the logs of the LXD daemon to really tell, though.

Revision history for this message
Christian Brauner (cbrauner) wrote : Re: [Bug 1776381] Re: lxc-test-api-reboot will hang with autopkgtest

On Tue, Jun 12, 2018 at 12:46 PM, Free Ekanayaka
<email address hidden> wrote:
> It might be a duplicate of https://github.com/lxc/lxd/issues/4485 (which
> is fixed in 3.0.1, now in -proposed I believe).

This is a LXC integration test that is failing, not a LXD one. :)

>
> We'd need to see the logs of the LXD daemon to really tell, though.
>
> ** Bug watch added: LXD bug tracker #4485
> https://github.com/lxc/lxd/issues/4485
>
> --
> You received this bug notification because you are a member of Ubuntu
> containers team, which is subscribed to lxc in Ubuntu.
> Matching subscriptions: lxc
> https://bugs.launchpad.net/bugs/1776381
>
> Title:
> lxc-test-api-reboot will hang with autopkgtest
>
> Status in lxc package in Ubuntu:
> New
>
> Bug description:
> Steps:
> 1. Deploy Bionic on a bare-metal system.
> 2. Enable deb-src, install the autopkgtest package
> 3. sudo autopkgtest lxc -- null
>
> Result:
> * The test will hang, a "reboot" lxc container will be created. The last message on the screen will be:
> make[1]: Entering directory '/tmp/autopkgtest.JxRLRE/build.bSQ/src'
> make[1]: Nothing to be done for 'all-am'.
> make[1]: Leaving directory '/tmp/autopkgtest.JxRLRE/build.bSQ/src'
> * Tried to connect to the "reboot" container with "sudo lxc-console reboot", but nothing there:
> Connected to tty 1
> Type <Ctrl+a q> to exit the console, <Ctrl+a Ctrl+a> to enter Ctrl+a itself
> * If you kill the job and run it again, this time the test will go on (fail with the lxc-test-api-reboot test, as the container has already been created)
>
> This issue can be reproduced on an amd64 box and a s390x zKVM.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/ubuntu/+source/lxc/+bug/1776381/+subscriptions

Revision history for this message
Christian Brauner (cbrauner) wrote :
Download full text (3.9 KiB)

On Tue, Jun 12, 2018 at 8:39 AM, Po-Hsu Lin <email address hidden> wrote:
> If you leave it there for a long period, it will time out in the end:
> make[1]: Leaving directory '/tmp/autopkgtest.ZiY11u/build.Nic/src'
> FAIL: lxc-tests: lxc-test-api-reboot (9845s)

The API reboot tests will hang indefinitely if the container fails to
reboot properly.
This can happen for a variety of reasons. The most likely are that the
container in
question does either not have its signal handlers set up correctly at
the time it receives
the reboot signal or that this is a broken busybox implementation that
doesn't correctly
handle reboots.

> ---
> Terminated
> ---
>
> Session terminated, terminating shell...bash: line 1: 15305 Terminated /tmp/autopkgtest.ZiY11u/build.Nic/src/debian/tests/exercise 2> >(tee -a /tmp/autopkgtest.ZiY11u/exercise-stderr >&2) > >(tee -a /tmp/autopkgtest.ZiY11u/exercise-stdout)
> ...terminated.
> autopkgtest [06:26:24]: ERROR: timed out on command "su -s /bin/bash root -c set -e; export USER=`id -nu`; . /etc/profile >/dev/null 2>&1 || true; . ~/.profile >/dev/null 2>&1 || true; buildtree="/tmp/autopkgtest.ZiY11u/build.Nic/src"; mkdir -p -m 1777 -- "/tmp/autopkgtest.ZiY11u/exercise-artifacts"; export AUTOPKGTEST_ARTIFACTS="/tmp/autopkgtest.ZiY11u/exercise-artifacts"; export ADT_ARTIFACTS="$AUTOPKGTEST_ARTIFACTS"; mkdir -p -m 755 "/tmp/autopkgtest.ZiY11u/autopkgtest_tmp"; export AUTOPKGTEST_TMP="/tmp/autopkgtest.ZiY11u/autopkgtest_tmp"; export ADTTMP="$AUTOPKGTEST_TMP"; export DEBIAN_FRONTEND=noninteractive; export LANG=C.UTF-8; export DEB_BUILD_OPTIONS=parallel=16; unset LANGUAGE LC_CTYPE LC_NUMERIC LC_TIME LC_COLLATE LC_MONETARY LC_MESSAGES LC_PAPER LC_NAME LC_ADDRESS LC_TELEPHONE LC_MEASUREMENT LC_IDENTIFICATION LC_ALL;rm -f /tmp/autopkgtest_script_pid; set -C; echo $$ > /tmp/autopkgtest_script_pid; set +C; trap "rm -f /tmp/autopkgtest_script_pid" EXIT INT QUIT PIPE; cd "$buildtree"; export AUTOPKGTEST_NORMAL_USER=; export ADT_NORMAL_USER=; chmod +x /tmp/autopkgtest.ZiY11u/build.Nic/src/debian/tests/exercise; touch /tmp/autopkgtest.ZiY11u/exercise-stdout /tmp/autopkgtest.ZiY11u/exercise-stderr; /tmp/autopkgtest.ZiY11u/build.Nic/src/debian/tests/exercise 2> >(tee -a /tmp/autopkgtest.ZiY11u/exercise-stderr >&2) > >(tee -a /tmp/autopkgtest.ZiY11u/exercise-stdout);" (kind: test)
> autopkgtest [06:26:24]: test exercise: -----------------------]
> autopkgtest [06:26:24]: test exercise: - - - - - - - - - - results - - - - - - - - - -
> exercise FAIL timed out
> autopkgtest [06:26:24]: @@@@@@@@@@@@@@@@@@@@ summary
> exercise FAIL timed out
>
> --
> You received this bug notification because you are a member of Ubuntu
> containers team, which is subscribed to lxc in Ubuntu.
> Matching subscriptions: lxc
> https://bugs.launchpad.net/bugs/1776381
>
> Title:
> lxc-test-api-reboot will hang with autopkgtest
>
> Status in lxc package in Ubuntu:
> New
>
> Bug description:
> Steps:
> 1. Deploy Bionic on a bare-metal system.
> 2. Enable deb-src, install the autopkgtest package
> 3. sudo autopkgtest lxc -- null
>
> Result:
> * The test will hang, a "reboot" lxc container will be ...

Read more...

Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

Is there anything that I can do for debugging this?

Thank you.

Revision history for this message
Christian Brauner (cbrauner) wrote :

On Thu, Jun 14, 2018 at 04:19:39AM -0000, Po-Hsu Lin wrote:
> Is there anything that I can do for debugging this?

Hm, you could try manually creating a busybox container and trying to:
- shut it down
- reboot it
with lxc-stop

Christian

Po-Hsu Lin (cypressyew)
Changed in ubuntu-kernel-tests:
status: New → Confirmed
Po-Hsu Lin (cypressyew)
tags: added: ubuntu-lxc
tags: added: bionic
Revision history for this message
Stéphane Graber (stgraber) wrote :

It's not currently failing based on recent logs anyway.

Changed in lxc (Ubuntu):
status: New → Invalid
Revision history for this message
Kelsey Steele (kelsey-steele) wrote :

Seeing on bionic/fips 4.15.0-1061.69 s390x host s2lp3

tags: added: 4.15 fips s390x sru-20210531
Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

Re-open this bug, the reason why this was not reported for a while is because of bug 1788574.

On the first attempt we tried to run the test, and it hangs with this issue.
The second attempt this test will fail due to the container already exists.

The the container properly removed, this is hitting us again on Bionic 4.15 s390x

Changed in lxc (Ubuntu):
status: Invalid → New
Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

I tried the following on a s390x LPAR instance with Bionic 4.15 kernel:
$ sudo lxc-create -t /usr/share/lxc/templates/lxc-busybox -n test
$ sudo lxc-ls
test
$ sudo lxc-start test
$ sudo lxc-attach test
# Make sure the container is running here

# Use another terminal on this instance to do lxc-stop
$ sudo lxc-stop test
lxc-stop: test: commands_utils.c: lxc_cmd_sock_rcv_state: 70 Resource temporarily unavailable - Failed to receive message
$ echo $?
0

The container terminated.

Revision history for this message
Christian Brauner (cbrauner) wrote :

Hm, what is the LXC version used here? Is it the one in Bionic?

Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

ubuntu@s2lp3:~$ dpkg -l | grep lxc
ii liblxc-common 3.0.3-0ubuntu1~18.04.1 s390x Linux Containers userspace tools (common tools)
ii liblxc-dev 3.0.3-0ubuntu1~18.04.1 s390x Linux Containers userspace tools (development)
ii liblxc1 3.0.3-0ubuntu1~18.04.1 s390x Linux Containers userspace tools (library)
ii lxc 3.0.3-0ubuntu1~18.04.1 all Transitional package - lxc -> lxc-utils
rc lxc-common 2.1.1-0ubuntu1 s390x Linux Containers userspace tools (common tools)
ii lxc-dev 3.0.3-0ubuntu1~18.04.1 all Transitional package - lxc-dev -> liblxc-dev
ii lxc-utils 3.0.3-0ubuntu1~18.04.1 s390x Linux Containers userspace tools
ii lxc1 3.0.3-0ubuntu1~18.04.1 all Transitional package - lxc1 -> lxc-utils
ii lxcfs 3.0.3-0ubuntu1~18.04.2 s390x FUSE based filesystem for LXC

Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

Note: s2lp3 here is our Bionic s390x LPAR instance

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.