systemd autopkgtest regression on arm64 and s390x on mantic

Bug #2038433 reported by Andrea Righi
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Confirmed
Undecided
Unassigned
Mantic
Confirmed
Undecided
Unassigned
systemd (Ubuntu)
New
Undecided
Unassigned
Mantic
New
Undecided
Unassigned

Bug Description

tests-in-lxd seems to fail to download the lxd image used for the test (or something similar), looking at the log:
...
Publishing instance: Image pack: 85% (5.51MB/s)

Instance published with fingerprint: 83b511be44395f2afc7d4d1ee0708cb18834c8b15cb208c9d218964ed8479114
6845s autopkgtest [09:11:14]: starting date and time: 2023-10-03 09:11:14+0000
6845s autopkgtest [09:11:14]: version 5.28ubuntu1
6845s autopkgtest [09:11:14]: host autopkgtest; command line: /usr/bin/autopkgtest -U -B . --test-name=unit-tests -- lxd autopkgtest/ubuntu/mantic/arm64
7063s <VirtSubproc>: failure: (down) ['mktemp', '--directory', '--tmpdir', 'autopkgtest.XXXXXX'] failed (exit status 1, stderr '')
7063s autopkgtest [09:14:52]: ERROR: testbed failure: unexpected eof from the testbed
7063s autopkgtest [09:14:52]: test tests-in-lxd: -----------------------]
7064s autopkgtest [09:14:53]: test tests-in-lxd: - - - - - - - - - - results - - - - - - - - - -
7064s tests-in-lxd FAIL non-zero exit status 1
2:40 PM
...

This introduces a regression that is blocking the promotion of new kernels in Mantic.

ProblemType: Bug
DistroRelease: Ubuntu 23.10
Package: systemd 253.5-1ubuntu6
ProcVersionSignature: Ubuntu 6.5.0-5.5.1-lowlatency 6.5.0
Uname: Linux 6.5.0-5-lowlatency x86_64
ApportVersion: 2.27.0-0ubuntu4
Architecture: amd64
CasperMD5CheckResult: pass
CurrentDesktop: ubuntu:GNOME
Date: Wed Oct 4 14:38:15 2023
InstallationDate: Installed on 2023-09-25 (9 days ago)
InstallationMedia: Ubuntu 23.10 "Mantic Minotaur" - Beta amd64 (20230924)
MachineType: {report['dmi.sys.vendor']} {report['dmi.product.name']}
ProcEnviron:
 LANG=en_US.UTF-8
 PATH=(custom, no user)
 SHELL=/bin/bash
 TERM=xterm-256color
 XDG_RUNTIME_DIR=<set>
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-6.5.0-5-lowlatency root=/dev/mapper/ubuntu--vg-ubuntu--lv ro quiet splash vt.handoff=7
SourcePackage: systemd
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 09/01/2022
dmi.bios.release: 5.19
dmi.bios.vendor: American Megatrends International, LLC.
dmi.bios.version: 2.04
dmi.board.asset.tag: Default string
dmi.board.name: G1621-02
dmi.board.vendor: GPD
dmi.board.version: Default string
dmi.chassis.asset.tag: Default string
dmi.chassis.type: 10
dmi.chassis.vendor: Default string
dmi.chassis.version: Default string
dmi.ec.firmware.release: 2.3
dmi.modalias: dmi:bvnAmericanMegatrendsInternational,LLC.:bvr2.04:bd09/01/2022:br5.19:efr2.3:svnGPD:pnG1621-02:pvrDefaultstring:rvnGPD:rnG1621-02:rvrDefaultstring:cvnDefaultstring:ct10:cvrDefaultstring:skuDefaultstring:
dmi.product.family: Default string
dmi.product.name: G1621-02
dmi.product.sku: Default string
dmi.product.version: Default string
dmi.sys.vendor: GPD

Revision history for this message
Andrea Righi (arighi) wrote :
Revision history for this message
Nick Rosbrook (enr0n) wrote :

Taking a quick look at the logs for arm64, https://autopkgtest.ubuntu.com/packages/systemd/mantic/arm64, I only see this specific failure for newer kernels. Have you tried to reproduce this locally to see if it is in fact introduced by a kernel change?

Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Status changed to Confirmed

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Juerg Haefliger (juergh) wrote :

Is seems the container fails to boot. From a bad run:

9109s autopkgtest [10:39:17]: test tests-in-lxd: [-----------------------
9145s 2023-11-28T10:39:50Z INFO Waiting for automatic snapd restart...
9259s lxd 5.19-8635f82 from Canonical** installed
9288s Creating autopkgtest-prepare-jhI
9475s Retrieving image: metadata: 100% (323.81MB/s)
  Retrieving image: rootfs: 1% (3.53MB/s)
  <SNIP>
  Retrieving image: rootfs: 100% (8.38MB/s)
  Retrieving image: Unpack: 100% (226.67MB/s)
  Starting autopkgtest-prepare-jhI
9486s Created symlink /<email address hidden> → /dev/null.
9490s Failed to connect to bus: No such file or directory
9632s Timed out waiting for container to boot

Revision history for this message
Juerg Haefliger (juergh) wrote :
Revision history for this message
Juerg Haefliger (juergh) wrote :

So the workaround for the above in systemd/debian/tests/tests-in-lxd is problematic?

Revision history for this message
Juerg Haefliger (juergh) wrote :

Ok that's not it. It looks like a container reboot takes longer than 1 min with newer kernels resulting in a timeout error and subsequent failure.

Revision history for this message
Juerg Haefliger (juergh) wrote :

2023-11-29T11:00:34.713689+00:00 mantic-test systemd[1]: Reached target cloud-init.target - Cloud-init target.
2023-11-29T11:00:34.713731+00:00 mantic-test systemd[1]: Startup finished in 5.934s.
2023-11-29T11:00:34.766612+00:00 mantic-test systemd[1]: dmesg.service: Deactivated successfully.
2023-11-29T11:01:00.318282+00:00 mantic-test systemd[1]: systemd-hostnamed.service: Deactivated successfully.
2023-11-29T11:01:03.018207+00:00 mantic-test systemd[1]: systemd-timedated.service: Deactivated successfully.
2023-11-29T11:01:51.230419+00:00 mantic-test systemd[1]: Created slice user-0.slice - User Slice of UID 0.
2023-11-29T11:01:51.231018+00:00 mantic-test systemd[1]: Starting user-runtime-dir@0.service - User Runtime Directory /run/user/0...
2023-11-29T11:01:51.235439+00:00 mantic-test systemd[1]: Finished user-runtime-dir@0.service - User Runtime Directory /run/user/0.
2023-11-29T11:01:51.236741+00:00 mantic-test systemd[1]: Starting user@0.service - User Manager for UID 0...

Revision history for this message
Juerg Haefliger (juergh) wrote :

From my limited testing, it looks like it was introduced in 6.5.0-10.10.

Revision history for this message
Juerg Haefliger (juergh) wrote :

The first container reboot fails because seeding takes too long (which is bug 1878225). Then the hack kicks in and a second image build is attempted with also fails. The hack masks the problematic services

9811s Created symlink /etc/systemd/system/snapd.service → /dev/null.
9817s Created symlink /etc/systemd/system/snapd.seeded.service → /dev/null.

but the subsequent reboot still times out. Huh?

Revision history for this message
Juerg Haefliger (juergh) wrote :

A couple of retries and the test finally passes. There's some flakiness.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.