Disco autopkgtest @ armhf fails root-unittests -> test-execute -> exec-dynamicuser-statedir.service

Bug #1845337 reported by Christian Ehrhardt 
22
This bug affects 2 people
Affects Status Importance Assigned to Milestone
qemu (Ubuntu)
Invalid
Undecided
Unassigned
Disco
Invalid
Undecided
Unassigned
systemd (Ubuntu)
Fix Released
Undecided
Unassigned
Disco
Fix Released
Undecided
Unassigned

Bug Description

[impact]

due to a recent change to allow armhf tests to run lxd containers, autopkgtest for systemd on disco fails consistently.

[test case]

see test results, linked in original description below.

[regression potential]

very low, autopkgtest fix only.

[other info]

original description:
---

Since the recent few weeks systemd autopkgtest @ armhf @ disco fail [1].

The log is very (very) long and partially interwoven due to concurrent execution.
Somewhere in between we see this subcase is the one failing: root-unittests
Of this test (which again has many subtests) it is: test-execute
And of this again it is (always):

I'll attach bad and good case full and stripped logs.

The diff of those comes down to just:
1. execute a find in a shell
2. shell exits
3. exec-dynamicuser-statedir.service: Main process exited, code=exited, status=0/SUCCESS
vs
3. exec-dynamicuser-statedir.service: Main process exited, code=exited, status=1/FAILURE
4. in the bad case that triggers an assertion
The find that fails is:

find / -path /var/tmp -o -path /tmp -o -path /proc -o -path /dev/mqueue -o -path /dev/shm -o -path /sys/fs/bpf

Good and bad case are the same most recent version systemd/240-6ubuntu5.7.

Maybe something is bad in the containers we have for armhf in regard to these paths?
Was there any change we'd know of?

If there is nothing known, could we force-badtest it to get it out of the way of ongoing migrations?

[1]: http://autopkgtest.ubuntu.com/packages/s/systemd/disco/armhf

CVE References

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Also add a qemu task which is blocked by it from migrating through an SRU

tags: added: update-excuse
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :
Revision history for this message
Dan Streetman (ddstreet) wrote :

The 'good' tests have a dir being mounted '/dev/.lxd-mounts' while the 'bad' tests have a dir being mounted '/dev/.lxc' as well as '/dev/.lxc/proc'. In the 'bad' case, the test is making sure there are no unexpected writable directories other than the single one it's expecting (and other expected ones which are ignored, e.g. /tmp, /var/tmp, ...), and it's finding a writable dir under /dev/.lxc/ that it of course wasn't expecting:
+ test /dev/.lxc/proc/1079/fd/dev/.lxc/proc/1079/map_files/dev/.lxc/proc/1079/task/1079/fd/var/lib/private/quux/pief/var/lib/private/waldo = /var/lib/private/quux/pief/var/lib/private/waldo

something (maybe lxc itself?) seems to be mounting /proc under the /dev/.lxc dir, or something like that...when using lxd, that problem doesn't seem to happen. I'd be inclined to blame lxc for this, not the test itself.

Did the armhf testbeds get changed from lxd to lxc recently?

Revision history for this message
Dan Streetman (ddstreet) wrote :

@laney maybe you know if the armhf testbeds were recently moved from lxd to lxc containers?

Revision history for this message
Iain Lane (laney) wrote :

Hey thanks for subscribing me!

We haven't had an LXD update recently (the instances are using LXD from bionic-updates and that's not been changed for a long time). The only things I can think of is that Adam recently deployed a config change to set 'security.nesting=true' on our instances (https://git.launchpad.net/autopkgtest-cloud/commit/?id=b8c9165686c7598b3f1a68aa4684e7f382ad935c), and we recently (last week, while in Paris) dist-upgraded and rebooted them all to pick up a newer kernel (4.15.0-62-generic).

I'm not sure if either of these changes might relate to what you're seeing here - my first suggestion would be talk to the LXD team? If you need help connecting with them, please let me know. Hope that helps.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Thanks Dan for going deeper on these logs - interesting path differences that you have spotted!

Thanks Laney for the info on recent changes!

I subscribed stgraber and will give him and the other LXD folks a ping to chime in here if the mentioned configs/updates ring a bell in regard to the paths that were identified to be changing between good/bad case.

Revision history for this message
Stéphane Graber (stgraber) wrote :

/dev/.lxc/* shows up when nesting is enabled, so that's indeed related to the change Adam did.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Great, thanks Stephane for confirming that.

@Rbalinx / @xnox - would you want to fix that up as part of the next systemd upload to Disco then?

Until then we could mark it badtest on armhf as that reflects the current state correctly and unblocks others until fixed.

Dan Streetman (ddstreet)
tags: added: next-ddstreet systemd
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Until resolved I added a commit to the MP [1] masking current bad systemd tests in Disco.
That would unblock everyone until this is hopefully resolved in the next upload.

[1]: https://code.launchpad.net/~paelzer/britney/hints-ubuntu-disco-fix-systemd-ppc-hint-that-never-works/+merge/373200

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in qemu (Ubuntu Disco):
status: New → Confirmed
Changed in qemu (Ubuntu):
status: New → Confirmed
Changed in systemd (Ubuntu Disco):
status: New → Confirmed
Changed in systemd (Ubuntu):
status: New → Confirmed
Revision history for this message
Balint Reczey (rbalint) wrote :

Thanks all! I'm uploading it to Eoan first, then schedule for SRU to Disco.

Eoan package is tested in ppa:ci-train-ppa-service/3797 .

Dan Streetman (ddstreet)
tags: added: ddstreet disco
removed: next-ddstreet
tags: removed: ddstreet
Paride Legovini (paride)
Changed in systemd (Ubuntu):
status: Confirmed → Triaged
Changed in systemd (Ubuntu Disco):
status: Confirmed → Triaged
Changed in qemu (Ubuntu Disco):
status: Confirmed → Triaged
Changed in qemu (Ubuntu):
status: Confirmed → Triaged
Revision history for this message
Balint Reczey (rbalint) wrote :
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package systemd - 242-7ubuntu1

---------------
systemd (242-7ubuntu1) eoan; urgency=medium

  * Merge from unstable
  * UBUNTU: drop setting fs.protected_regular and fs.protected_fifos from
    sysctl defaults shipped by systemd (LP: #1845637)
    File: debian/patches/debian/UBUNTU-drop-kernel.-settings-from-sysctl-defaults-shipped.patch
    https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/commit/?id=6e583847b04c3f83a50f3bd6947dcae6a73d8388
  * test-execute: Filter /dev/.lxc in exec-dynamicuser-statedir.service.
    It appears in nested LXC containers and broke the armhf autopkgtest.
    (LP: #1845337)
    File: debian/patches/test-execute-Filter-dev-.lxc-in-exec-dynamicuser-statedir.patch
    https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/commit/?id=75af888d5552f706b86182a56f12ccc8e83ca04e

systemd (242-7) unstable; urgency=medium

  * sleep: properly pass verb to sleep script
  * core: factor root_directory application out of apply_working_directory.
    Fixes RootDirectory not working when used in combination with User.
    (Closes: #939408)
  * shared/bus-util: drop trusted annotation from
    bus_open_system_watch_bind_with_description().
    This ensures that access controls on systemd-resolved's D-Bus interface
    are enforced properly.
    (CVE-2019-15718, Closes: #939353)

 -- Balint Reczey <email address hidden> Wed, 02 Oct 2019 14:13:28 +0200

Changed in systemd (Ubuntu):
status: Triaged → Fix Released
Dan Streetman (ddstreet)
description: updated
Balint Reczey (rbalint)
Changed in qemu (Ubuntu):
status: Triaged → Invalid
Changed in qemu (Ubuntu Disco):
status: Triaged → Invalid
Revision history for this message
Steve Langasek (vorlon) wrote : Please test proposed package

Hello Christian, or anyone else affected,

Accepted systemd into disco-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/systemd/240-6ubuntu5.8 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-disco to verification-done-disco. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-disco. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

description: updated
Changed in systemd (Ubuntu Disco):
status: Triaged → Fix Committed
tags: added: verification-needed verification-needed-disco
Revision history for this message
Ubuntu SRU Bot (ubuntu-sru-bot) wrote : Autopkgtest regression report (systemd/240-6ubuntu5.8)

All autopkgtests for the newly accepted systemd (240-6ubuntu5.8) for disco have finished running.
The following regressions have been reported in tests triggered by the package:

prometheus-bind-exporter/unknown (armhf)
php7.2/7.2.24-0ubuntu0.19.04.1 (armhf)
gvfs/1.40.1-1ubuntu0.1 (ppc64el)
pdns-recursor/unknown (armhf)
webhook/unknown (armhf)
munin/2.0.47-1ubuntu3 (armhf, arm64)
systemd/240-6ubuntu5.8 (ppc64el)

Please visit the excuses page listed below and investigate the failures, proceeding afterwards as per the StableReleaseUpdates policy regarding autopkgtest regressions [1].

https://people.canonical.com/~ubuntu-archive/proposed-migration/disco/update_excuses.html#systemd

[1] https://wiki.ubuntu.com/StableReleaseUpdates#Autopkgtest_Regressions

Thank you!

Revision history for this message
Adam Conrad (adconrad) wrote :

I bumped the systemd/ppc64el hint and retried the rest, and autopkgtests look clear now.

Revision history for this message
Balint Reczey (rbalint) wrote :

Verified with systemd/240-6ubuntu5.8 on Disco.

https://objectstorage.prodstack4-5.canonical.com/v1/AUTH_77e2ada1e7a84929a74ba3b87153c0ac/autopkgtest-disco/disco/armhf/s/systemd/20191109_024443_a141a@/log.gz :

...
exec-dynamicuser-statedir.service: Executing: /usr/bin/sh -x -c 'test $(find / \( -path /var/tmp -o -path /tmp -o -path /proc -o -path /dev/mqueue -o -path /dev/shm -o -path /sys/fs/bpf -o -path /dev/.lxc \) -prune -o -type d -writable -print 2>/dev/null | sort -u | tr -d \\n) = /var/lib/private/quux/pief/var/lib/private/waldo'
+ sort -u
+ tr -d \n
+ find / ( -path /var/tmp -o -path /tmp -o -path /proc -o -path /dev/mqueue -o -path /dev/shm -o -path /sys/fs/bpf -o -path /dev/.lxc ) -prune -o -type d -writable -print
+ test /var/lib/private/quux/pief/var/lib/private/waldo = /var/lib/private/quux/pief/var/lib/private/waldo
Received SIGCHLD from PID 1077 (sh).
...
root-unittests PASS
...

tags: added: verification-done verification-done-disco
removed: verification-needed verification-needed-disco
tags: added: update-excuse-disco
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package systemd - 240-6ubuntu5.8

---------------
systemd (240-6ubuntu5.8) disco; urgency=medium

  [ Victor Tapia ]
  * d/p/resolved_disable-connection-downgrade-when-DNSSEC-yes.patch
    Fix regression introduced by
    resolved-Mitigate-DVE-2018-0001-by-retrying-NXDOMAIN-with.patch when
    DNSSEC=yes (LP: #1796501)

  [ Dan Streetman ]
  * d/p/lp1840640-shared-seccomp-add-sync_file_range2.patch:
    allow sync_file_range2 in nspawn container (LP: #1840640)
  * d/p/lp1847527-journal-remote-do-not-request-Content-Length-if-Tran.patch:
    do not request Content-Length if Transfer-Encoding is chunked
    (LP: #1847527)
  * d/t/storage: fix flaky test
    (LP: #1847815)
  * d/p/lp1843381-dell_passthrough_skip_rename_retry.patch,
    debian/extra/rules/73-usb-net-by-mac.rules:
    fix rename delay for systems using "Dell MAC passthrough"
    (LP: #1843381)
  * d/p/lp1849733/0001-resolved-if-we-can-t-append-EDNS-OPT-RR-then-indicat.patch,
    d/p/lp1849733/0002-resolved-don-t-let-EDNS0-OPT-dgram-size-affect-TCP.patch:
    ignore EDNS0 payload limit when responding over TCP (LP: #1849733)
  * d/p/lp1849658-resolved-set-stream-type-during-DnsStream-creation.patch:
    - Fix bug in refcounting TCP stream types (LP: #1849658)
  * d/extra/dhclient-enter-resolved-hook:
    - only restart resolved if dhclient conf changed (LP: #1805183)

  [ Balint Reczey ]
  * d/p/test-execute-Filter-dev-.lxc-in-exec-dynamicuser-statedir.patch:
    fix test breakage due to running in nested lxd container
    (LP: #1845337)

 -- Dan Streetman <email address hidden> Fri, 04 Oct 2019 09:06:58 -0400

Changed in systemd (Ubuntu Disco):
status: Fix Committed → Fix Released
Revision history for this message
Łukasz Zemczak (sil2100) wrote : Update Released

The verification of the Stable Release Update for systemd has completed successfully and the package is now being released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers