Please backport "fix race between daemon-reload and other commands #8803" to 16.04 (for UC16) and 18.04 (for UC18)

Bug #1819728 reported by Michael Vogt on 2019-03-12
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
systemd (Ubuntu)
Undecided
Unassigned
Xenial
Undecided
Unassigned
Bionic
Undecided
Unassigned

Bug Description

[Impact]
On Ubuntu Core we recently hit the a race in daemon-reload and systemctl twice. This race is fixed in systemd upstream: "fix race between daemon-reload and other commands #8803" and a subsequent fix in "PR#11121".

Note that this is a general problem in systemd with daemon-reload and
systemctl commands, we just happen to hit it more often on Ubuntu Core
but the test-case below explodes just fine on a normal Ubuntu release
like 16.04 or 18.04 (not on 18.10+ as its fixed there).

[TEST CASE]
To reproduce its enough to run:

for i in $(seq 50); do
  systemctl daemon-reload &
  systemctl start ssh &
done

This will result in "systemctl start ssh" hanging in ppoll. With the patch applied the hangs go away.

[REGRESSION POTENTIAL]
Medium/High, this change is already in the systemd upstream and in use disco and later but the backport required some manual resolving of conflicts the code because changed between 229,237 and the fixed code in 240. Its also
not fully clear if the fix relies on the new systemd "coldplug" functionality that was added in more recent git revisions.

The upstream fix is https://github.com/systemd/systemd/pull/8803 and https://github.com/systemd/systemd/pull/11121

This change is already in the systemd in cosmic+

summary: - Please backport " fix race between daemon-reload and other commands
- #8803 "
+ Please backport "fix race between daemon-reload and other commands
+ #8803" to 16.04 (for UC16)
tags: added: patch
Michael Vogt (mvo) on 2019-03-13
description: updated
Michael Vogt (mvo) wrote :
summary: Please backport "fix race between daemon-reload and other commands
- #8803" to 16.04 (for UC16)
+ #8803" to 16.04 (for UC16) and 18.04 (for UC18)
Michael Vogt (mvo) wrote :

I ran the snapd autopkgtest against a bionic systemd deb build with this and noticed no regressions.

Michael Vogt (mvo) on 2019-03-13
description: updated
Michael Vogt (mvo) wrote :

This is now uploaded to the ppa:snappy-dev/edge

Michael Vogt (mvo) on 2019-03-13
description: updated
Michael Vogt (mvo) wrote :

The xenial build of the updated systemd was tested using a full spread run with no regressions and a new test was added in https://github.com/snapcore/snapd/pull/6595 to test that the regression is fixed

This test shows that core/edge is fixed but core/beta which does not yet has the fix is hanging (as expected).

Michael Vogt (mvo) wrote :

The xenial version of this is NOT ready yet, a second run produced a CRASH at startup on UC16 with the updated systemd.

Michael Vogt (mvo) wrote :

This version fixes a subtle bug added by me when de-conflicting the diff.

Michael Vogt (mvo) wrote :

The version for xenial link in comment #9 did successfully run a full spread run with UC16. This includes the regression test that systemctl start is not hanging.

Michael Vogt (mvo) wrote :

I uploaded both the xenial and bionic version to -proposed now.

Michael Vogt (mvo) on 2019-03-13
description: updated

Hello Michael, or anyone else affected,

Accepted systemd into bionic-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/systemd/237-3ubuntu10.16 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-bionic to verification-done-bionic. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-bionic. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in systemd (Ubuntu Bionic):
status: New → Fix Committed
tags: added: verification-needed verification-needed-bionic
Łukasz Zemczak (sil2100) wrote :

Hello Michael, or anyone else affected,

Accepted systemd into xenial-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/systemd/229-4ubuntu21.18 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-xenial to verification-done-xenial. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-xenial. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in systemd (Ubuntu Xenial):
status: New → Fix Committed
tags: added: verification-needed-xenial
Michael Vogt (mvo) on 2019-03-15
tags: added: verification-failed-xenial
removed: verification-needed-xenial
Michael Vogt (mvo) wrote :

Unfortunately we need to pull the xenial update. We see failure like this:

https://objectstorage.prodstack4-5.canonical.com/v1/AUTH_77e2ada1e7a84929a74ba3b87153c0ac/autopkgtest-xenial/xenial/i386/p/python-systemd/20190314_173156_e1077@/log.gz

on various autopkgtest runs. E.g. for http://autopkgtest.ubuntu.com/packages/p/python-systemd

It is also very inconsistent, i.e. not all arches are affected, for the python-systemd just i386,ppc64el,s390x. But its also visible in the docker.io xenial amd64 test so its not arch specific.

Changed in systemd (Ubuntu Xenial):
status: Fix Committed → Triaged
Changed in systemd (Ubuntu):
status: New → Fix Released
Michael Vogt (mvo) wrote :

I can reproduce the autopkgtest failure with:

autopkgtest -sU --apt-pocket proposed python-systemd_234-2build1.dsc -- qemu ~/VM/ubuntu-16.04-32.img

on a local qemu. When it pulls in the systemd from -proposed I see:
...
Failed to execute operation: Failed to activate service 'org.freedesktop.systemd1': timed out
...
Trying to debugnow.

Michael Vogt (mvo) wrote :

I managed to capture the crash in xenial while running the ADT tests for python-systemd.

Michael Vogt (mvo) wrote :

The xenial crash turns out to be https://github.com/systemd/systemd/issues/10716 - there is a fix in git, I will look into backport this. We will also need a binoic update with that and a cosmic update.

Michael Vogt (mvo) wrote :
Michael Vogt (mvo) wrote :
tags: added: verification-failed-bionic
removed: verification-needed-bionic
description: updated
Michael Vogt (mvo) wrote :

The bionic version of this systemd update was used on an Ubuntu 18.04 system that ran the full spread test suite (>300 tests). There are hundreds of mount units created, started, stopped and a bunch of system services created and removed and tons of daemon-reloads. No systemd related issues where found.

Michael Vogt (mvo) wrote :

This updated debdiff for xenial ran a full autopkgtest run of snapd on 16.04 without errors.

Michael Vogt (mvo) wrote :

The xenial version was also run on i386 with the full snapd testsuite without issues.

Michael Vogt (mvo) wrote :

This is now uploaded (together with #1816753) to xenial-proposed and is currently in the UNAPPROVED queue).

Łukasz Zemczak (sil2100) wrote :

Hello Michael, or anyone else affected,

Accepted systemd into bionic-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/systemd/237-3ubuntu10.17 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-bionic to verification-done-bionic. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-bionic. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

tags: added: verification-needed-bionic
removed: verification-failed-bionic
Łukasz Zemczak (sil2100) wrote :

Hello Michael, or anyone else affected,

Accepted systemd into xenial-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/systemd/229-4ubuntu21.19 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-xenial to verification-done-xenial. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-xenial. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in systemd (Ubuntu Xenial):
status: Triaged → Fix Committed
tags: added: verification-needed-xenial
removed: verification-failed-xenial
Michael Vogt (mvo) wrote :

We validated the fix via spread in https://github.com/snapcore/snapd/pull/6595 (both xenial and bionic).

Validation completed. No errors detected on:

xenial: 229-4ubuntu21.19
bionic: 237-3ubuntu10.17
core18: systemd 237

tags: added: verification-done verification-done-bionic verification-done-xenial
removed: verification-needed verification-needed-bionic verification-needed-xenial

The verification of the Stable Release Update for systemd has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package systemd - 229-4ubuntu21.19

---------------
systemd (229-4ubuntu21.19) xenial; urgency=medium

  [ Michael Vogt ]
  * d/p/fix-race-daemon-reload-11121.patch:
    - backport systemd upstream PR#8803 and PR#11121 to fix race
      when doing systemctl and systemctl daemon-reload at the
      same time LP: #1819728

  [ Balint Reczey ]
  * d/p/virt-detect-WSL-environment-as-a-container.patch:
    - virt: detect WSL environment as a container (LP: #1816753)

 -- Michael Vogt <email address hidden> Mon, 25 Mar 2019 16:04:56 +0100

Changed in systemd (Ubuntu Xenial):
status: Fix Committed → Fix Released
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package systemd - 237-3ubuntu10.17

---------------
systemd (237-3ubuntu10.17) bionic; urgency=medium

  [ Michael Vogt ]
  * d/p/Support-system-image-read-only-etc.patch:
    - re-add support for /etc/writable for core18 (LP: #1778936)
  * d/p/fix-race-daemon-reload-8803.patch:
    - backport systemd upstream PR#8803 and PR#11121 to fix race
      when doing systemctl and systemctl daemon-reload at the
      same time LP: #1819728

  [ Balint Reczey ]
   * d/p/virt-detect-WSL-environment-as-a-container.patch:
     - virt: detect WSL environment as a container (LP: #1816753)

 -- Michael Vogt <email address hidden> Mon, 18 Mar 2019 08:40:44 +0100

Changed in systemd (Ubuntu Bionic):
status: Fix Committed → Fix Released
Paul van Tilburg (paulvt) wrote :

This fix caused a regression in our Dracut-based boot setup.
Since this patch was incorporated (237-3ubuntu10.17), some of several of the (PC-type) models we provide to our customers that tend to have "slow I/O", no longer boot because Dracut times out waiting for the root file system to come online because the systemd-crypt@ service can no longer be started.
We are currently using a build of this source package that doesn't include this patch to fix this temporarily.

I have filed a bug eport at Systemd's GitHub too, this contains more information:
https://github.com/systemd/systemd/issues/12371.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.