cloud-init generates ordering cycle via After=cloud-init in systemd-fsck

Bug #1717477 reported by thermoman on 2017-09-15
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
cloud-init
High
Unassigned
cloud-init (Ubuntu)
High
Unassigned
Xenial
High
Scott Moser
Zesty
High
Scott Moser
Artful
High
Unassigned

Bug Description

http://pad.lv/1717477
https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/1717477

=== Begin SRU Template ===
[Impact]
Cloud-init's inclusion of a systemd drop-in file
  /lib/systemd/system/systemd-fsck@.service.d/cloud-init.conf
Caused a regression on systems that had entries in /etc/fstab
that were not authored by cloud-init (specifically that did not have
something like 'x-systemd.requires=cloud-init.service' in their
filesystem options.

[Test Case]
The test can be done on any cloud that has space to put a non-root
filesystem.

a.) launch instance
b.) upgrade to cloud-init to -updates pocket
c.) create a filesystem and put it in /etc/fstab
    bdev="/dev/sdb1"
    mkdir -p /mnt
    mkfs.ext4 -F "$bdev"
    echo "$bdev /mnt auto defaults 0 2" >> /etc/fstab

    reboot
d.) see mention of 'ordering cycle' in journal

    $ journalctl -o short-precise | grep -i ordering.cycle
    Sep 15 14:08:48.331033 xenial-20170911-174122 systemd[1]: local-fs.target: Found ordering cycle on local-fs.target/start
    Sep 15 14:08:48.331097 xenial-20170911-174122 systemd[1]: local-fs.target: Breaking ordering cycle by deleting job mnt.mount/start
    Sep 15 14:08:48.331108 xenial-20170911-174122 systemd[1]: mnt.mount: Job mnt.mount/start deleted to break ordering cycle starting with local-fs.target/start

e.) upgrade to proposed
f.) reboot
g.) expect no mention of ordering cycle as seen in 'd'
    $ journalctl -o short-precise | grep -i ordering.cycle || echo "no cycles"
    no cycles

[Regression Potential]
This change will mean that bug 1691489 is present again.
That bug is much less severe and affects a much smaller set of users.

[Other Info]
Upstream commit at
  https://git.launchpad.net/cloud-init/commit/?id=a2f8ce9c80

=== End SRU Template ===

We're running several machines with

  cloud-init_0.7.9-153-g16a7302f-0ubuntu1~16.04.2

without problems.

Just upgraded all machines to

  cloud-init_0.7.9-233-ge586fe35-0ubuntu1~16.04.1

and rebooted them all.

All machines report ordering cycles in their dmesg, resulting in systemd breaking the
loop by NOT starting some important services, e.g. mouting local filesystems:

Sep 14 15:43:52.487945 noname systemd[1]: networking.service: Found ordering cycle on networking.service/start
Sep 14 15:43:52.487952 noname systemd[1]: networking.service: Found dependency on local-fs.target/start
Sep 14 15:43:52.487960 noname systemd[1]: networking.service: Found dependency on home.mount/start
Sep 14 15:43:52.487968 noname systemd[1]: networking.service: Found dependency on systemd-fsck@dev-disk-by\x2dlabel-Home.service/start
Sep 14 15:43:52.487975 noname systemd[1]: networking.service: Found dependency on cloud-init.service/start
Sep 14 15:43:52.487982 noname systemd[1]: networking.service: Found dependency on networking.service/start
Sep 14 15:43:52.488297 noname systemd[1]: networking.service: Breaking ordering cycle by deleting job local-fs.target/start
Sep 14 15:43:52.488306 noname systemd[1]: local-fs.target: Job local-fs.target/start deleted to break ordering cycle starting with networking.service/start

% cat /etc/fstab
LABEL=cloudimg-rootfs / ext4 defaults,discard 0 1
LABEL=Home /home xfs defaults,logbufs=8 0 2

In this case /home isn't mounted as a result of systemd breaking the loop, resulting in services depending on /home not being started.

1. Tell us your cloud provider

AWS

2. dpkg-query -W -f='${Version}' cloud-init

0.7.9-233-ge586fe35-0ubuntu1~16.04.1

3. Any appropriate cloud-init configuration you can provide us

Nothing special - worked with 0.7.9-153-g16a7302f-0ubuntu1~16.04.2 on all machines without hassle.

The problem is this change:

diff -uaNr 153/lib/systemd/system/systemd-fsck@.service.d/cloud-init.conf 233/lib/systemd/system/systemd-fsck@.service.d/cloud-init.conf
--- 153/lib/systemd/system/systemd-fsck@.service.d/cloud-init.conf 1970-01-01 01:00:00.000000000 +0100
+++ 233/lib/systemd/system/systemd-fsck@.service.d/cloud-init.conf 2017-07-28 22:28:47.000000000 +0200
@@ -0,0 +1,2 @@
+[Unit]
+After=cloud-init.service

WORKAROUND
==========

I just did a

  rm /lib/systemd/system/systemd-fsck@.service.d/cloud-init.conf

on all machines and rebooted them: no more dependency loops reported, everything works again.

Related bugs:
 * bug 1686514: Azure: cloud-init does not handle reformatting GPT partition ephemeral disks
 * bug 1691489: fstab entries written by cloud-config may not be mounted

Related branches

thermoman (thermoman) on 2017-09-15
tags: added: cycle ordering
tags: added: systemd
removed: cycle ordering
Scott Moser (smoser) on 2017-09-15
Changed in cloud-init:
importance: Undecided → High
Scott Moser (smoser) on 2017-09-15
Changed in cloud-init:
status: New → Confirmed
Scott Moser (smoser) wrote :

I can reproduce this fairly easily with a standard ubuntu image.

a.) launch an instance on openstack (or somewhere with an additional disk)

   upgrade cloud-init if necessary.

b.) create a filesystem and put it in /etc/fstab with simple options.
  on my opensstack cloud-init sets up an entry like:

   /dev/vdb /mnt auto defaults,nofail,x-systemd.requires=cloud-init.service,comment=cloudconfig 0 2

  but what we want is just a simple line:

   /dev/vdb /mnt auto defaults 0 2

c.) reboot

I'm attaching 'journal -o short-precise' output for first boot and after reboot (with the failure).

Scott Moser (smoser) wrote :
Scott Moser (smoser) wrote :
description: updated
Changed in cloud-init (Ubuntu):
status: New → Confirmed
importance: Undecided → High
Scott Moser (smoser) on 2017-09-15
Changed in cloud-init (Ubuntu Xenial):
status: New → Confirmed
Changed in cloud-init (Ubuntu Zesty):
status: New → Confirmed
Changed in cloud-init (Ubuntu Xenial):
importance: Undecided → High
Changed in cloud-init (Ubuntu Zesty):
importance: Undecided → High
tags: added: regression-released
tags: added: regression-release
removed: regression-released
Balint Reczey (rbalint) wrote :

I suggest reverting the commit causing the regression and fixing LP: #1691489 in an other way.

Scott Moser (smoser) on 2017-09-15
description: updated
Scott Moser (smoser) on 2017-09-15
Changed in cloud-init (Ubuntu Xenial):
status: Confirmed → In Progress
Changed in cloud-init (Ubuntu Zesty):
status: Confirmed → In Progress
Changed in cloud-init (Ubuntu Artful):
status: Confirmed → In Progress
Changed in cloud-init (Ubuntu Xenial):
assignee: nobody → Scott Moser (smoser)
Changed in cloud-init (Ubuntu Zesty):
assignee: nobody → Scott Moser (smoser)
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package cloud-init - 0.7.9-280-ge626966e-0ubuntu1

---------------
cloud-init (0.7.9-280-ge626966e-0ubuntu1) artful; urgency=medium

  * debian/rules: install rsyslog file with 0644 mode instead of 0755.
  * debian/rules, debian/apport-launcher.py: add an apport hook. (LP: #1607345)
  * New upstream snapshot.
    - cmdline: add collect-logs subcommand. [Chad Smith] (LP: #1607345)
    - CloudStack: consider dhclient lease files named with a hyphen.
      (LP: #1717147)
    - resizefs: Drop check for read-only device file, do not warn on
      overlayroot. [Chad Smith]
    - Do not provide systemd-fsck drop-in which could cause ordering cycles.
      [Balint Reczey] (LP: #1717477)
    - tests: Enable the NoCloud KVM platform [Joshua Powers]
    - resizefs: pass mount point to xfs_growfs [Dusty Mabe]
    - vmware: Enable nics before sending the SUCCESS event. [Sankar Tanguturi]
    - cloud-config modules: honor distros definitions in each module
      [Chad Smith] (LP: #1715738, #1715690)
    - chef: Add option to pin chef omnibus install version
      [Ethan Apodaca] (LP: #1462693)
    - tests: execute: support command as string [Joshua Powers]
    - schema and docs: Add jsonschema to resizefs and bootcmd modules
      [Chad Smith]
    - tools: Add xkvm script, wrapper around qemu-system [Joshua Powers]
    - vmware customization: return network config format
      [Sankar Tanguturi] (LP: #1675063)

 -- Scott Moser <email address hidden> Fri, 15 Sep 2017 16:09:07 -0400

Changed in cloud-init (Ubuntu Artful):
status: In Progress → Fix Released

Hello thermoman, or anyone else affected,

Accepted cloud-init into zesty-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/cloud-init/0.7.9-233-ge586fe35-0ubuntu1~17.04.2 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-zesty to verification-done-zesty. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-zesty. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in cloud-init (Ubuntu Zesty):
status: In Progress → Fix Committed
tags: added: verification-needed verification-needed-zesty
Łukasz Zemczak (sil2100) wrote :

Since this basically re-introduces LP: #1691489, we need to remember to revert the 'Fix Released' fields for each series in that bug once this goes out to -updates.

Changed in cloud-init (Ubuntu Xenial):
status: In Progress → Fix Committed
tags: added: verification-needed-xenial
Łukasz Zemczak (sil2100) wrote :

Hello thermoman, or anyone else affected,

Accepted cloud-init into xenial-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/cloud-init/0.7.9-233-ge586fe35-0ubuntu1~16.04.2 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-xenial to verification-done-xenial. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-xenial. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

This bug is believed to be fixed in cloud-init in 17.1. If this is still a problem for you, please make a comment and set the state back to New

Thank you.

Changed in cloud-init:
status: Confirmed → Fix Released
Scott Moser (smoser) wrote :

## launched instance on Openstack
$ cat /etc/cloud/build.info
build_name: server
serial: 20170929

$ git clone http://git.launchpad.net/~smoser/cloud-init/+git/sru-info

$ sudo apt-get update -q && sudo apt-get install -q cloud-init
..
cloud-init is already the newest version (0.7.9-233-ge586fe35-0ubuntu1~16.04.1).

$ dpkg-query --show cloud-init
0.7.9-233-ge586fe35-0ubuntu1~16.04.1

$ ./sru-info/bin/save-old-data orig-boot

$ bdev="/dev/vdb";
$ sudo umount $bdev || :
$ sudo sed -i '/comment=cloudconfig/d' /etc/fstab
$ sudo mkdir -p /mnt
$ sudo mkfs.ext4 -F "$bdev"
$ echo "$bdev /mnt auto defaults 0 2" | sudo tee -a /etc/fstab

$ cat /etc/fstab
LABEL=cloudimg-rootfs / ext4 defaults 0 0
/dev/vdb /mnt auto defaults 0 2

$ ./sru-info/bin/do-reboot

### ssh back in and see the bug.
$ journalctl -o short-precise | grep -i ordering.cycle
Oct 02 19:40:09.174479 xenial-1717477 systemd[1]: networking.service: Found ordering cycle on networking.service/start
Oct 02 19:40:09.174552 xenial-1717477 systemd[1]: networking.service: Breaking ordering cycle by deleting job apparmor.service/start
Oct 02 19:40:09.174565 xenial-1717477 systemd[1]: apparmor.service: Job apparmor.service/start deleted to break ordering cycle starting with networking.service/start
Oct 02 19:40:09.174576 xenial-1717477 systemd[1]: networking.service: Found ordering cycle on networking.service/start
Oct 02 19:40:09.174638 xenial-1717477 systemd[1]: networking.service: Breaking ordering cycle by deleting job local-fs.target/start
Oct 02 19:40:09.174650 xenial-1717477 systemd[1]: local-fs.target: Job local-fs.target/start deleted to break ordering cycle starting with networking.service/start

### enable proposed to show fix.
$ sudo ./sru-info/bin/enable-proposed
deb http://nova.clouds.archive.ubuntu.com/ubuntu/ xenial-proposed main universe

$ sudo apt-get update -q && sudo apt-get install -qy cloud-init
...
Unpacking cloud-init (0.7.9-233-ge586fe35-0ubuntu1~16.04.2) over (0.7.9-233-ge586fe35-0ubuntu1~16.04.1) ...
Processing triggers for ureadahead (0.100.0-19) ...
Setting up cloud-init (0.7.9-233-ge586fe35-0ubuntu1~16.04.2) ...
Leaving 'diversion of /etc/init/ureadahead.conf to /etc/init/ureadahead.conf.disabled by cloud-init'

$ sudo ./sru-info/bin/save-old-data show-bug

$ sudo ./sru-info/bin/do-reboot

## ssh back in
$ journalctl -o short-precise | grep -i ordering.cycle || echo "no cycles"
no cycles

$ grep WARN /var/log/cloud-init || echo "no warnings"
no warnings

$ cat /etc/fstab
LABEL=cloudimg-rootfs / ext4 defaults 0 0
/dev/vdb /mnt auto defaults 0 2

$ grep vdb /proc/mounts
/dev/vdb /mnt ext4 rw,relatime,data=ordered 0 0

tags: added: verification-done-xenial
removed: verification-needed-xenial
Scott Moser (smoser) wrote :

## launch instance on OpenStack
$ cat /etc/cloud/build.info
build_name: server
serial: 20170922

$ git clone http://git.launchpad.net/~smoser/cloud-init/+git/sru-infy

$ sudo apt-get update -q && sudo apt-get install -q cloud-init
cloud-init is already the newest version (0.7.9-233-ge586fe35-0ubuntu1~17.04.1).

$ dpkg-query --show cloud-init
cloud-init 0.7.9-233-ge586fe35-0ubuntu1~17.04.1

$ sudo ./sru-info/bin/save-old-data orig-boot

$ bdev="/dev/vdb";
$ sudo umount $bdev || :
$ sudo sed -i '/comment=cloudconfig/d' /etc/fstab
$ sudo mkdir -p /mnt
$ sudo mkfs.ext4 -F "$bdev"
$ echo "$bdev /mnt auto defaults 0 2" | sudo tee -a /etc/fstab

LABEL=cloudimg-rootfs / ext4 defaults 0 0
LABEL=UEFI /boot/efi vfat defaults 0 0
/dev/vdb /mnt auto defaults 0 2

$ sudo ./sru-info/bin/do-reboot
$ journalctl -o short-precise | grep -i ordering.cycle
Oct 02 19:53:10.852617 zesty-1717477 systemd[1]: apparmor.service: Found ordering cycle on apparmor.service/start

## enable proposed to show fix
$ sudo ./sru-info/bin/enable-proposed
deb http://nova.clouds.archive.ubuntu.com/ubuntu/ zesty-proposed main universe
ubuntu@zesty-1717477:~$

$ sudo apt-get update -q && sudo apt-get install -qy cloud-init
...
Setting up cloud-init (0.7.9-233-ge586fe35-0ubuntu1~17.04.2) ...

$ sudo ./sru-info/bin/save-old-data show-bug

$ sudo ./sru-info/bin/do-reboot

$ journalctl -o short-precise | grep -i ordering.cycle || echo "no cycles"
no cycles

$ grep WARN /var/log/cloud-init.log || echo "no warnings"
no warnings

$ cat /etc/fstab
LABEL=cloudimg-rootfs / ext4 defaults 0 0
LABEL=UEFI /boot/efi vfat defaults 0 0
/dev/vdb /mnt auto defaults 0 2

$ grep vdb /proc/mounts
/dev/vdb /mnt ext4 rw,relatime,data=ordered 0 0

tags: added: verification-done-zesty
removed: verification-needed verification-needed-zesty

The verification of the Stable Release Update for cloud-init has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package cloud-init - 0.7.9-233-ge586fe35-0ubuntu1~17.04.2

---------------
cloud-init (0.7.9-233-ge586fe35-0ubuntu1~17.04.2) zesty; urgency=medium

  * cherry-pick a2f8ce9c: Do not provide systemd-fsck drop-in which
    could cause systemd ordering cycles (LP: #1717477).

 -- Scott Moser <email address hidden> Fri, 15 Sep 2017 15:30:01 -0400

Changed in cloud-init (Ubuntu Zesty):
status: Fix Committed → Fix Released
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package cloud-init - 0.7.9-233-ge586fe35-0ubuntu1~16.04.2

---------------
cloud-init (0.7.9-233-ge586fe35-0ubuntu1~16.04.2) xenial-proposed; urgency=medium

  * cherry-pick a2f8ce9c: Do not provide systemd-fsck drop-in which
    could cause systemd ordering loops (LP: #1717477).

 -- Scott Moser <email address hidden> Fri, 15 Sep 2017 15:23:38 -0400

Changed in cloud-init (Ubuntu Xenial):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers