Breaking ordering cycle by deleting job NetworkManager.service/start

Bug #1487679 reported by Marc Schmitt on 2015-08-22
98
This bug affects 31 people
Affects Status Importance Assigned to Milestone
systemd
Fix Released
High
avahi (Ubuntu)
Undecided
Unassigned
Xenial
Undecided
Unassigned
nbd (Debian)
Fix Released
Unknown
nbd (Ubuntu)
Undecided
Unassigned
Xenial
Undecided
Unassigned
network-manager (Ubuntu)
Undecided
Unassigned
Xenial
Undecided
Unassigned
nfs-utils (Ubuntu)
Undecided
Unassigned
Xenial
Undecided
Unassigned
rpcbind (Ubuntu)
Undecided
Unassigned
Xenial
Undecided
Unassigned
systemd (Ubuntu)
Undecided
Unassigned
Xenial
Undecided
Unassigned
util-linux (Ubuntu)
Undecided
Unassigned
Xenial
Undecided
Unassigned

Bug Description

$ lsb_release -rd
Description: Ubuntu 15.04
Release: 15.04

$ apt-cache policy nbd-client
nbd-client:
  Installed: 1:3.8-4ubuntu0.1
  Candidate: 1:3.8-4ubuntu0.1
  Version table:
 *** 1:3.8-4ubuntu0.1 0
        500 http://ch.archive.ubuntu.com/ubuntu/ vivid-updates/main amd64 Packages
        500 http://security.ubuntu.com/ubuntu/ vivid-security/main amd64 Packages
        100 /var/lib/dpkg/status
     1:3.8-4 0
        500 http://ch.archive.ubuntu.com/ubuntu/ vivid/main amd64 Packages

I'm using the nbd-client to mount some raw disk images over the network but starting the nbd-client automatically during bootup does not happen due to the following:

Aug 22 08:54:20 fractal kernel: [ 11.875885] systemd[1]: Found dependency on nbd-client.service/start
Aug 22 08:54:20 fractal kernel: [ 11.875890] systemd[1]: Breaking ordering cycle by deleting job nbd-client.service/start
Aug 22 08:54:20 fractal kernel: [ 11.875891] systemd[1]: Job nbd-client.service/start deleted to break ordering cycle starting with basic.target/start

Meaning after boot, I have to manually run `sudo ndb-client start` every time I want to access these images. Note that this is no diskless system, the images I mount via NBD do not contain the local system, they are totally unrelated.

-----------------------------------------------------------------------------
Bug with NFS-server and RPC-bind is indicated by messages:

$ journalctl | grep -i break
ноя 06 22:49:57 norbert-vaio systemd[1]: network.target: Breaking ordering cycle by deleting job NetworkManager.service/start
ноя 06 22:49:57 norbert-vaio systemd[1]: NetworkManager.service: Job NetworkManager.service/start deleted to break ordering cycle starting with network.target/start
ноя 06 22:49:57 norbert-vaio systemd[1]: nfs-server.service: Breaking ordering cycle by deleting job rpcbind.socket/start
ноя 06 22:49:57 norbert-vaio systemd[1]: rpcbind.socket: Job rpcbind.socket/start deleted to break ordering cycle starting with nfs-server.service/start

Description of problem:
With NFS mounts in an fstab it appears that NetworkManager and the autogenerated mount units get into a dependency loop which causes a number of important units to get discarded while trying to resolve the issue.

Version-Release number of selected component (if applicable):
Seen in FC16,17,18

How reproducible:
Every boot with NFS mounts in the fstab

Steps to Reproduce:
1. Try and boot a system with NFS mounts in the fstab.
2.
3.

Actual results:
Several units are deleted and never attempt to stop, several others fail because the root filesystem doesn't get remounted RW until other units time out.

Expected results:
System should come up, remount the root filesystem RW, mount NFS volumes in the fstab without dependency loops or units timing out.

Additional info:

[ 6.943559] systemd[1]: Found ordering cycle on NetworkManager-wait-online.service/start
[ 6.943707] systemd[1]: Walked on cycle path to NetworkManager.service/start
[ 6.943810] systemd[1]: Walked on cycle path to dbus.socket/start
[ 6.943912] systemd[1]: Walked on cycle path to sysinit.target/start
[ 6.944039] systemd[1]: Walked on cycle path to local-fs.target/start
[ 6.944141] systemd[1]: Walked on cycle path to var-storage-fuse-mp3.mount/start
[ 6.944287] systemd[1]: Walked on cycle path to var-storage.mount/start
[ 6.944388] systemd[1]: Walked on cycle path to remote-fs-pre.target/start
[ 6.944490] systemd[1]: Walked on cycle path to network.target/start
[ 6.944588] systemd[1]: Walked on cycle path to NetworkManager-wait-online.service/start
[ 6.944735] systemd[1]: Breaking ordering cycle by deleting job NetworkManager.service/start
[ 6.944883] systemd[1]: Job NetworkManager.service/start deleted to break ordering cycle starting with NetworkManager-wait-online.service/start
[ 6.945103] systemd[1]: Deleting job NetworkManager-wait-online.service/start as dependency of job NetworkManager.service/verify-active

Same problem here during reboot on Fedora 19:

Aug 4 16:18:43 edison systemd[1]: Found ordering cycle on local-fs.target/stop
Aug 4 16:18:43 edison systemd[1]: Walked on cycle path to nfs-darwin.mount/stop
Aug 4 16:18:43 edison systemd[1]: Walked on cycle path to network.target/stop
Aug 4 16:18:43 edison systemd[1]: Walked on cycle path to NetworkManager.service/stop
Aug 4 16:18:43 edison systemd[1]: Walked on cycle path to dbus.socket/stop
Aug 4 16:18:43 edison systemd[1]: Walked on cycle path to sysinit.target/stop
Aug 4 16:18:43 edison systemd[1]: Walked on cycle path to local-fs.target/stop
Aug 4 16:18:43 edison systemd[1]: Breaking ordering cycle by deleting job nfs-darwin.mount/stop
Aug 4 16:18:43 edison systemd[1]: Job nfs-darwin.mount/stop deleted to break ordering cycle starting with local-fs.target/stop
Aug 4 16:18:43 edison systemd[1]: Found ordering cycle on local-fs.target/stop
Aug 4 16:18:43 edison systemd[1]: Walked on cycle path to nfs-darwin-home.mount/stop

/etc/fstab contains:

darwin:/ /nfs/darwin nfs4 rw,soft,intr,noauto 0 0

This should be fixed by the latest f18 systemd update.

Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in nbd (Ubuntu):
status: New → Confirmed
Echo Nolan (echonolan) wrote :

I have this problem as well, or at least a related one. On my system systemd resolves the cycle by disabling NetworkManager, so I have no internet until I systemctl start NetworkManager. Here's the relevant section of the log:

Feb 04 12:17:32 Behemoth systemd[1]: network-online.target: Found ordering cycle on network-online.target/start
Feb 04 12:17:31 Behemoth mtp-probe[583]: bus: 2, device: 4 was not an MTP device
Feb 04 12:17:32 Behemoth systemd[1]: network-online.target: Found dependency on NetworkManager-wait-online.service/start
Feb 04 12:17:33 Behemoth nvidia-persistenced[684]: Started (684)
Feb 04 12:17:32 Behemoth systemd[1]: network-online.target: Found dependency on basic.target/start
Feb 04 12:17:33 Behemoth nvidia-persistenced[684]: Failed to query NVIDIA devices. Please ensure that the NVIDIA device files (/dev/nvidia*) exist, and that user 121 has read and write permissions for those files
Feb 04 12:17:32 Behemoth systemd[1]: network-online.target: Found dependency on paths.target/start
Feb 04 12:17:33 Behemoth nvidia-persistenced[684]: The daemon no longer has permission to remove its runtime data directory /var/run/nvidia-persistenced
Feb 04 12:17:32 Behemoth systemd[1]: network-online.target: Found dependency on cups.path/start
Feb 04 12:17:33 Behemoth nvidia-persistenced[684]: Shutdown (684)
Feb 04 12:17:32 Behemoth systemd[1]: network-online.target: Found dependency on sysinit.target/start
Feb 04 12:17:32 Behemoth systemd[1]: network-online.target: Found dependency on nbd-client.service/start
Feb 04 12:17:32 Behemoth systemd[1]: network-online.target: Found dependency on network-online.target/start
Feb 04 12:17:32 Behemoth systemd[1]: network-online.target: Breaking ordering cycle by deleting job NetworkManager-wait-online.service/start

This is a regression from 15.04 to 15.10.

Alkis Georgopoulos (alkisg) wrote :

I'm experiencing this in Xenial 16.04 fully updated.
I have nbd-client installed but I'm not using its sysvinit service at all (LTSP).
One out of 10 boots, systemd decides to:
Απρ 07 08:20:15 srv1-dide systemd[1]: network.target: Breaking ordering cycle by deleting job NetworkManager.service/start

...and we have to manually start network-manager to get network access.

Martin Pitt (pitti) wrote :

This is indeed a too strong/impossible (with NetworkManager) requirement in /etc/init.d/nbd-client:

# Required-Start: $network $local_fs
# Default-Start: S

A service can't simultaneously run in early boot *and* require the network to be up, as that does not work with NetworkManager, connman, or similar "heavy" machinery to bring up the network. It works with ifupdown as that can run sufficiently early in the boot.

If nbd-client really can't work without the network being up, then it needs to run later, i. e. "Default-Start: 2 3 4 5". I don't know this package, would that be reasonable?

Changed in nbd (Ubuntu):
status: Confirmed → Triaged
Alkis Georgopoulos (alkisg) wrote :

The same issue (mostly) was reported in https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=796633

Wouter plans to replace the init script with a generator in time for Debian Stretch.

I've asked him to comment on if what pitti suggested is acceptable till then (for Ubuntu Xenial's release).

Alkis Georgopoulos (alkisg) wrote :

Hmm. The "2 3 4 5" change doesn't seem to be enough to break the dependency cycle.

I played a bit to see what could break the cycle but I couldn't find it.
By completely removing /etc/init.d/nbd-client, all is fine.
By putting *just* this minimal header:

# Required-Start: $network

...and removing all the other ones, I'm still getting
> nbd-client.service: Job nbd-client.service/start deleted
> to break ordering cycle starting with basic.target/start.

Alkis Georgopoulos (alkisg) wrote :

I can't reproduce the issue in Debian Stretch.

Alkis Georgopoulos (alkisg) wrote :

I was able to reproduce the issue in Debian Stretch after applying an Ubuntu specific patch to it. So I'm suspecting that Ubuntu's etwork-manager might be involved in this, I've put it to the affects list.

Specifically, by editing Debian's /lib/systemd/system/NetworkManager.service like this:
-After=network-pre.target dbus.service
+Wants=network.target
+Before=network.target

...the dependency cycle error was affecting Debian too.

Tomorrow I'll try the opposite, to revert the Ubuntu .diff from /lib/systemd/system/NetworkManager.service and see if it solves the issue.

Martin Pitt (pitti) wrote :

The main difference between D and U is that Debian's NetworkManager does not enable NetworkManager-wait-online.service by default. This is very likely to create an ordering cycle like this, as this pulls in NetworkManager (which is late boot) into network-online.target, which nbd-client requires.

Martin Pitt (pitti) wrote :

+Before=network.target

This is absolutely necessary too, otherwise NetworkManager can be stopped (on shutdown/reboot) before services that require the network to be up get shut down. Thus these services often hang then.

Changed in network-manager (Ubuntu):
status: New → Won't Fix
Alkis Georgopoulos (alkisg) wrote :

I confirm that by enabling NetworkManager-wait-online.service in Debian Stretch, I started getting dependency cycles.
Replacing "# Default-Start: S" with "# Default-Start: 2 3 4 5" didn't solve the cycles.

Any fixes or workarounds for Xenial?

For now, I removed "Required start: $network" from all the sysvinit services that had it (e.g. /etc/init.d/nbd-client).

Alkis Georgopoulos (alkisg) wrote :

@pitti, since
1) wouter isn't planning to fix this upstream soonish (Debian freeze is months ahead),
2) the issue is serious, i.e. people that install nbd-client, then randomly don't have network-manager running after reboots,
3) the included patch is surely better than the existing situation, i.e. it solves the ordering cycle issue by just starting nbd-client later on,

would you approve an SRU for the included patch,
or do you have some better idea on how to handle this?

Thanks!

Martin Pitt (pitti) wrote :

@Alkis: Doesn't that break nbd-client for people who use ifupdown? With that patch it will start much earlier, way before the network is up. Does nbd-client get along with that? I. e. it doesn't fail on startup without a network connection and listens for new network connections coming up?

no longer affects: systemd (Ubuntu)
tags: added: patch
Alkis Georgopoulos (alkisg) wrote :

Hi Martin,

I don't have access to any systems without network-manager (that's ubuntu-server installations, I imagine?) so I cannot say if it currently works with plain ifupdown or not, and if the patch will break it.

Let's assume the worst, that it now works there and that the patch will break it. In that case we have:

1) The nbd-client sysvinit service will not work as expected; users will notice and hopefully will find this bug report telling them that they need to revert the $network patch.

But now we have:

2) The nbd-client sysvinit service breaking not only itself but other packages as well! Network-manager or other services randomly not starting!
This even affects persons that don't use the nbd-client sysvinit service at all, but just need the nbd-client executable (like LTSP or QEMU users or persons wishing to use nbd-client to temporarily access a remote disk).

I believe that (2) is a lesser evil than (1), which I imagine only affect a minority of the ubuntu-server installations that also happend to need the nbd-client service.

In any case, if this won't be fixed for Xenial, let's please mention it in this bug report, so that we develop a workaround in LTSP, for all LTSP users that need it immediately.

Thanks a lot,
Alkis

Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in nfs-utils (Ubuntu):
status: New → Confirmed
Changed in systemd (Ubuntu):
status: New → Confirmed
Norbert (nrbrtx) wrote :

Got this bug again other Ubuntu 16.04.3 LTS system:

$ journalctl | grep -i network
ноя 06 22:49:57 norbert-vaio kernel: FUJITSU Extended Socket Network Device Driver - version 1.0 - Copyright (c) 2015 FUJITSU LIMITED
ноя 06 22:49:57 norbert-vaio systemd[1]: network.target: Found ordering cycle on network.target/start
ноя 06 22:49:57 norbert-vaio systemd[1]: network.target: Found dependency on NetworkManager.service/start
ноя 06 22:49:57 norbert-vaio systemd[1]: network.target: Found dependency on basic.target/start
ноя 06 22:49:57 norbert-vaio systemd[1]: network.target: Found dependency on sockets.target/start
ноя 06 22:49:57 norbert-vaio systemd[1]: network.target: Found dependency on uuidd.socket/start
ноя 06 22:49:57 norbert-vaio systemd[1]: network.target: Found dependency on sysinit.target/start
ноя 06 22:49:57 norbert-vaio systemd[1]: network.target: Found dependency on console-cyrillic.service/start
ноя 06 22:49:57 norbert-vaio systemd[1]: network.target: Found dependency on remote-fs.target/start
ноя 06 22:49:57 norbert-vaio systemd[1]: network.target: Found dependency on remote-fs-pre.target/start
ноя 06 22:49:57 norbert-vaio systemd[1]: network.target: Found dependency on nfs-server.service/start
ноя 06 22:49:57 norbert-vaio systemd[1]: network.target: Found dependency on network.target/start
ноя 06 22:49:57 norbert-vaio systemd[1]: network.target: Breaking ordering cycle by deleting job NetworkManager.service/start
ноя 06 22:49:57 norbert-vaio systemd[1]: NetworkManager.service: Job NetworkManager.service/start deleted to break ordering cycle starting with network.target/start

$ journalctl | grep -i break
ноя 06 22:49:57 norbert-vaio systemd[1]: network.target: Breaking ordering cycle by deleting job NetworkManager.service/start
ноя 06 22:49:57 norbert-vaio systemd[1]: NetworkManager.service: Job NetworkManager.service/start deleted to break ordering cycle starting with network.target/start
ноя 06 22:49:57 norbert-vaio systemd[1]: nfs-server.service: Breaking ordering cycle by deleting job rpcbind.socket/start
ноя 06 22:49:57 norbert-vaio systemd[1]: rpcbind.socket: Job rpcbind.socket/start deleted to break ordering cycle starting with nfs-server.service/start

I'm trying to remove nfs-kernel-server. But it is still critical bug.

Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in avahi (Ubuntu):
status: New → Confirmed
Changed in util-linux (Ubuntu):
status: New → Confirmed
Norbert (nrbrtx) on 2017-11-06
summary: - Breaking ordering cycle by deleting job nbd-client.service/start
+ metabug: Breaking ordering cycle by deleting job
+ NetworkManager.service/start
summary: - metabug: Breaking ordering cycle by deleting job
+ CRITICAL BUG: Breaking ordering cycle by deleting job
NetworkManager.service/start
tags: added: xenial
Norbert (nrbrtx) on 2017-11-06
description: updated
Changed in systemd:
importance: Unknown → High
status: Unknown → Fix Released

There is other bug 1717459 - about "Job local-fs.target/start deleted to break ordering cycle starting with networking.service/start".

All these bugs make system with systemd useless (does not matter on desktop, in cloud, or on server).

description: updated
tags: added: systemd-boot

There are many potential affected packages I have no insight, but for nbd this should be fixed since 1:3.14-1 by upstream now providing a native systemd service.

That means >=Zesty should be fixed in that regard.
Not sure on backporting that - one would need to check the potential further context that needs to go back to Xenial, but that might be the proper solution that one might want to try.

Changed in nbd (Ubuntu Xenial):
status: New → Triaged
Changed in nbd (Ubuntu):
status: Triaged → Fix Released
Changed in nbd (Debian):
status: Unknown → Fix Released
adm (alexm-) wrote :

was really glad to read it $-)
take my 5 cents:

$ sudo dmesg |grep dnscrypt
[ 10.963163] systemd[1]: dnscrypt-proxy2.socket: Found ordering cycle on dnscrypt-proxy2.socket/start
[ 10.964298] systemd[1]: dnscrypt-proxy2.socket: Found dependency on network.target/start
[ 10.965055] systemd[1]: dnscrypt-proxy2.socket: Found dependency on NetworkManager.service/start
[ 10.965801] systemd[1]: dnscrypt-proxy2.socket: Found dependency on basic.target/start
[ 10.966562] systemd[1]: dnscrypt-proxy2.socket: Found dependency on sockets.target/start
[ 10.967313] systemd[1]: dnscrypt-proxy2.socket: Found dependency on dnscrypt-proxy2.socket/start
[ 10.968087] systemd[1]: dnscrypt-proxy2.socket: Breaking ordering cycle by deleting job network.target/start

adm (alexm-) wrote :

sorry, had forgot to mention OS: mint 18.2 (= ubuntu xenial)

Norbert (nrbrtx) wrote :

Guys I can't completely understand why you are ignoring this bug.
Today my system boot offline again:

dmesg | grep break

[ 4.634456] systemd[1]: sockets.target: Job sockets.target/start deleted to break ordering cycle starting with basic.target/start
[ 4.634893] systemd[1]: acpid.path: Job acpid.path/start deleted to break ordering cycle starting with paths.target/start
[ 4.635273] systemd[1]: NetworkManager.service: Job NetworkManager.service/start deleted to break ordering cycle starting with NetworkManager-wait-online.service/start

Does Ubuntu is really enterprise-grade OS?

Norbert (nrbrtx) wrote :

aoetools is affected via bug 1596178.

Norbert (nrbrtx) wrote :

cloud-init is described in bug 1713104.

Norbert (nrbrtx) wrote :

xe-guest-utilities is described in bug 1669755.

Norbert (nrbrtx) wrote :

cloud-initramfs-tools is described in bug 1666573.

Norbert (nrbrtx) wrote :

rpcbind is described in bug 1580523.

Norbert (nrbrtx) wrote :

ipsec-tools is described in bug 1574833.

Norbert (nrbrtx) wrote :

shorewall is described in bug 1511869.

Norbert (nrbrtx) wrote :

open-iscsi is described in bug 1453331.

Norbert (nrbrtx) wrote :

On next boot my system is online, but nbd-client failed:

$ cat dmesg_net.txt | grep break
[ 4.676982] systemd[1]: dbus.service: Job dbus.service/start deleted to break ordering cycle starting with NetworkManager.service/start
[ 4.677730] systemd[1]: sockets.target: Job sockets.target/start deleted to break ordering cycle starting with NetworkManager.service/start
[ 4.678408] systemd[1]: nbd-client.service: Job nbd-client.service/start deleted to break ordering cycle starting with NetworkManager.service/start

Steve Langasek (vorlon) on 2017-12-10
no longer affects: shorewall (Ubuntu)
no longer affects: shorewall (Ubuntu Xenial)
no longer affects: ipsec-tools (Ubuntu Xenial)
no longer affects: ipsec-tools (Ubuntu)
no longer affects: shorewall6 (Ubuntu Xenial)
no longer affects: shorewall6 (Ubuntu)
no longer affects: cloud-initramfs-tools (Ubuntu)
no longer affects: cloud-initramfs-tools (Ubuntu Xenial)
no longer affects: open-iscsi (Ubuntu Xenial)
no longer affects: open-iscsi (Ubuntu)
no longer affects: cloud-init (Ubuntu)
Steve Langasek (vorlon) on 2017-12-10
no longer affects: cloud-init (Ubuntu Xenial)
no longer affects: xe-guest-utilities (Ubuntu Xenial)
no longer affects: xe-guest-utilities (Ubuntu)
no longer affects: aoetools (Ubuntu Xenial)
no longer affects: aoetools (Ubuntu)
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in avahi (Ubuntu Xenial):
status: New → Confirmed
Changed in network-manager (Ubuntu Xenial):
status: New → Confirmed
Changed in nfs-utils (Ubuntu Xenial):
status: New → Confirmed
Changed in rpcbind (Ubuntu Xenial):
status: New → Confirmed
Changed in rpcbind (Ubuntu):
status: New → Confirmed
Changed in systemd (Ubuntu Xenial):
status: New → Confirmed
Changed in util-linux (Ubuntu Xenial):
status: New → Confirmed
Robie Basak (racb) wrote :

Note that the workaround here is to edit the file in /etc/init.d/ manually and remove the circular dependency. If somebody familiar could detail the exact edit here for the benefit of others, that would be appreciated.

Local changes to /etc will be preserved on package upgrades.

Since there is a workaround, this issue is not critical. See https://wiki.ubuntu.com/Bugs/Importance for definitions. I'm editing the subject line of the bug to avoid users getting unrealistic expectations of an imminent fix.

If somebody would like to volunteer debdiffs that we can upload to fix this, please do. nbd is already fixed in newer releases, so see https://wiki.ubuntu.com/StableReleaseUpdates#Procedure for what you need to do for stable releases.

I don't understand what exactly is the problem with the other packages that others have added tasks for. It would be helpful if someone could explain each one using the form of a normal bug report (steps to reproduce, etc). In particular, it is not helpful to assume that the same error message means that it is the same bug, since if it turns out not to be the case, then it just leads to a whole bunch of confusion in the same bug. If in doubt, please use separate bugs to track different initially.

summary: - CRITICAL BUG: Breaking ordering cycle by deleting job
- NetworkManager.service/start
+ Breaking ordering cycle by deleting job NetworkManager.service/start
Changed in systemd (Ubuntu Xenial):
status: Confirmed → New
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.