NFSd does not start due to systemd ordering cycle with setserial

Bug #1452644 reported by kex on 2015-05-07
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
setserial (Debian)
Fix Released
Unknown
setserial (Ubuntu)
High
Martin Pitt
Vivid
High
Unassigned
Wily
High
Martin Pitt

Bug Description

NFSd does not start due to systemd ordering cycle.

Manually starting with "systemctl start nfs-kernel-server" after boot works fine.

here systemd lines from dmesg:

[ 43.569341] systemd[1]: Cannot add dependency job for unit gssproxy.service, ignoring: Unit gssproxy.service failed to load: No such file or directory.
[ 43.595080] systemd[1]: Found ordering cycle on nfs-server.service/start
[ 43.620748] systemd[1]: Found dependency on network.target/start
[ 43.646476] systemd[1]: Found dependency on NetworkManager.service/start
[ 43.672292] systemd[1]: Found dependency on basic.target/start
[ 43.697952] systemd[1]: Found dependency on sockets.target/start
[ 43.723381] systemd[1]: Found dependency on cups.socket/start
[ 43.748530] systemd[1]: Found dependency on sysinit.target/start
[ 43.773533] systemd[1]: Found dependency on setserial.service/start
[ 43.798185] systemd[1]: Found dependency on remote-fs.target/start
[ 43.822739] systemd[1]: Found dependency on remote-fs-pre.target/start
[ 43.871768] systemd[1]: Found dependency on nfs-server.service/start
[ 43.896442] systemd[1]: Breaking ordering cycle by deleting job network.target/start
[ 43.921641] systemd[1]: Job network.target/start deleted to break ordering cycle starting with nfs-server.service/start

SRU test case
=============
 - Install network-manager, nfs-kernel-server, and setserial into a clean vivid VM.
 - Reboot, and you'll find the above ordering cycle (check "sudo journalctl -p warning"). "systemctl status nfs-server.service" will be not running (it could also be that the cycle gets broken on a different point, but that's the most likely one).
 - With the fixed setserial package (upgrade, then reboot) there should be no cycle and all services start up normally.
 - Double-check "systemctl status setserial" that the unit started and output from the init.d script looks reasonable.

Regression potential: The new .service has less strict startup dependencies/ordering than the autogenerated LSB one, so I don't see a potential problem with introducing new cycles. The new unit could potentially fail to start if there's a bug in unit installation (this is covered by the test case). So all in all, very low.

ProblemType: Bug
DistroRelease: Ubuntu 15.04
Package: nfs-kernel-server 1:1.2.8-9ubuntu8.1
Uname: Linux 4.1.0-040100rc1-generic x86_64
ApportVersion: 2.17.2-0ubuntu1
Architecture: amd64
CurrentDesktop: Unity
Date: Thu May 7 12:26:09 2015
InstallationDate: Installed on 2012-12-22 (865 days ago)
InstallationMedia: Ubuntu 12.10 "Quantal Quetzal" - Release amd64 (20121017.5)
SourcePackage: nfs-utils
UpgradeStatus: Upgraded to vivid on 2015-04-23 (13 days ago)

kex (kex-iki) wrote :
kex (kex-iki) wrote :

I wonder if some update arrived between the bug reporting...

After reboot:
[ 43.777385] systemd[1]: Found dependency on nfs-server.service/start
[ 43.803123] systemd[1]: Found dependency on network.target/start
[ 43.828678] systemd[1]: Breaking ordering cycle by deleting job NetworkManager.service/start

the rpcinfo shows it started OK.

kex (kex-iki) wrote :

There is something weird going on, since every boot behaves a bit differently:

[ 44.264445] systemd[1]: Found ordering cycle on nfs-server.service/start
[ 44.264447] systemd[1]: Found dependency on nfs-mountd.service/start
[ 44.264448] systemd[1]: Found dependency on network.target/start
[ 44.264449] systemd[1]: Found dependency on NetworkManager.service/start
[ 44.264450] systemd[1]: Found dependency on basic.target/start
[ 44.264451] systemd[1]: Found dependency on sockets.target/start
[ 44.264452] systemd[1]: Found dependency on acpid.socket/start
[ 44.264453] systemd[1]: Found dependency on sysinit.target/start
[ 44.264455] systemd[1]: Found dependency on setserial.service/start
[ 44.264457] systemd[1]: Found dependency on remote-fs.target/start
[ 44.264460] systemd[1]: Found dependency on remote-fs-pre.target/start
[ 44.264461] systemd[1]: Found dependency on nfs-server.service/start
[ 44.264462] systemd[1]: Breaking ordering cycle by deleting job nfs-mountd.service/start
[ 44.264463] systemd[1]: Job nfs-mountd.service/start deleted to break ordering cycle starting with nfs-server.service/start

This time NFS is not working.

rpcinfo
   program version netid address service owner
    100000 4 tcp6 ::.0.111 portmapper superuser
    100000 3 tcp6 ::.0.111 portmapper superuser
    100000 4 udp6 ::.0.111 portmapper superuser
    100000 3 udp6 ::.0.111 portmapper superuser
    100000 4 tcp 0.0.0.0.0.111 portmapper superuser
    100000 3 tcp 0.0.0.0.0.111 portmapper superuser
    100000 2 tcp 0.0.0.0.0.111 portmapper superuser
    100000 4 udp 0.0.0.0.0.111 portmapper superuser
    100000 3 udp 0.0.0.0.0.111 portmapper superuser
    100000 2 udp 0.0.0.0.0.111 portmapper superuser
    100000 4 local /run/rpcbind.sock portmapper superuser
    100000 3 local /run/rpcbind.sock portmapper superuser

But sudo systemctl start nfs-kernel-server
starts nfs server beautifully

Steve Langasek (vorlon) on 2015-05-07
tags: added: systemd-boot
Steve Langasek (vorlon) wrote :

The nfs-server unit has a wrong dependency on 'network.target'. The only services that should depend on this are those which implement the network management service itself.

Changed in nfs-utils (Ubuntu):
importance: Undecided → High
status: New → Triaged
Steve Langasek (vorlon) wrote :

I assumed that this would be fixed by removing the dependency on network.target (which is certainly wrong), but I'm not sure that it does, because nfs-server is Before=remote-fs-pre.target.

More investigation is needed.

Changed in nfs-utils (Ubuntu Vivid):
status: New → Triaged
importance: Undecided → High

I am getting the same problem here, but using the open-iscsi initiator (which - of course - depends on the network being initialized):

Found ordering cycle on sysinit.target/start
[ 3.657825] systemd[1]: Found dependency on open-iscsi.service/start
[ 3.657827] systemd[1]: Found dependency on network-online.target/start
[ 3.657828] systemd[1]: Found dependency on NetworkManager-wait-online.service/start
[ 3.657830] systemd[1]: Found dependency on basic.target/start
[ 3.657831] systemd[1]: Found dependency on sockets.target/start
[ 3.657833] systemd[1]: Found dependency on acpid.socket/start
[ 3.657834] systemd[1]: Found dependency on sysinit.target/start
[ 3.657836] systemd[1]: Breaking ordering cycle by deleting job open-iscsi.service/start
[ 3.657837] systemd[1]: Job open-iscsi.service/start deleted to break ordering cycle starting with sysinit.target/start

So, it is quite a systematic problem with the System-V init scripts and their dependency setup in systemd.

Martin Pitt (pitti) wrote :

I can confirm this, by installing network-manager, nfs-kernel-server, and setserial into a clean VM. setserial's init.d script is very demanding as it both wants to start in run level "S" (sysinit.target in systemd's mapping) *and* requires $remote_fs. But we can't start remote file systems that early if we need NetworkManager for them, this only works with very simple networking like ifupdown.

I suppose /etc/init.d/setserial wants $remote_fs as it reads stuff from /var/lib/setserial/ which might potentially be on NFS. Having setserial on an NFS /var with NetworkManager is certainly a very special case, and I suppose we could just ignore it. But I think we can relax the early startup instead.

As far as I can see, the most appropriate course here is to provide a systemd unit for setserial with more fine-grained ordering, or just a drop-in to relax the overly harsh "Default-Start: S" (as it really just Before=system-getty.slice). I don't want to completely replace the init.d script as it has lots of code (which really doesn't belong there, but I leave that to the Debian maintainer, and this is too intrusive for an SRU).

affects: nfs-utils (Ubuntu Wily) → setserial (Ubuntu Wily)
Changed in setserial (Ubuntu Wily):
assignee: nobody → Martin Pitt (pitti)
status: Triaged → In Progress
summary: - vivid NFSd does not start due to systemd ordering cycle
+ NFSd does not start due to systemd ordering cycle with setserial
description: updated
Martin Pitt (pitti) wrote :

Vivid and Wily fixes uploaded.

Changed in setserial (Ubuntu Wily):
status: In Progress → Fix Committed
Changed in setserial (Ubuntu Vivid):
status: Triaged → In Progress
description: updated
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package setserial - 2.17-48ubuntu1

---------------
setserial (2.17-48ubuntu1) wily; urgency=medium

  * Add debian/setserial.service systemd unit, mostly to relax the very
    demanding Before=sysinit.target plus After=remote-fs.target combination
    resulting from the init.d script's $remote_fs dependency. This causes an
    insatisfiable dependency loop when using NFS and NetworkManager.
    (LP: #1452644)
  * Add dh-systemd build dependency and call dh_systemd_enable in
    debian/rules.

 -- Martin Pitt <email address hidden> Wed, 27 May 2015 11:57:47 +0200

Changed in setserial (Ubuntu Wily):
status: Fix Committed → Fix Released
Changed in setserial (Debian):
status: Unknown → New
Bob Dooby (squid12345678) wrote :

Is this going to be fixed in Vivid? It's randomly crippling my server on reboots. Can I install the Wily version on 15.04?

Martin Pitt (pitti) wrote :

Oh, my vivid SRU got rejected: https://launchpad.net/ubuntu/vivid/+queue?queue_state=4&queue_text=setserial

I don't know why, I'll just reupload it.

Hello kex, or anyone else affected,

Accepted setserial into vivid-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/setserial/2.17-48ubuntu0.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in setserial (Ubuntu Vivid):
status: In Progress → Fix Committed
tags: added: verification-needed
kex (kex-iki) wrote :
Download full text (3.6 KiB)

hello.

The fix removes setserial from the dependency loop, but there are other dependencies, so nfsd is still not starting:

kesä 10 22:29:41 xpc systemd[1]: Cannot add dependency job for unit gssproxy.service, ignoring: Unit gssproxy.service failed to load: No such file or directory.
kesä 10 22:29:41 xpc systemd[1]: Cannot add dependency job for unit fancontrol.service, ignoring: Unit fancontrol.service is masked.
kesä 10 22:29:41 xpc systemd[1]: Found ordering cycle on remote-fs-pre.target/start
kesä 10 22:29:41 xpc systemd[1]: Found dependency on nfs-server.service/start
kesä 10 22:29:41 xpc systemd[1]: Found dependency on network.target/start
kesä 10 22:29:41 xpc systemd[1]: Found dependency on NetworkManager.service/verify-active
kesä 10 22:29:41 xpc systemd[1]: Found dependency on basic.target/start
kesä 10 22:29:41 xpc systemd[1]: Found dependency on sockets.target/start
kesä 10 22:29:41 xpc systemd[1]: Found dependency on cups.socket/start
kesä 10 22:29:41 xpc systemd[1]: Found dependency on sysinit.target/start
kesä 10 22:29:41 xpc systemd[1]: Found dependency on later-readahead.service/start
kesä 10 22:29:41 xpc systemd[1]: Found dependency on remote-fs.target/start
kesä 10 22:29:41 xpc systemd[1]: Found dependency on remote-fs-pre.target/start
kesä 10 22:29:41 xpc systemd[1]: Breaking ordering cycle by deleting job nfs-server.service/start
kesä 10 22:29:41 xpc systemd[1]: Job nfs-server.service/start deleted to break ordering cycle starting with remote-fs-pre.target/start
kesä 10 22:29:41 xpc systemd[1]: Dependency failed for pNFS block layout mapping daemon.
kesä 10 22:29:41 xpc systemd[1]: ureadahead.service failed.

sudo systemctl status setserial
● setserial.service - controls configuration of serial ports
   Loaded: loaded (/lib/systemd/system/setserial.service; enabled; vendor preset: enabled)
   Active: active (exited) since ke 2015-06-10 22:29:46 EEST; 31s ago
     Docs: man:setserial(8)
  Process: 862 ExecStart=/etc/init.d/setserial start (code=exited, status=0/SUCCESS)
 Main PID: 862 (code=exited, status=0/SUCCESS)
   CGroup: /system.slice/setserial.service

kesä 10 22:29:46 xpc systemd[1]: Starting controls configuration of serial.....
kesä 10 22:29:46 xpc setserial[862]: Loading the saved-state of the serial.....
kesä 10 22:29:46 xpc setserial[862]: /dev/ttyS0 at 0x03f8 (irq = 4) is a 16550A
kesä 10 22:29:46 xpc systemd[1]: Started controls configuration of serial ...s.
Hint: Some lines were ellipsized, use -l to show in full.
kex@xpc:~$ sudo systemctl status nfs-kernel-server
● nfs-server.service - NFS server and services
   Loaded: loaded (/lib/systemd/system/nfs-server.service; enabled; vendor preset: enabled)
   Active: inactive (dead)
kex@xpc:~$ rpcinfo
   program version netid address service owner
    100000 4 tcp6 ::.0.111 portmapper superuser
    100000 3 tcp6 ::.0.111 portmapper superuser
    100000 4 udp6 ::.0.111 portmapper superuser
    100000 3 udp6 ::.0.111 portmapper superuser
    100000 4 tcp 0.0.0.0.0.111 portmapper superuser
    100000 3 tcp 0.0.0...

Read more...

tags: added: verification-failed
removed: verification-needed
Martin Pitt (pitti) wrote :

I wouldn't call this v-failed. The setserial fix might not be sufficient for your particular system, but it's necessary. As that cycle is gone for you, I mark this as v-done.

etc/init.d/later-readahead comes from the "readahead-fedora" package. I suggest to not install that -- it's not supported, we already have "readahead", and it causes that trouble for you. Please open a bug about that, but I'd rather just remove it from wily entirely.

tags: added: verification-needed
removed: verification-failed
tags: added: verification-done
removed: verification-needed
kex (kex-iki) wrote :

I can confirm that the proposed package fixes the bug for me.

I did not have readahead-fedora package, but ureadahead. However, I had fedora-readahead with 14.04. It is possible that some files were not removed during upgrade to 15.04. apt-get remove ureadahead; apt-get purge readahead-fedora fixed this for me (in addition to using the proposed setserial). Now nfsd starts.

Changed in setserial (Debian):
status: New → Incomplete
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package setserial - 2.17-48ubuntu0.1

---------------
setserial (2.17-48ubuntu0.1) vivid-proposed; urgency=medium

  * Add debian/setserial.service systemd unit, mostly to relax the very
    demanding Before=sysinit.target plus After=remote-fs.target combination
    resulting from the init.d script's $remote_fs dependency. This causes an
    insatisfiable dependency loop when using NFS and NetworkManager.
    (LP: #1452644)
  * Add dh-systemd build dependency and call dh_systemd_enable in
    debian/rules.

 -- Martin Pitt <email address hidden> Wed, 27 May 2015 11:57:47 +0200

Changed in setserial (Ubuntu Vivid):
status: Fix Committed → Fix Released

The verification of the Stable Release Update for setserial has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Changed in setserial (Debian):
status: Incomplete → New
Changed in setserial (Debian):
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.