ssh server doesn't start when irrelevant filesystems are not available

Bug #583542 reported by Jeffrey Baker on 2010-05-20
38
This bug affects 5 people
Affects Status Importance Assigned to Milestone
openssh (Ubuntu)
Medium
Unassigned
Declined for Lucid by Mathias Gug
Declined for Maverick by Mathias Gug

Bug Description

In Lucid, the SSH daemon won't start at boot unless all filesystems listed in fstab can be mounted. This is annoying to the administrator because some fstab entries are irrelevant and/or could be expected to have transient failures. When SSH doesn't start, it's impossible for the admin to do an in-band fix of these filesystems.

Examples of when filesystems might not mount:

Underlying device not attached
NFS server unavailable
iSCSI target unavailable
RAID without a quorum of member devices
Kernel package upgrade disabled certain filesystem modules

And so forth. The line "start on filesystem" should probably be edited to something a bit more robust.

Eric Hammond (esh) wrote :

This is especially important for remotely controlled servers which have no console access (e.g., Amazon EC2).

Colin Watson (cjwatson) wrote :

I don't believe mountall emits any event that would be suitable for this. The only other plausible one is local-filesystems, whose manual page notes that it may well not cover /usr so it's not suitable for use by the ssh job.

'filesystem' is documented as being appropriate for most normal services, so surely many other services have the same problem? Most notably, rc-sysinit starts on filesystem, so you'll never reach runlevel 2 if that event is never emitted. It seems to me that any change I might make in ssh would tend to make matters worse, not better.

Can't you use the nobootwait option in /etc/fstab to avoid holding up boot for filesystems that aren't needed to get up and running? This is documented in fstab(5).

Scott Moser (smoser) wrote :

> 'filesystem' is documented as being appropriate for most normal
> services, so surely many other services have the same problem? Most
> notably, rc-sysinit starts on filesystem, so you'll never reach runlevel
> 2 if that event is never emitted. It seems to me that any change I
> might make in ssh would tend to make matters worse, not better.

I agree that this is likely to affect other services or jobs also. I'm
not aware of any event that would be better.

That said, this is a real issue, the 'nobootwait' may be a suitable
workaround for lucid, but there needs to be some way of starting services
that is reliable. All sorts of things could result in a /etc/fstab that
wasn't perfect (failed disk, '/dev/sdXX' entry rather than UUID= and
changed kernel, ...) . Having ssh not start means a physical touch to the
machine or out of band interface has to be used to service it. In
EC2/UEC, there *is* no out of band interface, or physical touch.

Jeffrey Baker (jwbaker) wrote :

This may be out of scope for a bug report, but why not change the way an upstart job describes its start conditions? ssh, for example, could supply a script which checks if /usr is mounted. The script(s) can be run after every upstart job completes, and when all conditions are met the new jobs are started.

In the meantime, I'll check out the nobootwait workaround.

Scott Moser (smoser) wrote :

hm... now that i'm reading the man page you directed me at, the
nobootwait and optional flags do seem to solve this issue.

at very least, though, there is an educational problem here. I was unaware of these options as I'm sure several sysadmins or users are.

Scott Moser (smoser) on 2010-05-27
Changed in openssh (Ubuntu):
status: New → Triaged
importance: Undecided → Medium
Tokuko (launchpad-net-tokuko) wrote :

This issue has hit me multiple times now. I'm usually working on Solaris, HP-UX and AIX. All of these simply issue a big loud warning on the console, but try to continue to boot, which I guess is what most administrators (at least I) expect.
As the last entry was 3 years ago - has any decision been reached?

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers