Boot hangs and unable to continue when automount disk in fstab is not available (Off or Disconnected)

Bug #571444 reported by Michael.S.G
384
This bug affects 75 people
Affects Status Importance Assigned to Milestone
mountall (Ubuntu)
Fix Released
Medium
Unassigned
Nominated for Lucid by Erik Reuter

Bug Description

On 10.04 Lucid server installations, if a filesystem that is specified in /etc/fstab cannot be mounted for some reason (USB disk not connected, partition deleted, samba share not availible etc.) the boot process hangs displaying
"mountall: Filesystem could not be mounted: ..."
Screenshot: http://launchpadlibrarian.net/48612932/Screenshot%20mountall%20hang.png
The user is not informed what to do at this point.

== WORKAROUND ==
Press "s" to skip the missing filesystem and continue booting.
   - This workaround only works for the first missing drive. For more missing drives press alt-sysrq-i and edit your fstab to remove missing drives

== ORIGINAL DESCRIPTION ==
I have attached an external drive to mount on boot up of 10.04 server (Lucid).
It's label is "WAD" and is located and mounted via the following fstab entry:

LABEL=WAD /mnt/WAD auto relatime,errors=remount-ro 0 2

It's pretty generic and usually mounts either a ext3 or ext4 file system.

However, If this drive is detached or powered down for some reason - Then the entire boot process hangs!
There is no way to continue. You can switch terminals, But can not drop to any command line or even login.
So you can't easily fix the problem either. ctrl-alt-del will initiate a shutdown/restart.
Also booting into recovery mode exhibits the same problem.

The only way I could recover this server (As I had no live cd available) was to hit alt-sysreq-i (Kill all tasks)
The process's it killed are as follow: (This may give a hint to where it is hanging!?)
udev
portmap
statd
mountall
plymouth
Then it stated: General error mounting filesystems.
And droped into a terminal (Thank God), So I could fix fstab (remove the external drive entry).

I feel this is a pretty serious issue. It almost forced me to reinstall the server.
(Only chance made me press Alt-SysRq-ScrollLock to give me Magic Key Help)

affects: ubuntu → mountall (Ubuntu)
Revision history for this message
Scott James Remnant (Canonical) (canonical-scott) wrote :

Thanks for the report,

In order to debug further I'll need you to collect a bit more information for me. First can you run "apport-collect 571444" on your system, this will attach various files from your system to the bug which I need to analyse.

This is a server install, right? Could you run "ls -l /lib/plymouth/themes" for me and provide the output?

What happens if you push the "S" key at the point of the hang?

Changed in mountall (Ubuntu):
status: New → Incomplete
importance: Undecided → Medium
Revision history for this message
Michael.S.G (ozfalcon) wrote :

Ok. I'll do that asap. (~12-24h)

Revision history for this message
Michael.S.G (ozfalcon) wrote :

I have been delayed in getting the information (10.04 desktop reinstall/setup eating my time)(Plus Wife factor of x3~6 delays)

However I will add to this bug's info.

This bug ALSO affects ubuntu 10.04 desktop edition.

I reproduced this but by accident on my Desktop install.

I was experimenting with moving Firefox cache to ram disk.
I put the Line in fstab. But FORGOT to make the directories in the file system.
On reboot. The Desktop hung (Same as server with external drive).
I used magic key to boot and fixed the problem. (Make /var/firefox/Cache folder).

This was the entry I had in fstab

 none /var/firefox/Cache tmpfs defaults,noatime 0 0

or

 tmpfs /var/firefox/Cache tmpfs defaults,noatime 0 0

Revision history for this message
Sebastian Krysmanski (manski) wrote :

Same here (using Ubuntu 10.04 server). One of my hard drives is formatted with JFS but I forgot to install jfsutils. Now when I try to boot it says:

fsck: fsck.jfs: not found
fsck: Error 2 while executing fsck.jfs for /dev/sdb1
linuxroot: clean, ...
mount: wrong fs type, bad option, bad superblock on /dev/sdb1
       missing codepage or helper program, ...

mountall: mount /mnt/backup [650] terminated with status 32
mountall: Filesystem could not be mounted: /mnt/backup

And there the boot process is stuck. No shell, no login prompt, just nothing. Happens with "normal" boot as well as "recovery mode".

After installing "jfsutils" by using a live cd the problem has disappeared.

Revision history for this message
tyldis (tyldis) wrote :

Hit similar issue:
Removed drive from system but forgot to remove the line i fstab referencing to it. Hence no 'filsystem' emitted. System hangs.

Revision history for this message
Jakob Unterwurzacher (jakobunt) wrote : apport information

Architecture: i386
DistroRelease: Ubuntu 10.04
InstallationMedia: Ubuntu-Server 10.04 LTS "Lucid Lynx" - Release i386 (20100427)
Package: mountall 2.15
PackageArchitecture: i386
ProcEnviron:
 PATH=(custom, no user)
 LANG=de_DE.UTF-8
 SHELL=/bin/bash
Tags: lucid
Uname: Linux 2.6.34-020634rc7-generic i686
UserGroups:

tags: added: apport-collected
Revision history for this message
Jakob Unterwurzacher (jakobunt) wrote : Dependencies.txt

apport information

Revision history for this message
Jakob Unterwurzacher (jakobunt) wrote :

@Scott:
1) I apport-collect'ed it.
2) Pressing "s" continues the boot process :)
3) # ls -l /lib/plymouth/themes
   total 8
   drwxr-xr-x 2 root root 4096 May 16 15:12 details
   lrwxrwxrwx 1 root root 31 May 16 15:28 text.plymouth -> /etc/alternatives/text.plymouth
   drwxr-xr-x 2 root root 4096 May 16 15:26 ubuntu-text

Additionally, the attached screenshot shows the hang.

Changed in mountall (Ubuntu):
status: Incomplete → New
Revision history for this message
Von (daaxix) wrote :

This happened to me because for some reason 10.04 does not load the wireless modules prior to attempting to connect to network shares that are in fstab. The network (samba) shares are not able to be found and then the boot just hangs. No shell, nothing, I can only reboot or use the work around, thanks OP for the workaround!

This is an extremely serious bug for any non-power user, as fstab must be edited to fix the initial problem, and it basically makes an upgraded system unusable.

description: updated
description: updated
Revision history for this message
bvdb (bvdb-bvdb) wrote :

Problem with this bug is that it interferes with a backup script i use for several customers, sensing the right drive, and when found executing the correct backup ...

Moreover, pressing S only works for the first drive it does not find, when pressing S a second time, the system really hangs ...
(at least i think it does)

Changed in mountall (Ubuntu):
status: New → Confirmed
Revision history for this message
Erik Reuter (misc71) wrote :

This bug bit me also. I had a line in /etc/fstab for an iscsi volume that was not available at boot time, and the system just hangs during boot.

Workaround was to put the noauto option on the volume in /etc/fstab.

But I think this is a serious bug. I suspect most Ubuntu users would be very frustrated and confused by this behavior, and not know what to do -- only power users would be able to figure out that it is related to mounting an unavailable volume.

When searching the bugs, I found a lot of people posting about this bug in unrelated bug reports, since they did not know the cause of the problem.

The importance should be "high", not "medium".

Revision history for this message
Jim Willsher (jim-jimwillsher) wrote :

Importance should be high. In my own scenario I'm mounting to CIFS shares on a server. If that server is unavailable then the Ubuntu server will not boot at all. Pressing S seems to achieve nothing.

description: updated
Revision history for this message
Rupert H. (ruphe) wrote :

Ubuntu 10.04 (Server). Drobo connected via USB. Same problem. If I press "s" the system will continue to boot. Putting "noauto" in /etc/fstab as mentioned by Erik does the trick. Additionally, I also put a mount command into /etc/rc.local. Not perfect, but it will do me for now.

/etc/fstab
  LABEL=DROBO /media/drobo ext4 defaults,noauto 0 2
  /media/drobo/home /home ext4 defaults,noauto,bind 0 2

/etc/rc.local
  mount /media/drobo
  mount /home

Revision history for this message
timandjulz (fromubuntu-timandjulz) wrote :

This is a critical problem for me. I connect to my Ubuntu machine remotely. If the drive is not found (OK if it is not) then the machine hangs during boot. I cannot remote x-win and cannot ssh to the machine without getting physical access. Have to connect a monitor and keyboard to see what is going on.

Revision history for this message
Stefan Pappalardo (sjuk) wrote :

Hi folks, I have the same problem with mounting nfs-shares on my working machine to my htpc in the living room. There are 4 mount points for my mythtv home entertainment and backup strategy.

/etc/fstab:
obelix:/var/lib/mythtv/music /var/lib/mythtv/music nfs _netdev,auto 0 0
obelix:/var/lib/mythtv/mythexport /var/lib/mythtv/mythexport nfs _netdev,auto 0 0
obelix:/var/lib/mythtv/pictures /var/lib/mythtv/pictures nfs _netdev,auto 0 0
obelix:/var/lib/mythtv/videos /var/lib/mythtv/videos nfs _netdev,auto 0 0

When obelix (htpc) is down my working machine does not boot!

Is there any workaround?

Is there anything I can do to help?

Revision history for this message
Erik Reuter (misc71) wrote :

It seems to me that mountall must be modified to timeout after X number of seconds, where timeout means the equivalent of pressing 'S' for skipping the unmountable volume.

The man page for mountall gives the author as "Scott James Remnant <email address hidden>".

Scott appears to be an employee of Canonical:

https://launchpad.net/~scott

I notice that this bug is still unassigned. Is it possible to assign it to Scott? Someone else? It would seem that coding the timeout should be straightforward, if only the right person knows about it...

Revision history for this message
Ralph Corderoy (ralph-inputplus) wrote :

This also affected a 9.10 -> 10.04 Desktop, not Server, machine that had removed the `splash' from the kernel line some time earlier. The non-free virtualbox had placed the line

    none /proc/bus/usb usbfs devgid=127,devmode=664 0 0

at the end of /etc/fstab some months earlier. Its presence caused the 10.04 boot to hang with no prompting, and various unrelated lines of output since mountall started meant the cause wasn't clear.

Google shows this is biting quite a few people, e.g. http://ubuntugenius.wordpress.com/2010/05/07/fix-an-error-occurred-while-mounting-procbususb-bootup-error-after-upgrade-to-lucid/ I think medium is too low a priority for something that stops the system from booting unless you accidentally press `s' when prodding the keyboard for a response, e.g. a friend typed `ls<Return>' on his machine on one of the failures and was puzzled that Return made it continue that time but never again.

Revision history for this message
elineis - Pierre-Mael Cretinon (elineis.pmc) wrote :

Hi,
I confirm this pesky bug also affects sshfs entry on fstab (when the distant ssh server is unavailable, running ubuntu server 10.04). The "s" touch solves the problem, but as already said on a previous comment, the recovery mode doesn't work... I won't comment on the "medium" importance, but please, someone should be affected to this bug !

Revision history for this message
Paul (b01) wrote :

This is a serious bug which means I will recommend that my customers not upgrade to 10.04 for server environments till this is fixed. Same problem occurs when a softRAID error prevents the array from being mounted.

Unfortunately I can't develop a fix, but here's a few recommendations:

1. Timeout on failure to mount non-root filesystems (mentioned above).
2. Text during boot to indicate the 's' allows skipping.

Revision history for this message
SJI (sji) wrote :

I've just been hit by this bug on 2.6.32-24-server #38-Ubuntu SMP x86_64 GNU/Linux

Here it's on a machine in a rack that has no video output or keyboard so all I initially observe is a machine that didn't restart from power down.

Remove the fstab entry and attempt to mount the remote drive after boot and everything is fine.

Error occurs if the device that failed to mount is remote or local.

Revision history for this message
Matthew Smith (gizmosmith) wrote :

Grrr. This is a medium priority bug and it's still unassigned.

Revision history for this message
Joseph Brasch (jbrasch) wrote :

Pretty annoying bug and crippling to those of us that are dealing with remote servers.

Revision history for this message
Erik Reuter (misc71) wrote :

I sent a message to Scott Remnant a couple weeks ago, referencing this bug. It does not seem to have gotten any attention.

I'm not sure what Ubuntu is coming to when an important bug like this seems to be ignored by the developers.

Revision history for this message
Michael.S.G (ozfalcon) wrote : Re: [Bug 571444] Re: Boot hangs and unable to continue when automount disk in fstab is not available (Off or Disconnected)

On 12/08/10 08:23, Erik Reuter wrote:
> I sent a message to Scott Remnant a couple weeks ago, referencing this
> bug. It does not seem to have gotten any attention.
>
> I'm not sure what Ubuntu is coming to when an important bug like this
> seems to be ignored by the developers.
>
>
Yes, It's a bit concerning. I have a feeling too much effort is put into
the next releases, Rather than fixing problems.

Revision history for this message
Tim Oertel (toertel-launchpad) wrote :

After being frustrated with this bug for a while, I grabbed the mountall source... it's a lot of stuff, but one thing I found was the "nobootwait" option.

Adding "nobootwait" to all my "auto" mounts allowed my system to boot up without intervention.

Although the fstab manpage is a little terse describing nobootwait, it seems like a reasonable solution. Still, it would certainly be better for everyone if Ubuntu dealt with this more automatically...

Revision history for this message
bvdb (bvdb-bvdb) wrote :

Thank you Tim,
this workaround seems to work for me ... (the last 24hours at least).
Please Ubuntu, or Mint, do something about this very annoying bug!!!
I'm a linux teacher, and my negativity on the latest bugs in 10.04
is unconsciously influencing my students.

Revision history for this message
scm (scm) wrote :

I commented in a bug now marked as a duplicate of this one, but I'll re-post here:

Anything with the 'user' fs option, should probably be ignored/skipped by mountall to begin with, as those filesystems are generally not wanted at boot time. I'm not sure how the current design of mountall could otherwise be more intelligent about what to skip, though there is Bug #614662 with an enhanced mountall.

tags: added: lucid
scm (scm)
tags: added: glucid
removed: lucid
Philip Muškovac (yofel)
tags: added: lucid
removed: glucid
Joel Ebel (jbebel)
tags: added: glucid
Revision history for this message
CharlesA (charlesa) wrote :

Hello,

I ran into the same thing a few months ago and automagically fixed it by mashing on the keyboard.

I did file a couple bug reports, but I guess they got no real attention since it wasn't filed with the right package.

There are https://bugs.launchpad.net/ubuntu/+source/linux/+bug/609864 and https://bugs.launchpad.net/ubuntu/+source/linux/+bug/576001

I have to build my RAID drivers after every kernel upgrade and since the array is set to be mounted in fstab, this causes a bit of pain. At least I do have physical access to the machine, but it's still a pain if I forget to comment that line out before rebooting into the new kernel.

Btw: Thanks Tim for that option, I had never heard of it before.

Revision history for this message
Gerald Villemure (gvillemure) wrote :

I wrote a small script to make booting a system a little less scary. You can find it here, https://bugs.launchpad.net/bugs/614662

Gérald

Revision history for this message
Fionn (fbe) wrote :

See also: bug #638228

Revision history for this message
uxeng (listmail) wrote :

I'm booting some test installs that use vmdks that point to raw devices. It just so happened one non-crucial drive /dev/sdc was not available and the boot process would hang. The twist is I don't see any error message or prompt for input like this bug suggests. If it wasn't for this bug report I would not have learned that I needed to press 'S' to finish booting.

Revision history for this message
Mathieu Alorent (kumy) wrote :

For the specific problem : "Press "s" to skip the missing filesystem and continue booting. - This workaround only works for the first missing drive. For more missing drives press alt-sysrq-i and edit your fstab to remove missing drives"

Please see bug #655616, I've made a patch and built a new package, could you test it please ?

Revision history for this message
bvdb (bvdb-bvdb) wrote :

# dpkg -l | grep mountall
ii mountall 2.15.3~kumy1 filesystem mounting tool

I had to comment these out again to boot.
No solution on this computer.

#LABEL=HOME_IMAGES /images auto defaults 0 0
#LABEL=BACK_SG /backsg auto defaults 0 0
#LABEL=BACK_WD /backwd auto defaults 0 0

What works for me is NOBOOTWAIT

Revision history for this message
uxeng (listmail) wrote :

Mathieu Alorent I don't see a description on your ppa as to what you implemented. Does your patch allow you to press "S" to skip more than just the first missing drive?

From what I can see (including other bugs) not everyone is seeing the interactive prompt or errors when this hang issue occurs. This maybe to due to things like fb/videos drivers loading around that time which typically clear the screen or not using splash, I'm not sure. This also major issue on servers with remote only access where interactive prompts would not be desired anyway. It seems like to me, at least in this state, that mountall's default functionality should actually be NOBOOTWAIT and those that want to take advantage of this functionality could specify BOOTWAIT manually.

Revision history for this message
Mathieu Alorent (kumy) wrote :

@dvdb: my case was about NFS mounts, I'll try with your config

@dblade: yes, the patch allow me to "press "S" to skip more than just the first missing drive", my case was on NFS mount point.
@dblade: about the messages not displayed, we've got this problem too when we started to install our servers (via FAI), I can't remember how we correct this... I'll check and keep you informed. Maybe it was because "plymouth-theme-ubuntu-text" wasn't installed.
@dblade: I'll try to see if I could change default behavior to nobootwait instead of bootwait as you suggest. But will it be accepted by Ubuntu team ?

Revision history for this message
Mathieu Alorent (kumy) wrote :

@bvdb: I've tested with your mount points and I was able to skip all partitions

Here is an extract of my fstab:

inexistant.machine:/var/opt/hosting/data /var/opt/hosting/data nfs defaults 0 0
inexistant.machine:/var/opt/hosting/toto /var/opt/hosting/toto nfs defaults 0 0

LABEL=HOME_IMAGES /images auto defaults 0 0
LABEL=BACK_SG /backsg auto defaults 0 0
LABEL=BACK_WD /backwd auto defaults 0 0

# dpkg -l |grep mountall
ii mountall 2.15.3~kumy1 filesystem mounting tool

I'll do some more test with a fresh reinstall

Revision history for this message
uxeng (listmail) wrote :

Mathieu Alorent: my guess would be that there would be resistance to this change as I think the goal is to empower the user to correct a stituation without having to resort to technical recovery procedures or find out about the problem after the fact. It is good that it is configurable via fstab, but my opinion stands that the *better* default at this point in time would be NOBOOTWAIT if feesible.

Revision history for this message
Mathieu Alorent (kumy) wrote :

@dvdb: for the missing splash screen, you just need to add "splash" to your grub conf !

edit /etc/default/grub

modifiy line:
GRUB_CMDLINE_LINUX_DEFAULT="quiet"
to
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash"

then regen grub's conf (update-grub2)

that does the trick for me :)

Also, I've tested my patched mountall, and I was able to skip the tree lines you specified (/images , /backsg , /backwd), that were NOT existant on my system.

Now just need to have nobootwait the default behavior...

Revision history for this message
Mathieu Alorent (kumy) wrote :

I've uploaded a new revision, and included other patches : (it's currently building)

mountall (2.15.3~kumy2) lucid; urgency=low

  * Default behavior for missing part set to nobootwait (LP #571444).
  * mountall does not honor nobootwait flag on /var/* and /usr/* filesystems (LP #654545)
  * gfs2 mount not seen as remote (LP #582278)
  * Disk check message is grammatically incorrect (LP #572016)

mountall (2.15.3~kumy1) lucid; urgency=low

  * Press S or M does not work for more than one partition on error (LP: #655616).

Revision history for this message
uxeng (listmail) wrote :

Thanks. I will try to provide feedback on your package hopefully tonight.

I also like Erik Reuter's idea of a timeout which I think would be a viable alternative in the event the consensus was that BOOTWAIT should be the default. This is assuming that prompt/error messages were corrected to display in all scenarios. This would also accommodate server configurations .

Revision history for this message
bvdb (bvdb-bvdb) wrote :

thanks,
all 3 absent infstab labeled SATA disks automatically skipped boot without intervention.

$ dpkg -l | grep mountall
ii mountall 2.15.3~kumy2 filesystem mounting tool

for me the problem is completely solved.
thank you and best regards,
bvdb

Revision history for this message
Mathieu Alorent (kumy) wrote :

My morning thoughts about "Default behavior for missing part set to nobootwait" : in my opinion, this should not be done ! As if you are in production system, your machine should only start if all prerequisites are satisfied. Imagine what happens if you have /usr missing, with all programs that should start, like apache, mysql, etc

dblade: you said "This also major issue on servers with remote only access where interactive prompts would not be desired anyway", I have just tested what happens with Debian Etch if there is wrong (missing) devices in /etc/fstab : system don't boot, problem with fsck ("the superblock could not be read......."). Same behavior on Hardy.

I think that if users really want to boot with missing parts, they should explicitly add "nobootwait" to concerned fstab lines.

So for me:
* wait for user to press S/M is NOT a bug
* missing "splash" in grub2 conf IS a bug, that don't let users see why their system won't boot
* Press S or M does not work for more than one partition on error IS a bug (LP: #655616)

Revision history for this message
uxeng (listmail) wrote :

>I have just tested what happens with Debian Etch if there is wrong (missing) devices in /etc/fstab : system don't boot, problem
> with fsck ("the superblock could not be read......."). Same behavior on Hardy. I think that if users really want to boot with
> missing parts, they should explicitly add "nobootwait" to concerned fstab lines.
I don't see it as that simple. Yes root is one thing and your argument applies there, but hanging because of external or extraneous drives coupled with not seeing the error/prompt text and possibly not having the option to interact is another problem.

>* missing "splash" in grub2 conf IS a bug, that don't let users see why their system won't boot
to me, specifically requiring splash to see the error/prompt text is either a design flaw or bug. I expect text mode (read: console) to work before splash does. I manually set my systems to boot text mode.

Revision history for this message
Tim Oertel (toertel-launchpad) wrote :

I believe the correct policy for mountall is to ignore missing filesystems that are marked with options "owner", "user" or "noauto". Everything else should properly cause a hang (unless an admin explicitly says a filesystem isn't required via nobootwait). "noauto" should already be ignored and "user"/"owner" says it's definitely not required for booting. Anything else seems reasonable to block on.

Revision history for this message
uxeng (listmail) wrote :

@Tim: I agree with this policy and perhaps mountall is doing a better job now at adhering to it.

I have a problem with informational messages and prompts are not showing up on console. This isn't as big of a deal when you are aware of the issue; you said yourself that you went through the source code to learn of "nobootwait". Other users could search the output text, if it shows up, and then learn what they can do about it.

Revision history for this message
Tim Oertel (toertel-launchpad) wrote :

@dblade I wasn't taking a stand on the other issues. But I agree not displaying the message is/was a bug.

Revision history for this message
Philip Muškovac (yofel) wrote :

The issue that it doesn't show any messages on the console is tracked in bug 563916 and has been fixed in maverick, lucid is still being worked on.

Revision history for this message
Paul (b01) wrote :

@Philip: That's great news! Should save some from headaches in the future. :-)

@Matheiu: Yes, I definitely wouldn't want nobootwait to be default for my servers. A common scenario for me is a separate RAID array with essential files for services on the server.

@Tim: A policy like that sounds good to me.

Revision history for this message
Gerald Villemure (gvillemure) wrote :

I think we need to include the very real scenario where Ubuntu Server is used in a headless capacity. For example I have a dedicated server with GoDaddy and I have absolutely ZERO access to console. In truth, I rarely have access to console on pretty much all the server I admin.

If I reboot a system, and SSH is no longer available, then that server is DEAD to me. I do NOT have the luxury of walking up to console to investigate.

I would like to have an option in Ubuntu to indicated that the server is headless and that it should NEVER, I mean NEVER pause during boot regardless of the problem. With a boot script that sends a notification email if something did go wrong at boot.

Gérald

Revision history for this message
Gerald Villemure (gvillemure) wrote :

To summarize, when booting a system, we have 2 options:
 - We use "bootwait" on all local filesystems (default)
 - We use "nobootwait" on all local filesystems.

The problem with "bootwait" is that it outright prevents the use of Ubuntu on a headless system.
The problem with "nobootwait" is that is breaks possible dependencies between server applications and the underlying filesystem. Also there is the problem of being blind to what’s going on. You can tell there is an "fsck –a" running, that’s it. If it fails you only get a message in /var/log/boot.log.

In short, assuming we don’t have a luxury of sitting in front of the server every time it boots, we need a slightly smarter "nobootwait" option.

Fix1, If the system is running a background FSCK we NEED to be able to see what’s going on. For both the guy sitting on console and the admin connected via SSH. For this the best solution if to run FSCK in a screen session on TTY13+
Fix2, We need to let server applications KNOW to WAIT for the right filesystem to come online before they start. In upstart you can put a line like this: "start on mounted MOUNTPOINT=/home"
Fix3, We the admins, NEED to know if a filesystem has not come online. A quick email to let us know.

All 3 fixes are in place in the small sample script available here: https://bugs.launchpad.net/bugs/609864

Hope this helps,
Gérald

Revision history for this message
Dawning (dawning) wrote :

Interesting history of discussion on this topic. I also stumbled on a small thread regarding users confused about what to do for this here: http://ubuntuforums.org/showthread.php?p=10094269#post10094269

I'm a fairly big fan of having a timeout for bootwait. Alternatively, perhaps we need another option in addition to bootwait and nobootwait... "bootwaitdelay" or something like that. I'd then suggest that at install time, there be an "advanced options" area people can view to change some defaults for things like this.

I support another simple 'solution' could be to add a comment to the default /etc/fstab file that informs people of 'nobootwait' as I think for most Ubuntu users, that'd be enough. Granted the more casual users would still be stuck "enjoying" the annoying nature of their experience.

Those are my two cents anyway. Gerald's comments look quite nice too.

Revision history for this message
Bill Hughes (whughes) wrote :

Has there been any progress on this issue?

I run a headless server with an external USB drive attached. I found this issue during a test power failure.

The server was shutdown properly and the power was cut to the external drive. When the power returned, the server restarted but the drive did not. Because the usb drive was missing, the server did not boot.

It would be great to mark the usb drive as non-critical so as not to prevent boot up.

Revision history for this message
CharlesA (charlesa) wrote :

I haven't heard anything.

You can add "nobootwait" to fstab to tell the OS to not wait for that drive if it fails to mount.

You can also use the script that Gerald wrote up:
https://bugs.launchpad.net/ubuntu/+source/mountall/+bug/614662

dir schneid (d-schneid)
Changed in mountall (Ubuntu):
status: Confirmed → New
Andreas Ntaflos (daff)
Changed in mountall (Ubuntu):
status: New → Confirmed
Revision history for this message
Tim Cutts (timc) wrote :

We've also been bitten by this. We run approximately 2000 servers (currently mostly Debian, but increasing numbers of Ubuntu Lucid). We're finding our servers hang if, for example, a SAN disk has failed, for whatever reason. Now, yes, we usually have some sort of remote console access, but it's a royal PITA to use, and if I have several hundred machines affected because of a mistake in SAN zoning or something, it's really impractical to start a console session on each server to press S. I'd be there for weeks. Of course in reality I'd just power cycle them all again.

There needs to be some sort of remote administration access using ssh or something similar, presumably started from the initrd, from which people can recover from this sort of situation. In the meantime I think I'll probably use kumy's mountall that proceeds anyway, which restores the Debian behaviour. At least then I get a booted system which I can log into and diagnose.

Revision history for this message
modjoa (modjoa) wrote :

I also had the same problem, but I found the solution:

sudo nano /etc/fstab
 "nobootwait" - add this option to all external partition descriptions

The "bug" is resolved! :)

Revision history for this message
Laurent Dinclaux (dreadlox) wrote :

nobootwait break the mount command

Revision history for this message
Steven T. Snyder (7-launchpad-steventsnyder-com) wrote :

If nobootwait doesn't work, try the nofail option.

Revision history for this message
Steve Langasek (vorlon) wrote :

In 12.04, mountall supports both 'nobootwait' and 'timeout' options to control whether mountall should wait for a given filesystem before signalling for the boot to proceed. That seems to be the issue raised in this bug, so I think this is resolved.

There were other comments about the plymouth prompts not working if more than one disk is failed. If someone is seeing such an issue with 12.04, please file a separate bug report.

Changed in mountall (Ubuntu):
status: Confirmed → Fix Released
Revision history for this message
Steve Langasek (vorlon) wrote :

actually, no separate bug report needed - that appears to be bug #655616.

Revision history for this message
ullix (ullix) wrote :

This issue has bugged me many times, but I think the real bug is not the hang during boot under those circumstances, but the fact that during boot process a message is given to the user, but the user can't see it!

Here the original problem was the hang in the boot process because the disk was not found. Had the message been shown, I could have pressed "s". This specific problem might now be overcome with new options, however, if during the boot process other issues come up, I still won't see them!

So, what need to be adressed is that messages to the user, where the user needs to take action, are still being shown, while the remainder of the boot can go silent.

Revision history for this message
hankwang (hknienhuys) wrote :

I got bitten by this in Ubuntu Server 12.04 LTS because of an fstab entry for an external USB disk (user-mountable).

The 'nobootwait' option in /etc/fstab did the job, but it is really poorly documented. Because the issue occured during automatic fs checking, the logical place to search was 'man fsck', which mentions the 'nofail' option -- which didn't resolve the problem. Neither is it described in 'man mountall' or 'man 8 mount'; one has to check in 'man fstab', which is typically not where I look for documentation on mount options. By the way

If there is no way to make the boot process more intelligent on a headless server, then can it at least be considered to update the various man pages that a user might check?

Revision history for this message
hankwang (hknienhuys) wrote :

I got bitten by this in Ubuntu Server 12.04 LTS because of an fstab entry for an external USB disk (user-mountable).

The 'nobootwait' option in /etc/fstab did the job, but it is really poorly documented. Because the issue occured around the time of automatic fs checking, the logical place for me to search was 'man fsck', which mentions the 'nofail' option -- which didn't resolve the problem. Neither is it described in 'man mountall' or 'man 8 mount'; one has to check in 'man fstab', which is typically not where I look for documentation on mount options.

If there is no way to make the boot process more intelligent on a headless server, then can it at least be considered to make it easier to find the solution? E.g. mention it in the man pages for 'mountall' and 'mount (8)'?

Revision history for this message
Sandeep Gautham (lookforsandy) wrote :

Do we have a fix for this bug yet?

Revision history for this message
JEREMY B. KEITH (jkeith) wrote :

This bug has gotten me also.... I have found the undocumented workaround, but unfortunately after the damage has already been done. I have a system 5 states away, and only one person in the area that is 60 miles away. In order for me to fix this, I must ask them to drive two hours round trip just so they can go press the "s" key. This is a real killer on a headless system...

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.