separate /var and /var/tmp tmpfs dependency loop

Bug #431040 reported by GiuseppeVerde on 2009-09-16
44
This bug affects 5 people
Affects Status Importance Assigned to Milestone
mountall (Ubuntu)
Low
Scott James Remnant (Canonical)
Karmic
Low
Scott James Remnant (Canonical)

Bug Description

I have an eee 901 with a 64GB ssd (sda) and the default 4gb (sdb), both internal (the livecd-on-a-stick is sdc). After upgrading last night (my private mirror updated at 4pm CDT), I can no longer boot on the eee, although my desktop boots fine.

root is on /dev/sda5, home is on /dev/sda6, and there are a few other partitions (var and boot).
sdb1 and 2 are encrypted partitions that I unlock manually after logging in.

When booting into rescue mode, the last messages are:
===
Begin: Running /scripts/local-bottom ...
Done.
Done.
Begin: Running /scripts/init-bottom ...
Done.
init: sreadahead main process (1175) terminated with status 1
[timestamp] scsi 2:0:0:0: Direct-Access Generic 6000 PQ: 0 ANSI: 0 CCS
sd 2:0:0:0: Attached scsi generic sg2 type 0
[scd initialization follows as it gets recognized]
sd 2:0:0:0: [sdc] Attached SCSI removable disk
_
===
(the underscore is the blinking cursor)
Upon hitting ctrl-alt-del, I get a flurry of messages. Some of them are
===
checking for unattended upgrades
===
general error mounting filesystems
===
mountall-shell main process stopped by STOP signal
mountall-shell main process continued by CONT signal
===

This last message occurs frequently until rebooted.
The console is clearly set, since the font changes to what I specified, and the keymap is modified as set up.

The same thing happens with kernel -9 as with -10. I don't have others to test.

I have been accessing the system via a livecd-on-a-stick, hoping that an update will fix things (not so far. :(

I'm highly motivated to help fix, but will lose email contact very shortly until tomorrow (CDT)

David He (dhe128) wrote :

I also have this problem on a Lenovo Thinkpad T500. All disks, partitions, and lvm logical volumes are detected, but the booting process stops before the password prompt for cryptsetup.

Gabe Gorelick (gabegorelick) wrote :

Marking this as a duplicate of bug #430496, since it has very similar symptoms. The fix for that bug should be released by the first karmic beta release. If that doesn't solve this, then unmark this as a duplicate.

If the updates yesterday fixed things, then my bug is separate. Un-linking with #430496; is not a dup.

How do I go about debugging this monkey?

Gabe Gorelick (gabegorelick) wrote :

Guess what was said here: https://bugs.launchpad.net/ubuntu/+source/mountall/+bug/430496/comments/18
was correct, namely that the fix for bug #430496 may not solve this problem for everyone since there are a lot of different parts that could potentially cause the same symptoms.

Just wish I knew what I could do to fix the problem. :(

the initrd break points may be of great help:
https://help.ubuntu.com/community/BootOptions

I tracked down the problem. It's completely unrelated to encryption. Rather, mountall.sh seems to hang if you have /var/tmp mounted (specifically, I have /var on a separate partition from / and /var/tmp is a tmpfs fs).

from /etc/fstab:
UUID=699c5b98-4d12-4a14-97d9-91d67e07cd05 / reiserfs noatime,user_xattr,acl 0 1
UUID=1c7dc52d-7abb-4462-a606-59e036824aee /var reiserfs noatime,nodev,nosuid,acl 0 2
#tmpfs /var/tmp tmpfs relatime,nodev,noexec,nosuid,mode=1777 0 3

Uncomment /var/tmp to cause the hang.

Daniel Hahler (blueyed) wrote :

MIght this be the case for /tmp on tmpfs, too? (bug 432070)

Naturally, the workaround until this is fixed is to comment out the /var/tmp line in fstab. :)

summary: - [karmic] booting on eee901 with encrypted partitions broken
+ [karmic] booting broken when /var/tmp is in fstab

Not on my system, anyway. I have /tmp on tmpfs too (reduce unnecessary flash wear)

I'm including my now-working fstab.

affects: ubuntu → mountall (Ubuntu)
tags: added: karmic mount
William (william-cobradevil) wrote :

Hello

I can confirm this bug with an home partition on lvm.

It worked before alpha 6 but after the upgrade i get a hangup on boot time.

When i boot into init=/bin/bash and comment out my home partition on lvm the system works again and i have setup the rc.local file to mount my home partition which works ok but should not be the solution i think.

ths is my fstab:

proc /proc proc defaults 0 0
# / was on /dev/sda2 during installation
UUID=a694176b-9aca-44be-9429-0d286a2df093 / ext4 errors=remount-ro,user_xattr 0 1
# /boot was on /dev/sda1 during installation
UUID=a14da3df-b6e6-401a-b48a-c2a9bd1417d2 /boot ext2 defaults 0 2
/dev/sda3 none swap noatime 0 1
/dev/scd0 /media/cdrom0 udf,iso9660 user,noauto,exec,utf8 0 0
/dev/fd0 /media/floppy0 auto rw,user,noauto,exec,utf8 0 0
#/dev/data/homes /home ext4 rw,exec,user,noatime 0 0

With kind regards

William

Giuseppe: could you attach your /etc/fstab for me?

Other people - unless you have /var/tmp on fstab, you do not have this bug - please file new ones for your problems

Changed in mountall (Ubuntu):
importance: Undecided → Medium
status: New → Incomplete
summary: - [karmic] booting broken when /var/tmp is in fstab
+ /var/tmp in fstab hangs boot

Erm, it's right there. "My (working) fstab"

Changed in mountall (Ubuntu):
status: Incomplete → New
Changed in mountall (Ubuntu):
status: New → Triaged
Luke (lukekuhn) wrote :

BIND MOUNTS OK-even on /var/tmp:

  Var/tmp as a bind mount doesn't seem to cause a problem. I use directories in /home, mounted with -o bind, for these things to allow use of full home directory space (unlike a separate LUKS volume) while sealing leaks of encrypted data.

Some time back I worked up the "Bootcrypt" method of using bind mounts on an encrypted home partition to close data leaks in /tmp, var/timp ,etc. Currently /home and swap are LUKS partitions, other "sensitive" directories are subdirectories on /home, bind mounted to the filesystem.

As of September 18 I have been able to use mountall with this-even with usplash, which I rolled back and pinned when the splash packages broke. I also use a custom splash theme based on ubuntustudio, with added armed penguins warning that all data is encrypted. In initramfs-tools/scripts/top , I had to substitute an older framebuffer script or usplash would freeze on usplash_write.

Can't use fsck yet(set 0 in fstab), due to another reported bug causing mountall to refuse to deal properly with a failed fsck run.

The partitions are specified by UUID, the bind mounts by file names in /home. Here if my fstab:

# /etc/fstab: static file system information.
#
# Use 'vol_id --uuid' to print the universally unique identifier for a
# device; this may be used with UUID= as a more robust way to name devices
# that works even if disks are added and removed. See fstab(5).
#
# <file system> <mount point> <type> <options> <dump> <pass>
proc /proc proc defaults 0 0
# / was on /dev/sda1 during installation
UUID=c6ecb774-1add-408f-95b2-16d263cadec1 / ext4 relatime,errors=remount-ro 0 0#TEMP
/dev/scd0 /media/cdrom0 udf,iso9660 user,noauto,exec,utf8 0 0
#
####### CHANGES ADDED BY BOOTCRYPT V 1.1 #######
#
UUID=8213ad0a-269b-492a-8d30-94b5bac12942 /home ext3 rw,relatime,nofail 0 0#TEMP
#
/home/TMP /tmp ext3 rw,bind,relatime,nofail 0 0
/home/VAR_TMP /var/tmp ext3 rw,bind,relatime,nofail 0 0
/home/VAR_SPOOL /var/spool ext3 rw,bind,relatime,nofail 0 0
/home/VAR_MAIL /var/mail ext3 rw,bind,relatime,nofail 0 0
/home/VAR_CACHE_CUPS /var/cache/cups ext3 rw,bind,relatime,nofail 0 0
UUID=5d09cd8b-61a7-4e86-94f8-c85a406217d7 none swap swap 0 0

Here is the crypttab that goes with it:

# <target name> <source device> <key file> <options>

vgbase UUID=5b9711af-64fa-4cda-89b1-ffc637e6359c none luks,tries=1000

I've tested a /var/tmp mount, and this works for me so far:

wing-commander scott% mount | grep tmpfs
udev on /dev type tmpfs (rw,mode=0755)
none on /dev/shm type tmpfs (rw,nosuid,nodev)
none on /var/run type tmpfs (rw,nosuid,mode=0755)
none on /var/lock type tmpfs (rw,noexec,nosuid,nodev)
none on /lib/init/rw type tmpfs (rw,nosuid,mode=0755)
none on /var/tmp type tmpfs (rw)

Giuseppe: could you make sure you're up to date, and uncomment /var/tmp
from your fstab again and try again.

If this is still hanging for you, could you try the following:

 edit /etc/init/mountall.conf
 add "--debug >/dev/mountall.log" to the mountall command
 reboot

This should give you a /dev/mountall.log file - if you could attach
that, that would be appreciated.

 status incomplete

Scott
--
Scott James Remnant
<email address hidden>

Changed in mountall (Ubuntu):
status: Triaged → Incomplete

Thanks!

What's happening here is that because you have a separate /var and a tmpfs on /var/tmp. the "virtual-filesystems" event will not be emitted until /var/tmp is mounted. But /var/tmp cannot be mounted until /var is mounted. And /var cannot be mounted until udev has created the block device for it. But udev is not started until the virtual-filesystems events, etc.

So basically you have a dependency loop.

Codewise, the solution would be to consider virtual filesystems under local filesystems as local filesystems from the point of view of the event. We've already started to grow this code for the root filesystem which is always considered local, so maybe I just need to expand on that.

For you, there's a workaround you can use for now:

  boot with init=/bin/bash
  mount -o rw,remount /
  mkdir /var/tmp
(note that this makes a /var/tmp on the root filesystem under /var)
  edit /etc/fstab, add "showthrough" to the options for /var/tmp
  mount -o ro,remount /
  reboot

This will cause /var/tmp to be mounted before /var is mounted, and carried over once /var is mounted as well

summary: - /var/tmp in fstab hangs boot
+ separate /var and /var/tmp tmpfs dependency loop
Changed in mountall (Ubuntu):
status: Incomplete → Triaged
importance: Medium → Low
assignee: nobody → Scott James Remnant (scott)
milestone: none → ubuntu-9.10
tags: added: ubuntu-boot

Last I tried this (yesterday or the day before or so) it still failed to boot.

  Code might be machine dependant in some way. Fed up with the poor performance (SLOW and fsck issues), I borrowed a mountall script someone else wrote, partially debugged it, and modified it to put all fsck runs in one place and give partial usplash support. Attached are the replacement script(still immature code and might act up if no encrypted device is delaying things)and mountall.conf file.

 This script sure as hell boots faster, and on both my intel Atom netbook and AMD Athlon single core machines mounts the bind mounted directories just fine.

  If the various problems in the mountall binary are not fixed by release, every user of Karmic may well have to use a script-and edit the damned thing to suit their machines in a worst case scenario.

  I've looked at the C code for the binary, and it sure as hell is long, though it's only one file. Given that it's job is to call external programs (mount and fsck), I'm not sure how much of an advantage the use of a compiled binary gives.

  Of course, a well-made mountall binary that treated fsck and mount as though they were shared libraries could give a hell of a speed boost once it's properly worked out, but writing this is far beyond my C skills.

> Date: Fri, 2 Oct 2009 16:05:15 +0000
> From: <email address hidden>
> To: <email address hidden>
> Subject: [Bug 431040] Re: separate /var and /var/tmp tmpfs dependency loop
>
> Last I tried this (yesterday or the day before or so) it still failed to
> boot.
>
> --
> separate /var and /var/tmp tmpfs dependency loop
> https://bugs.launchpad.net/bugs/431040
> You received this bug notification because you are a direct subscriber
> of the bug.

_________________________________________________________________
Hotmail: Free, trusted and rich email service.
http://clk.atdmt.com/GBL/go/171222984/direct/01/

Changed in mountall (Ubuntu Karmic):
status: Triaged → In Progress
Changed in mountall (Ubuntu Karmic):
status: In Progress → Fix Committed

I've uploaded a new mountall package to the ubuntu-boot PPA:

https://launchpad.net/~ubuntu-boot/+archive/ppa

I would appreciate it if you could install this and try it out. *BEFORE* you reboot though, could you run "sudo mountall --debug > mountall.log 2>&1" and attach that to this bug - then after you reboot, let me know whether it worked or not.

Thanks

Luke (lukekuhn) wrote :
Download full text (3.1 KiB)

  I just tried this on the Intel Atom(the less buggy of my two machines about boot mount) and it was a huge flop. I did not use the "debug" you mentioned, having not seen it yet, but it refused to remount / rw, leaving it read-only. My /dev/mapper/vg0-home (LUKS) partition did not mount-and calling df /home showed /mnt/RAMDISK/.mozilla instead of /dev/mapper/vg0-home-even though the /mnt/RAMDISK tmpfs for Firefox isn't even in my fstab, nor mounted at boot. Needless to say, X won't open that way, so I recovered from vt2.

For now, I have thrown in the towel on the mountall binary and have been playing with scripts. I use a single mountall script (your mountall.conf file works fine and is in fact the same as what I wrote except what I commented out you removed) to mount all the stuff in fstab.

I have been playing with the attached script for days, and have gotten it to the point of having it's own fsck recovery shell from which booting can actually proceed bypassing checks if necessary, and indicating forced checks(by mount count ONLY) through usplash write. For some reason, I cannot get console output with echo in a script run at that stage on ANY vt! Therefore, all log messages depend on usplash set verbose-and I use a lot of them for debugging. Without terminal output, getting progress indication of forced checks has been beyond me, so I have set the 140+GB home partition not to force checks unless marked dirty.

Interestingly, mount itself seems buggy, sometimes writing to mtab on the AMD 64(with 32 bit Ubuntu), but NOT the Intel Atom, that it has mounted the LUKS partition while in fact the root partition's mount point comes up on opening.

Sometimes mount -a will work right, sometimes not. The attached script calls mount -a twice in an effort to run the bind mounts last(I found a better way to exclude the binds on first run but it didn't help the AMD) so it does not get confused-but due to mount bugs on the AMD I had to hardcode the home partition for it. I could write code to parse this, but that version is in only one machine. The script as attached is what I am using on the Intel atom.

Maybe some of the bugs plaguing the mountall binary are actually mount bugs?
Anyway, with Beta out and the release just weeks away, I hope there is a backup plan.

> Date: Thu, 8 Oct 2009 02:12:36 +0000
> From: <email address hidden>
> To: <email address hidden>
> Subject: [Bug 431040] Re: separate /var and /var/tmp tmpfs dependency loop
>
> I've uploaded a new mountall package to the ubuntu-boot PPA:
>
> https://launchpad.net/~ubuntu-boot/+archive/ppa
>
> I would appreciate it if you could install this and try it out.
> *BEFORE* you reboot though, could you run "sudo mountall --debug >
> mountall.log 2>&1" and attach that to this bug - then after you reboot,
> let me know whether it worked or not.
>
> Thanks
>
> --
> separate /var and /var/tmp tmpfs dependency loop
> https://bugs.launchpad.net/bugs/431040
> You received this bug notification because you are a direct subscriber
> of the bug.

_________________________________________________________________
Hotmail: Free, trusted and rich email service.
http://clk.atdmt.com/GBL/go/17...

Read more...

Download full text (4.2 KiB)

  I tried your debugging routine on a second try but could not find the logging output.
I don't know where it could have been written, as / was mounting read-only and not getting remounted.

Again I recovered from console 2, this time remounting / rw and then going straight to gdm(which worked and indeed sent me to my LUKS home directory), ignoring the garbage from df /home. df / home reported the tmpfs again-but with the CORRECT percentage used for the LUKS partition shown.

Here's what turned up in mtab:

/dev/sda1 / ext4 rw,relatime,errors=remount-ro 0 0
proc /proc proc rw 0 0
binfmt_misc /proc/sys/fs/binfmt_misc binfmt_misc rw,noexec,nosuid,nodev 0 0
none /sys sysfs rw,noexec,nosuid,nodev 0 0
udev /dev tmpfs rw,mode=0755 0 0
none /dev/pts devpts rw,noexec,nosuid,gid=5,mode=0620 0 0
none /dev/shm tmpfs rw,nosuid,nodev 0 0
none /var/run tmpfs rw,nosuid,mode=0755 0 0
none /var/lock tmpfs rw,noexec,nosuid,nodev 0 0
none /lib/init/rw tmpfs rw,nosuid,mode=0755 0 0
gvfs-fuse-daemon /home/luke/.gvfs fuse.gvfs-fuse-daemon rw 0 0
/dev/fuse /home/luke/.gvfs fuse rw,nosuid,nodev,user=luke 0 0
tmpfs /mnt/RAMDISK tmpfs rw,size=10% 0 0
/mnt/RAMDISK/.mozilla /home/luke/.mozilla none rw,bind 0 0

RAMDISK now belongs there, as it is now mounted for volatile(secure) web browsing, but mtab does not show /dev/mapper/vg0-home (mounted ANYWHERE), even though it is obviously mounted as I akm using it! This behavior is the exact opposite of the write but not mount behavior I got in the AMD while playing with the script. Again, I suspect some sort of context-sensitive bug in mount itself. If I get such problems with simple scripts, they will make unending trouble in debugging compiled binaries.

Mount needs to be fixed, then mountall can be fixed. Meanwhile, that fsck progress support should be added to the existing mountall code so it can be debugged in parallel to the mount bugs, given the tight schedule.

Since the changelog for mount doesn't show any likely causes, the underlying trouble has probably been there all along, just waiting for faster routines to trigger them.

Here's what's in fstab:

# /etc/fstab: static file system information.
#
# Use 'vol_id --uuid' to print the universally unique identifier for a
# device; this may be used with UUID= as a more robust way to name devices
# that works even if disks are added and removed. See fstab(5).
#
# <file system> <mount point> <type> <options> <dump> <pass>
proc /proc proc defaults 0 0
# / was on /dev/sda1 during installation
UUID=c6ecb774-1add-408f-95b2-16d263cadec1 / ext4 relatime,errors=remount-ro 0 1
/dev/scd0 /media/cdrom0 udf,iso9660 user,noauto,exec,utf8 0 0
#
####### CHANGES ADDED BY BOOTCRYPT V 1.1 #######
#
UUID=8213ad0a-269b-492a-8d30-94b5bac12942 /home ext3 rw,relatime,nofail 0 2
#
/home/TMP /tmp ext3 rw,bind,relatime,nofail 0 0
/home/VAR_TMP /var/tmp ext3 rw,bind,relatime,nofail 0 0
/home/VAR_SPOOL /var/spool ext3 rw,bind,relatime,nofail 0 0
/home/VAR_MAIL /var/mail ext3 rw,bind,relatime,nofail 0 ...

Read more...

Download full text (4.4 KiB)

  Here it is as cut and pasted text:

#!/bin/bash
# As re-edited by Luke
#
#WARNING: little error handling, some fs not supported!
#
#For Linux mounting / read-only, /proc, and /sys from the initramfs ONLY!
#
# Faster-running replacement for the binary mountall(which is still new code, slow running right now).

usplash_write "TEXT Mounting virtual filesystems..."

#mount_remotes() { :; }

#trap 'mount_remotes' USR1
# Second, mount the rest of our virtual filesystems.
# Since these probably won't be in /etc/fstab, mount them by hand.
#REMOVED LINE: /sys/fs/fuse/connections fusectl defaults
#REMOVED LINE /spu spufs gid=spu
#REMOVED LINE /sys sysfs nodev,noexec,nosuid

echo "/proc/sys/fs/binfmt_misc binfmt_misc defaults
/sys/kernel/debug debugfs defaults
/sys/kernel/security securityfs defaults
/dev tmpfs mode=0755
/dev/shm tmpfs nosuid,nodev
/dev/pts devpts noexec,nosuid,gid=tty,mode=0620
/var/run tmpfs mode=0755,nosuid
/var/lock tmpfs nodev,noexec,nosuid
/lib/init/rw tmpfs mode=0755,nosuid" |while read mntpt fs opts rest; do
    # if this virtual filesystem type is not supported, skip it.
    grep -q "$fs" /proc/filesystems || continue
    # if the filesystem is already mounted, don't mount it again
        grep -q " $mntpt " /proc/mounts && continue
    # if the directory does not exist, try to make it.
    [[ -d $mntpt ]] || mkdir -p "$mntpt" || exit 3
    mount -t $fs -o $opts none $mntpt || exit 3
done
# exit if something failed mounting virtual filesystems
ret=$?; (($ret > 1)) && exit $ret
# let the world know our virtual filesystems are mounted.
initctl emit virtual-filesystems
usplash_write "SUCCESS OK"

# by now udev should have kicked off. Wait for it to detect all our devices.
udevadm settle

#Check filesystems

usplash_write "TEXT Checking Filesystems..."

#find rootdev

IDTYPE=$(cat /proc/cmdline | tail -c +6 | head -c 4)
if test $IDTYPE = UUID; then
        CMDLINE=$(tail -c +6 /proc/cmdline | head -c +41)
else
        CMDLINE=$(tail -c +6 /proc/cmdline |head -c 9)
fi

#find mount count

COUNT=$(tune2fs -l $CMDLINE | grep "Mount count" | tail -c +27)

#find maximum number of mounts of root filesystem-WARNING:will not indicate on forced check of any other fs!

MAXCOUNT=$(tune2fs -l $CMDLINE | grep "Maximum mount count" | tail -c +28)

if test $COUNT -ge $MAXCOUNT; then

   usplash_write "TEXT fsck: maximum mount count reached, check forced..."

   usplash_write "SUCCESS WAIT"

fi

   fsck -aA > /lib/init/rw/checkall
   ret=$?
usplash_write "SUCCESS DONE"

if test $ret -ge 2 ; then
               echo "a filesystem failed fsck"
               usplash_write "FAILURE FAIL"
               usplash_write "TEXT ctl-alt-F8 for console to repair and continue"
               sleep 2
               usplash_write "QUIT"
               exec </dev/console >/dev/co...

Read more...

Sorry about the issues with the previous PPA versions, as usual things worked just fine when I tested it in the various rigs I have here - of course it flatly failed when installed on normal systems because I hadn't actually tested that ;)

I've uploaded a new ~boot4 version, this one feels much better (and I'm running it on my laptop now :p)

As before, after installing the package but *before* you reboot, please run with --debug and attach the log to the bug - then after rebooting, let me know how it works out.

Thanks for all your help with testing, this is a big change and it's good to know that it's now working for 95% of people and your help getting it work for the final 5% is greatly appreciated!

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package mountall - 0.2.0

---------------
mountall (0.2.0) karmic; urgency=low

  [ Colin Watson ]
  * Always check the root filesystem if --force-fsck is used, regardless of
    passno. LP: #435707.

  [ Johan Kiviniemi ]
  * Have each fsck instance create a lock for each underlying physical device.
    If you have a single disk or RAID, all filesystem checks will happen
    sequentially in order to avoid thrashing. On more complex configurations,
    you’ll benefit from the parallel checks mountall has been doing all along.
    LP: #434974.

  [ Scott James Remnant ]
  * Flush standard output and error before spawning processes to make
    capturing logs easier (otherwise we end up repeating things still in
    the buffer), and before calling exec().
  * Turn the code upside down so that each mount knows what it's waiting
    for, and allow multiple dependencies. This makes the code much more
    readable putting the "policy" in a single function, and will make it
    much easier in future when this is done by Upstart.
  * For kernel filesystems listed in fstab, honour the order that they
    are listed in fstab. LP: #432571, #433537, #436796
  * Always create new swap partition mounts for each fstab entry, don't
    treat them as updating the same. LP: #435027.
  * Virtual filesystems under local or remote filesystems (and local under
    remote) don't delay the virtual or local events. LP: #431040.
  * Simplify event emission, this has the advantage that we can now output
    what mount points we're waiting for and what they are waiting for as
    well.
  * Fixed issue with trailing slashes. LP: #443035.
  * Only run hooks if the filesystem was not already mounted. LP: #444252.
  * Don't clean up /tmp when run without --tmptime argument.
  * Ignore loop and ram devices until ready. LP: #441454.
  * Add options to binfmt_misc filesystem, which will probably cause it to
    be mounted on boot as well.
  * Synchronously mount local and virtual filesystems, I suspect this is
    the real cause behind the XFS races as one will modprobe and the other
    will not (and fail). LP: #432620.
  * Synchronously activate swap to avoid out of memory issues when checking
    the root filesystem.
  * Enumerate existing udev devices on startup, so we don't always have to
    see udev be coldplugged.
  * Don't break on general errors for non-essential filesystems.
    LP: #441144.
  * Don't repeat attempts to mount a filesystem without having first
    succeded to mount another.
  * Still restart mountall even if the recovery shell fails.
  * Don't queue filesystem check when device is "none", or missing, or the
    filesystem is marked nodev.
  * Generate a "mount" event before mounting a filesystem, and wait for its
    effects to complete.

 -- Scott James Remnant <email address hidden> Fri, 09 Oct 2009 16:50:46 +0100

Changed in mountall (Ubuntu Karmic):
status: Fix Committed → Fix Released

  This version worked, though I still need to test with a forced fsck run. This version was fast, unlike the hangs that forced me to write the scripts.

Log file attached as per request.

_________________________________________________________________
Your E-mail and More On-the-Go. Get Windows Live Hotmail Free.
http://clk.atdmt.com/GBL/go/171222985/direct/01/

I've done more testing of this with forced fsck runs (generated by the utc clock bug-reconfigure tzdata to "utc", then BIOS clock to local time to avoid this), and the shell works, though the first login prompt refuses the root password, echoing it to the console(DANGEROUS in some environments) and you must try again.

There is still one serious problem: because the shell is external to mountall, fsck re-runs and errors out again if you try to skip with control-d! This could be especially bad if someone has not manually set a root password, assuming the system still demands one.

I am about to roll back the time again and see if it is even possible to control-d out of the shell with no root password at all.

If this is not fixed, Ubuntu Karmic should prompt for a root password by default on installation AND on upgrade, otherwise end users without rescue flash drives/disks could get seriously locked out.

_________________________________________________________________
Hotmail: Trusted email with Microsoft’s powerful SPAM protection.
http://clk.atdmt.com/GBL/go/177141664/direct/01/

  With ctrl-d and without a root password, though it does little good if you cannot get past a bad fsck run like in the past. Workaround if you have no rescue flash drive would be to go to the shell(assuming you DO have a root password set), mount -o remount, rw / , nano /etc/fstab and set the fsck pass numbers to zero, the mount -o remount, ro / and exit the shell. This will disable fsck and allow you to boot. Just be sure to change it back after fixing the problems!

Again, the fact that Ubuntu does not by default encourage use of a root password could be a real mess if an fsck bypass is not included when mountall is restarted after leaving the mantainance shell. Either fsck was just run, or it cannot be run at the moment, and either way it should not be run again.

Perhaps the shell upon closing could pass a flag variable to mountall that would cause it to skip a second fsck run?

> Date: Sat, 10 Oct 2009 03:03:18 +0000
> From: <email address hidden>
> To: <email address hidden>
> Subject: [Bug 431040] The latest mountall version as of Oct 9
>
>
> I've done more testing of this with forced fsck runs (generated by the
> utc clock bug-reconfigure tzdata to "utc", then BIOS clock to local time
> to avoid this), and the shell works, though the first login prompt
> refuses the root password, echoing it to the console(DANGEROUS in some
> environments) and you must try again.
>
> There is still one serious problem: because the shell is external to
> mountall, fsck re-runs and errors out again if you try to skip with
> control-d! This could be especially bad if someone has not manually set
> a root password, assuming the system still demands one.
>
> I am about to roll back the time again and see if it is even possible to
> control-d out of the shell with no root password at all.
>
> If this is not fixed, Ubuntu Karmic should prompt for a root password by default on installation AND on upgrade, otherwise end users without rescue flash drives/disks could get seriously locked out.
>
> _________________________________________________________________
> Hotmail: Trusted email with Microsoft’s powerful SPAM protection.
> http://clk.atdmt.com/GBL/go/177141664/direct/01/
>
> --
> separate /var and /var/tmp tmpfs dependency loop
> https://bugs.launchpad.net/bugs/431040
> You received this bug notification because you are a direct subscriber
> of the bug.

_________________________________________________________________
Hotmail: Trusted email with powerful SPAM protection.
http://clk.atdmt.com/GBL/go/177141665/direct/01/

On Sat, 2009-10-10 at 03:03 +0000, Luke wrote:

> There is still one serious problem: because the shell is external to
> mountall, fsck re-runs and errors out again if you try to skip with
> control-d! This could be especially bad if someone has not manually set
> a root password, assuming the system still demands one.
>
I don't understand - you can't skip this with ^D.

If you don't repair the filesystem, you cannot boot. That's precisely
why it's given you a shell.

Scott
--
Scott James Remnant
<email address hidden>

Luke (lukekuhn) wrote :

   Issues are as follows:

1: if someone has not set a root password but is asked for a root password for the shell, they cannot fix the filesystem and can only try to skip it with ^d. Unless the shell does not ask for a root password when none or a random one is set, this is real trouble, as mountall won't let them try to boot.

 If mountall just brings them back to the shell and again asks for the root password, they need a rescue disk/flash drive. How many converts from Windoze will be able to handle that?

2: If someone has a big /home partition with errors they should still be able to boot and put off fsck repairs because they may not have 15 minutes to wait on a 160GB disk being checked if they are say, checking email before going out the door. An end user with a 1TB drive will have real trouble is they run into, say, the clock/utc fsck bug.

> Date: Sat, 10 Oct 2009 10:56:38 +0000
> From: <email address hidden>
> To: <email address hidden>
> Subject: Re: [Bug 431040] The latest mountall version as of Oct 9
>
> On Sat, 2009-10-10 at 03:03 +0000, Luke wrote:
>
> > There is still one serious problem: because the shell is external to
> > mountall, fsck re-runs and errors out again if you try to skip with
> > control-d! This could be especially bad if someone has not manually set
> > a root password, assuming the system still demands one.
> >
> I don't understand - you can't skip this with ^D.
>
> If you don't repair the filesystem, you cannot boot. That's precisely
> why it's given you a shell.
>
> Scott
> --
> Scott James Remnant
> <email address hidden>
>
> --
> separate /var and /var/tmp tmpfs dependency loop
> https://bugs.launchpad.net/bugs/431040
> You received this bug notification because you are a direct subscriber
> of the bug.

_________________________________________________________________
Hotmail: Trusted email with Microsoft’s powerful SPAM protection.
http://clk.atdmt.com/GBL/go/177141664/direct/01/

Daniel Hahler (blueyed) wrote :

JFI: The clock/utc bug is bug 432070.

On Sat, 2009-10-10 at 21:30 +0000, Luke wrote:

> Issues are as follows:
>
> 1: if someone has not set a root password but is asked for a root
> password for the shell, they cannot fix the filesystem and can only try
> to skip it with ^d. Unless the shell does not ask for a root password
> when none or a random one is set, this is real trouble, as mountall
> won't let them try to boot.
>
If they do not have a root password set, they are given the shell
straight away (this is the default configuration so well tested :p)

> 2: If someone has a big /home partition with errors they should still be
> able to boot and put off fsck repairs because they may not have 15
> minutes to wait on a 160GB disk being checked if they are say, checking
> email before going out the door. An end user with a 1TB drive will have
> real trouble is they run into, say, the clock/utc fsck bug.
>
fsck skipping is coming back, don't worry (that code is basically
unchanged since jaunty - just in a separate branch while we deal with
the big bugs)

Scott
--
Scott James Remnant
<email address hidden>

Changed in mountall (Ubuntu):
status: Fix Released → New
IgnorantGuru (ignorantguru) wrote :

This bug does not appear to be fixed in karmic final. Attempting to mount tmpfs to /var or /var/log in fstab results in a hung "waiting for tmpfs" at boot. This was NOT the case in karmic alpha3 and prior, so perhaps it is the use of upstart that is triggering this problem? Mounting tmpfs to /tmp does NOT cause this problem, just /var or a /var subdir.

To reproduce, add this line to fstab:
tmpfs /var tmpfs defaults,noatime,size=1000M,mode=1777 0 0

or

tmpfs /var/log tmpfs defaults,noatime,size=1000M,mode=1777 0 0

This is an important ability for systems with SSDs, so I don't think the importance should be low. mountall --version reports "mountall 1.0". I installed from kubuntu-9.10-alternate-amd64.iso

Changed in mountall (Ubuntu Karmic):
status: Fix Released → New

  Could this be triggered by the mounting of the /var/run subdirectory onto /var?

> Date: Tue, 3 Nov 2009 15:34:21 +0000
> From: <email address hidden>
> To: <email address hidden>
> Subject: [Bug 431040] Re: separate /var and /var/tmp tmpfs dependency loop
>
> This bug does not appear to be fixed in karmic final. Attempting to
> mount tmpfs to /var or /var/log in fstab results in a hung "waiting for
> tmpfs" at boot. This was NOT the case in karmic alpha3 and prior, so
> perhaps it is the use of upstart that is triggering this problem?
> Mounting tmpfs to /tmp does NOT cause this problem, just /var or a /var
> subdir.
>
> To reproduce, add this line to fstab:
> tmpfs /var tmpfs defaults,noatime,size=1000M,mode=1777 0 0
>
> or
>
> tmpfs /var/log tmpfs defaults,noatime,size=1000M,mode=1777 0 0
>
> This is an important ability for systems with SSDs, so I don't think the
> importance should be low. mountall --version reports "mountall 1.0". I
> installed from kubuntu-9.10-alternate-amd64.iso
>
>
>
> ** Changed in: mountall (Ubuntu Karmic)
> Status: Fix Released => New
>
> --
> separate /var and /var/tmp tmpfs dependency loop
> https://bugs.launchpad.net/bugs/431040
> You received this bug notification because you are a direct subscriber
> of the bug.

_________________________________________________________________
Hotmail: Trusted email with Microsoft's powerful SPAM protection.
http://clk.atdmt.com/GBL/go/177141664/direct/01/
http://clk.atdmt.com/GBL/go/177141664/direct/01/

Steve Langasek (vorlon) wrote :

The bug that was originally reported has been fixed.

Mounting /var on tmpfs is not a supported configuration for Ubuntu; /var/lib must be persistent.

Changed in mountall (Ubuntu):
status: New → Fix Released
Changed in mountall (Ubuntu Karmic):
status: New → Fix Released
IgnorantGuru (ignorantguru) wrote :

>The bug that was originally reported has been fixed.

It hasn't been fixed at all, it's just been ignored and listed as fixed, like many bugs lately, to get it off the buglist quick.

>Mounting /var on tmpfs is not a supported configuration for Ubuntu; /var/lib must be persistent.

What does that have to do with mounting /var/log and /var/tmp on tmpfs? That is what most people want to do for the sake of their SSDs, not mount all of /var, and you're not addressing it. Mounting /var on tmpfs is not really the issue.

Why is it that this worked fine in all previous versions? That is, until people started using the ability, and then suddenly it was 'broken'.

Changed in mountall (Ubuntu):
status: Fix Released → New
Changed in mountall (Ubuntu Karmic):
status: Fix Released → New
IgnorantGuru (ignorantguru) wrote :

At any rate, for those who actually want to use the capabilities of their OS and protect their SSD, I have posted a method here which works around this bug in mountall to some extent.
http://ubuntuforums.org/showpost.php?p=8232599&postcount=11

I guess in these days of Microsoft's... excuse me, CANONICAL's higher level devs being hostile to its users, this is kind of thing we have to work with. That mounts /var/tmp in rc.local, a little later than fstab, but still before KDE and Gnome are started. (All of the data in my /var/tmp belongs to KDE.) /var/log is more of a problem because the system logs are opened in it before rc.local is run (even in runlevel 0). The best I think that can be done until this mountall bug is actually addressed (as in something actually being DONE on it and an explanation of why it suddenly broke between karmic alpha3 and final) is to have rc.local create symlinks in /var/log for some of the log folders. That won't protect the SSD from the system logs writes, though.

Also, eventually I'll post a more polished version of that SSD script here... http://igurublog.wordpress.com/

Changed in mountall (Ubuntu):
status: New → Fix Released
Changed in mountall (Ubuntu Karmic):
status: New → Fix Released

Was the fix released just an hour ago? I got hit by this bug on my server systems, which oddly enough weren't affected until the kernel update came through last night, and I rebooted. I don't think it's the kernels (didn't matter which kernel I chose the one time I tried a different kernel), but a package between the that and the previous reboot. Both my servers hit this bug, hard. Additionally, /var being mounted has nothing to do with hitting the bug:

$mount
/dev/sda1 on / type ext3 (rw,noatime,nodiratime,acl,user_xattr,errors=remount-ro)
proc on /proc type proc (rw)
none on /sys type sysfs (rw,noexec,nosuid,nodev)
none on /sys/fs/fuse/connections type fusectl (rw)
none on /sys/kernel/debug type debugfs (rw)
none on /sys/kernel/security type securityfs (rw)
udev on /dev type tmpfs (rw,mode=0755)
none on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=0620)
none on /dev/shm type tmpfs (rw,nosuid,nodev)
tmpfs on /tmp type tmpfs (rw,noexec,nosuid,nodev,relatime,mode=1777)
tmpfs on /var/run type tmpfs (rw,noexec,nosuid,nodev,relatime,mode=1777)
none on /var/lock type tmpfs (rw,noexec,nosuid,nodev)
none on /lib/init/rw type tmpfs (rw,nosuid,mode=0755)
/dev/sda5 on /home type ext4 (rw,noexec,nosuid,nodev,noatime,nodiratime,acl,user_xattr)
/dev/sda6 on /usr/local/web type ext4 (rw,noexec,nosuid,nodev,noatime,nodiratime,acl,user_xattr)
binfmt_misc on /proc/sys/fs/binfmt_misc type binfmt_misc (rw,noexec,nosuid,nodev)

(I used to have /var/tmp mounted tmpfs like /tmp and /var/run and things)

If log writes are an issue for you, you may want to send them elsewhere instead of killing them on reboot. (see the syslog-ng manpage, for instance, on how to do this; you can do it securely)

The bug that was originally reported has been fixed.

Mounting /var on tmpfs is not a supported configuration for Ubuntu; /var/lib must be persistent.

Umm, please, please *please* read my fstab. It's like nobody's looking at it at all. I have /var mounted from the SSD (reiserfs); it's persistent. apt's cache is tmpfs, /tmp is tmpfs, /var/run is tmpfs (on the server system, not my eee), and /var/tmp is tmpfs.

(The first two lines of the previous should've been quoted; they're not my words, but pasted from above)

OK, my server was last rebooted on Oct 30:
reboot system boot 2.6.31-14-generi Fri Oct 30 12:50 - 13:30 (10+01:40)

My local mirror is updated at 4am and 4pm. I'm attaching the relevant /var/log/apt/term.log files.

Attaching a list of packages that were installed in the range of Oct. 30 - Today (inclusive). Generated by
$ grep -i 'setting up' term.log | awk '{print $3 $4;}' > packages.list.txt

IgnorantGuru (ignorantguru) wrote :

> If log writes are an issue for you, you may want to send them elsewhere instead of killing them on reboot. (see the syslog-ng manpage, for instance, on how to do this; you can do it securely)

Thanks for that suggestion - I'll take a look. Generally I don't need persistent logs, and if there's a problem I can always temporarily make them persistent.

I have also submitted a new bug here which narrows the bug report to subdirs of /var, instead of /var itself (I did this because none of the other bug reports seem to cover this explicitly with tmpfs, and this bug has been marked "fix released" despite the whole problem not being addressed.)
https://bugs.launchpad.net/ubuntu/+source/mountall/+bug/479429

Goran Zec (zecg) wrote :

I am seeing the error "general error mounting filesystems" at boot, but a second or two later it mounts everything anyway and all is well. It's an SSD (Eee 901), I have an ext4 root (4GB) and ext3 home (8GB) partition, no swap.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers