12.04 isn't cleanly unmounted

Bug #1103416 reported by pelm
This bug report is a duplicate of:  Bug #1101666: inotify fd leak. Edit Remove
16
This bug affects 3 people
Affects Status Importance Assigned to Milestone
upstart (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

Noticed this when the boot-time took too long. Every reboot the journal must recover the root file system (i have the /home-filesystem on a second partition). I saw this in dmesg (not always the same but about, and this is always there at every boot):

[ 9.252087] EXT3-fs (sda1): recovery required on readonly filesystem
[ 9.252091] EXT3-fs (sda1): write access will be enabled during recovery
[ 10.928126] kjournald starting. Commit interval 5 seconds
[ 10.928181] EXT3-fs (sda1): orphan cleanup on readonly fs
[ 10.928188] ext3_orphan_cleanup: deleting unreferenced inode 338285
[ 10.928222] ext3_orphan_cleanup: deleting unreferenced inode 337758
[ 10.928229] ext3_orphan_cleanup: deleting unreferenced inode 338428
[ 10.928238] ext3_orphan_cleanup: deleting unreferenced inode 1622019
[ 10.936544] ext3_orphan_cleanup: deleting unreferenced inode 1622017
[ 10.936559] EXT3-fs (sda1): 5 orphan inodes deleted
[ 10.936562] EXT3-fs (sda1): recovery complete
[ 11.017444] EXT3-fs (sda1): mounted filesystem with ordered data mode
[ 31.210636] Adding 3903788k swap on /dev/sda2. Priority:-1 extents:1 across:3903788k
[ 32.044232] EXT3-fs (sda1): using internal journal

I have tried much of the workarounds covering dbus bugs or network-manager bugs for example https://bugs.launchpad.net/ubuntu/+source/dbus/+bug/740390 to no success. My system is a stock ubuntu 12.04 with some PPA-packages but not system ones. I've no slight idea what happens.

Tags: bot-comment
pelm (pelle-ekh)
description: updated
description: updated
description: updated
description: updated
Revision history for this message
Ubuntu Foundations Team Bug Bot (crichton) wrote :

Thank you for taking the time to report this bug and helping to make Ubuntu better. It seems that your bug report is not filed about a specific source package though, rather it is just filed against Ubuntu in general. It is important that bug reports be filed about source packages so that people interested in the package can find the bugs about it. You can find some hints about determining what package your bug might be about at https://wiki.ubuntu.com/Bugs/FindRightPackage. You might also ask for help in the #ubuntu-bugs irc channel on Freenode.

To change the source package that this bug is filed about visit https://bugs.launchpad.net/ubuntu/+bug/1103416/+editstatus and add the package name in the text box next to the word Package.

[This is an automated message. I apologize if it reached you inappropriately; please just reply to this message indicating so.]

tags: added: bot-comment
Revision history for this message
Stefan Tauner (stefanct) wrote :

I am seeing this too in the last couple of reboots. My / and home are on separate lvm volumes and share the same crypted luks partition. /home is actually bind-mounted from my data partition:
/dev/mapper/ssd-root / ext4 noatime,errors=remount-ro,discard 0 1
/dev/mapper/ssd-data /data ext4 noatime,discard 0 3
/data/home /home none bind 0 0

fsck from util-linux 2.20.1
fsck from util-linux 2.20.1
/dev/mapper/ssd-root: clean, …
/dev/mapper/ssd-data: Clearing orphaned inode …

Kernel is 3.2.0-36-generic #57 amd64.

I am have enabled the standard precise repo, precise-updates, precise-backports, precise-security, partner (and all sections in each of them, i.e. main restricted universe multiverse), but *not* precise-proposed. I am also using the official opera and mate-desktop repository (I am running mate as DE). I have also added the quantal repo to use the updated texlive packages, but this should be not related, everything else is pinned back.

The purpose of this paragraph is to isolate the offending package. I dont think that the packages named inhere are related; they are just named to derive a timeframe when it broke. I have to admit that I saw this problem earlier (as in months ago) too, but I am quite certain this is something new, i.e. introduced in 2013.
The last updates I installed are firefox-globalmenu 18.0.1+build1-0ubuntu0.12.04.1 (automatically, last night), desktop-file-utils 0.20-0ubuntu3 (manually, 2013-01-18). The non-security updates between the two just mentions have not been installed yet, but I will do so now. The problem was certainly introduced by an update *before* them (or a security update).

There was a NM update on the 16th: network-manager:amd64 0.9.4.0-0ubuntu4.1 -> 0.9.4.0-0ubuntu4.2, at the same time I upgraded xserver and activity-log-manager-common (whatever that is :) and a few others.
On the 13th I upgraded mountall:amd64 2.36 -> 2.36.3, grub2-common:amd64 1.99-21ubuntu3.4 -> 1.99-21ubuntu3.7 (et al) and a few others. I would guess the culprit is one of those above.

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in ubuntu:
status: New → Confirmed
Revision history for this message
pelm (pelle-ekh) wrote :

#Stefan Tauner

Yes this maybe related. I don't know. But with regard to some problem with "clean unmounting" some times ago in network-manager. The last update of network-manager could be something. It's rather disturbing anyhow. Could lead to data corruption at time too.

Revision history for this message
pelm (pelle-ekh) wrote :
Revision history for this message
Stefan Tauner (stefanct) wrote :

I have added some debug prints to umountfs and umountroot (in /etc/init.d/).
Namely:
 mount > /umountrootfs-mount.log
 lsof > /umountrootfs-lsof.log
 ps auxf > /umountrootfs-ps.log
and similarly in umountfs before they run their umount sequences.

The umountrootfs-mount.log shows, that one of my binded mounts on the /data filesystem is not unmounted cleanly by unmountfs:
/data/home on /home type none (ro)
/data on the other hand is *not* mentioned anymore(!)

The original state of /data/home is this: /data/home on /home type none (rw,bind)
as is to be expected with the fstab line i posted in #2.
Other bind-mounted directories are unmounted correctly though although they use the same options in fstab, e.g.:
/data/tftpboot /tftpboot none bind
which shows up in the mount log (before umounting in umountfs) as
/data/tftpboot on /tftpboot type none (rw,bind)

Neither the ps nor the lsof log indicate any obvious problems.
Because of this I have to assume that the initscripts are missing something.

affects: ubuntu → sysvinit (Ubuntu)
Revision history for this message
Stefan Tauner (stefanct) wrote :

i have been digging further.
REG_MTPTS contains the following values in my setup: /tftpboot/loops/sysrescuecd /tftpboot /data/backup/home-recover /var/cache /home /boot /data
i have redirected the output of the umount call (initiated via fstab-decode) to a log file:
 fstab-decode umount -f -v -r -d $REG_MTPTS >> /umountfs.log 2>&1
the output is:
can't delete device /dev/loop0: No such device or address
umount2: Device or resource busy
umount: /data/home busy - remounted read-only
/dev/loop0 has been unmounted
/data/tftpboot has been unmounted
/data/backup/home/hourly.0 has been unmounted
/data/cache has been unmounted
/dev/sda1 has been unmounted
/dev/mapper/ssd-data has been unmounted

the first line is apparently a problem with umount trying to delete the loop device (umount's -d option) but i dont think that is related.
i am not sure about umount2. it is either related to the loop device or to /data/home... which is clearly not cleanly umounted, but remounted as read-only (due to umount's -r option).

so... why is it busy? i have no idea.
lsof and ps auxf output (called before "Unmount local filesystems") attached

Revision history for this message
Stefan Tauner (stefanct) wrote :
affects: sysvinit (Ubuntu) → upstart (Ubuntu)
Revision history for this message
Stefan Tauner (stefanct) wrote :

grep -v ^# /etc/fstab
proc /proc proc nodev,noexec,nosuid 0 0

/dev/mapper/ssd-root / ext4 noatime,errors=remount-ro,discard 0 1
UUID=adfae278-b399-446e-8024-77bd1cef5ef0 /boot ext2 noatime 0 2

/dev/mapper/ssd-data /data ext4 noatime,discard 0 3
/data/home /home none bind 0 0
/data/cache /var/cache none bind 0 0

/data/backup/home/hourly.0 /data/backup/home-recover/ none bind,ro

/data/tftpboot /tftpboot none bind
/tftpboot/isos/systemrescuecd.iso /tftpboot/loops/sysrescuecd

Revision history for this message
Stefan Tauner (stefanct) wrote :

... auto loop,ro

sorry

Revision history for this message
Stefan Tauner (stefanct) wrote :

lsof -x fl +D /data/home output is empty.
losetup -a -v does only show the .iso mounted:
/dev/loop0: [fc02]:6162260 (/tftpboot/isos/systemrescuecd-x86-3.0.0.iso)
losetup -f returns the first unused loop device, which is loop1.
there is no nfs server running.

i am running out of ideas... other than compiling umount myself and adding some debug prints... DONOTWANT.

Revision history for this message
Stefan Tauner (stefanct) wrote :

hm... I am pretty sure now that this was fixed by linux-image-3.2.0-37-generic (3.2.0-37.58), i can still reproduce the wrong behavior (even with a second bind mount sometimes apparently) with linux-image-3.2.0-36-generic (3.2.0-36.57). can anyone confirm this?

Revision history for this message
pelm (pelle-ekh) wrote :

Yes i can't see this anymore after the update. Boot is clean now.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.