Race with ureadahead can mean that /var/lib/ureadahead/debugfs appears in /etc/mtab

Bug #499773 reported by Ralf Hildebrandt on 2009-12-23
178
This bug affects 34 people
Affects Status Importance Assigned to Milestone
mountall (Ubuntu)
Low
Unassigned

Bug Description

When the system is booted, the mountall process may notice a temporarily-mounted file system at /var/lib/ureadahead/debugfs and record that fact in /etc/mtab. Later in the boot process, this file system is unmounted, but mountall doesn't record unmounts, so it never "forgets" this path when it's unmounted.

Mountall should if necessary remove this mount from mtab.

To confirm on your system that it is indeed not really mounted, look for it in /proc/self/mountinfo
Otherwise this may be due to a crash of the ureadahead process during boot.

---- original report ----

Binary package hint: ureadahead

$ mount
/dev/sda1 on / type ext3 (rw,relatime,errors=remount-ro)
proc on /proc type proc (rw)
none on /sys type sysfs (rw,noexec,nosuid,nodev)
none on /sys/fs/fuse/connections type fusectl (rw)
none on /sys/kernel/debug type debugfs (rw)
none on /sys/kernel/security type securityfs (rw)
udev on /dev type tmpfs (rw,mode=0755)
none on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=0620)
none on /dev/shm type tmpfs (rw,nosuid,nodev)
none on /var/run type tmpfs (rw,nosuid,mode=0755)
none on /var/lock type tmpfs (rw,noexec,nosuid,nodev)
none on /lib/init/rw type tmpfs (rw,nosuid,mode=0755)
none on /var/lib/ureadahead/debugfs type debugfs (rw)
/dev/sda6 on /home type ext4 (rw,relatime)
binfmt_misc on /proc/sys/fs/binfmt_misc type binfmt_misc (rw,noexec,nosuid,nodev)
gvfs-fuse-daemon on /home/hildeb/.gvfs type fuse.gvfs-fuse-daemon (rw,nosuid,nodev,user=hildeb)

/var/lib/ureadahead/debugfs is left lingering around. It's use is somewhat limited, since most Ubuntu users are not kernel developers -- if it's used as intended: http://kerneltrap.org/node/4394

ProblemType: Bug
Architecture: amd64
Date: Wed Dec 23 11:04:11 2009
DistroRelease: Ubuntu 9.10
NonfreeKernelModules: nvidia
Package: ureadahead 0.90.3-2
ProcEnviron:
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcVersionSignature: Ubuntu 2.6.31-17.54-generic
SourcePackage: ureadahead
Uname: Linux 2.6.31-17-generic x86_64

This implies that ureadahead has crashed; when you see the lingering debugfs mount, is the ureadahead process still in ps? What does "status ureadahead say"?

If not, could you attach your dmesg, syslog and messages files (from /var/log), output of the "dmesg" command wouldn't go amiss either.

Thanks

Changed in ureadahead (Ubuntu):
status: New → Incomplete
importance: Undecided → Medium

Also could you attach the output of "cat /proc/self/mountinfo"

It's possible that this bug is quite different, and that /var/lib/ureadahead/debugfs isn't mounted at all - but an errant entry got added to /etc/mtab by mountall because it was while it was running; and that entry doesn't get removed by ureadahead itself.

* Scott James Remnant <email address hidden>:
> This implies that ureadahead has crashed; when you see the lingering
> debugfs mount, is the ureadahead process still in ps?

No.

> What does "status ureadahead say"?

$ status ureadahead
ureadahead stop/waiting

* Scott James Remnant <email address hidden>:
> Also could you attach the output of "cat /proc/self/mountinfo"

$ cat /proc/self/mountinfo
15 18 0:0 / /sys rw,nosuid,nodev,noexec,relatime - sysfs none rw
16 18 0:3 / /proc rw,nosuid,nodev,noexec,relatime - proc none rw
17 18 0:15 / /dev rw,relatime - tmpfs udev rw,mode=755
18 1 8:1 / / rw,relatime - ext3 /dev/disk/by-uuid/17de2e09-6910-4632-85f9-27736e1f0785 rw,errors=remount-ro,data=writeback
19 15 0:8 / /sys/kernel/security rw,relatime - securityfs none rw
21 15 0:16 / /sys/fs/fuse/connections rw,relatime - fusectl none rw
22 15 0:5 / /sys/kernel/debug rw,relatime - debugfs none rw
23 17 0:11 / /dev/pts rw,nosuid,noexec,relatime - devpts none rw,gid=5,mode=620,ptmxmode=000
24 17 0:17 / /dev/shm rw,nosuid,nodev,relatime - tmpfs none rw
25 18 0:18 / /var/run rw,nosuid,relatime - tmpfs none rw,mode=755
26 18 0:19 / /var/lock rw,nosuid,nodev,noexec,relatime - tmpfs none rw
27 18 0:20 / /lib/init/rw rw,nosuid,relatime - tmpfs none rw,mode=755
28 18 8:6 / /home rw,relatime - ext4 /dev/sda6 rw,barrier=1,data=ordered
29 16 0:21 / /proc/sys/fs/binfmt_misc rw,nosuid,nodev,noexec,relatime - binfmt_misc binfmt_misc rw

Great, this confirms that the filesystem isn't really mounted - it's just appearing in /etc/mtab

summary: - /var/lib/ureadahead/debugfs lingering around
+ /var/lib/ureadahead/debugfs not removed from /etc/mtab
Changed in ureadahead (Ubuntu):
status: Incomplete → Triaged

This is affecting me when trying to copy a really large file mounted over CIFS, and it then dies as if I had ran out of space (but haven't):

$ time cp Ewrapped.sql /var/lib/mysql/
cp: writing `/var/lib/mysql/Ewrapped.sql': No space left on device

real 25m17.057s
user 0m0.356s
sys 0m43.167s

(fyi, the file is a > 9GB mysql dump)

This is also WEIRD:

$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 36G 36G 0 100% /
udev 1.5G 256K 1.5G 1% /dev
none 1.5G 184K 1.5G 1% /dev/shm
none 1.5G 164K 1.5G 1% /var/run
none 1.5G 0 1.5G 0% /var/lock
none 1.5G 0 1.5G 0% /lib/init/rw
none 36G 36G 0 100% /var/lib/ureadahead/debugfs
/dev/sda6 111G 11G 95G 10% /home

debugfs with 36G/36G used, 0% free?

$ status ureadahead
ureadahead stop/waiting

$ uname -a
Linux garepa-desktop 2.6.31-16-generic #53-Ubuntu SMP Tue Dec 8 04:01:29 UTC 2009 i686 GNU/Linux

$ cat /proc/self/mountinfo
15 18 0:0 / /sys rw,nosuid,nodev,noexec,relatime - sysfs none rw
16 18 0:3 / /proc rw,nosuid,nodev,noexec,relatime - proc none rw
17 18 0:15 / /dev rw,relatime - tmpfs udev rw,mode=755
18 1 8:1 / / rw,relatime - ext3 /dev/disk/by-uuid/f3dfd3ac-4485-4d7a-b8e0-bfb4f7094a9e rw,errors=remount-ro,data=writeback
19 15 0:8 / /sys/kernel/security rw,relatime - securityfs none rw
21 15 0:16 / /sys/fs/fuse/connections rw,relatime - fusectl none rw
22 15 0:5 / /sys/kernel/debug rw,relatime - debugfs none rw
23 17 0:11 / /dev/pts rw,nosuid,noexec,relatime - devpts none rw,gid=5,mode=620,ptmxmode=000
24 17 0:17 / /dev/shm rw,nosuid,nodev,relatime - tmpfs none rw
25 18 0:18 / /var/run rw,nosuid,relatime - tmpfs none rw,mode=755
26 18 0:19 / /var/lock rw,nosuid,nodev,noexec,relatime - tmpfs none rw
27 18 0:20 / /lib/init/rw rw,nosuid,relatime - tmpfs none rw,mode=755
28 18 8:6 / /home rw,nosuid,nodev,relatime - ext3 /dev/sda6 rw,errors=continue,data=writeback
29 28 0:21 / /home/garen/.gvfs rw,nosuid,nodev,relatime - fuse.gvfs-fuse-daemon gvfs-fuse-daemon rw,user_id=1000,group_id=1000
30 16 0:22 / /proc/sys/fs/binfmt_misc rw,nosuid,nodev,noexec,relatime - binfmt_misc binfmt_misc rw
20 18 0:23 / /mnt/nyquist/DailySnapshots rw,relatime - cifs //nyquist/Daily\040Snapshots rw,mand,unc=\\nyquist\Daily Snapshots,username=garepa,uid=0,noforceuid,gid=0,noforcegid,addr=10.100.0.57,file_mode=0755,dir_mode=0755,serverino,rsize=16384,wsize=57344

Uh - you *did* run out of space on your root filesystem. :P

debugfs is just lying to you and stating that it's filesystem is full too.

/dev/sda1 36G 36G 0 100% /

comgen (ndatus) wrote :

I'll add:

On any system I have ubuntu 9.10 ( ubuntu, xubuntu, etc ) installed and on boot / reboot
the following is present ( df -H ): /var/lib/ureadahead/debugfs

System performance is greatly reduced ( 20% - 40% est ), so far, a simple reboot will remove this entry and performance is returned to normal. The key condition / factor is whether /var/lib/ureadahead/debugfs is present

General system specs are wide: 400Mhz - 3 GHz Intel and AMD 32/64 bit, 256 MB - 4 GB RAM, 10 GB - 500 GB HDD mixed brands.

As reported by others, ureadhead, will fail on boot / reboot as well, with out an failure everything is fine

If I can be of any further assistance, please provide brief debug / test commands / log info request, etc and I'll be happy to provide them.

Lao Meng (menggr) wrote :

I've got something similar to this going on. Here's the data:
uname -a yields:
   Linux eRuiServer 2.6.31-19-generic-pae #56-Ubuntu SMP Thu Jan 28 02:29:51 UTC 2010 i686 GNU/Linux

status ureadahead yields:
   ureadahead stop/waiting

df yields:
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sda1 24027628 22204940 602152 98% /
udev 253720 208 253512 1% /dev
none 253720 124 253596 1% /dev/shm
none 253720 532 253188 1% /var/run
none 253720 0 253720 0% /var/lock
none 253720 0 253720 0% /lib/init/rw
none 24027628 22204940 602152 98% /var/lib/ureadahead/debugfs
/dev/sda2 128321212 12963948 108838892 11% /home

sudo du -sc * from the / dir yields:
6852 bin
55076 boot
0 cdrom
332 dev
52 drivers
17932 etc
du: cannot access `home/menggr/.gvfs': Permission denied
12771740 home
0 initrd.img
0 initrd.img.old
383696 lib
16 lost+found
0 lpr.org
17630604 media
4 mnt
4 opt
du: cannot access `proc/3189/task/3189/fd/4': No such file or directory
du: cannot access `proc/3189/task/3189/fdinfo/4': No such file or directory
du: cannot access `proc/3189/fd/4': No such file or directory
du: cannot access `proc/3189/fdinfo/4': No such file or directory
du: cannot access `proc/3190': No such file or directory
du: cannot access `proc/3192': No such file or directory
du: cannot access `proc/3193': No such file or directory
0 proc
392 root
8512 sbin
4 selinux
200 srv
0 sys
136 tmp
2842812 usr
1082940 var
0 vmlinuz
0 vmlinuz.old
34801304 total

From the df we can see that the /dev/sda2 partition that has /home mounted is using 12,963,948kb. This is reasonably consistent with the value reported by du, which is 12,771,740kb. But for /dev/sda1, which has everything else, df reports it is using 22,204,940kb and is 98% full; and debugfs mirrors that. But if we look at what du reports for everything besides /home (we also ignore media too because that is just an external disk) it totals out to 4,398,960kb which isn't even close to 98%. I spent most of the day trying to find what is going on here but have failed. The debugfs led me to this discussion. When I boot my server it is complaining that the root directory is almost full. Does anybody think my problem is related to this one? Any suggestions on what I can do? Thanks in advance!

Michel Roche (pichel) wrote :

Hello,
I'm experiencing the very same problem as Lao Meng !

Actually, I was led to this bug because of the same message telling me that my root partition was almost full.
The debugfs directory didn't happen at first. But I can see it now repeatedly.

Can it be related to this bug ? Or another one ?

Michel

I think what's happening here is that ureadahead mounts /var/lib/ureadahead/debugfs so that it can work before the actual /sys/kernel/debug path is mounted. mountall is running, and sees that debugfs gets mounted at the new path, and records it to be added to /etc/mtab when the root filesystem is mounted.

Now mountall doesn't record unmounts, so never "forgets" this path when it's unmounted. I think that's actually the bug here; mountall should forget about the mount, and if necessary remove it from mtab.

summary: - /var/lib/ureadahead/debugfs not removed from /etc/mtab
+ Race with mountall can mean that /var/lib/ureadahead/debugfs appears in
+ /etc/mtab
summary: - Race with mountall can mean that /var/lib/ureadahead/debugfs appears in
- /etc/mtab
+ Race with ureadahead can mean that /var/lib/ureadahead/debugfs appears
+ in /etc/mtab
affects: ureadahead (Ubuntu) → mountall (Ubuntu)
Changed in mountall (Ubuntu):
importance: Medium → Low
André Savik (asavik) wrote :

Why has this been set to importance: low when tens of thousands of users are told their hard disks are full, and they can no longer store files on them?

It's still happening in natty:

$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 106G 62G 39G 62% /
none 2,0G 296K 2,0G 1% /dev
none 2,0G 5,2M 2,0G 1% /dev/shm
none 2,0G 128K 2,0G 1% /var/run
none 2,0G 0 2,0G 0% /var/lock
none 106G 62G 39G 62% /var/lib/ureadahead/debugfs

$ ps auxwww|grep ureadahead
hildeb 26013 0.0 0.0 10396 924 pts/1 S+ 15:18 0:00 grep ureadahead

$ cat /proc/mounts
rootfs / rootfs rw 0 0
none /sys sysfs rw,nosuid,nodev,noexec,relatime 0 0
none /proc proc rw,nosuid,nodev,noexec,relatime 0 0
none /dev devtmpfs rw,relatime,size=2016252k,nr_inodes=504063,mode=755 0 0
none /dev/pts devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0
fusectl /sys/fs/fuse/connections fusectl rw,relatime 0 0
/dev/disk/by-uuid/d3877693-43c0-45cc-821c-6c4dcd692cde / ext4 rw,noatime,errors=remount-ro,barrier=1,data=ordered 0 0
none /sys/kernel/debug debugfs rw,relatime 0 0
none /sys/kernel/security securityfs rw,relatime 0 0
none /dev/shm tmpfs rw,nosuid,nodev,relatime 0 0
none /var/run tmpfs rw,nosuid,relatime,mode=755 0 0
none /var/lock tmpfs rw,nosuid,nodev,noexec,relatime 0 0
binfmt_misc /proc/sys/fs/binfmt_misc binfmt_misc rw,nosuid,nodev,noexec,relatime 0 0
gvfs-fuse-daemon /home/hildeb/.gvfs fuse.gvfs-fuse-daemon rw,nosuid,nodev,relatime,user_id=1000,group_id=1000 0 0

Mine disappeared after a server re-boot (on fully patched 10.04 system) - but will it re-appear?

Getting this in fresh install of maverick 10.10, fully updated.

$ df
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/mapper/shim-root
                     230343852 107685244 110957772 50% /
none 4034012 340 4033672 1% /dev
none 4044064 47660 3996404 2% /dev/shm
none 4044064 752 4043312 1% /var/run
none 4044064 0 4044064 0% /var/lock
none 230343852 107685244 110957772 50% /var/lib/ureadahead/debugfs
/dev/sda1 233191 75352 145398 35% /boot

$ cat /proc/self/mountinfo
14 19 0:14 / /sys rw,nosuid,nodev,noexec,relatime - sysfs none rw
15 19 0:3 / /proc rw,nosuid,nodev,noexec,relatime - proc none rw
16 19 0:5 / /dev rw,relatime - devtmpfs none rw,size=4034012k,nr_inodes=1008503,mode=755
17 16 0:11 / /dev/pts rw,nosuid,noexec,relatime - devpts none rw,gid=5,mode=620,ptmxmode=000
18 14 0:15 / /sys/fs/fuse/connections rw,relatime - fusectl fusectl rw
19 1 252:1 / / rw,relatime - ext4 /dev/mapper/shim-root rw,errors=remount-ro,user_xattr,barrier=1,data=ordered
21 14 0:7 / /sys/kernel/debug rw,relatime - debugfs none rw
22 14 0:10 / /sys/kernel/security rw,relatime - securityfs none rw
23 16 0:16 / /dev/shm rw,nosuid,nodev,relatime - tmpfs none rw
24 19 0:17 / /var/run rw,nosuid,relatime - tmpfs none rw,mode=755
25 19 0:18 / /var/lock rw,nosuid,nodev,noexec,relatime - tmpfs none rw
26 19 8:1 / /boot rw,relatime - ext2 /dev/sda1 rw,errors=continue
27 15 0:19 / /proc/sys/fs/binfmt_misc rw,nosuid,nodev,noexec,relatime - binfmt_misc binfmt_misc rw
20 19 0:20 / /home/aaron/.gvfs rw,nosuid,nodev,relatime - fuse.gvfs-fuse-daemon gvfs-fuse-daemon rw,user_id=1000,group_id=1000

Neal McBurnett (nealmcb) wrote :

Is there a way to clean this up when it happens? E.g. a safe way to audit and fix /etc/mtab?

description: updated
James (guanfenglin) wrote :

Looks like doing dist-upgrade will fix it

Processing triggers for ureadahead ...
ureadahead will be reprofiled on next reboot
Setting up linux-image-2.6.32-31-server (2.6.32-31.61) ...

Help me out here, folks: Is this harmless or not? On the one hand, if it's just a matter of /etc/mtab claiming there's something mounted that isn't, that seems (relatively) harmless. On the other hand, we have at least one user reporting here (comment #9) that they see a significant performance impact when this happens, and multiple users reporting that it causes incorrect file system full errors (although I'm having trouble seeing why it would do that).

So:

* Does this really have a performance impact, or is that user mistaken?

* Does this really cause false "full" errors, or are those users mistaken?

10.04 is an LTS release. I'm seeing this on my fully-updated 10.04 server, which surprises me nearly two years later if it really has any impacts.

Håkan Jonsson (hajons) wrote :

It is not harmless. I am also getting file system full errors. For example, Mozilla Firefox, can't download anything since it can't store any temp files.

$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/vg_local-root
                       19G 19G 0 100% /
none 5.9G 280K 5.9G 1% /dev
none 5.9G 1.3M 5.9G 1% /dev/shm
none 5.9G 468K 5.9G 1% /var/run
none 5.9G 0 5.9G 0% /var/lock
none 5.9G 0 5.9G 0% /lib/init/rw
none 19G 19G 0 100% /var/lib/ureadahead/debugfs
/dev/sda1 457M 116M 317M 27% /boot
/dev/mapper/vg_local-home
                      159G 47G 105G 31% /home

@Håkan Jonsson (comment #20):

But your root file system really is showing that it's full:

> $ df -h
> Filesystem Size Used Avail Use% Mounted on
> /dev/mapper/vg_local-root
> 19G 19G 0 100% /

Note the 100%, 19G out of 19G used. Have you proven that that's not true? (Pretty easy to fill up 19G these days...)

-- T.J.

QuentinHartman (qhartman) wrote :

This is mainly for the benefit of people who search google for this problem and end up here.

I've seen this on several servers. A simple "umount /var/lib/ureadahead/debugfs" removes the errant mtab entry and cleans up df output if it's bothering you.

It's highly unlikely that this mount lying around is causing any actual problems. The only people who have reported disk full problems legitimately had full disks, and the performance degredation just doesn't make sense. It's entirely possible that there is some other coincident problem that is causing the performance problem that the above user reported.

If you are having problems with a full disk issues but df is reporting you have free space, check the output from du. It's possible for a file to be deleted, but held open by a long-running process which prevents that space from being actually freed. This will make df report free space when there actually isn't any. To see if this might be affecting you, run "lsof | grep deleted". If you see any files listed there, especially large ones, see if you can stop the process holding them open and that space will become usable again. It's likely you will see at least a few files in the list, probably log files, temp files, and lock files. Those are likely not worth worrying about.

Jason Wolosonovich (jmwoloso) wrote :

This bug does cause deja-dup to hang if you're backing up system files.

Thomas M Steenholdt (tmus) wrote :

open-vm-tools also chokes on this, when freezing filesystems for a quiesced snapshot

Maxim Kursin (m0ps) wrote :

Installing linux-image-hwe-virtual-trusty on 14.04 with latest updates broke Veeam backups. This is a urgent issue, can somebody advice?

Grzegorz Ojrzanowski (szponek) wrote :

For me it's the mtab entry for tracefs on a non-existing mount point in /var/lib/ureadahead/debugfs/tracing that is causing problems. It does affect VMware snapshots when quiescing is required too.

Since it's a different entry then in this post I opened a new bug report in #1659684

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers