overlayfs does not implement inotify interfaces correctly

Bug #882147 reported by Andy Whitcroft on 2011-10-26
238
This bug affects 39 people
Affects Status Importance Assigned to Milestone
coreutils (Ubuntu)
Undecided
Adam Conrad
Precise
Undecided
Unassigned
linux (Ubuntu)
High
Andy Whitcroft
Precise
High
Andy Whitcroft

Bug Description

When using tail on the liveCD some updates are not reported. This seems to be triggered by tail using inotify to identify modified files. Overlayfs does not appear to be implementing inotify quite the way you might hope reporting only against the underlying filesystems.

Related bugs:
 * bug 1213925: upstart should notice "/etc" inode change

ProblemType: Bug
DistroRelease: Ubuntu 11.10
Package: linux-image-3.0.0-12-generic 3.0.0-12.20
ProcVersionSignature: Ubuntu 3.0.0-12.20-generic 3.0.4
Uname: Linux 3.0.0-12-generic x86_64
AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.24.
ApportVersion: 1.23-0ubuntu3
Architecture: amd64
ArecordDevices:
 **** List of CAPTURE Hardware Devices ****
 card 0: Intel [HDA Intel], device 0: STAC92xx Analog [STAC92xx Analog]
   Subdevices: 0/1
   Subdevice #0: subdevice #0
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: apw 2296 F.... pulseaudio
 /dev/snd/pcmC0D0c: apw 2296 F...m pulseaudio
 /dev/snd/pcmC0D0p: apw 2296 F...m pulseaudio
Card0.Amixer.info:
 Card hw:0 'Intel'/'HDA Intel at 0xfc700000 irq 47'
   Mixer name : 'Intel Cantiga HDMI'
   Components : 'HDA:111d7675,1028029f,00100103 HDA:80862802,80860101,00100000'
   Controls : 20
   Simple ctrls : 11
Date: Wed Oct 26 17:46:16 2011
EcryptfsInUse: Yes
HibernationDevice: RESUME=UUID=d8328455-deac-4bae-877d-c408d371cefe
MachineType: Dell Inc. Studio 1537
ProcEnviron:
 PATH=(custom, user)
 LANG=en_GB.UTF-8
 SHELL=/bin/bash
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.0.0-12-generic root=UUID=cf503727-25f2-4ecd-b0f3-2b894523bcba ro quiet splash vt.handoff=7
RelatedPackageVersions:
 linux-restricted-modules-3.0.0-12-generic N/A
 linux-backports-modules-3.0.0-12-generic N/A
 linux-firmware 1.60
SourcePackage: linux
UpgradeStatus: Upgraded to oneiric on 2011-10-17 (9 days ago)
WpaSupplicantLog:

dmi.bios.date: 09/22/2008
dmi.bios.vendor: Dell Inc.
dmi.bios.version: A03
dmi.board.vendor: Dell Inc.
dmi.board.version: A03
dmi.chassis.type: 8
dmi.chassis.vendor: Dell Inc.
dmi.chassis.version: A03
dmi.modalias: dmi:bvnDellInc.:bvrA03:bd09/22/2008:svnDellInc.:pnStudio1537:pvrA03:rvnDellInc.:rn:rvrA03:cvnDellInc.:ct8:cvrA03:
dmi.product.name: Studio 1537
dmi.product.version: A03
dmi.sys.vendor: Dell Inc.

Andy Whitcroft (apw) wrote :
Changed in linux (Ubuntu):
status: New → Triaged
assignee: nobody → Andy Whitcroft (apw)
Changed in linux (Ubuntu):
importance: Undecided → Medium
Asel Alshukri (asel-alshukri) wrote :
Download full text (9.4 KiB)

I got to this bug when I was asked to report an error by ubuntu, then it said that this bug/error has already been reported and then opened me this page.
This is what I was doing:

To run a command as administrator (user "root"), use "sudo <command>".
See "man sudo_root" for details.

ubuntu@ubuntu:~$ sudo su -
root@ubuntu:~# apt-get install nfs-common
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following extra packages will be installed:
  libgssglue1 libnfsidmap2 libtirpc1 rpcbind
The following NEW packages will be installed:
  libgssglue1 libnfsidmap2 libtirpc1 nfs-common rpcbind
0 upgraded, 5 newly installed, 0 to remove and 0 not upgraded.
Need to get 398 kB of archives.
After this operation, 1,565 kB of additional disk space will be used.
Do you want to continue [Y/n]? y
Get:1 http://archive.ubuntu.com/ubuntu/ oneiric/main libgssglue1 amd64 0.3-1ubuntu1 [22.3 kB]
Get:2 http://archive.ubuntu.com/ubuntu/ oneiric/main libtirpc1 amd64 0.2.2-5 [84.2 kB]
Get:3 http://archive.ubuntu.com/ubuntu/ oneiric/main rpcbind amd64 0.2.0-6ubuntu3 [42.2 kB]
Get:4 http://archive.ubuntu.com/ubuntu/ oneiric/main libnfsidmap2 amd64 0.24-1 [28.0 kB]
Get:5 http://archive.ubuntu.com/ubuntu/ oneiric/main nfs-common amd64 1:1.2.4-1ubuntu2 [222 kB]
Fetched 398 kB in 0s (565 kB/s)
W: Duplicate sources.list entry cdrom://Ubuntu 11.10 _Oneiric Ocelot_ - Release amd64 (20111012)/ oneiric/main i386 Packages (/var/lib/apt/lists/Ubuntu%2011.10%20%5fOneiric%20Ocelot%5f%20-%20Release%20amd64%20(20111012)_dists_oneiric_main_binary-i386_Packages)
Selecting previously deselected package libgssglue1.
(Reading database ... 130325 files and directories currently installed.)
Unpacking libgssglue1 (from .../libgssglue1_0.3-1ubuntu1_amd64.deb) ...
Selecting previously deselected package libtirpc1.
Unpacking libtirpc1 (from .../libtirpc1_0.2.2-5_amd64.deb) ...
Selecting previously deselected package rpcbind.
Unpacking rpcbind (from .../rpcbind_0.2.0-6ubuntu3_amd64.deb) ...
Selecting previously deselected package libnfsidmap2.
Unpacking libnfsidmap2 (from .../libnfsidmap2_0.24-1_amd64.deb) ...
Selecting previously deselected package nfs-common.
Unpacking nfs-common (from .../nfs-common_1%3a1.2.4-1ubuntu2_amd64.deb) ...
Processing triggers for man-db ...
Processing triggers for ureadahead ...
Setting up libgssglue1 (0.3-1ubuntu1) ...
Setting up libtirpc1 (0.2.2-5) ...
Setting up rpcbind (0.2.0-6ubuntu3) ...
 Removing any system startup links for /etc/init.d/rpcbind ...
start: Unknown job: portmap
invoke-rc.d: initscript portmap, action "start" failed.
dpkg: error processing rpcbind (--configure):
 subprocess installed post-installation script returned error exit status 1
Setting up libnfsidmap2 (0.24-1) ...
dpkg: dependency problems prevent configuration of nfs-common:
 nfs-common depends on rpcbind (>= 0.2.0-6ubuntu1); however:
  Package rpcbind is not configured yet.
dpkg: error processing nfs-common (--configure):
 dependency problems - leaving unconfigured
Processing triggers for libc-bin ...
No apport report written because the error message indicates its a followup error from a previous failure.
                        ...

Read more...

C de-Avillez (hggdh2) wrote :

corss-reference: coreutils bug 908354

Kate Stewart (kate.stewart) wrote :

Bumping up the priority on this, as it seems to be the root cause of several problems.

stgraber wrote: Edubuntu would like bug 882147 to be released targeted and its importance bumped. It's the ultimate cause of all these ltsp-live bugs and the reason why I had to add a bunch of upstart and NM hacks in ltsp-live (as in, force them to rescan/restart their config to workaround inotify not working). This is also annoying when doing things like "tail -f /var/log/syslog" in a livecd and it just doesn't work because of that bug

Changed in linux (Ubuntu):
importance: Medium → High
tags: added: rls-mgr-p-tracking
Brian Murray (brian-murray) wrote :

I installed openssh-server after booting a precise live cd and then was unable to start ssh because upstart uses inotify to scan for new services.

I ended up having to manually run 'sudo initctl reload-configuration' to be able to use 'sudo service ssh start'.

Brian Murray (brian-murray) wrote :

Additionally, I want to be able to fix bug 901381 and having ubiquity create proper apport-crash reports in /var/crash but unfortunately update-notifier uses inotify so on the Live CD won't tell you that there is a crash report for you to report.

Steve Langasek (vorlon) wrote :

Andy, is there a chance this bug might be fixed this cycle? There's an awful lot that depends on inotify nowadays, and it would be a shame to have to revert to polling interfaces to work around this overlayfs bug.

Brian Murray (brian-murray) wrote :

Andy indicated on irc today, in #ubuntu-kernel, that he is planning on starting a conversation with upstream about this.

Mike Mestnik (cheako) wrote :

Well, inotify is great and all. However features like this simply won't always exist or always work properly. A good case in point is locking over NFS.

Thus dpkg or the install scripts need a hook for every file that needs inotify. This call back should be run after the inotify and it should first detect either by looking for the results of the inotify(possibly best) or test inotify on it's own. If an issue is detected corrective action needs to be taken, likely via a call back.

Yes, that's right... "Why bother with inotify."

inotify for updates is a poor choice because updates are not done 100 or even 1000s of times per day, so the performance loss of sending a signal(kill -HUP) is preferred over a system that sometimes doesn't work.

However inotify can be useful in fast-paths, that's what it's good for. Start using the right tool for the job.

Stéphane Graber (stgraber) wrote :

Just added bug 956827 as a duplicate. Lack of inotify breaks tail which in turn breaks the installer's log window.

All the user will see during the install is:
[timestamp] Ubiquity 2.9.x
[timestamp] log-output -t ubiquity laptop-detect

Any other relevant information will only show up if the installer crashes and the user sends a bug report...

Colin Watson (cjwatson) wrote :

Mike, the problem is that overlayfs effectively pretends to have inotify but it in fact never sends any notifications. tail *does* in fact test that the inotify calls succeed and fall back to other methods if they don't, but the exact manner of the failure here defeats its test.

Mike Mestnik (cheako) wrote :

Colin,
  This is the second time I've replied to this, so I'll be brief.

A filesystem can implement inotify most of the time, while still being unable to implement it all of the time. A good example is a filesystem where inotify is disabled, or unable to be implemented, when flock is in use.

Applications should be able to handle this situation effectively.

Mike Mestnik (cheako) wrote :

Depending on an advertised feature simply because it was advertised is an error.

Steve Langasek (vorlon) wrote :

Mike, I think you may be missing the point that the nature of the overlayfs bug is such that nothing which wants to use inotify (and there are several bits of software in the live CD that do nowadays, because inotify is a *far superior API* to the alternatives) has any way of knowing that it isn't working correctly.

This is a bug in the overlayfs driver, plain and simple.

Mike Mestnik (cheako) wrote :

So, I should open a bug regarding the use of inotify on other filesystems including but not limited to overlayfs?

I understand the need to accurately track separate issues, but the solution is the same.

John Gilmore (gnu-gilmore) wrote :

Mike, if inotify will not actually report all changes on an overlayfs then inotify needs to report an error when someone tries to monitor changes in an overlayfs. Pretty simple, eh?

I mean, you could actually fix it so that inotify DOES report all the changes, but that would be harder than merely erroring out and forcing the userspace program to fall back to polling.

(I don't know how many userspace programs, besides "tail", know how to fall back to polling. But what's the point of having inotify at all, if userspace programs can't depend on it to either produce correct results, or give an error when it can't?)

(I reported bug#977847 in which "apt-get install nfs-common" fails on the latest Ubuntu Beta2 LTS livecd because a critical system facility made the stupid mistake of depending on inotify. Either we have to tell every programmer to pretend inotify doesn't exist as a kernel facility, or we have to fix it. This is not an academic concern; end-users are noticing daily.)

Mike Mestnik (cheako) wrote :

John,
  You make an interesting point about returning with failure. Applications should account for all modes of failure, even when success is reported.

As for the point of having inotify at all, this was addressed in my first post I wrote on 2012-02-25.

My skepticism about reporting error if inotify will not actually report "all" changes is this rules out the possibility of reporting a subset of changes via inotify.

John Gilmore (gnu-gilmore) wrote :

Inotify is not defined to return "a subset of changes". It's defined to return all changes. That's why this is a bug.

Mike Mestnik (cheako) wrote :

Vary well.

Though there is still one thing that bothers me. If INotify is non-existant because only partial support would be avalible, how do applications like tail/upstart/ect work?

Rhetorical I hope, they have to work regardless. So that begs my original question!

"Why bother with inotify."

Just do your fall back and forget about even testing for inotify support under the guyed lines of small is better and inotify dosn't make most applications smaller. As indicated there are a few places where INotify can shine, but it should not be abused by every function wherever it might fit.

INotify was built for wathcing large numbers of files and folders, if your application has a managable number of files... Then INotify is over kill and not for you as your application will break when INotify is missing.

Dimitri John Ledkov (xnox) wrote :

There was indication in the past of some work in progress patches:
http://marc.info/?l=linux-fsdevel&m=133043888730766&w=2
http://marc.info/?l=linux-kernel&m=133043888330760&w=2

Andy, have you published them somewhere? Any amount of inotify support in overlayfs, even with limitations, will greatly improve ubuntu live cd session with&without persistence.

Why bother with inotify: it's cheap, when the alternative of constantly polling is expensive w.r.t. system resources and hence not used.

Mike Mestnik (cheako) wrote :

Why bother with inotify: The is the huge problem with community driven software. Performance is always given the right of way, stability and usefulness are not universally considered important.

If there is a *single situation where a feature doesn’t function then every application that makes use of this feature needs to either exit with error or handle a warning. This bug is about "When using tail on the liveCD some updates are not reported.", not about the lack of inotify support. The lack of inotify is a different bug, the issue here is that this one software package doesn’t appropriately handle cases were inotify doesn’t function as it should.

I've no problems with an application that makes use of inotify, I've an issue with applications expect inotify to always be available even on Overlwayfs/3.0.0-12-generic.

* A a filesystem that made it into a release kernel, being a good example.

Mike Mestnik (cheako) wrote :

A simple solution for performance is for tail to scale it's polling. Users won't care much if tail checks every 3 seconds on a file that hasn't change in 30, plus there can be configuration options.

My suggestion is, that once after all this logic is a core component, tail won't need inotify support anymore.

Excerpts from Mike Mestnik's message of 2013-01-06 16:06:49 UTC:
> Why bother with inotify: The is the huge problem with community driven
> software. Performance is always given the right of way, stability and
> usefulness are not universally considered important.
>
> If there is a *single situation where a feature doesn’t function then
> every application that makes use of this feature needs to either exit
> with error or handle a warning. This bug is about "When using tail on
> the liveCD some updates are not reported.", not about the lack of
> inotify support. The lack of inotify is a different bug, the issue here
> is that this one software package doesn’t appropriately handle cases
> were inotify doesn’t function as it should.
>
> I've no problems with an application that makes use of inotify, I've an
> issue with applications expect inotify to always be available even on
> Overlwayfs/3.0.0-12-generic.

Its not that inotify is missing, its that overlayfs is falsely claiming
that inotify works. Applications which implement inotify can fall back
to polling, *if* the inotify calls fail.. which they do not on overlayfs.

So you see, the bug is already fixed in tail, but it is well and truly
broken in overlayfs.

Mike Mestnik (cheako) wrote :

No, if a kernel is released with some behavior then applications must be made to cope with that behavior(if only for the range or kernels release with this feature).

Tail does not cope with the "features" facing a released kernel and this is in it self a big, separate from that of the kernel having a given feature.

Clint Byrum (clint-fewbar) wrote :

By that logic if select() is broken in the kernel and does not properly handle its arguments or give proper error codes, my program which expects to get proper error codes from the syscall is broken because it doesn't know how to handle something not working as documented? I'm not sure I agree with that.

tail should be able to trust that if inotify_add_watch() returns with a successful return code, then the directory that was given to the kernel *is watched*. If an error were returned, as it is in other filesystems where inotify does not work, then tail would fall back to polling.

Mike Mestnik (cheako) wrote :

Would we allow an application into the archive/pool that fails to work as it should? Regardless of "WHO" or "WHAT" we can't have applications that don't work, excuses like "It's the kernel's fault." are meaningless to users.

If there is a kernel that fails to provide an API as it's supposed to and that kernel becomes part of a Stable Release, then the Documentation is wrong because it doesn't reflect the API of a Stable Release.

Clint Byrum (clint-fewbar) wrote :

We fix kernels in stable releases when they don't conform to the documentation, not the other way around.

Mike Mestnik (cheako) wrote :

Is there any particular reason for not fixing both, ignoring a known failure mode?

Adam Conrad (adconrad) on 2013-01-28
Changed in coreutils (Ubuntu):
assignee: nobody → Adam Conrad (adconrad)
status: New → In Progress
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in coreutils (Ubuntu Precise):
status: New → Confirmed
Andrea R. (andreran) wrote :

For everyone interested in a quick workaround to the "tail -f" issue, one can use (quoting) «the deliberately undocumented ---disable-inotify option that was introduced in coreutils-7.6». Yes, with three dashes.

http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=583198#20

Mike Mestnik (cheako) wrote :

That sounds like a a solution, knowing is half the battle.

A proper fix would be for the option to be enabled when needed, but the above is enough to close this bug or at least lessen the importance. The live CD can make use of this flag in all it's scripts, for example.

Bryan Quigley (bryanquigley) wrote :

It seems like ---disable-inotify was added for testing purposes only, hence the "do not document". I think it would be a bad idea to make liveCD scripts dependent on it. As a workaround for me running tail -f command on a liveCD, that sounds fine.

Mike Mestnik (cheako) wrote :

I don't see the logic in having tail depend on this know buggy kernel feature, while avoiding dependencies on non-documented features of tail on a liveCD.

...It makes more sense to me that we depend on a nominally useful feature add instead of depending on code that always works and perhaps luck. IMHO there will always be a few bugs and there should always be by-passes to allow for working around them. This feature add is a good example, providing a secondary code path for unusual situations.

Scott Moser (smoser) on 2013-06-04
tags: added: overlayfs
removed: running-unity
Scott Moser (smoser) on 2013-08-19
description: updated
Li Jianguo (byjgli) wrote :

inotify support for overlayfs

Li Jianguo (byjgli) wrote :

This is a patch for overlayfs.v19

Li Jianguo (byjgli) wrote :

linux kernel 3.11 overlayfs.v19

The attachment "overlayfs_inotify.patch" seems to be a patch. If it isn't, please remove the "patch" flag from the attachment, remove the "patch" tag, and if you are a member of the ~ubuntu-reviewers, unsubscribe the team.

[This is an automated message performed by a Launchpad user owned by ~brian-murray, for any issues please contact him.]

tags: added: patch
Kamilion (kamilion) wrote :

This is currently causing some problems for me -- I'm using grub2 to boot a LiveISO with TORAM=Yes from a DiskOnModule to implement a hardware appliance.

It's *REALLY* aggravating that tail -f /var/log/<filename> does not work as expected.

Could someone bring that overlayfs inotify kernel patch up to date against vivid before alpha2 strikes so we can kill this annoying little problem once and for all for 15.04 and beyond?

Mike Mestnik (cheako) wrote :

Notice the comment #30 above: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/882147/comments/30

Also note that there is some confusion about known behavior of released kernels. I suggest that if a kernel is released and it's select() call does not behave appropriately that every application that uses select() should be patched to detect if it's running under such a kernel, though there are other opinions.

For example the kernel will detect known problems with FPUs/CPUs and handle these appropriately, such should be the way with applications and released kernel APIs.

Alf Gaida (agaida) wrote :

@Mike Mestnik: you can repeat yourself as often as you want to - this is a bug and should be addressed soon. There is absolutely no need to argue - there is a need for fixing faulty behaviour. Ok. might not be your thing, because may be complicated. Writing nonsense is not, therefore you prefer the last I guess.

Mike Mestnik (cheako) wrote :

@Alf Gaida Most ppl can't read, it's unfortunate. To communicate with ppl one must repeat themselves like an advertiser a 100 times, no a 1000 times is not even enough. Computers read vary well and what's more is if you adjust your phrasing slightly enough times eventually you'll get total obedience. Thus communicate with humans repetitively even past the point where the audience thinks that your actions are argumentative is the only way I've ever known how to get even the most basic understanding across.

I'll keep pointing out that ---disable-inotify is a good solution and merits being made into a feature. Plus there are a multitude of ways to fix this and there is a deficit in only perusing one single-mindedly.

The compliance of a single person is irrelevant.

Alf Gaida (agaida) wrote :

@Mike Mestnik: To be true, i have a record that shows that i'm not the nicest and most patient living person in the world - but you should look at the bigger picture. I've used aufs for ISOs long term - with the latest development in debian aufs does more harm than good - so the logical switch is to overlayfs. And the overlay should give feedback about changed files the same way as the most other filesystems do - i think thats called the principle of least surprise.

@Li Jianguo (byjgli): thanks for the patch, does what it should so far, i should have a closer look at it, apply to a 3.18 Kernel without problems, quilt refresh does its work. So far it works good with both tail and Qt 5.3.2 too - big thanks, that save me days.

Jason Meredith (signull) wrote :

According to some here: https://github.com/phusion/baseimage-docker/issues/198 this issue with overlayfs is not allowing CROND to run in a docker container. I would really appreciate some eyes on this issue if that is the case.

Serge Hallyn (serge-hallyn) wrote :

I've seen reports that this is fixed in 4.10?

Serge Hallyn (serge-hallyn) wrote :

Nope, tail -f is still broken at least in 4.12.

Bryan Quigley (bryanquigley) wrote :

tail -f /var/log/syslog worked for me with 17.10 dev | kernel 4.12. (Tested by turning network on and off).

Jasper Koolhaas (morgana2313) wrote :

tail -f works because it checks the file every sleep-interval seconds (defaults to 1 second).
tail -f -s 10 /var/log/syslog gives a burst of new lines every 10 seconds.

With this test I see 10-seconds bursts with linux 4.10.0 and 4.13.0 indicating this bug isn't fixed.

Bryan Quigley (bryanquigley) wrote :

Good to know. If tail works around this by default is there any other reason to keep the coreutils task open?

To post a comment you must log in.