Intermittent file change notification dropouts (~/Desktop, possibly more general)

Bug #13428 reported by Jan
42
Affects Status Importance Assigned to Milestone
gamin (Ubuntu)
Fix Released
High
Martin Pitt

Bug Description

Today I noticed files with .gz or .bz2 suffix do not diappear from desktop when
dragged to wastebasket. They show up in every Nautilus view of the desktop, but
do not exist for real. One needs to log off to get rid of them. Other files
(.txt) seems to work OK.

Running up to date hoary last updated March 2. Both these files are associated
with File-roller.

http://bugzilla.gnome.org/show_bug.cgi?id=172365: http://bugzilla.gnome.org/show_bug.cgi?id=172365

Revision history for this message
Sebastien Bacher (seb128) wrote :

do you use the trash on the desktop or the applet ? what version of gamin do you
use ?

Revision history for this message
Daniel Robitaille (robitaille) wrote :

I have experienced this bug 6763 times: once last week, and 2 times tonight while
trying to reproduce it. I still haven't been able to come up with a good
example to prove this bug; but essentially tonight I was randomly creating files
on the desktop, renaming them, and then deleting them; and twice they stayed on
the desktop after a delete.

But the one thing I'm sure of: it is not related to the trash applet. In one
of the case tonight I removed the file using the command line (i.e, "rm
~/Desktop/test.gz "), and the file was deleted from my filesystem, but its icon
remained on my desktop. And obviously that's why they finally disappear only
after a logout/login since they are not physically present in ~/Desktop
directory after being removed or moved to the trashcan.

Revision history for this message
Jan (debian-gepro) wrote :

(In reply to comment #1)
I use trash applet and my gamin version is 0.0.24-ubuntu1

Revision history for this message
Sebastien Bacher (seb128) wrote :

gamin 0.0.24 is bugged, could you try with 0.0.25 ?

Revision history for this message
Jan (debian-gepro) wrote :

Yes, I can reproduce the same problem with gamin 0.0.25.

It's not 100% probability, one has to download and delete serveral archives from
the desktop before
seeing this flaw.

I do confirm it has nothing to do with the trash applet. Deleting the files from
commandline
has the same effect.

Revision history for this message
Daniel Robitaille (robitaille) wrote :

(In reply to comment #5)
> Yes, I can reproduce the same problem with gamin 0.0.25.

After many days without experiencing this bug I was finally able to reproduce it
once tonight on an up-to-date hoary system. (gamin 0.0.25-0ubuntu1)

Steps that I did to make it happen once: from the command line I created in my
home directory a file named patch.tar.gz containing a few files. Using Nautilus
I dragged that file to my desktop. I then clicked on it to see its content
(using file-roller). Then from the command line I removed it from the Desktop
(cd ~/Desktop; rm patch.tar.gz). The file was gone from the ~/Desktop
directory, but the icon was still on my screen. I then killed the Nautilus
process, which immediately removed the icon from the desktop.

Subsequent attempts (half-a-dozen times) failed to reproduce that bug more than
once.

Revision history for this message
Sebastien Bacher (seb128) wrote :

gamin issue, reassigning

Revision history for this message
Daniel Robitaille (robitaille) wrote :

I haven't been able to reproduce this bug since the upgrade to gamin
0.0.26-0ubuntu1. I don't know if it means it has been solved with this new
upstream version, or simply that I haven't been lucky enough yet to
re-experience the bug.

Revision history for this message
Matt Zimmerman (mdz) wrote :

I think this is supposed to have been fixed; Martin or Sebastien, can you confirm?

Revision history for this message
Sebastien Bacher (seb128) wrote :

That works fine here. The desktop issues are supposed to be fixed with 0.0.25
according to upstream. People having issues, is that on a laptop ?

Revision history for this message
Daniel Robitaille (robitaille) wrote :

(In reply to comment #10)
> That works fine here. The desktop issues are supposed to be fixed with 0.0.25
> according to upstream. People having issues, is that on a laptop ?

I have experienced this bug on a x86 desktop.

Revision history for this message
Martin Pitt (pitti) wrote :

I can't reproduce the bug with deleting a file, but with the opposite: If I copy
a file to ~/Desktop from the shell (or by Save-As in Firefox) the file does not
appear on the Desktop. It does appear if I drag&drop from a Nautilus window, though.

Revision history for this message
Daniel Robitaille (robitaille) wrote :

(In reply to comment #12)
> I can't reproduce the bug with deleting a file, but with the opposite: If I copy
> a file to ~/Desktop from the shell (or by Save-As in Firefox) the file does not
> appear on the Desktop. It does appear if I drag&drop from a Nautilus window,
though.

it does that all the time or only occasionally? I just tried it a few times
and it works fine on my machine. And could Bug 14257 be the same bug than this
one? They both sound somewhat similar.

Revision history for this message
Martin Pitt (pitti) wrote :

(In reply to comment #13)

> it does that all the time or only occasionally?

I tried about 10 times, it never worked for me.

Revision history for this message
Martin Pitt (pitti) wrote :

(In reply to comment #14)
> (In reply to comment #13)
>
> > it does that all the time or only occasionally?
>
> I tried about 10 times, it never worked for me.

... no matter if the file is a .tar.gz, a .txt, or any other file type.

Revision history for this message
Jeff Waugh (jdub) wrote :

I've been looking at this quite a bit recently. Unfortunately, the upstream
maintainer is not hugely responsive until you can prove that a bug exists. Even
the Red Hat desktop team are not getting traction from him on this one. ;-)

Try this: killall nautilus gam_server, then start doing operations in your
Desktop folder or any other folder. Everything should work as expected. At some
stage, during normal usage, an event happens that stops gamin seeing or
communicating changes (not sure which). The bugs are filed about the desktop,
because that's where we see most file change notification take place.

Revision history for this message
Martin Pitt (pitti) wrote :

(In reply to comment #16)
> Try this: killall nautilus gam_server, then start doing operations in your
> Desktop folder or any other folder. Everything should work as expected. At some
> stage, during normal usage, an event happens that stops gamin seeing or
> communicating changes (not sure which). The bugs are filed about the desktop,
> because that's where we see most file change notification take place.

Indeed, I just tried this. After killall gam_server, cp foo.txt Desktop/ and rm
Desktop/foo.txt works fine three times, then nothing happens any more until I
restart gam_server again. Sometimes it takes three cp/rm rounds, sometimes 5,
sometimes it doesn't work right from the beginning. I'd say this is pretty well
reproducible...

Revision history for this message
Sebastien Bacher (seb128) wrote :

Martin, could you run nautilus with GAM_DEBUG ?

- gnome-session-remove nautilus
- GAM_DEBUG=1 nautilus

and see if there is something weird happening here

Revision history for this message
Martin Pitt (pitti) wrote :

(In reply to comment #18)
> Martin, could you run nautilus with GAM_DEBUG ?
>
> - gnome-session-remove nautilus
> - GAM_DEBUG=1 nautilus
>
> and see if there is something weird happening here

Done. The key difference is the presence of "Failed to find request 1" logs if
it doesn't work. I attach a couple of log files:

nautilus-desktopcp-fails.txt: log when doing "cp kernelsec.txt Desktop/" and the
icon doesn't appear
nautilus-desktopcp-works[2].txt: two different logs when doing the same cp and
the icon appears
nautilus-desktoprm-fails[2].txt: two different logs when doing "rm
Desktop/kernelsec.txt" and the icon doesn't vanish
nautilus-desktoprm-works.txt: log when doing the same rm and the icon vanishes

I hope this helps somehow. Please let me know if you need any additional logs.

Revision history for this message
Martin Pitt (pitti) wrote :

Created an attachment (id=1825)
nautilus-desktopcp-fails.txt

Revision history for this message
Martin Pitt (pitti) wrote :

Created an attachment (id=1826)
nautilus-desktopcp-works2.txt

Revision history for this message
Martin Pitt (pitti) wrote :

Created an attachment (id=1827)
nautilus-desktopcp-works.txt

Revision history for this message
Martin Pitt (pitti) wrote :

Created an attachment (id=1828)
nautilus-desktoprm-fails2.txt

Revision history for this message
Martin Pitt (pitti) wrote :

Created an attachment (id=1829)
nautilus-desktoprm-fails.txt

Revision history for this message
Martin Pitt (pitti) wrote :

Created an attachment (id=1830)
nautilus-desktoprm-works.txt

Revision history for this message
Sebastien Bacher (seb128) wrote :

*** Bug 14615 has been marked as a duplicate of this bug. ***

Revision history for this message
Jeff Waugh (jdub) wrote :

*** Bug 14257 has been marked as a duplicate of this bug. ***

Revision history for this message
Martin Pitt (pitti) wrote :

For the record, I used this loop for testing:

   while true; do echo Hallo > Desktop/baz; sleep 1; rm Desktop/baz; sleep 1; done

This reproduces the bug reliably with Nautilus, i. e. the Icons stop
appearing/disappearing pretty fast. However, with test_gam as client everything
works well, I get a neverending stream of Created/Deleted notifications.

Upstream says that this is a race condition between nautilus (which registers
multiple watches onto a file and accesses gamin concurrently), gnome-vfs, and
gamin. He knows about the Desktop icon test above, maybe he can reproduce it.

Revision history for this message
Martin Pitt (pitti) wrote :
Revision history for this message
Fabio Massimo Di Nitto (fabbione) wrote :

Hi guys,
 I have been testing this bug and experienced the following:

- machine A (very fast processor): cannot reproduce at all.
- machine B (very slow processor): reproducible in 1 Martin's loop.

attempting to isolate the problem, i noticed that killing either gam_server
or nautilus will restore the standard operations.

If i understand correctly Martin's previous message that with test_gam
everything is ok,
I would suspect more a problem in nautilus than the gamin itself or we are
facing 2 bugs here.

I get the impression that nautilus isn't fast enough in grabbing and processing
the info
coming from gamin and it stalls somewhere.

*REALLY WILD GUESS*:
I get the feeling that messages are possibly queued in gamin to ensure their
delivery to the clients.
Restarting gamin might as well flush the queue and start messaging
again to nautilus. A nautilus restart will be registered with a different
client-id and therefor
a new messaging queue in place. Everything works until the stall happens again
due to the slow processing.

I will try to look into gamin/nautilus code during the weekend, but i am not an
expert of any of those.

Fabio

Revision history for this message
Fabio Massimo Di Nitto (fabbione) wrote :

(In reply to comment #30)
> Hi guys,
> I have been testing this bug and experienced the following:
>
> - machine A (very fast processor): cannot reproduce at all.
> - machine B (very slow processor): reproducible in 1 Martin's loop.
>
> attempting to isolate the problem, i noticed that killing either gam_server
> or nautilus will restore the standard operations.
>
> If i understand correctly Martin's previous message that with test_gam
> everything is ok,
> I would suspect more a problem in nautilus than the gamin itself or we are
> facing 2 bugs here.
>
> I get the impression that nautilus isn't fast enough in grabbing and processing
> the info
> coming from gamin and it stalls somewhere.
>
> *REALLY WILD GUESS*:

let's forget about this crap. I switched gam_server to run in debugging mode
together
with nautilus.

The problem is located in the gamin remove_subscription code or around that area.
While running the loop, all of a sudden gamin deregisters my entire home from being
polled.
This also explain Martin's error message from nautilus:
"Failed to find request N" ( i get different values of N)

nautilus has no knowledge that gamin has stopped monitoring the dir and
requestes will fail
since there is no data entry for it anymore.

restarting nautilus will readd /home/$user to the gamin list of dirs to poll.
restarting gamin will do the same.

So now.. we need to understand why gamin decides that it is time to stop
monitoring $HOME.

Fabio

Note that i am talking about polling here, since gamin is using dnotify as
monitoring backend.
I did also test the inotify variant (please do NOT do that if you are not 100%
of what you are
doing) and the backend tends to survive longer (a bunch of more loops) but still
dies at end.

Revision history for this message
Fabio Massimo Di Nitto (fabbione) wrote :

(In reply to comment #31)
> (In reply to comment #30)

>
> The problem is located in the gamin remove_subscription code or around that area.
> While running the loop, all of a sudden gamin deregisters my entire home from
being
> polled.
> This also explain Martin's error message from nautilus:
> "Failed to find request N" ( i get different values of N)
>
> nautilus has no knowledge that gamin has stopped monitoring the dir and
> requestes will fail
> since there is no data entry for it anymore.
>

I have isolated the problem to the dnotify and inotify backends. Let's skip for
now the inotify one, since the kernel doesn't enable it by default.

I recompiled gamin with --dnotify-enable and everything works as it should
(basically gamin switches to polling mode), but there is somekind of regression.
gnome-panel will suck a lot of CPU and for a long time to startup.
I can't see any obvious error in the dnotify backend code yet.

Fabio

Revision history for this message
Fabio Massimo Di Nitto (fabbione) wrote :

> I recompiled gamin with --dnotify-enable and everything works as it should

I mean --disable-dnotify.

Revision history for this message
Fabio Massimo Di Nitto (fabbione) wrote :

Some more info:

I did enable the debugging session of gamin and noticed the following in the
logs, while running
the usual test case:

0 shows a remove signal, 1 an add signal.

gam_dnotify_file_handler /home/fabbione/Desktop/baz : 0
gam_dnotify_file_handler /home/fabbione/Desktop/baz : 1
gam_dnotify_file_handler /home/fabbione/Desktop/baz : 0
[here is missing an add event]
gam_dnotify_file_handler /home/fabbione/Desktop/baz : 0
[bum]

It looks like once there are 2 decrements with a missing call
the server simply can't recover from that situation and it
stops monitoring the entire directory.
None of the clients seem to care to readding it back.

Fabio

Revision history for this message
Fabio Massimo Di Nitto (fabbione) wrote :

Getting more close. Using the same scenario i noticed that refcount is bong:

Add:

gam_dnotify_file_handler /home/fabbione/Desktop/baz : 1
 not a dir using parent /home/fabbione/Desktop
Jumping to gam_dnotify_directory_handler_internal
Adding /home/fabbione/Desktop to dnotify
  found incremented refcount: 3

Removal:

gam_dnotify_file_handler /home/fabbione/Desktop/baz : 0
 not a dir using parent /home/fabbione/Desktop
Jumping to gam_dnotify_directory_handler_internal
Removing /home/fabbione/Desktop from dnotify
  found decremented refcount: 2

[Missing add]

Removal:

gam_dnotify_file_handler /home/fabbione/Desktop/baz : 0
 not a dir using parent /home/fabbione/Desktop
Jumping to gam_dnotify_directory_handler_internal
Removing /home/fabbione/Desktop from dnotify
  found decremented refcount: 1

Add:

gam_dnotify_file_handler /home/fabbione/Desktop/baz : 1
 not a dir using parent /home/fabbione/Desktop
Jumping to gam_dnotify_directory_handler_internal
Adding /home/fabbione/Desktop to dnotify
  found incremented refcount: 2

Remove:
gam_dnotify_file_handler /home/fabbione/Desktop/baz : 0
 not a dir using parent /home/fabbione/Desktop
Jumping to gam_dnotify_directory_handler_internal
Removing /home/fabbione/Desktop from dnotify
  found decremented refcount: 1

[another missing add!]

Remove:

 gam_dnotify_file_handler /home/fabbione/Desktop/baz : 0
 not a dir using parent /home/fabbione/Desktop
Jumping to gam_dnotify_directory_handler_internal
Removing /home/fabbione/Desktop from dnotify
deactivated DNotify for /home/fabbione/Desktop

goodbye Desktop

Revision history for this message
Fabio Massimo Di Nitto (fabbione) wrote :

Created an attachment (id=1989)
Workaround.

The proposed patch workarounds 2 issues in gamin. Long description will follow.

Revision history for this message
Fabio Massimo Di Nitto (fabbione) wrote :

We are experiencing 2 bugs here:

1) both the dnotify and inotify backends lose some signals during normal operations.
   In my environment they miss some Add messages. This leads to bug number 2.

2) the dnotify and inotify backends remove direcotries from polling
automatically when
   not requested to do so by the client.
   Given that we miss some signals (that could happen for several reasons but
still wrong)
   the backend automatically removes a Directory from being polled when the
refcount is 0.
   It is a client task to tell gam_server to stop polling a certain dir since it
might
   want to monitor an empty dir for example. The server has no way to know that.

My patch solves 2) and as a consequence hides 1) to the user that might notice a
slower
updates of the icons (since they lack a cycle of refresh).

Fabio

Revision history for this message
Martin Pitt (pitti) wrote :

 gamin (0.0.26-0ubuntu3) hoary; urgency=low
 .
   * Added debian/patches/01_no_auto_deregister.patch:
     - Never deregister watches automatically (this should be done by clients
       anyway) never let the reference count drop below 0. gam_server misses
       some signals which causes watches to be dropped, this patch works around
       this.
     - Thanks a lot to Fabio Massimo di Nitto for the patch and his analysis!
     - Ubuntu bug #13428

This makes desktop notifications work nicely again. However, it is not a proper
fix (this is something for upstream).

Revision history for this message
Julien Olivier (julo) wrote :

I'm still able to reproduce this bug by doing the following:
 - create a new folder on your desktop
 - open it by double-clicking on it
 - close it
 - drag the folder to the trash applet

Result -> the folder remains on the folder.

A few notes:
 - If you don't open the folder just before you put it in the trash applet, the
desktop is refreshed.
 - You don't have to create a new folder to reproduce this bug. But you have to
open and close the folder just before you drag it to the trash applet
 - The bug can only be reproduced with a folder located on the desktop, and by
dragging it to the trash applet.

Revision history for this message
Daniel Robitaille (robitaille) wrote :

(In reply to comment #39)
> - The bug can only be reproduced with a folder located on the desktop, and by
> dragging it to the trash applet.

I can reproduce it on my computer without the use of the trash applet; I just
need to delete the folder using rm from the command line to trigger the bug.

Revision history for this message
Sebastien Bacher (seb128) wrote :

gamin 0.1.0 works fine, I'm closing this bug. Feel free to reopen if you get
this bug again with this version

Revision history for this message
Martin Pitt (pitti) wrote :

*** Bug 17039 has been marked as a duplicate of this bug. ***

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.