Ubuntu

tracker inconsistecies with .tex files

Reported by Romano Giannetti on 2008-05-02
8
Affects Status Importance Assigned to Milestone
tracker (Ubuntu)
Undecided
Unassigned

Bug Description

Binary package hint: tracker

It seems that tracker, even after a re-index, fails to index (or find) contents in .tex files.
For example, after 2 days running on my home dir (10 G), this are the results:

(0)rukbat:~/research/ferrite% tracker-search "Jiles"
/home/romano/research/ferrite/biblio/Jill-Atherton-spice-model.pdf
/home/romano/research/ferrite/biblio/Brachtendorf.pdf
/home/romano/research/ferrite/biblio/verilog-a-mag.pdf
/home/romano/research/ferrite/biblio/simul-hyster-04014515.pdf
/home/romano/research/ferrite/tim1/Attic/bozza0.TEX
/home/romano/research/ferrite/tim1/Attic/bozza1.tex
/home/romano/research/ferrite/tim1/Attic/bozza1.tex.backup

But:

(0)rukbat:~/research/ferrite% grep -l Jiles */*.tex */*/*.tex | uniq | wc -l
24

So it finds just 3 files over 24 in just this subdir (probably there are much more).

Just to add a bit of data: my .tex files are encoded in a mix of encoding, someone is in ASCII, others are in utf-8, others are in iso-8859-1. But it seems that the problem is unrelated to the encoding.

Felipe Figueiredo (philsf) wrote :

Additional information regarding upstream progress:

http://osdir.com/ml/freedesktop.tracker/2006-12/msg00039.html
http://<email address hidden>/msg02976.html

Changed in tracker:
status: New → Confirmed
Felipe Figueiredo (philsf) wrote :

Workaround: install the package 'untex' available in universe. Please confirm if this works for you.

I installed untex, I'll report back.

On the other hand, I checked the tracked filters, and I have:

(0)rukbat:~/research/ferrite/imtc07-journal% cat /usr/lib/tracker/filters/text/x-tex_filter
#!/bin/sh
UNTEX=`which untex`
if [ -n "$UNTEX" ]; then nice -n19 $UNTEX "$1" > "$2";
else cat "$1" > "$2";
fi

So it should not have had any effect.

Checking something more, the .tex files that are found are in the
category "Documents", not "Text". Dunno if this can help pinpointing the
problem...

(notice: I _do_ have some tex indexing. The problem is that it is
missing a lot of files.)

Felipe Figueiredo (philsf) wrote :

I too had some but not all tex files indexed. After installing untex, and giving it a while to catch up, I can now consistently find my .tex files searching by name or by content.

Since untex is already registered as a dependency for tracker (recommends), I'm concluding this is not a bug at all (nothing needs to be changed). I don't think latex users are that common to warrant an automatic installation of untex with tracker (not that it would waste that much space, but still).

Should we set this report to invalid, if you get the expected results from tracker after installing untex? I guess the mere existence of this report will serve as documentation for nextcomers.

> Should we set this report to invalid, if you get the expected results
> from tracker after installing untex? I guess the mere existence of this
> report will serve as documentation for nextcomers.

Maybe. I will be without internet for the next two days, then I will
report. If all is starting to work, setting this as invalid could be the
Right Thing to do.

You need untex installed and then need to reindex

not a bug but all dependent apps must be installed prior to indexing or a reindex performed post install

Changed in tracker:
status: Confirmed → Invalid
Felipe Figueiredo (philsf) wrote :

Jamie, just for completeness of the issue, shouldn't tracker index .tex files even without untex (index with \TeX tags)? This doesn't seem to be happening.

Also I have a wishlist, that might be right for this bug report: when a .tex file is found, tracker should also ignore the sattellite files with the same basename (.toc, .bbl, .log, etc). They are indexed as well, some as Text. I'll look for a complete list of extensions to be ignored.

On Fri, 2008-05-02 at 20:20 +0000, Felipe Figueiredo wrote:
> Jamie, just for completeness of the issue, shouldn't tracker index .tex
> files even without untex (index with \TeX tags)? This doesn't seem to be
> happening.

Yes, that was my guess, too. The filter script installed should have
indexed the files nevertheless, so there's a bug here, untex installed
or not.

I didn't re-index after installing untex to see if the current state of
my system could be useful. But well, I will try to re-index now, and
comment after that.

I think that the marking of the bug as invalid has been a bit hasty.

well sorry for being hasty but the way tracker works is to spawn untex filename which will always return nothing if you dont have untex installed

Im not sure I want to change that yet as indexing unfiltered files might lead to garbage going into the index (that applies to all files that need filtering and not just tex ones)

Its likely therefore to remain as is (consider it a wont fix rather than invalid if you wish)

Emilio Pozuelo Monfort (pochu) wrote :

I'll add untex to tracker's recommends

Michael Biebl (mbiebl) wrote :

2008/5/5 Emilio Pozuelo Monfort <email address hidden>:
> I'll add untex to tracker's recommends

It already is (at least in the Debian package). Maybe you should just
sync the Ubuntu package.

Cheers,
Michael

--
Why is it that all of the instruments seeking intelligent life in the
universe are pointed away from Earth?

On Sun, 2008-05-04 at 22:52 +0000, Jamie McCracken wrote:
> well sorry for being hasty but the way tracker works is to spawn untex
> filename which will always return nothing if you dont have untex
> installed

I understand. I am sorry if I sounded harsh, but probably I am a bit
dense... I supposed that tracker was using the filter that I have in
/usr/lib/tracker/filters/text/x-tex_filter which I posted above, which
says that if there's no untex, the whole latex file was indexed [1]. But
if tracker directly call untex you're right.

> Im not sure I want to change that yet as indexing unfiltered files might
> lead to garbage going into the index (that applies to all files that
> need filtering and not just tex ones)

That lead to the question I started the the thread with: why I had just
3 latex files indexed when grep found 24 occurrences? I see how that
could have been 0 (no untex --- no index), but why the partial result?

> Its likely therefore to remain as is (consider it a wont fix rather than
> invalid if you wish)

Fair enough, it's your call. I have installed untex and my laptop is
re-indexing; will open another bug if the problem is still here, ok?

Romano

[1] BTW, maybe indexing the whole tex file is the right thing to do. In
the end, a .tex file should be treated like a source code for any
language, so sometime I'd like to make a search for that \darn macro...

Emilio Pozuelo Monfort (pochu) wrote :

Michael Biebl wrote:
> 2008/5/5 Emilio Pozuelo Monfort <email address hidden>:
>> I'll add untex to tracker's recommends
>
> It already is (at least in the Debian package). Maybe you should just
> sync the Ubuntu package.

Ah, thanks Michael, you're right. It's also in the Ubuntu package, problem is
that as tracker is installed by default, people will end without tracker's
Recommends, as for ubuntu-desktop (metapackage) dependencies, their Recommended
package aren't installed by default. That's why o3read was moved to Depends (so
that it's installed by default), but to do the same with untex we will need to
get it into main first.

Ok,

installed untex and checked, rebuilt the database (re-index), still no joy.
Look:

(0)rukbat:~/research/ferrite% wajig list untex
ii untex 9210-10 Remove LaTeX commands from input

(0)rukbat:~/research/ferrite% tracker-search "Jiles"
/home/romano/research/ferrite/biblio/Jill-Atherton-spice-model.pdf
/home/romano/research/ferrite/biblio/Brachtendorf.pdf
/home/romano/research/ferrite/biblio/verilog-a-mag.pdf
/home/romano/research/ferrite/biblio/simul-hyster-04014515.pdf
/home/romano/research/ferrite/tim1/Attic/bozza0.TEX
/home/romano/research/ferrite/tim1/Attic/bozza1.tex
/home/romano/research/ferrite/tim1/Attic/bozza1.tex.backup

7 files found, of which 3 .tex files. But:

(0)rukbat:~/research/ferrite% grep "Jiles" */*.tex */*/*.tex | wc -l
29

29 *just* .tex file in that subdirectory.

So, untex is immaterial to this problem. The result are the same with and without it, like it should be, given the filter file...

Tracker is great for searching things into documents like ods or pdf; I suppose I should stay with grep for LaTeX... or should I open another report?
Thanks

BTW, I found this:

May 5 10:49:32 rukbat kernel: [28930.604863] tracker-extract[25166]: segfault at 6562696c ip b7dbb283 sp bfe9c80c error 4 in libc-2.7.so[b7d49000+149000]

in my logs during re-index. Dunno if it could be related.

Installed untex and re-indexed, still the same problem, so the problem is still here. Moreover, .tex files are akin to program sources, so they should be indexed as text (the same as .py or .c files).

Changed in tracker:
status: Invalid → New

Too few users of LaTeX, I suppose.

jaquemaire (jaquemaire) wrote :

bonjour tout le monde,
I suffer the same problem as Romano's. I used to index my .tex files in winXP with Copernic Desktop, and I wish it could be done (with tracker) in ubuntu. Wht can't tracker consider .tex files as text files ?

regards,
alex

François Bouffard (fbouffard) wrote :

I'm also affected by the bug. I'm not sure if untexing the .tex files is wise anyway -- I recently found myself searching for LaTeX commands for which I've long forgotten the syntax , knowing that I used them in an old document.

Thomas Hotz (thotz) wrote :

Marking as confirmed then...

Changed in tracker (Ubuntu):
status: New → Confirmed
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers