File search should use Tracker when available

Bug #771513 reported by Alessio Bolognino on 2011-04-26
136
This bug affects 25 people
Affects Status Importance Assigned to Milestone
Tracker
New
Undecided
Unassigned
Zeitgeist Extensions
Undecided
Unassigned
unity-lens-files
Undecided
Unassigned

Bug Description

This is a feature request, I guess.
I think it's awkward that search inside Nautilus (that uses Tracker, if installed) finds 8 results for a certain query and the File Lense does not find anything (even when the string is in the filename!). Zeitgeist is not even supposed to be a search tool, so why are we pretending it is?

Neither Zeitgeist or Tracker are "search tools". The first is a log of all your activities and the second is an RDF database. You can implement text searching on top of both of them.

For Unity we choose ZG for a few reasons - fx. simpler deployment and more controlled impact on the host system (filesystem crawling is a notoriously difficult problem on Linux and while the Tracker devs has done an awesome job at it it is still not unproblematic). Plus ZG lends itself very naturally to the design requirements of having temporal sorting and grouping of results.

Zeitgeist, by nature, only logs stuff you interact with. And for desktop file searching I think this is the right space to be searching. How often are you looking for something on your file system you don't know you have?

The problem as I see it is that ZG fails to log some things you might expect it to. But let's fix *that* instead of throwing the baby out with the bathwater.

Bonus note: The files lens does not use any privileged API or anything. Anyone who wants to can write a files lens backed by Tracker. And if this choice turns out to be popular and superior in every way I am sure that we'll evaluate it for replacing the current one.

Changed in unity-place-files:
status: New → Opinion
Alessio Bolognino (themolok) wrote :

For the record I was merely suggesting to take advantage of the installed software, so if tracker is installed the lens could retrieve results from both zeitgeist and tracker.

Right. Things are deceptionally simple however. We'd need to
deduplicate the results bewteen ZG and T otherwise the user experience
would be pretty crappy. And since we'd query both of the sources
asynchronously we'd have to remove the dupes from the slowest of
these.

Adding to that we don't want results to "jump around" when the slower
source arrives with results - in other words - slow results *must* be
appended to the end of the results list. This doesn't play well with
sorting by timestamp obviously (or any kind of sorting in fact) since
this breaks unless you insert results at the right point in time. So
these two issues would basically force us to always show results from
the slowest source - adding to that some pretty non-trivial logic to
keep the results sets sane.

So from a technical POW I think it makes more sense to either:

 a) Integrate Tracker and Zeitgeist at a lower level making extra
logic in the lens daemon unnecessary
 b) Make ZG be aware of all the things that users expect to find
 c) Write a standalone Tracker powered files lens

I'm pretty sure there will be people working on all of these points to
some degree - perhaps you can be one them?! :-) So time will tell
which one prevails. The user experience should be the top priority -
not which libraries or tools we use under the hood.

Seif Lotfy (seif) wrote :

We have a semi working extension for FTS searching that uses Tracker instead of our FTS-extension. It then sorts using the Zeitgeist criteria though. I will package it for our ppa soon...

Felipe Castillo (fcastillo.ec) wrote :

I was wondering if this new way of searching with the tracker and sorting using Zeitgeist has getting any progress, I find it really annoying the fact that I can't find anything in Ubuntu now.

I don't think anybody has considered what happens when you have a fresh install of your system, you restore your backup, and when you start looking for your files, it turns out you have none, but the truth is that you just haven't interacted with them yet. It's so annoying!!! The current tracker program is the only way out of this problem.

Please keep us posted if there's a PPA for us to install and try the search using tracker, I'm looking forward to it

Seif, have you gotten further with https://bugs.launchpad.net/unity-lens-files/+bug/646724/comments/28 or the packaging for the ppa you mentioned above?

I just repeat what it's written in the bug description:
Zeitgeist is not even supposed to be a search tool, so why are we pretending it is?

second I refer to the following

Mikkel Kamstrup Erlandsen (kamstrup) wrote:
"The user experience should be the top priority - not which libraries or tools we use under the hood."

And thrid I will make some examples why Tracker or any other indexing engine should be used with the Files lens:
 I'm an Architect, something pretty rare in the Open source world. As I'm workig, I need to archive many files to be able to protect myself against what any contractor can say. This means that I have to write hundreds of emails and save them for a posible future need. when necessay, which is quite often, I need the search engine to find specific words within those emails or PDF's. And this might happen with some quite old projects wich I have never access since months or years (normally in between I would have done a new installation of the system to keep up to date). This case, it's something that happens to almost any professional in the world, and it needs to have the contents indexed. Those contents that have never been accessed ... you know they are there, but you can't start opening and reading 400 emails until you discover which one is the right one.

Having to install a completely separate system to perform the search it's an absolutely shame as the Files Lens have an amazing power to sort results out (Size, date modified, ...). Loosing that power it's completely nonsense.

Another situation would be:
I have many books, catalogs and technical sheets. As I have to look for construction materials, I need to just write as an example "concrete" and get all documents related or containing "concrete".

Another situation would be a user that has tones of music, videos, pictures, and files in the computer.
Normally what you expect of a search engine is to find things. SO you know what you have in your computer but you can't remember where you saved it ... and you use the files lens to find ... NOTHING ... as it's just looking for those things that you already know where they are.

With the above I just mean that it's absolutly nonsense to think that:

"Zeitgeist, by nature, only logs stuff you interact with. And for desktop file searching I think this is the right space to be searching. How often are you looking for something on your file system you don't know you have?"

I'm sorry but everybody knows more or less what he has in the computer. The problem is to know where it's whe you need it once every some years and you are in the hurry. The proble is ... when are you going to watch that movie again that you already have seen! ... in 2 years?

I understand ZG to be very powerfull, but it's far away of doing the required job at the moment. I also understand the problems with using both, but at the moment, and until further development of ZG I find Tracker much more usefull for the user experience as ZG.

In my opinion the real solution is to get ZG to became a Indexer as well.

Seif Lotfy (seif) wrote :
Download full text (4.3 KiB)

Yeah we ported Zeitgeist to Vala so I think I might port the Zeitgeist
Tracker extensions that does searching via Tracker and sorting via
Zeitgeist...
However be aware that even Tracker doesn't find everything. If for example
you have a folder in your home directory that isnt XDG standard Tracker
will not index it even after interaction with the files inside it. Indexing
and monitoring are very very expensive.

On Sun, Jan 8, 2012 at 5:48 PM, Gabriel.G.Gordillo <
<email address hidden>> wrote:

> I just repeat what it's written in the bug description:
> Zeitgeist is not even supposed to be a search tool, so why are we
> pretending it is?
>
> second I refer to the following
>
> Mikkel Kamstrup Erlandsen (kamstrup) wrote:
> "The user experience should be the top priority - not which libraries or
> tools we use under the hood."
>
> And thrid I will make some examples why Tracker or any other indexing
> engine should be used with the Files lens:
> I'm an Architect, something pretty rare in the Open source world. As I'm
> workig, I need to archive many files to be able to protect myself against
> what any contractor can say. This means that I have to write hundreds of
> emails and save them for a posible future need. when necessay, which is
> quite often, I need the search engine to find specific words within those
> emails or PDF's. And this might happen with some quite old projects wich I
> have never access since months or years (normally in between I would have
> done a new installation of the system to keep up to date). This case, it's
> something that happens to almost any professional in the world, and it
> needs to have the contents indexed. Those contents that have never been
> accessed ... you know they are there, but you can't start opening and
> reading 400 emails until you discover which one is the right one.
>
> Having to install a completely separate system to perform the search
> it's an absolutely shame as the Files Lens have an amazing power to sort
> results out (Size, date modified, ...). Loosing that power it's
> completely nonsense.
>
> Another situation would be:
> I have many books, catalogs and technical sheets. As I have to look for
> construction materials, I need to just write as an example "concrete" and
> get all documents related or containing "concrete".
>
> Another situation would be a user that has tones of music, videos,
> pictures, and files in the computer.
> Normally what you expect of a search engine is to find things. SO you know
> what you have in your computer but you can't remember where you saved it
> ... and you use the files lens to find ... NOTHING ... as it's just looking
> for those things that you already know where they are.
>
> With the above I just mean that it's absolutly nonsense to think that:
>
> "Zeitgeist, by nature, only logs stuff you interact with. And for
> desktop file searching I think this is the right space to be searching.
> How often are you looking for something on your file system you don't
> know you have?"
>
> I'm sorry but everybody knows more or less what he has in the computer.
> The problem is to know where it's whe you need it once every some years
> and you are in t...

Read more...

Thaks for such a quick and positive answer!!!

I'm aware that things are not going to be perfect from the begining, but it's a first step in the right direction. Tracker working inside Zeitgheist sounds like going to be a great deal for everybody.

The only thing I have to add is that as most people that really need indexing capabilities, information it's saved in external drives. It would be very usefull to add some option in the Zeitgeist coming UI to be able to configure the volumes and or directories to be indexed.

It would be some way to integrate the current tracker-preferences in ZG-preferences. This would happen as a standard if Ubuntu comes with tracker as a standard package. Something that I'm far away of being able to decide.
Otherwise should not be a problem to keep them separate although would be interesting to allow for a extra tab with the Tracker-Preferences within ZG-Preferences once tracker is installed.

no longer affects: zeitgeist
TomasHnyk (sup) wrote :

BTW: seems a full-text search in Dash can be achieved with recoll-lens: http://www.webupd8.org/2012/03/recoll-lens-full-text-search-unity-lens.html - for those searching workaround.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers