Can't recover from FTS index corruption

Reported by Daniel Nögel on 2011-01-21
54
This bug affects 8 people
Affects Status Importance Assigned to Milestone
Synapse
Undecided
Unassigned
Zeitgeist Extensions
Medium
Mikkel Kamstrup Erlandsen
zeitgeist-extensions (Ubuntu)
Undecided
Unassigned

Bug Description

The Zeitgeist FTS extension should be able to automatically recover from corrupted indexes by rebuilding the index from the Zeitgeist log.

-- Original report ---------------------

Hallo,

since a few days synapse does not show any folders anymore - of course I made sure that the directory plugin is enabled. I also disabled it, restarted synapse and enabled it again. It still did not work. The config.json file seems to be proper, too.

Starting synapse from commandline, I get the following output each time I press any letter:

** (synapse:26138): WARNING **: zeitgeist-plugin.vala:574: Zeitgeist search failed: Remote Exception invoking org.gnome.zeitgeist.Index.Search() on /org/gnome/zeitgeist/index/activity at name org.gnome.zeitgeist.Engine: org.freedesktop.DBus.Python.xapian.DatabaseCorruptError: Traceback (most recent call last):
  File "/usr/lib/pymodules/python2.6/dbus/service.py", line 702, in _message_cb
    retval = candidate_method(self, *args, **keywords)
  File "/usr/share/zeitgeist/_zeitgeist/engine/extensions/fts.py", line 123, in Search
    offset, count, result_type)
  File "/usr/share/zeitgeist/_zeitgeist/engine/extensions/fts.py", line 253, in search
    self.QUERY_PARSER_FLAGS)
DatabaseCorruptError: Data ran out unexpectedly when reading posting list.
 org.freedesktop.DBus.Python.xapian.DatabaseCorruptError Traceback%20%28most%20recent%20call%20last%29%3A%0A%20%20File%20%22%2Fusr%2Flib%2Fpymodules%2Fpython2.6%2Fdbus%2Fservice.py%22%2C%20line%20702%2C%20in%20_message_cb%0A%20%20%20%20retval%20%3D%20candidate_method%28self%2C%20%2Aargs%2C%20%2A%2Akeywords%29%0A%20%20File%20%22%2Fusr%2Fshare%2Fzeitgeist%2F_zeitgeist%2Fengine%2Fextensions%2Ffts.py%22%2C%20line%20123%2C%20in%20Search%0A%20%20%20%20offset%2C%20count%2C%20result_type%29%0A%20%20File%20%22%2Fusr%2Fshare%2Fzeitgeist%2F_zeitgeist%2Fengine%2Fextensions%2Ffts.py%22%2C%20line%20253%2C%20in%20search%0A%20%20%20%20self.QUERY_PARSER_FLAGS%29%0ADatabaseCorruptError%3A%20Data%20ran%20out%20unexpectedly%20when%20reading%20posting%20list.%0A
** (synapse:26138): DEBUG: zeitgeist-plugin.vala:580: ZG search took 19 ms
SynapseHybridSearchPlugin found 0 extra uris (ZG returned 0)

I've tested synapse 0.2.2.1, 0.2.2.2 from your download section and I also tested the recent version from branch. The folders I tried to find were ordinary folders within the home-partition (like "Documents", "Music", "Desktop", "Videos").

Sincerely,

Daniel

Michal Hruby (mhr3) wrote :

Seems like your search index got corrupted somehow, to fix it open a terminal and run these commands:

zeitgeist-daemon --quit
cd ~/.local/share/zeitgeist/
rm -rvf fts.index/
zeitgeist-daemon &

Afterwards your database will be reindexed, wait a while for this to happen (there'll be lots of output in the terminal) and then try again.

Changed in synapse-project:
status: New → Incomplete

Thanks, that worked for me.

Sincerely,

Daniel

Michal Hruby (mhr3) on 2011-02-10
Changed in synapse-project:
status: Incomplete → Invalid
Michal Hruby (mhr3) wrote :

FTS should probably catch this and try to do something about it...

Michal Hruby (mhr3) on 2011-02-16
summary: - Synapse don't show any folders anymore
+ FTS index corruption

The solution didn't help me. After the latest Synapse update, it stopped locating a folder called 'systems' when I type 'systems' (the folder's location is ~/Systems). Still doesn't find it after wiping and redoing the index though. Could be another issue - I'm not sure, but is annoying.

Denis Prost (denis-prost) wrote :

I'm not sure this is exactly the same bug, just let me know if I should report a new one, but when I search in synapse using "locate" the results list shows every file in that folder but not the folder itself though its name contains the search term. (this is with synapse 0.2.6)

I just pushed some fixes to the zeitgeist fts extension that definitely plugs some places where we could corrupt the index because of threading issues. That said I still think it should be a high priority for the fts extension to gracefully recover from corrupted index. So repurposing this bug for that.

summary: - FTS index corruption
+ Can't recover from FTS index corruption
Changed in zeitgeist-extensions:
assignee: nobody → Mikkel Kamstrup Erlandsen (kamstrup)
importance: Undecided → Medium
status: New → Triaged
description: updated
Changed in zeitgeist-extensions:
status: Triaged → Fix Committed

If you have at least r69 of the fts extension you can test this by killing zeitgeist, then running:

 $ head /dev/urandom > ~/.local/share/zeitgeist/fts.index/postlist.DB

And restarting Zeitgeist. This should cause a reindex. You can verify this by looking in ~/.cache/zeitgeist/daemon.log. You should see a line like:

  WARNING - zeitgeist.fts - Full text index corrupted: 'Expected block 37 to be level 1, not 49'. Rebuilding index.

Changed in zeitgeist-extensions:
milestone: none → fts-0.0.12
status: Fix Committed → Fix Released
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package zeitgeist-extensions - 0.0.12-0ubuntu1

---------------
zeitgeist-extensions (0.0.12-0ubuntu1) oneiric; urgency=low

  * New upstream release:
    - fts can SIGSEGV ZG during reindex (LP: #617309)
    - zeitgeist-daemon crashed with RuntimeError in _check_index():
      basic_string::assign (LP: #839740)
    - Blowing Xapian max term length corrupts index (LP: #843668)
    - Can't recover from FTS index corruption (LP: #705944)
 -- Didier Roche <email address hidden> Thu, 08 Sep 2011 11:25:16 +0200

Changed in zeitgeist-extensions (Ubuntu):
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers