Blowing Xapian max term length corrupts index
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
| Zeitgeist Extensions |
High
|
Mikkel Kamstrup Erlandsen | ||
| zeitgeist-extensions (Ubuntu) |
Undecided
|
Unassigned |
Bug Description
Xapian has a (not very well documented) max term length of 245 bytes. See fx. http://
This is reproducible by indexing long URLs (at least 245 bytes long). We already had a cap at 2000 characters, but that was apparently not good enough.
Related branches
- Zeitgeist Extensions: Pending requested 2011-09-07
-
Diff: 114 lines (+41/-13)2 files modifiedfts/_tests.py (+1/-0)
fts/fts.py (+40/-13)
Changed in zeitgeist-extensions: | |
assignee: | nobody → Mikkel Kamstrup Erlandsen (kamstrup) |
importance: | Undecided → High |
status: | New → Triaged |
Changed in zeitgeist-extensions: | |
status: | Triaged → Fix Committed |
Changed in zeitgeist-extensions: | |
milestone: | none → fts-0.0.12 |
status: | Fix Committed → Fix Released |
Launchpad Janitor (janitor) wrote : | #1 |
Changed in zeitgeist-extensions (Ubuntu): | |
status: | New → Fix Released |
Richard Boulton (richardboulton) wrote : | #2 |
Note from a Xapian developer; the report here says: "For some reason this is not always gracefully handled inside Xapian and busting that limit may occasionally corrupt the index." We're not aware of any situation in which adding a term longer than the limit can result in a corrupted index, and I don't recall any such report. If you have a way to reproduce such a corruption, we'd be interested in it, so that we can fix it.
Richard: Sure - I never personally could reproduce this issue, but one user seemed to get it very reliably. I can check with him to see if we can narrow it down.
This bug was fixed in the package zeitgeist- extensions - 0.0.12-0ubuntu1
--------------- extensions (0.0.12-0ubuntu1) oneiric; urgency=low
zeitgeist-
* New upstream release: string: :assign (LP: #839740)
- fts can SIGSEGV ZG during reindex (LP: #617309)
- zeitgeist-daemon crashed with RuntimeError in _check_index():
basic_
- Blowing Xapian max term length corrupts index (LP: #843668)
- Can't recover from FTS index corruption (LP: #705944)
-- Didier Roche <email address hidden> Thu, 08 Sep 2011 11:25:16 +0200