bug search fails to find results when punctuation is adjacent to regular text in the document (e.g. '"from"', '<div>')
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Launchpad itself |
Fix Released
|
Critical
|
Abel Deuring |
Bug Description
This bug has expanded a bit since it was originally filed in 2006. Here is the current state of things.
The Problem:
============
Doing a bug search can fail (turn up no results) despite the fact that the *exact* search string appears in the titles of some bugs that should be within the scope of the search.
Examples:
=========
* See bug 2753 (now a dupe of this). The thing being searched for was 'div' and the text indexed contained ' <div> '.
* See bug #360642, which is now marked as a dup of this. The
reporter says that searching for "from" failed to find results,
even though that's in the title of
https:/
Here is the title of that bug, using "/" as the delimiter since the
title itself contains both double quotes and parens: /Editing the
"From" field for the current email only (as text, not dropdown)/.
I re-tested on 2010-02-22, and searching for either "from" or
"From", with or without double quotes around it, still fails to
turn up that bug. When I did a search for "from" (with no double
quotes) with the "Across all project" radio button selected, I got
exactly one result: 508760. It seems very unlikely that there'd be
exactly one hit for a search on "from" :-).
* There are two bugs with the string "community-
their titles: bug #513608 (as of 2010-02-22 was in state
"confirmed", with summary "community-
use Launchpad to determine who is not a Canonical employee") and
bug #432742 (state "fix committed", with summary
"community-
input"). Both are in launchpad-
matter).
Anyway, searching for "community-
any results when done across all projects, nor in
"launchad-
"launchpad".
Removing the ".py" and searching for "community-
launchpad-
"community" and "contributed" separately) and 484824 (which
contains "community" and "contributions" separately), but we still
don't get the bugs that have the exact match in their titles.
Meanwhile, searching for "community-
the ".py") with "Across all projects" checked results in 19 hits
(312766, 374090, 459701, 459701, 265028, 456301, 393407, 418469,
484824, 250402, 263554, 250402, 250402, 357358, 411358, 453775,
495391, 459701, 509094), none of which are the two I'm looking for.
A Possible Non-Example:
=======
* In the original repro recipe for this bug, the reporter said "If I
search for 'sqlobject' on
https:/
despite this term being in the title of Bug #3096, which is
currently in 'confirmed' status. Interestingly, you can see this
bug in the full bug list."
But bug #3096 is in "launchpad-
searching for it in "launchpad" would work anyway, since
"launchpad" is (AFAICT) just a grab-bag temporary holding area
anyway. So it may be that the original bug report here was a
misunderstan
whose symptoms match those that the original report described!
Possible causes
===============
Tokenisation of terms is done both in-DB and in-python, if these are mismatched we may have terms that simply cannot be searched on because the supplied search query won't ever match the indexed terms,
Related branches
- Raphaël Badin (community): Approve
- Stuart Bishop (community): Approve (db)
- Francesco Banconi (community): Approve (code*)
-
Diff: 607 lines (+311/-117)4 files modifieddatabase/schema/patch-2209-24-1.sql (+125/-0)
lib/lp/answers/stories/question-browse-and-search.txt (+1/-1)
lib/lp/registry/doc/vocabularies.txt (+1/-2)
lib/lp/services/database/doc/textsearching.txt (+184/-114)
- Abel Deuring (community): Approve
-
Diff: 607 lines (+311/-117)4 files modifieddatabase/schema/patch-2209-24-1.sql (+125/-0)
lib/lp/answers/stories/question-browse-and-search.txt (+1/-1)
lib/lp/registry/doc/vocabularies.txt (+1/-2)
lib/lp/services/database/doc/textsearching.txt (+184/-114)
description: | updated |
Changed in malone: | |
assignee: | nobody → stub |
Changed in malone: | |
status: | Confirmed → Triaged |
summary: |
- Search for sqlobject bugs in launchpad product fails to find any results + bug search fails to find results despite exact search string being in + bug titles |
description: | updated |
description: | updated |
description: | updated |
Changed in launchpad: | |
assignee: | Stuart Bishop (stub) → nobody |
tags: |
added: search ubuntu-qa removed: lp-bugs |
Changed in launchpad: | |
importance: | Medium → High |
Changed in launchpad: | |
importance: | High → Low |
description: | updated |
Changed in launchpad: | |
assignee: | Graham Binns (gmb) → nobody |
description: | updated |
summary: |
- bug search fails to find results despite exact search string being in - bug titles + bug search fails to find results when punctuation is adjacent to regular + text in the document (e.g. '"from"', '<div>') |
Changed in launchpad: | |
assignee: | nobody → Abel Deuring (adeuring) |
status: | Triaged → In Progress |
tags: | added: bad-commit-15464 |
Changed in launchpad: | |
status: | Fix Committed → In Progress |
Changed in launchpad: | |
status: | In Progress → Fix Committed |
Changed in launchpad: | |
status: | Fix Committed → Fix Released |
I tried the search with 'sqlobject' and 'sqlobject.select' on the https:/ /launchpad. net/products/ launchpad/ +bugs and got some results, but bug 3096 didn't show up.
Weird.