Blueprints search is totally broken
Bug #1064996 reported by
Ara Pulido
This bug report is a duplicate of:
Bug #1025357: BluePrint searchtext= not returning correct results.
Edit
Remove
This bug affects 1 person
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Launchpad itself |
Invalid
|
Undecided
|
Unassigned |
Bug Description
Right now searching for blueprints is completely broken. The results of a very simple search are not reliable at all.
Steps to reproduce:
1. Go to https:/
2. Order by blueprint name, so it is easier to spot the bug
3. Look at arm-m (for example) blueprints. Currently there are 4 starting with "arm-m"
4. Search by "arm-m" (or visit https:/
Expected results:
The user gets at least those 4 blueprints
Current results:
No blueprints are returned
tags: | added: lp-blueprints specifications |
To post a comment you must log in.
Please search for "arm m" instead.
Your example shows a limit of the full text search features of Postgres:
Let's take the name "arm-m- xdeb-cross- compilation- environment" . Thes full text index data for this word is:
to_tsvector( 'arm-m- xdeb-cross- compilation- environment' );
to_tsvector ------- ------- ------- ------- ------- ------- ------- ------- ------- ------- ------- ------- ------- xdeb-cross- compilation- environ' :1 'compil':6 'cross':5 'environ':7 'm':3 'xdeb':4
-------
'arm':2 'arm-m-
so, all words separated by a '-' (some of them stemmed), and the complete name.
If you enter the searh term "arm-m", it is transformed into these search expression:
select ftq('arm-m'); ------- ------- --
ftq
-------
'arm-m' & 'arm' & 'm'
meaning that all three words must appear in the the index data -- but "arm-m" is not present.
But the search term "arm m" is transformed into
select ftq('arm m');
ftq
-------------
'arm' & 'm'
so, those row are returned where the the full text index contains the words "arm" and "m"
Another caveat: Most single characters are indexed, but some are not:
select to_tsvector('a b c d e f g h i j k l m n o p q r s t u v w x y z');
'b':2 'c':3 'd':4 'e':5 'f':6 'g':7 'h':8 'j':10 'k':11 'l':12 'm':13 'n':14 'o':15 'p':16 'q':17 'r':18 'u':21 'v':22 'w':23 'x':24 'y':25 'z':26
So, a, i, s, t are missing. And they are also omitted from search terms, meaning that a search for "arm s" will return all texts containig "arm", including "arm-m".
You should use names like "arm-strange" (or whichever name will be used for the "s" series) instead of just "arm-s".