Inconsistent parsing of '~' in to_tsquery() and to_tsvector()
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Launchpad itself |
Triaged
|
Low
|
Unassigned |
Bug Description
If a word from a full-text-
search, all texts containing this word should be returned by
this search.
This can fail if a word starts with a '~':
psql -d launchpad_dev
psql (9.1.4)
Type "help" for help.
launchpad_dev=# select to_tsvector('aaa ~bbb ccc') @@ to_tsquery('~bbb');
?column?
----------
f
The reason:
launchpad_dev=# select to_tsvector('aaa ~bbb ccc');
---
'aaa':1 'bbb':2 'ccc':3
(1 row)
So, the '~' is stripped from '~bbb'. But the search term generated by
to_tsquery() retains the '~':
launchpad_dev=# select to_tsquery('~bbb');
to_tsquery
------------
'~bbb'
(1 row)
This is not completely wrong, because to_tsvector() sometimes keeps a
leading '~':
launchpad_dev=# select to_tsvector('~aaa bbb~ccc');
---
'bbb':2 '~aaa':1 '~ccc':3
(1 row)
ts_debug() gives a clue what is happening:
launchpad_dev=# select ts_debug('~bbb');
---
(file,"File or path name",~
(1 row)
launchpad_dev=# select ts_debug('~aaa bbb~ccc');
---
(file,"File or path name",~
(blank,"Space symbols"," ",{},,)
(asciiword
(file,"File or path name",~
(4 rows)
So, a '~' at the start of a text or following a word is treated as the
first character of a filename, while a '~' preceded by a space is
simply dropped and the following word is treated as an oridnary word.
description: | updated |
description: | updated |
Changed in launchpad: | |
importance: | Critical → Low |
tags: | added: search |
Marked as "critcal" since this bug descirbes one details of the quite generic bug 29713