Implicit ANDs should have higher precedence than explicit ORs
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Evergreen |
Triaged
|
Wishlist
|
Unassigned |
Bug Description
When the implicit ANDs added by QueryParser between terms are combined
with explicit ORs without any explicit grouping, the results can be
unexpected and undesirable. For example, the following query:
harry potter and the chamber of secrets || sorcerer's stone
Is translated into a query with two branches:
1. harry potter and the chamber of secrets stone
2. sorcerer's stone
This is of course nothing like what the user was expecting, which were
the following two branches:
1. harry potter and the chamber of secrets
2. sorcerer's stone
(Note: of course the user probably wanted to search for "harry potter
and the chamber of secrets" or "harry potter and the sorcerer's stone"
but we have to draw the line on implicit grouping somewhere, and where
it requires reading minds seems like a good place)
A patch modifying the QueryParser can be found in working/
tags: | added: wishlist |
Changed in evergreen: | |
importance: | Undecided → Wishlist |
summary: |
- Implicit and should have higher precedence than explicit or + Implicit ANDs should have higher precedence than explicit ORs |
Changed in evergreen: | |
status: | New → Triaged |
tags: | removed: wishlist |
tags: |
added: needsrebse removed: needsrepatch |
tags: |
added: needsrebase removed: needsrebse |
With the patch, the driver generates the following SQL:
2012-08-21 16:01:36 EDT LOG: statement: SELECT * -- bib search: #CD_documentLength #CD_meanHarmonic #CD_uniqueWords keyword: code || law depth(0) estimation_ strategy( inclusion) limit(1000) core_limit(10000) query_parser_ fts(
1::INT,
0::INT,
$core_query_ 20111$
to_tsquery( 'keyword' , COALESCE(NULLIF( '(' || btrim(regexp_ replace( search_ normalize( split_date_ range($ _20111$ code$_20111$ )),E'(? :\\s+|: )','&', 'g'),'& |') || ')', '()'), '')) AS tsq ), x941bc30_keyword_xq AS (SELECT
to_tsquery( 'keyword' , COALESCE(NULLIF( '(' || btrim(regexp_ replace( search_ normalize( split_date_ range($ _20111$ law$_20111$ )),E'(? :\\s+|: )','&', 'g'),'& |') || ')', '()'), '')) AS tsq )
ARRAY[ m.source] AS records, ts_rank_ cd(x9411690_ keyword. index_vector, x9411690_ keyword. tsq, 14) * x9411690_ keyword. weight, 0.0) normalize( x9411690_ keyword. value) ~ (search_ normalize( $_20111$ code$_20111$ ))), FALSE )::INT * 10, 1))+ ts_rank_ cd(x941bc30_ keyword. index_vector, x941bc30_ keyword. tsq, 0) * x941bc30_ keyword. weight, 0.0) normalize( x941bc30_ keyword. value) ~ (search_ normalize( $_20111$ law$_20111$ ))), FALSE )::INT * 10, 1)) eng$_20111$ )), FALSE )::INT * 5, 1)))::NUMERIC AS rel, ts_rank_ cd(x9411690_ keyword. index_vector, x9411690_ keyword. tsq, 14) * x9411690_ keyword. weight, 0.0) normalize( x9411690_ keyword. value) ~ (search_ normalize( $_20111$ code$_20111$ ))), FALSE )::INT * 10, 1))+ ts_rank_ cd(x941bc30_ keyword. index_vector, x941bc30_ keyword. tsq, 0) * x941bc30_ keyword. weight, 0.0) normalize( x941bc30_ keyword. value) ~ (search_ normalize( $_20111$ law$_20111$ ))), FALSE )::INT * 10, 1)) eng$_20111$ )), FALSE )::INT * 5, 1)))::NUMERIC AS rank,
FIRST( mrd.attrs- >'date1' ) AS tie_break metarecord_ source_ map m keyword_ xq.tsq /* search */ keyword_ field_entry AS fe metabib_ field AS fe_weight ON (fe_weight.id = fe.field) keyword_ xq.tsq) keyword. source) keyword_ xq.tsq /* search */ keyword_ field_entry AS fe metabib_ field AS fe_weight ON (fe_weight.id = fe.field) keyword_ xq.tsq) keyword. source)
FROM search.
WITH x9411690_keyword_xq AS (SELECT
SELECT m.source AS id,
1.0/((AVG(
(COALESCE(
* /* word_order */ COALESCE(NULLIF( (search_
(COALESCE(
* /* word_order */ COALESCE(NULLIF( (search_
)+1 * COALESCE( NULLIF( FIRST(mrd.attrs @> hstore('item_lang', $_20111$
1.0/((AVG(
(COALESCE(
* /* word_order */ COALESCE(NULLIF( (search_
(COALESCE(
* /* word_order */ COALESCE(NULLIF( (search_
)+1 * COALESCE( NULLIF( FIRST(mrd.attrs @> hstore('item_lang', $_20111$
FROM metabib.
LEFT JOIN (
SELECT fe.*, fe_weight.weight, x9411690_
FROM metabib.
JOIN config.
JOIN x9411690_keyword_xq ON (fe.index_vector @@ x9411690_
) AS x9411690_keyword ON (m.source = x9411690_
LEFT JOIN (
SELECT fe.*, fe_weight.weight, x941bc30_
FROM metabib.
JOIN config.
JOIN x941bc30_keyword_xq ON (fe.index_vector @@ x941bc30_
) AS x941bc30_keyword ON (m.source = x941bc30_
INNER JOIN metabib.record_attr mrd ON (m.source = mrd.id AND ((x941bc30_keywo...