ProgrammingError: spiexceptions.SyntaxError: syntax error in tsquery

Bug #1020443 reported by Diogo Matsubara
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Launchpad itself
Critical
Abel Deuring

Bug Description

OOPS-74344500aca70d5e087bc6abcbbe1ce7 shows a ProgrammingError: spiexceptions.SyntaxError: syntax error in tsquery asking a question. The minimum query to trigger the error is "?!." which means the +addquestion page is not properly escaping those characters

Steps to reproduce:
1. https://qastaging.launchpad.net/launchpad/+addquestion
2. In Summary fill in: ?!.
3. Click Continue
4. OOPS-74344500aca70d5e087bc6abcbbe1ce7

Related branches

Revision history for this message
Stuart Bishop (stub) wrote :

Its not an attack vector.

The ftq() stored procedure is responsible for turning arbitrary input and our AND, OR, NOT boolean operations into a valid tsearch2 query string. It is failing in this case by leaving in the spurious punctuation; the '!' is a tsearch2 NOT operator and not proceded by an & or | to separate it, so we get a syntax error.

security vulnerability: yes → no
visibility: private → public
description: updated
Curtis Hovey (sinzui)
tags: added: questions
Revision history for this message
Abel Deuring (adeuring) wrote :

We can fix this bug in two ways:

(1) insert spaces on the left and right side of the operators:

select _ftq('? ! .');
 _ftq
------
 ?&!.
(1 row)

So, the required '&' is inserted.

launchpad_dev=# select ftq('? ! .');
NOTICE: text-search query contains only stop words or doesn't
contain lexemes, ignored
CONTEXT: SQL statement "SELECT to_tsquery('default', _ftq($1)) AS x"
PL/Python function "ftq"
 ftq
-----

(1 row)

(2) just replace the operators '&', '|', '!' with spaces in search terms.

I favour the second option: Launchpad is about software development,
so we can expect that a text like

   Shell scripts usually start with #!/bin/sh

appears in an indexed column and that it is used as a search term.

This is indexed as:

select to_tsvector('Shell scripts usually start with #!/bin/sh');
                     to_tsvector
------------------------------------------------------
 '/bin/sh':6 'script':2 'shell':1 'start':4 'usual':3

Using the example text directly as a search expression leads currently
to the OOPS described in this bug. If we insert spaces around the '!',
we get this ts_query:

select ftq('Shell scripts usually start with # ! /bin/sh');
                         ftq
-----------------------------------------------------
 'shell' & 'script' & 'usual' & 'start' & !'/bin/sh'
(1 row)

So, the indexed text will _not_ be found because of the !'/bin/sh'.

Similar oddities might occur for other typical source code lines:

select to_tsvector('x = y | !z');
    to_tsvector
-------------------
 'x':1 'y':2 'z':3

select ftq('x = y | !z');
       ftq
------------------
 'x' & 'y' | !'z'

If we first remove '!', '&', '|' from search terms and then replace only
"AND", "OR", "NOT" with the respective operators, we avoid most of
these oddities, because the words a stop words and thus do not appear
in the FTI data. (This still leaves a problem when somebody tries to
search for an SQL expression like "SELECT ... FROM ... WHERE a=5 AND
(b=6 OR c=7)". I think we can avoid suprises here only by mentioning this
problem on a help page and recommend to remove the anyway useless
stop word from the search term or to use lowercase text...

Another note, not related to the OOPS, but related to possibly suprising
search results: We should not replace a leading '-' with a '!':

select to_tsvector('123 -546');
   to_tsvector
------------------
 '-546':2 '123':1

launchpad_dev=# select ftq('123 -546');
      ftq
----------------
 '123' & !'546'

So again a "non-match".

Abel Deuring (adeuring)
Changed in launchpad:
assignee: nobody → Abel Deuring (adeuring)
status: Triaged → In Progress
Revision history for this message
William Grant (wgrant) wrote :

I've reverted the changes in r15696; they contain an unreviewed DB patch, and model changes that are required to be deployed simultaneously with it, neither of which is permissible.

Revision history for this message
Launchpad QA Bot (lpqabot) wrote :
tags: added: qa-needstesting
Changed in launchpad:
status: In Progress → Fix Committed
William Grant (wgrant)
Changed in launchpad:
status: Fix Committed → In Progress
tags: added: bad-commit-15696 qa-bad
removed: qa-needstesting
Revision history for this message
Abel Deuring (adeuring) wrote :

I see your point about a DB patch -- but what model changes do you mean?

Revision history for this message
William Grant (wgrant) wrote :

There were several non-test, non-DB changes:

  lib/lp/answers/model/faq.py
  lib/lp/answers/model/question.py
  lib/lp/bugs/model/bugtasksearch.py
  lib/lp/registry/model/person.py
  lib/lp/registry/vocabularies.py

Since we don't deploy DB and code changes simultaneously or atomically, the code at the time of the DB patch deployment has to work both with and without the DB patch. To ensure this, model changes are forbidden from landing at the same time as DB patches, unless approved by the TA or DBA (this has never happened before).

Revision history for this message
Launchpad QA Bot (lpqabot) wrote :
tags: added: qa-needstesting
removed: qa-bad
Changed in launchpad:
status: In Progress → Fix Committed
Abel Deuring (adeuring)
Changed in launchpad:
status: Fix Committed → In Progress
Abel Deuring (adeuring)
tags: added: qa-ok
removed: qa-needstesting
Revision history for this message
Launchpad QA Bot (lpqabot) wrote :
tags: added: qa-needstesting
removed: qa-ok
Changed in launchpad:
status: In Progress → Fix Committed
Abel Deuring (adeuring)
tags: added: qa-ok
removed: qa-needstesting
Ian Booth (wallyworld)
Changed in launchpad:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers