QP query generation is incorrect with two or more bool ops in a row
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Evergreen |
Fix Released
|
Medium
|
Unassigned | ||
2.4 |
Fix Released
|
Medium
|
Unassigned |
Bug Description
Dan Wells raised two issues with search here: http://
Of those, the second is a new issue. It's root cause is improper use of "local" variables in perl. In addition to (at least) two other bugs, the branch below addresses this, generating the correct SQL for the examples given.
From the relevant commit message:
First, we didn't need to make $last_type local, and it broke explicit
grouping anyway. That's removed, and we now reset that (and a few more
like it) at calls to the top level parse() method.
Second, we are smarter about finding the boundary of atoms. Previous
to this commit, and curly brace could send the parser into a tailspin
from which it would not recover. Now we use alternation instead of
a character class, which is much safer with the default multi-character
float syntax specifier.
Third, as a catch-all, if we can't parse the remained of a query we
now simply say so (when in debug mode) and go away, instead of risking
an infinite loop. We do this via a final, unqualified "else" clause
in decompose().
description: | updated |
information type: | Public → Private Security |
Changed in evergreen: | |
status: | Fix Committed → Fix Released |
information type: | Private Security → Public Security |
Thanks for your work on this, Mike. This helped my example queries in testing, so I pushed it into production. The situation is generally improved, but now that the queries are actually being generated, I have seen a few cases where "pathological" queries can actually crash the database system. In at least one case, our DB system went into "recovery mode" in such a way that it never recovered.
As a little more background, what the queries are trying to do is determine (with help of xISBN services) whether we have *any* edition of a title represented by a given ISBN. So they are only pathological in the sense that a human would never construct them, but the question itself is reasonable. I am doing these queries via an API request to open-ils. search. biblio. multiclass. query, not using the TPAC search box.
The request which crashed our DB permanently was for "Animal Farm". Apparently, there are at least 300 editions of this book with different ISBNs. So the expanded, generated query was something like 'identifier| isbn:(001000838 1 || 0030554349 || 0134354680 ...(300+ ISBNs later)... || 9992772115)'.
I'm a reasonable guy, so I can accept is such a query is simply too much, and can't easily be made to work. If so, I think we should consider putting some hard limits at some point in the query generation chain to prevent these cases from getting too far along.
That said, I am also concerned about at least one aspect of the generated query for this case. The "meat" of the query looks something like this:
... identifier_ isbn.id IS NOT NULL)
(xd221f38_ identifier_ isbn.id IS NOT NULL) identifier_ isbn.id IS NOT NULL)
(xd2224d8_ identifier_ isbn.id IS NOT NULL)
(
(
(xd221da0_
)
OR
(
(
(
)
OR
(
(
(
(xd222208_
)
OR
(
(
(
)
)
)
)
)
)
)
)
...
Is there a reason the nesting gets deeper and deeper for every added bool? Or is this another aspect to this bug? Maybe there is a reason, with the most likely being that the extra nesting makes the parser simpler in some way, but I thought it at least looked fishy enough to mention, as it seems very possible that this massive nested structure is a major factor in the DB crashes.
Finally, I have moved this to a security bug, since giving an example query which crashes EG makes me uneasy.
Thanks again.