ResultSet.is_empty() raises IndexError with valid query

Bug #296739 reported by Guilherme Salgado
4
Affects Status Importance Assigned to Milestone
Storm
New
Undecided
Unassigned

Bug Description

In Launchpad, our queries that use the LIKE operator are written like this:

>>> query = "englishname ILIKE '%%' || %(text)s || '%%'" % {'text': quote_like('Spanish')}
>>> query
"englishname ILIKE '%' || 'Spanish' || '%'"

When we feed that query to Store.find(), we get a ResultSet object which contains the correct items from the DB and looks like a fully-functional ResultSet.

>>> r = s.find(Language, query)
>>> r.count()
21
>>> r.any()
<Language at 0xb932eec>

However, if we call the .is_empty() method, it'll raise an IndexError.

>>> r.is_empty()
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "/home/salgado/devel/launchpad/new-storm/lib/storm/store.py", line 1047, in is_empty
    result = self._store._connection.execute(select)
  File "/home/salgado/devel/launchpad/new-storm/lib/storm/databases/postgres.py", line 257, in execute
    return Connection.execute(self, statement, params, noresult)
  File "/home/salgado/devel/launchpad/new-storm/lib/storm/database.py", line 203, in execute
    raw_cursor = self.raw_execute(statement, params)
  File "/home/salgado/devel/launchpad/new-storm/lib/storm/databases/postgres.py", line 267, in raw_execute
    return Connection.raw_execute(self, statement, params)
  File "/home/salgado/devel/launchpad/new-storm/lib/storm/database.py", line 284, in raw_execute
    self._check_disconnect(raw_cursor.execute, *args)
  File "/home/salgado/devel/launchpad/new-storm/lib/storm/database.py", line 325, in _check_disconnect
    return function(*args, **kwargs)
IndexError: tuple index out of range

Tags: tech-debt
Revision history for this message
James Henstridge (jamesh) wrote :

The problem is caused by the way psycopg2 interprets query strings. If you call cursor.execute(query), then psycopg2 passes the query through directly. If you call cursor.execute(query, params), it performs string substitution.

Your query string contains unmatched format characters, so it works if the query Storm produces has no parameters (as the count() query appears to) but fails if it does produce parameters (as is_empty() appears to).

There are two solutions to your problem:

 1. Use Storm's expression compiler instead of raw query strings. The storm.sqlobject.CONTAINSSTRING expression object does most of what you want, although it doesn't turn on case insensitivity.

 2. This one is a hack: double the percent signs. In pattern matching, "%%" will match any number of characters as well as "%", so will work whether string substitution is performed on your query string or not. I think this one is used in one or two places in Launchpad already.

Revision history for this message
Guilherme Salgado (salgado) wrote :

I don't think we should be solving this at the application level. The fact that some queries work with everything but is_empty() will certainly be a source of headaches to other users of storm as well.

Given that we know about that behavior, I think storm should try to workaround it instead have the applications do so.

Revision history for this message
James Henstridge (jamesh) wrote :

You are passing raw query strings to Storm. If you make use of Storm's query building features, it will build correct queries for you.

If you really must pass raw query strings, it would be safest to assume that they will be escaped and double the percent signs.

Revision history for this message
Данило Шеган (danilo) wrote :

After upgrade to Storm 0.14, this bug bit us as well. Apparently, SQLObjectResultSet.__nonzero__ implementation was incompatibly changed in rev 282, since ResultSet.is_empty() seems to behave differently from ResultSet.any() with exactly the same queries.

We see this as bug #408845 with a traceback like

  http://launchpadlibrarian.net/29676954/uK8pt7QInv178EM0SyyuKEuaZai.txt

It's very hard to reproduce in a test case (on production, it fails on roughly 10-20% of our PO imports, and we had 2751 imports fail due to this bug), and this particular query shouldn't be have any percent-signs to replace.

Revision history for this message
Данило Шеган (danilo) wrote :

My guess is that the problem happens in wrapping the subselect inside another select in is_empty(), whereas any() doesn't do that.

Curtis Hovey (sinzui)
tags: added: tech-debt
Revision history for this message
James Henstridge (jamesh) wrote :

Just to reiterate what I said before, if you are hitting this bug then your code is buggy. You are including interpolation markers in your raw query string and they are being treated as such by psycopg2.

Since this bug was filed, Storm's column object has grown a contains_string() method that would let you do a case sensitive version of the query in the original comment as:

    result = store.find(Language, Language.englishname.contains_string('Spanish'))

It should be pretty easy to hook in a "case_senstive=False" optional argument to this function if needed.

If you must do raw SQL, consider interpolating the strings that are out of your control. For example:

    result = store.find(Language, SQL('englishname ILIKE ?', ['%Spanish%']))

This will have psycopg2 interpolate the string '%Spanish%', so you don't need to worry about interpolation markers in the string itself.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.