storm processing of result sets can be very very slow

Bug #618019 reported by Robert Collins on 2010-08-14
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Launchpad itself
Low
Unassigned
Storm
Low
Unassigned

Bug Description

We've had timeouts where a couple thousand objects are returned from a fast SQL query. E.g. the sql is reported as taking 300ms, but the page times out; changing those queries to be a count() stopped the timeout;

One theory is that the there is something taking up excessive time in the db layers - dbapi/storm.

Another theory is that the DB time is actually much greater.

Looking at the tracer code that aggregates SQL time, we count as sql time the time to obtain a cursor, not the aggregate time involved in retrieving content from it.

I think that the latter theory is correct and what we need to do is to also add to the sql time the time spent in fetch* calls to the raw cursor, because in large result sets this may be batched?, and we're thus we're blaming python for time spent in the DB layers.

Related to this we may want to start calculating deserialisation/storm cache processing time, but I'll file a separate bug for this in the future.

IMO this is a 'high' priority bug, because accurate information about performance is an essential tool for the development process.

Related branches

Gary Poster (gary) wrote :

Triaging as low for now, and asking for Stuart's thoughts. If he confirms your suspicions, I agree that getting this fixed is important and will raise the priority.

Changed in launchpad-foundations:
status: New → Triaged
importance: Undecided → Low
assignee: nobody → Stuart Bishop (stub)
Stuart Bishop (stub) on 2010-08-18
affects: launchpad-foundations → storm
Stuart Bishop (stub) wrote :

Storm needs to grow this features. I think it involves extending the tracer API and invoking the new hooks from inside ResultSet._load_objects somewhere, but someone familiar with Storm internals would need to confirm.

Changed in storm:
assignee: Stuart Bishop (stub) → nobody
Changed in launchpad-foundations:
status: New → Triaged
importance: Undecided → Low
Stuart Bishop (stub) wrote :

Launchpad will need to update its customized PostgresTimeoutTracer when this facility is landed to ensure opstats is updates and the transaction doomed.

Stuart Bishop (stub) wrote :

Not our customized PostgresTimeoutTracer, but LaunchpadStatementTracer since we want this in our statement logs. We probably want two times for each statement (query time and fetch time) so we can separate the load on the db server from the load on the clients.

Gustavo Niemeyer (niemeyer) wrote :

Whatever the cost of retrieving an object from the database is, it can probably be classified close to O(1). Of course, when you modify the query to not return any objects by using count(), you're stopping not only the DB and ORM retrieval of such objects (which certainly has that fixed cost), but any other processing done with these objects. Given a sufficient number of objects in such a context, any trivial query (SELECT *) will cause the application to time out.

Given that, I see no reason for Storm to be tracking the fetch time, unless there's proper evidence confirming that the time spent doing fetches is significant in time and in variance.

Changed in storm:
status: Triaged → Incomplete

As explained in the summary, we are regularly encountering situations
for which a very plausible scenario is 'reading from the raw cursor is
blocking', and requests are spending 10 or more seconds in storm code
paths.

Either its storm, or its the DB itself.

We want to know systematically, not case by case, so that when this
happens developers can immediately address the issue rather than
roundtripping via sysadmins.

There are all sorts of things potentially using time up in reading from cursors:
 - networking
 - work in other threads in python
 - db side buffering

So we won't know which of these is the case - but we don't for our
existing traces either, because we're a threaded app in python, so its
no *less* accurate, and we find the existing timing data extremely
useful.

I don't understand why you would want to avoid being able to gather
evidence about where the time is going in an systematic way cleanly
integrated into the codebase.

If iterating over thousands of objects has a fixed cost which you're unhappy about, I wouldn't like to increase that fixed cost by adding persistent debugging logic on that fast path.

This is the logic of iteration of results within Storm's database.py:

            for result in results:
                yield tuple(self.from_database(result))

It doesn't look good to add a tracer+timer in this loop.

If you're wondering about network/db problems, I recommend monitoring these pieces more closely, rather than adding that logic to Storm.

Jamu Kakar (jkakar) wrote :

I would guess time-to-fetch can be an important measurement. A
query to fetch a 1000 objects might be very quick to execute, but
pulling the data might be slow. If you see that the query takes
100ms but the reality is that 2s are spent fetching, and you can't
easily know that that's the case, it will be hard to figure out that
you ought to figure out how to do what you want with less than 1000
objects.

I think it would be worth adding this kind of logic to Storm. If
we're unsure about whether it's truly useful we could add it
experimentally and use it in anger to determine how useful it really
is.

On Thu, Aug 19, 2010 at 6:10 AM, Gustavo Niemeyer <email address hidden> wrote:
> If iterating over thousands of objects has a fixed cost which you're
> unhappy about, I wouldn't like to increase that fixed cost by adding
> persistent debugging logic on that fast path.

Iteration is only part of it. I'm not worried about the python
overhead as yet - simple iteration is very fast.

> This is the logic of iteration of results within Storm's database.py:
>
>            for result in results:
>                yield tuple(self.from_database(result))
>
> It doesn't look good to add a tracer+timer in this loop.

That loop looks cheap (and may be, though storms complex variable
mapping is inside that loop.

Its not the area that matters though. This is the full loop:
        while True:
           results = self._connection._check_disconnect(
                self._raw_cursor.fetchmany)
            if not results:
                break
            for result in results:
                yield tuple(self.from_database(result))

The second like calls raw_cursor.fetchmany which AIUI *can block* and
*do networking*.

If I'm wrong, its a nonissue. Am I wrong?

> If you're wondering about network/db problems, I recommend monitoring
> these pieces more closely, rather than adding that logic to Storm.

Storm is our connection to the DB, we've done the same thing with
tracers to track the initial time for queries so far - I mean we can
write a replacement raw_cursor and db implementation, but if we need
to do that why are we using a storm tracer at all for the other
identical things?

As we discussed online:

- You're right, it's 1/10th the time for fetchmany() (arraysize will be set to 10 items at a time by default)

- We'll need two trace calls for every fetch, or rowcount/10*2 on fetchmany.

- I'm still a bit unhappy that we're adding O(n) overhead on the retrieval of results to debug an issue which isn't entirely related to the application logic (timeouts will mean network problems, or problems in the database machine).

- I realize it's important to you. Let's move ahead and get an implementation in a straightforward way and discuss from there.

Changed in storm:
status: Incomplete → Confirmed

Gustavo also asked that we guard the tracer iteration with 'if
_tracers' as a cheap tool to reduce some of the overhead in the
non-traced case.

I mentioned this on IRC to Robert:

For psycopg2, the Cursor.fetch*() calls are not network IO bound: the network IO occurs during the PQexec() invocation. The fetch*() calls are just involve converting the PGresult to Python values. This is going to be CPU bound, and because it is working with Python objects, it is CPU time holding the GIL.

It doesn't represent additional database round trips.

Thanks James; knowing that I'm certainly less stressed about getting
this in; we have now got some more concrete data about where time may
be going - _load_objects on *some* objects seems to be very slow. (see
bug 632145).

summary: - OOPS may be underrepresenting storm/sql time
+ storm processing of result sets can be very very slow
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers