Investigate KARL slowness on writes and search

Bug #974360 reported by Paul Everitt
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
KARL3
Fix Released
High
Shane Hathaway

Bug Description

OSF's production KARL has gotten slower on searches and on writes. Significantly, and recently. Perhaps there is some memcache change or something.

So for this ticket, do some research. This is a vague ticket, obviously, but a real problem.

Revision history for this message
Paul Everitt (paul-agendaless) wrote :

Leaving this open for us to respond on the questions about exclusive lock, etc.

Changed in karl3:
status: New → In Progress
Changed in karl3:
milestone: m98 → m99
Changed in karl3:
milestone: m99 → m100
JimPGlenn (jpglenn09)
Changed in karl3:
milestone: m100 → m101
Revision history for this message
Shane Hathaway (shane-hathawaymix) wrote :

For future reference, here is a typical query that was taking up to 30 seconds when we first started working on this issue::

        SELECT docid,
            coefficient * ts_rank_cd('{0.00003, 0.01, 0.03, 1.0}', text_vector, query) AS rank
        FROM pgtextindex, to_tsquery('english', '''fro'':*') query
        WHERE (text_vector @@ query)
         AND marker = 'Files'
        ORDER BY rank DESC;

Revision history for this message
Shane Hathaway (shane-hathawaymix) wrote :

Here are the problems we discovered and the solutions we implemented:

- The database server did not have enough RAM to keep the pgtextindex in memory. Expanding from 2 GB to 4 GB reduced thrashing significantly.

- Apparently, PG's autovacuum had never run, because locks were always held. This caused the query optimizer to make bad choices; some queries were thus unnecessarily expensive. Gocept added a cron job that manually performs the functions we expected from autovacuum.

- The repoze.pgtextindex package was using exclusive locks which hurt write performance. Gocept suggested read committed isolation rather than serializable isolation, avoiding the need for exclusive locks. I implemented that suggestion in repoze.pgtextindex 0.5, which is now in the karlhosting index and ready to deploy with the next release.

- Gocept revealed that Karl's postoffice opens connections frequently and explained that opening a connection causes PG to hold a bunch of locks for a short time. It's important to use connection pooling. Chris Rossi began work on that.

In summary, I believe we have solved most of the problem. One of the pieces will go out with the next release and the final piece is on Chris Rossi's list.

Changed in karl3:
status: In Progress → Fix Committed
Revision history for this message
JimPGlenn (jpglenn09) wrote :

I am clearing this.

Changed in karl3:
status: Fix Committed → Fix Released
JimPGlenn (jpglenn09)
tags: added: r3.86
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.