Deadlock between UI and importer

Bug #311175 reported by Jeroen T. Vermeulen
2
Affects Status Importance Assigned to Milestone
Launchpad itself
Fix Released
Medium
Jeroen T. Vermeulen

Bug Description

On 2008-12-23 we observed a deadlock between the importer and the translation UI, leading to an error from the import script and two timeout reports from the app servers. (The affected UI user reported retrying once when the server took too long to respond).

The deadlock occurred at the end of a template import, when the importer was recomputing statistics for the template's translations. The UI user was editing one of these translations at the time. The deadlock and timeouts were all reported with the same timestamp, which was apparently the moment the database detected and broke the deadlock.

Deadlock traceback: https://pastebin.canonical.com/12314/

Recomputing the statistics for all of a template's translations does not take much time, but it covers a lot of data. We can reduce the risk by moving this work into a separate transaction, or even one transaction per translation.

Revision history for this message
Данило Шеган (danilo) wrote :

FWIW, it might be interested to track if this behaviour is also the cause of occassional PO file stats skew in our update-stats cronjob.

Revision history for this message
Jeroen T. Vermeulen (jtv) wrote :

It's one possible reason; simple concurrency would be another. In theory, concurrent SERIALIZABLE transactions would run in a near-perfect illusion of being the only transaction currently executing. In practice, that's expensive to achieve so postgres approximates it. (It's recently been discovered that this approximation doesn't quite meet the standard).

That means that there may well be some cracks in this case that occasional inconsistencies can sneak through. As long as those are few and small, and the cron job catches them anyway, trying to fix that is probably not worth the effort.

Revision history for this message
Jeroen T. Vermeulen (jtv) wrote :

It's an easy fix, so might as well deal with it.

Changed in rosetta:
assignee: nobody → jtv
importance: Undecided → Medium
milestone: none → 2.2.2
status: New → In Progress
Changed in rosetta:
status: In Progress → Fix Committed
Revision history for this message
Jeroen T. Vermeulen (jtv) wrote :

Just happened again: http://launchpadlibrarian.net/22404413/x9AeJEyaFcnj4kzCoDkDtSvzWjE.txt

The fix has landed, but not rolled out yet.

Revision history for this message
Jeroen T. Vermeulen (jtv) wrote :

Happening really frequently now. Fix to roll out next week.

Changed in rosetta:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.