Bug:EntryResource:linkBranch timeouts due to branch scanner transaction length

Bug #816235 reported by Robert Collins
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Launchpad itself
Fix Released
Critical
Unassigned

Bug Description

  1 INSERT INTO BugBranch (branch, bug, datecreated, revision_hint, registrant) VALUES ($INT, $INT, CURRENT_TIMESTAMP AT TIME ZONE $STRING, None, $INT) RETURNING BugBranch.id:
   POST: 1 Robots: 0 Local: 1
      1 https://bugs.launchpad.net/api/devel/bugs/721166 (Bug:EntryResource:linkBranch)
       OOPS-2032F24

32. 128 8970ms SQL-launchpad-main-master
INSERT INTO BugBranch (branch, bug, datecreated, revision_hint, registrant)
VALUES (501845, 721166, CURRENT_TIMESTAMP AT TIME ZONE 'UTC', NONE, 67034) RETURNING BugBranch.id

description: updated
Revision history for this message
Robert Collins (lifeless) wrote :

This behaviour - a very long trivial insert - will occur when two transactions attempt to do something that violates a unique constraint at the same time.

e.g. in launchpad_dev, do this in two sessions:
begin;
INSERT INTO BugBranch (branch, bug, datecreated, revision_hint, registrant) VALUES (76, 13, CURRENT_TIMESTAMP AT TIME ZONE 'UTC', NULL, 2) RETURNING BugBranch.id;

observe that the second one will block until the first transaction completes; at which point it will either raise a duplicate key error or complete itself.

Whats odd though, if that is whats happening here, is that the first page reaching the insert must have been ultra-slow for (presumably some other reason).

We probably want some log analysis to confirm this theory.

Revision history for this message
Robert Collins (lifeless) wrote :

Ah, but the branch scanner will also insert bugbranch links, which would explain the > 9 second interval leading up to the commit.

So the answer here is that the scanner needs to be faster - < 1 second between the first update/insert and the commit in all its transactions.

summary: - Bug:EntryResource:linkBranch timeouts
+ Bug:EntryResource:linkBranch timeouts due to branch scanner transaction
+ length
Revision history for this message
Robert Collins (lifeless) wrote :

The branch/bug being linked that triggered this case was a launchpad branch named after the bug - I think its likely that a rev in it does indeed link to the bug.

Revision history for this message
Robert Collins (lifeless) wrote :

Another blocking case:
backend does:
begin;
update branch set last_scanned = CURRENT_TIMESTAMP AT TIME ZONE 'UTC' where id=76;

web ui does:
begin;
INSERT INTO BugBranch (branch, bug, datecreated, revision_hint, registrant) VALUES (76, 13, CURRENT_TIMESTAMP AT TIME ZONE 'UTC', NULL, 2) RETURNING BugBranch.id;

this will also block, and doesn't need a bzr rev metadata to trigger.

William Grant (wgrant)
tags: added: branch-scanner
Revision history for this message
William Grant (wgrant) wrote :

Fixed by FOR KEY SHARE in PostgreSQL 9.3.

Changed in launchpad:
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.