mirror-branch using too much memory

Bug #382795 reported by Herb McNew
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Launchpad itself
Fix Released
Critical
Michael Hudson-Doyle

Bug Description

I assume this is only for large branches, since it seems to be related to a mysql branch, but mirror branch is using too much memory. Causing the codehosting server to go into swap and eventually OOM itself.

Example ps output:

codehost 15480 86.6 56.8 6123612 4659576 ? R 15:55 5:29 \_ /usr/bin/python2.4 /srv/bazaar.launchpad.net/production/launchpad-rev-8105/cronscripts/../scripts/mirror-branch.py lp-hosted:///~scut-tang/mysql-server/mysql-6.0-infoschema lp-mirrored:///~scut-tang/mysql-server/mysql-6.0-infoschema 49399 ~scut-tang/mysql-server/mysql-6.0-infoschema HOSTED 1 /~mysql/mysql-server/mysql-5.1

As you can see it's > 4.5GB resident.

Tags: lp-code
Herb McNew (herb)
Changed in launchpad:
importance: Undecided → Critical
Ursula Junque (ursinha)
affects: launchpad → launchpad-code
Jonathan Lange (jml)
Changed in launchpad-code:
assignee: nobody → Jonathan Lange (jml)
status: New → In Progress
Revision history for this message
Jonathan Lange (jml) wrote :

Although we haven't identified the root cause of this bug, we have disabled the branch that is triggering it.

Revision history for this message
Jonathan Lange (jml) wrote :

Bug appears to be within Bazaar itself. Building the C extensions avoids the massive memory usage.

Revision history for this message
scut_tang (scut-tang) wrote :

Oh, I am sorry to hear this.
I just push the branch to
lp:~scut-tang/mysql-server/mysql-6.0-infoschema
But in https://launchpad.net/~scut-tang/mysql-server/mysql-6.0-infoschema
I see "Recent revisions: This branch has not been pushed to yet."
Then I re-post it.
bzr returns "No new revisions to push."

Revision history for this message
Ursula Junque (ursinha) wrote :

I guess the problem scut_tang related is related to OOPS-1250XMLP6: BranchTypeError: ~scut-tang/mysql-server/mysql-6.0-infoschema

Revision history for this message
Jonathan Lange (jml) wrote : Re: [Bug 382795] Re: mirror-branch using too much memory

On Fri, Jun 5, 2009 at 2:26 AM, Ursula Junque
<email address hidden> wrote:
> I guess the problem scut_tang related is related to OOPS-1250XMLP6:
> BranchTypeError: ~scut-tang/mysql-server/mysql-6.0-infoschema
>

We're getting that OOPS because of the somewhat crude way we disabled
mirroring of the branch.

jml

Revision history for this message
Jonathan Lange (jml) wrote :

Latest news:

We rolled out a build of Bazaar that had C extensions and then re-enabled this branch. Memory usage skyrocketed so we killed the mirroring job & disabled it again.

Currently, we're not sure whether we failed to roll Bazaar out correctly or whether I misdiagnosed the issue & C extensions are a red herring. We ran out of chances to get a sysop to investigate for us, so we'll have to look into this again on Monday or Tuesday.

jml

Changed in launchpad-code:
milestone: none → 2.2.6
Revision history for this message
Jonathan Lange (jml) wrote :

We have verified the following on staging and on my laptop:
 - Without C extensions, 'bzr branch offending-branch /tmp/wibble' skyrockets in memory
 - With C extensions, it reaches about 1.1g -- a lot, but manageable.
 - Regardless of whether C extensions are installed, the puller cronscript rapidly consumes all available memory when pulling offending-branch.

To resolve this, we will:
 - Try upgrading to 1.15 and see if the bug has been magically fixed.
 - Land a branch that uses setrlimit to cap the amount of memory used.

Once this is done, we'll be able to downgrade this bug from critical to high and to re-enable the offending branch. Note that the branch will probably still fail to mirror, since we have not addressed the underlying bug.

To find the underlying bug, we should probably use py_memory_dump to examine how the memory is being consumed. We might also want to pare the puller down into its bare essentials so as to reproduce the problem with minimal code.

Revision history for this message
Jonathan Lange (jml) wrote :

Filed bug 385040 to track building C extensions.

Revision history for this message
Jonathan Lange (jml) wrote :

Michael's doing the setrlimit work now, I believe.

Changed in launchpad-code:
assignee: Jonathan Lange (jml) → Michael Hudson (mwhudson)
Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

r8582 added some code to limit the amount of memory the puller can use.

We still don't know what's special about the branch that was causing problems, but a band-aid is enough for this cycle -- and a good idea anyway,

Changed in launchpad-code:
status: In Progress → Fix Committed
Revision history for this message
Jonathan Lange (jml) wrote :

Robin, I have been able to push branches to the mysql-server project. Here's what I did.

 1. I fetched a branch of the mysql-server project. 'bzr branch lp:mysql-server'
 2. I made my own branch locally. 'bzr branch mysql-server my-branch'
 3. *Important* I upgraded my own branch to branch format 7. 'bzr upgrade --1.9 my-branch'
 4. You can check this with 'bzr info -v my-branch', look for the line that says "branch: Branch format 7"
 5. I pushed the branch to mysql-server. 'bzr push lp:~jml/mysql-server/my-branch'. Bazaar's response to this is:

$ bzr push lp:~jml/mysql-server/my-branch
Using default stacking branch /~mysql/mysql-server/mysql-5.1 at lp-140046482387216:///~jml/mysql-server
Source branch format does not support stacking, using format:
  Branch format 7
Created new stacked branch referring to /~mysql/mysql-server/mysql-5.1.

From my home in Sydney, this took about 30 seconds.

I'm sorry that the format / stacking interaction is so subtle. I've raised the issue with the Bazaar developers, and am very confident that the 2.0 release will address these issues.

I hope this helps. Let me know if you have any troubles.

jml

Jonathan Lange (jml)
Changed in launchpad-code:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.