'leaks' Branch objects

Bug #758652 reported by John A Meinel
22
This bug affects 4 people
Affects Status Importance Assigned to Milestone
Bazaar
Confirmed
Low
Unassigned
loggerhead
Triaged
High
Unassigned
loggerhead-breezy
Triaged
High
Unassigned

Bug Description

I think this possibly needs fixes in both bzrlib as well as in Loggerhead.

Basically, we seem to get a lot of stale BzrBranch7 objects lying around in memory, waiting for the garbage collector to run to clear them all out.

After starting a test "bzr serve --http", and then poking at a bunch of pages, memory consumption goes up a fair amount. (Initial memory is 24MB VmRSS / 117MB VmPeak, after browsing just /changes it goes to 35/201MB, after a bunch of browsing it gets to 368/580MB)

I wrote this patch:
 lp:~jameinel/loggerhead/meliae-dump

Which lets me hit a "+memdump" URL and get a Meliae dump of the memory. Digging through it shows *5* BzrBranch7 objects, along with their associated CHKInventoryRepository and BTreeIndex caches all fully populated.
Also digging through it shows lots of instancemethod(80765408 80B 3refs 1par) references, which is probably where all of the gc cycles exist.

Off-hand, I'm thinking we should add a Branch.finalize() method, so that loggerhead (and other long-lived apps) can say "I don't need you any more, release all of your references and go away". Ideally it wouldn't be needed in a gc language like python, but 300+MB sitting around doing nothing is pretty bad.

Revision history for this message
John A Meinel (jameinel) wrote :

Also assigning bzr because I think it is cleanest as a Branch.finalize sort of action.

Changed in bzr:
importance: Undecided → Low
status: New → Confirmed
Revision history for this message
Martin Packman (gz) wrote :

Have you confirmed these are actually garbage collectable cycles, by for instance adding `gc.collect()` in you +memdump page and seeing the branches no longer present? I'd be surprised if even the Python collector let 300MB stay alive for long. If the objects are unreachable, that needs fixing in bzr. If loggerhead is accidentally stashing them somewhere however, that needs changing. Adding a finalize may help things for users of bzrlib, but could also cause confusion about whether a Branch object is dead or not.

Revision history for this message
John A Meinel (jameinel) wrote : Re: [Bug 758652] Re: 'leaks' Branch objects

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 04/13/2011 04:29 PM, Martin [gz] wrote:
> Have you confirmed these are actually garbage collectable cycles, by for
> instance adding `gc.collect()` in you +memdump page and seeing the
> branches no longer present? I'd be surprised if even the Python
> collector let 300MB stay alive for long. If the objects are unreachable,
> that needs fixing in bzr. If loggerhead is accidentally stashing them
> somewhere however, that needs changing. Adding a finalize may help
> things for users of bzrlib, but could also cause confusion about whether
> a Branch object is dead or not.
>

I'm not 100% positive, but when I was testing I did SIGQUIT,
'gc.collect()' and got 20700 back. (Meaning gc found 20,000 items that
it could garbage collect.)

My guess is that the objects lived long enough to make it into the
third-tier, so don't get garbage collected often. (Which isn't very hard
if something actively creates a lot of objects for a while, and then is
done with them. Since creating all those objects triggers the
lightweight gc runs, which moves them into the next tier of gc.)

John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk2lttkACgkQJdeBCYSNAAMgnACeMZKlx/UgjH636OCTstLv7IBE
6IcAnRY9Smk/XQRp81nNP7BH+7nSoWiO
=XxTY
-----END PGP SIGNATURE-----

Changed in loggerhead:
status: Confirmed → Triaged
importance: Medium → High
tags: added: performance
Jelmer Vernooij (jelmer)
tags: added: check-for-breezy
Jelmer Vernooij (jelmer)
Changed in loggerhead-breezy:
status: New → Triaged
importance: Undecided → High
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.