on MemoryError, log/report memory usage by type
Bug #551391 reported by
Martin Pool
This bug affects 1 person
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Bazaar |
Fix Released
|
Low
|
Karl Bielefeldt |
Bug Description
If bzr aborts with a MemoryError, it might help with debugging if we log to bzr.log and/or to an apport report a summary of allocated objects, available from gc.get_objects. We could just ship a very simple version that prints the count of objects of each type in gc.get_objects(), but that may just tell us there's a lot of strings allocated, and perhaps we'd have to use meliae.
Related branches
lp:~kbielefe/bzr/551391-log-memory-usage
- Vincent Ladeuil: Needs Information
- Martin Pool: Needs Fixing
- John A Meinel: Needs Fixing
-
Diff: 106 lines (+41/-1)4 files modifiedNEWS (+4/-0)
bzrlib/help_topics/en/debug-flags.txt (+1/-0)
bzrlib/tests/test_trace.py (+12/-1)
bzrlib/trace.py (+24/-0)
Changed in bzr: | |
assignee: | nobody → Karl Bielefeldt (kbielefe) |
status: | Confirmed → In Progress |
Changed in bzr: | |
status: | In Progress → Fix Committed |
Changed in bzr: | |
status: | Fix Committed → Fix Released |
milestone: | none → 2.3b3 |
To post a comment you must log in.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Martin Pool wrote:
> Public bug reported:
>
> If bzr aborts with a MemoryError, it might help with debugging if we log
> to bzr.log and/or to an apport report a summary of allocated objects,
> available from gc.get_objects. We could just ship a very simple version
> that prints the count of objects of each type in gc.get_objects(), but
> that may just tell us there's a lot of strings allocated, and perhaps
> we'd have to use meliae.
>
> ** Affects: bzr
> Importance: Low
> Status: Confirmed
>
>
> ** Tags: apport memory
>
If we are running on python2.6 we could also do some basic stats
gathering using 'sys.getsizeof()', which then lets you do stuff like:
info = {} (type(o) , [0, 0])
getsizeof = getattr(sys, 'getsizeof', lambda x: 0)
for o in gc.get_objects():
x = info.setdefault
x[0] += 1
x[1] += getsizeof(x)
Though I guess you are starting to grow your memory consumption.
Hopefully the type dict won't get too big (large dicts consume a lot of
memory).
Note, however, that *strings* are not in gc.get_objects() because they
don't have references and thus don't participate in cycles or the
garbage collector. As such Meliae had some tricks to walk some refs to
see if it could find more data. In the end, it was more efficient to use
memory building a set that could track what objects had been found, and
more accurate.
I think with the inclusion of StaticTuple, we broke the trick Meliae was
originally using. Namely:
obj.foo = ST(ST('str1', 'str2'))
At this point, obj is in gc, and references an outer ST, which
references an inner one, but you don't get as far as the actual strings.
If we really wanted to be memory efficient, a bloom filter would
probably get us decent accuracy, costing say 1MB of memory.
So I guess I have to say... far easier to have a line:
try: dump_all_ objects( 'bzr_memory_ reference_ dump.json' )
from meliae import scanner
except:
return
scanner.
John enigmail. mozdev. org/
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://
iEYEARECAAYFAku yMjQACgkQJdeBCY SNAAPvGACgo38ij nlnlOzfE43HQlBi F/eM Mclveafp+ UmDkVz3D
siUAn1zgK5YTs1P
=0GDh
-----END PGP SIGNATURE-----