Comment 2 for bug 723234

Revision history for this message
Andrew Bennetts (spiv) wrote : Re: cant pull or check, with big tree

My guess is the root cause of this is an out-of-memory problem (so is probably a duplicate of one of the existing reports tagged 'memory').. Does your big tree include any particularly big files?

That whatever happens causes SystemError from the zlib module would be a bug in Python itself. Calling zlib.decompress should never cause that.

Objects/stringobject.c:4271 of Python 2.6.6 is in _PyString_Resize, after this check is tripped:

(!PyString_Check(v) || Py_REFCNT(v) != 1 || newsize < 0 || PyString_CHECK_INTERNED(v))

zlibmodule.c does call _PyString_Resize directly in places, including in the decompress function (PyZlib_decompress)... at a glance I'd suspect the newsize < 0 is the issue: if the string being decompressed expands to something sufficiently huge I think that could happen. PyZlib_decompress calls _PyString_Resize to double the size of the buffer (via << 1) it is decompressing into as the zlib library produces more decompressed data, and newsize is a Py_ssize_t, a signed type...

If that's the case, then basically we need to either fix bzr to either extract such large compressed strings in parts rather than all at once, or avoid compressing such large strings in the first place. There's been some discussion about and gradual progress towards these solutions. John would know more about how far off we are.

Out of interest, how big is your "big tree"? How many files, and how big is the biggest file, and what's the average file size?