Big tree causes 'SystemError: ../Objects/stringobject.c' in groupcompress

Bug #723234 reported by seyacat on 2011-02-22
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Bazaar
Medium
Unassigned

Bug Description

i add a big tree and before commit, y cant pull, or check,

bzr check
Checking working tree at '/home/sandrade/workspace'.
Checking branch at 'file:///home/sandrade/workspace/'.
Checking repository at 'file:///home/sandrade/workspace/'.
bzr: failed to report crash using apport:
     OSError(13, 'Permiso denegado')
bzr: ERROR: exceptions.SystemError: ../Objects/stringobject.c:4271: bad argument to internal function

Traceback (most recent call last):
  File "/usr/lib/python2.6/dist-packages/bzrlib/commands.py", line 912, in exception_to_return_code
    return the_callable(*args, **kwargs)
  File "/usr/lib/python2.6/dist-packages/bzrlib/commands.py", line 1112, in run_bzr
    ret = run(*run_argv)
  File "/usr/lib/python2.6/dist-packages/bzrlib/commands.py", line 690, in run_argv_aliases
    return self.run(**all_cmd_args)
  File "/usr/lib/python2.6/dist-packages/bzrlib/commands.py", line 705, in run
    return self._operation.run_simple(*args, **kwargs)
  File "/usr/lib/python2.6/dist-packages/bzrlib/cleanup.py", line 135, in run_simple
    self.cleanups, self.func, *args, **kwargs)
  File "/usr/lib/python2.6/dist-packages/bzrlib/cleanup.py", line 165, in _do_with_cleanups
    result = func(*args, **kwargs)
  File "/usr/lib/python2.6/dist-packages/bzrlib/builtins.py", line 3286, in run
    check_dwim(path, verbose, do_branch=branch, do_repo=repo, do_tree=tree)
  File "/usr/lib/python2.6/dist-packages/bzrlib/check.py", line 452, in check_dwim
    check_repo=do_repo)
  File "/usr/lib/python2.6/dist-packages/bzrlib/decorators.py", line 140, in read_locked
    result = unbound(self, *args, **kwargs)
  File "/usr/lib/python2.6/dist-packages/bzrlib/repository.py", line 2776, in check
    check_repo=check_repo)
  File "/usr/lib/python2.6/dist-packages/bzrlib/repository.py", line 2780, in _check
    result.check(callback_refs)
  File "/usr/lib/python2.6/dist-packages/bzrlib/check.py", line 98, in check
    self.repository._check_inventories(self)
  File "/usr/lib/python2.6/dist-packages/bzrlib/repository.py", line 1206, in _check_inventories
    self._do_check_inventories(checker, bar)
  File "/usr/lib/python2.6/dist-packages/bzrlib/repository.py", line 1263, in _do_check_inventories
    checker, last_object, current_keys[(kind,) + record.key])
  File "/usr/lib/python2.6/dist-packages/bzrlib/repository.py", line 1289, in _check_record
    self._check_text(record, checker, item_data)
  File "/usr/lib/python2.6/dist-packages/bzrlib/repository.py", line 1303, in _check_text
    content = record.get_bytes_as('fulltext')
  File "/usr/lib/python2.6/dist-packages/bzrlib/groupcompress.py", line 438, in get_bytes_as
    self._manager._prepare_for_extract()
  File "/usr/lib/python2.6/dist-packages/bzrlib/groupcompress.py", line 538, in _prepare_for_extract
    self._block._ensure_content(self._last_byte)
  File "/usr/lib/python2.6/dist-packages/bzrlib/groupcompress.py", line 151, in _ensure_content
    self._content = zlib.decompress(self._z_content)
SystemError: ../Objects/stringobject.c:4271: bad argument to internal function

bzr 2.2.1 on python 2.6.6 (Linux-2.6.35-25-generic-pae-i686-with-Ubuntu-10.10-maverick)
arguments: ['/usr/bin/bzr', 'check']
encoding: 'UTF-8', fsenc: 'UTF-8', lang: 'es_EC.UTF-8'
plugins:
  bash_completion /usr/lib/python2.6/dist-packages/bzrlib/plugins/bash_completion [2.2.1]
  bzrtools /usr/lib/python2.6/dist-packages/bzrlib/plugins/bzrtools [2.2.0]
  explorer /usr/lib/python2.6/dist-packages/bzrlib/plugins/explorer [1.1.0]
  launchpad /usr/lib/python2.6/dist-packages/bzrlib/plugins/launchpad [2.2.1]
  netrc_credential_store /usr/lib/python2.6/dist-packages/bzrlib/plugins/netrc_credential_store [2.2.1]
  news_merge /usr/lib/python2.6/dist-packages/bzrlib/plugins/news_merge [2.2.1]
  qbzr /usr/lib/python2.6/dist-packages/bzrlib/plugins/qbzr [0.19.1]

*** Bazaar has encountered an internal error. This probably indicates a
    bug in Bazaar. You can help us fix it by filing a bug report at
        https://bugs.launchpad.net/bzr/+filebug
    including this traceback and a description of the problem.

seyacat (seyacat) wrote :

I test the same big tree with bzr and svn, and i can see bzr its not for big projects.

Andrew Bennetts (spiv) wrote :

My guess is the root cause of this is an out-of-memory problem (so is probably a duplicate of one of the existing reports tagged 'memory').. Does your big tree include any particularly big files?

That whatever happens causes SystemError from the zlib module would be a bug in Python itself. Calling zlib.decompress should never cause that.

Objects/stringobject.c:4271 of Python 2.6.6 is in _PyString_Resize, after this check is tripped:

(!PyString_Check(v) || Py_REFCNT(v) != 1 || newsize < 0 || PyString_CHECK_INTERNED(v))

zlibmodule.c does call _PyString_Resize directly in places, including in the decompress function (PyZlib_decompress)... at a glance I'd suspect the newsize < 0 is the issue: if the string being decompressed expands to something sufficiently huge I think that could happen. PyZlib_decompress calls _PyString_Resize to double the size of the buffer (via << 1) it is decompressing into as the zlib library produces more decompressed data, and newsize is a Py_ssize_t, a signed type...

If that's the case, then basically we need to either fix bzr to either extract such large compressed strings in parts rather than all at once, or avoid compressing such large strings in the first place. There's been some discussion about and gradual progress towards these solutions. John would know more about how far off we are.

Out of interest, how big is your "big tree"? How many files, and how big is the biggest file, and what's the average file size?

Changed in bzr:
importance: Undecided → Medium
status: New → Confirmed
summary: - cant pull or check, with big tree
+ Big tree causes 'SystemError: ../Objects/stringobject.c' in
+ groupcompress
tags: added: memory
Download full text (6.8 KiB)

Yes i put a couple of huge files, like 200MB,
and media file size is like 300KB

the commit do fine, and i dont have problem sinse i try to make push with
sftp, then i thy with file push
and after with check, all fails.

The I init new repositorie without huge files, pulls looks good, but i wait
like an our and it not finish

On Tue, Feb 22, 2011 at 9:34 PM, Andrew Bennetts <
<email address hidden>> wrote:

> My guess is the root cause of this is an out-of-memory problem (so is
> probably a duplicate of one of the existing reports tagged 'memory')..
> Does your big tree include any particularly big files?
>
> That whatever happens causes SystemError from the zlib module would be a
> bug in Python itself. Calling zlib.decompress should never cause that.
>
> Objects/stringobject.c:4271 of Python 2.6.6 is in _PyString_Resize,
> after this check is tripped:
>
> (!PyString_Check(v) || Py_REFCNT(v) != 1 || newsize < 0 ||
> PyString_CHECK_INTERNED(v))
>
> zlibmodule.c does call _PyString_Resize directly in places, including in
> the decompress function (PyZlib_decompress)... at a glance I'd suspect
> the newsize < 0 is the issue: if the string being decompressed expands
> to something sufficiently huge I think that could happen.
> PyZlib_decompress calls _PyString_Resize to double the size of the
> buffer (via << 1) it is decompressing into as the zlib library produces
> more decompressed data, and newsize is a Py_ssize_t, a signed type...
>
> If that's the case, then basically we need to either fix bzr to either
> extract such large compressed strings in parts rather than all at once,
> or avoid compressing such large strings in the first place. There's
> been some discussion about and gradual progress towards these solutions.
> John would know more about how far off we are.
>
> Out of interest, how big is your "big tree"? How many files, and how
> big is the biggest file, and what's the average file size?
>
> ** Changed in: bzr
> Importance: Undecided => Medium
>
> ** Changed in: bzr
> Status: New => Confirmed
>
> ** Summary changed:
>
> - cant pull or check, with big tree
> + Big tree causes 'SystemError: ../Objects/stringobject.c' in groupcompress
>
> ** Tags added: memory
>
> --
> You received this bug notification because you are a direct subscriber
> of the bug.
> https://bugs.launchpad.net/bugs/723234
>
> Title:
> Big tree causes 'SystemError: ../Objects/stringobject.c' in
> groupcompress
>
> Status in Bazaar Version Control System:
> Confirmed
>
> Bug description:
> i add a big tree and before commit, y cant pull, or check,
>
> bzr check
> Checking working tree at '/home/sandrade/workspace'.
> Checking branch at 'file:///home/sandrade/workspace/'.
> Checking repository at 'file:///home/sandrade/workspace/'.
> bzr: failed to report crash using apport:
> OSError(13, 'Permiso denegado')
> bzr: ERROR: exceptions.SystemError: ../Objects/stringobject.c:4271: bad
> argument to internal function
>
> Traceback (most recent call last):
> File "/usr/lib/python2.6/dist-packages/bzrlib/commands.py", line 912, in
> exception_to_return_code
> return the_callable(*args, **kwargs)
> File "/usr/lib/python2.6/di...

Read more...

seyacat (seyacat) wrote :
Download full text (7.1 KiB)

Repositorie have like 10K files, most of them are images

On Tue, Feb 22, 2011 at 9:48 PM, GatoSeya <email address hidden> wrote:

> Yes i put a couple of huge files, like 200MB,
> and media file size is like 300KB
>
> the commit do fine, and i dont have problem sinse i try to make push with
> sftp, then i thy with file push
> and after with check, all fails.
>
> The I init new repositorie without huge files, pulls looks good, but i wait
> like an our and it not finish
>
>
> On Tue, Feb 22, 2011 at 9:34 PM, Andrew Bennetts <
> <email address hidden>> wrote:
>
>> My guess is the root cause of this is an out-of-memory problem (so is
>> probably a duplicate of one of the existing reports tagged 'memory')..
>> Does your big tree include any particularly big files?
>>
>> That whatever happens causes SystemError from the zlib module would be a
>> bug in Python itself. Calling zlib.decompress should never cause that.
>>
>> Objects/stringobject.c:4271 of Python 2.6.6 is in _PyString_Resize,
>> after this check is tripped:
>>
>> (!PyString_Check(v) || Py_REFCNT(v) != 1 || newsize < 0 ||
>> PyString_CHECK_INTERNED(v))
>>
>> zlibmodule.c does call _PyString_Resize directly in places, including in
>> the decompress function (PyZlib_decompress)... at a glance I'd suspect
>> the newsize < 0 is the issue: if the string being decompressed expands
>> to something sufficiently huge I think that could happen.
>> PyZlib_decompress calls _PyString_Resize to double the size of the
>> buffer (via << 1) it is decompressing into as the zlib library produces
>> more decompressed data, and newsize is a Py_ssize_t, a signed type...
>>
>> If that's the case, then basically we need to either fix bzr to either
>> extract such large compressed strings in parts rather than all at once,
>> or avoid compressing such large strings in the first place. There's
>> been some discussion about and gradual progress towards these solutions.
>> John would know more about how far off we are.
>>
>> Out of interest, how big is your "big tree"? How many files, and how
>> big is the biggest file, and what's the average file size?
>>
>> ** Changed in: bzr
>> Importance: Undecided => Medium
>>
>> ** Changed in: bzr
>> Status: New => Confirmed
>>
>> ** Summary changed:
>>
>> - cant pull or check, with big tree
>> + Big tree causes 'SystemError: ../Objects/stringobject.c' in
>> groupcompress
>>
>> ** Tags added: memory
>>
>> --
>> You received this bug notification because you are a direct subscriber
>> of the bug.
>> https://bugs.launchpad.net/bugs/723234
>>
>> Title:
>> Big tree causes 'SystemError: ../Objects/stringobject.c' in
>> groupcompress
>>
>> Status in Bazaar Version Control System:
>> Confirmed
>>
>> Bug description:
>> i add a big tree and before commit, y cant pull, or check,
>>
>> bzr check
>> Checking working tree at '/home/sandrade/workspace'.
>> Checking branch at 'file:///home/sandrade/workspace/'.
>> Checking repository at 'file:///home/sandrade/workspace/'.
>> bzr: failed to report crash using apport:
>> OSError(13, 'Permiso denegado')
>> bzr: ERROR: exceptions.SystemError: ../Objects/stringobject.c:4271: bad
>> argument to internal function
...

Read more...

Martin Packman (gz) wrote :

Should there be an upstream Python bug filed here as well?

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers