crash while committing

Bug #303206 reported by Eric
2
Affects Status Importance Assigned to Milestone
Bazaar
Fix Released
Medium
Unassigned

Bug Description

While committing a large repository I got the error message:

bzr: ERROR: exceptions.SystemError: NULL result without error in PyObject_Call

Traceback (most recent call last):
  File "/opt/freeware/lib/python2.5/site-packages/bzrlib/commands.py", line 893, in run_bzr_catch_errors
    return run_bzr(argv)
  File "/opt/freeware/lib/python2.5/site-packages/bzrlib/commands.py", line 839, in run_bzr
    ret = run(*run_argv)
  File "/opt/freeware/lib/python2.5/site-packages/bzrlib/commands.py", line 539, in run_argv_aliases
    return self.run(**all_cmd_args)
  File "/opt/freeware/lib/python2.5/site-packages/bzrlib/builtins.py", line 2431, in run
    exclude=safe_relpath_files(tree, exclude))
  File "/opt/freeware/lib/python2.5/site-packages/bzrlib/decorators.py", line 192, in write_locked
    result = unbound(self, *args, **kwargs)
  File "/opt/freeware/lib/python2.5/site-packages/bzrlib/workingtree_4.py", line 237, in commit
    result = WorkingTree3.commit(self, message, revprops, *args, **kwargs)
  File "/opt/freeware/lib/python2.5/site-packages/bzrlib/decorators.py", line 192, in write_locked
    result = unbound(self, *args, **kwargs)
  File "/opt/freeware/lib/python2.5/site-packages/bzrlib/mutabletree.py", line 201, in commit
    *args, **kwargs)
  File "/opt/freeware/lib/python2.5/site-packages/bzrlib/commit.py", line 368, in commit
    self._update_builder_with_changes()
  File "/opt/freeware/lib/python2.5/site-packages/bzrlib/commit.py", line 670, in _update_builder_with_changes
    self._populate_from_inventory()
  File "/opt/freeware/lib/python2.5/site-packages/bzrlib/commit.py", line 808, in _populate_from_inventory
    content_summary)
  File "/opt/freeware/lib/python2.5/site-packages/bzrlib/commit.py", line 852, in _record_entry
    ie, self.parent_invs, path, self.work_tree, content_summary)
  File "/opt/freeware/lib/python2.5/site-packages/bzrlib/repository.py", line 394, in record_entry_contents
    self._add_text_to_weave(ie.file_id, lines, heads, None)
  File "/opt/freeware/lib/python2.5/site-packages/bzrlib/repository.py", line 437, in _add_text_to_weave
    check_content=False)[0:2]
  File "/opt/freeware/lib/python2.5/site-packages/bzrlib/knit.py", line 755, in add_lines
    parent_texts, left_matching_blocks, nostore_sha, random_id)
  File "/opt/freeware/lib/python2.5/site-packages/bzrlib/knit.py", line 813, in _add
    left_matching_blocks)
  File "/opt/freeware/lib/python2.5/site-packages/bzrlib/knit.py", line 1515, in _merge_annotations
    None, old_texts, new_texts)
SystemError: NULL result without error in PyObject_Call

bzr 1.9rc1 on python 2.5.2 (aix5)
arguments: ['/opt/freeware/bin/bzr', 'commit', '-m', '3']
encoding: 'ISO8859-1', fsenc: 'ISO8859-1', lang: 'en_US'
plugins:
  bzrtools /opt/freeware/lib/python2.5/site-packages/bzrlib/plugins/bzrtools [1.9.1]
  launchpad /opt/freeware/lib/python2.5/site-packages/bzrlib/plugins/launchpad [unknown]
*** Bazaar has encountered an internal error.
    Please report a bug at https://bugs.launchpad.net/bzr/+filebug
    including this traceback, and a description of what you
    were doing when the error occurred.

Tags: memory
Revision history for this message
Andrew Bennetts (spiv) wrote :

This appears to be a bug in the _patiencediff_c module. If you delete _patiencediff_c.so it should work around the bug (at the cost of using the slower pure-python code).

Glancing at the _patiencediff_c.c source, I'm guessing the bug is around line 768:

        self->backpointers = (Py_ssize_t *)malloc(sizeof(Py_ssize_t) * self->bsize * 4);
        if (self->backpointers == NULL) {
            Py_DECREF(self);
            return NULL;
        }

If that malloc fails and returns NULL, this function returns NULL without setting a Python error. I think a PyErr_NoMemory(); line needs to be inserted before the return. (or equivalently, that return replaced with "return PyErr_NoMemory();".

Eric, if you could try making that change, recompiling, and testing if it fixes your problem, that would be greatly appreciated. (Well, it'll probably "fix" it by making it raise a proper MemoryError rather than a SystemError, but that's a step in the right direction.)

Changed in bzr:
importance: Undecided → Medium
status: New → Confirmed
Revision history for this message
Andrew Bennetts (spiv) wrote :

My patch to _patiencediff_c has been reviewed and merged to bzr.dev, and should be part of the 1.11 release.

So bzr should at least die with a proper MemoryError now, rather than a SystemError from the Python internals.

The broader issue of dealing with large files still needs to be addressed.

Revision history for this message
Andrew Bennetts (spiv) wrote :

The fundamental issue with large files may be a duplicate of bug 109114.

Revision history for this message
John A Meinel (jameinel) wrote :

I'm marking this as Fix Released because Andrew's patch to raise a proper exception has been merged. And as he noted, if we are just simply running out of memory then that is actually a different bug.

looking at the code Andrew highlighted, we do:
backpointers = (Py_ssize_t *)malloc(sizeof(Py_ssize_t) * bsize * 4);

'bsize' is the number of lines in the second text. So we expect to be able to malloc 16*num_lines bytes at this point. For that to fail, you'd have to have a whole lot of lines < 16 bytes in the text (otherwise we probably would have run out of memory just reading the file).

Though I'll also note that looking through the code, I found a few other places we get a malloc() failure that we don't set PyErr_NoMemory (mostly in functions used for testing, rather than everyday code):
=== modified file 'bzrlib/_patiencediff_c.c'
--- bzrlib/_patiencediff_c.c 2008-11-29 00:31:13 +0000
+++ bzrlib/_patiencediff_c.c 2008-11-29 16:39:26 +0000
@@ -622,10 +622,12 @@

     matches = (struct matching_line *)malloc(sizeof(struct matching_line) * bsize);
     if (matches == NULL)
+ PyErr_NoMemory();
         goto error;

     backpointers = (Py_ssize_t *)malloc(sizeof(Py_ssize_t) * bsize * 4);
     if (backpointers == NULL)
+ PyErr_NoMemory();
         goto error;

     nmatches = unique_lcs(matches, &hashtable, backpointers, a, b, 0, 0, asize, bsize);
@@ -694,10 +696,12 @@
     matches.count = 0;
     matches.matches = (struct matching_block *)malloc(sizeof(struct matching_block) * bsize);
     if (matches.matches == NULL)
+ PyErr_NoMemory();
         goto error;

     backpointers = (Py_ssize_t *)malloc(sizeof(Py_ssize_t) * bsize * 4);
     if (backpointers == NULL)
+ PyErr_NoMemory();
         goto error;

     res = recurse_matches(&matches, &hashtable, backpointers,

Changed in bzr:
milestone: none → 1.10rc1
status: Confirmed → Fix Released
Revision history for this message
Eric (eric-dangoor) wrote : RE: [Bug 303206] Re: crash while committing
Download full text (5.2 KiB)

I removed _patiencediff_c.so and that did get rid of the problem - thanks. I need to get our system administrator to try the recompilation with the fix but in the meantime the workaround is fine.

We had other problems with Bazaar due to lack of memory about a week ago and I managed to fix them by raising the heap size. However, our heap size is now "unlimited" (set by ulimit -d) so I can't raise it any more. Is the problem being caused by Bazaar trying to grab unreasonably large amounts of memory ? With your fix in place will it manage to continue if the malloc fails or will it simply die in a more organised manner ?

-----Original Message-----
From: <email address hidden> [mailto:<email address hidden>] On Behalf Of Andrew Bennetts
Sent: 29 November 2008 00:30
To: Dangoor, Eric
Subject: [Bug 303206] Re: crash while committing

This appears to be a bug in the _patiencediff_c module. If you delete
_patiencediff_c.so it should work around the bug (at the cost of using
the slower pure-python code).

Glancing at the _patiencediff_c.c source, I'm guessing the bug is around
line 768:

        self->backpointers = (Py_ssize_t *)malloc(sizeof(Py_ssize_t) * self->bsize * 4);
        if (self->backpointers == NULL) {
            Py_DECREF(self);
            return NULL;
        }

If that malloc fails and returns NULL, this function returns NULL
without setting a Python error. I think a PyErr_NoMemory(); line needs
to be inserted before the return. (or equivalently, that return
replaced with "return PyErr_NoMemory();".

Eric, if you could try making that change, recompiling, and testing if
it fixes your problem, that would be greatly appreciated. (Well, it'll
probably "fix" it by making it raise a proper MemoryError rather than a
SystemError, but that's a step in the right direction.)

** Changed in: bzr
   Importance: Undecided => Medium
       Status: New => Confirmed

--
crash while committing
https://bugs.launchpad.net/bugs/303206
You received this bug notification because you are a direct subscriber
of the bug.

Status in Bazaar Version Control System: Confirmed

Bug description:
While committing a large repository I got the error message:

bzr: ERROR: exceptions.SystemError: NULL result without error in PyObject_Call

Traceback (most recent call last):
  File "/opt/freeware/lib/python2.5/site-packages/bzrlib/commands.py", line 893, in run_bzr_catch_errors
    return run_bzr(argv)
  File "/opt/freeware/lib/python2.5/site-packages/bzrlib/commands.py", line 839, in run_bzr
    ret = run(*run_argv)
  File "/opt/freeware/lib/python2.5/site-packages/bzrlib/commands.py", line 539, in run_argv_aliases
    return self.run(**all_cmd_args)
  File "/opt/freeware/lib/python2.5/site-packages/bzrlib/builtins.py", line 2431, in run
    exclude=safe_relpath_files(tree, exclude))
  File "/opt/freeware/lib/python2.5/site-packages/bzrlib/decorators.py", line 192, in write_locked
    result = unbound(self, *args, **kwargs)
  File "/opt/freeware/lib/python2.5/site-packages/bzrlib/workingtree_4.py", line 237, in commit
    result = WorkingTree3.commit(self, message, revprops, *args, **kwargs)
  File "/op...

Read more...

Revision history for this message
Andrew Bennetts (spiv) wrote :

Eric wrote:
[...]
> Is the problem being caused by Bazaar trying to grab unreasonably large
> amounts of memory ? With your fix in place will it manage to continue if
> the malloc fails or will it simply die in a more organised manner ?

With my fix, it will simply die in a more organised manner.

If it fails for you with _patiencediff_c, and works without it, then that's an
interesting bug that we should try to fix. It seems surprising that the C
version would try to use more memory than the Python version, but possibly the C
version was just optimised for speed, not memory.

Revision history for this message
Eric (eric-dangoor) wrote :

I have discovered exactly what provoked this error. I was trying to do a commit in which a directory had replaced a file of the same name. This had apparently confused Bazaar. So it was not related to the size of our repository.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.