bzrlib patch processor has unchecked iteration

Bug #670159 reported by Chris Lewis
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Bazaar
Confirmed
Medium
Unassigned
Breezy
Triaged
Medium
Unassigned

Bug Description

In bzrlib/patches.py, at line 281 is an iter_lines.next() call. If .next() has ended, Python will throw a StopIteration exception. I have a patch file (included below) which does cause this exception to be thrown, which means no hunk is returned.

The quick way to fix this is to wrap the call in a try block:

            try:
                hunk_line = parse_line(iter_lines.next())
                hunk.lines.append(hunk_line)
                if isinstance(hunk_line, (RemoveLine, ContextLine)):
                    orig_size += 1
                if isinstance(hunk_line, (InsertLine, ContextLine)):
                    mod_size += 1
            except StopIteration, e:
                break

Here's the diff that causes the failure:

Index: cvsanaly/trunk/ChangeLog
===================================================================
--- cvsanaly/trunk/ChangeLog (revision 6978)
+++ cvsanaly/trunk/ChangeLog (revision 6979)
@@ -2,6 +2,13 @@

  * pycvsanaly2/DBContentHandler.py:

+ Do not use the current commit_id for links of old paths, but the
+ 'from' commit_id.
+
+2009-02-26 Carlos Garcia Campos <email address hidden>
+
+ * pycvsanaly2/DBContentHandler.py:
+
  Remove the leading '/' when inserting into files cache if the
  original path doesn't start with '/' (cvs).

Revision history for this message
Martin Packman (gz) wrote :

Wrong diff, I take it?

Revision history for this message
Chris Lewis (chris-chris-to) wrote :

It is the diff that will cause the error (no hunk yielded from the iterator) to occur, it is not a diff on the source for the patch. I'm not sure how to use bzr effectively or contribute useful patches, hopefully the attached diff is of use.

Revision history for this message
Aaron Bentley (abentley) wrote :

If next has ended, your patch is malformed. The @@ line describes the number of lines in the original and modified versions of the hunk. In your example, the original version should have 6 lines and the modified version should have 13. Iteration should not end before the promised number of lines have been provided.

Your example appears to have 5 lines in the original version and 12 in the modified version (i.e. the last line is missing). This means it is malformed. This implementation is designed for exact patch parsing-- it transforms the exact original original text into the exact modified text. It does not handle cases where a text other than the original text is provided as an input, or where the patch data is inconsistent. In such cases, I recommend using GNU patch or similar tools.

Changed in bzr:
status: New → Invalid
Revision history for this message
Andrew Bennetts (spiv) wrote :

Aaron: While I agree bzr can't reasonably make use of a malformed patch like this one, it shouldn't crash out with StopIteration. A traceback from the CLi is always a bug.

Chris: What command did you run? Would you mind posting the traceback to this bug? It's not clear from your bug report how you encountered this situation, as bzr itself doesn't handle plain patch files directly. (I can speculate that it might involve corrupted shelves or merge directives, but rather than guess I'd rather be sure I know what your problem is.)

Changed in bzr:
status: Invalid → Incomplete
Revision history for this message
Chris Lewis (chris-chris-to) wrote :

Hi all,
So, to be fair, I've sort of done an end-run on this one. I was looking for a unified diff parser for an independent project, and it just so happened bzr had one (sidenote: it would be cool if some of the awesome stuff in bzrlib was distributed separately!), so I was directly accessing the methods inside. The "bug" is not something I found from running bzr, and may never actually expose itself during normal operation.

Having read Aaron's reply, I went back and looked more closely at what I was doing, and he is right that the patch I fed the library was malformed. It was missing trailing lines, which I had accidentally stripped in my own pre-processing. It's your call whether you want patches.py to be more tolerant or not! I am happy with the "Invalid" resolution, and I'm sorry to have wasted time from user error :(

FWIW, the diff I was using (with the extra not-stripped newlines...) can be called up by running `svn diff -c 6977 https://svn.forge.morfeo-project.org/svn/libresoft-tools`

Revision history for this message
Andrew Bennetts (spiv) wrote :

Thanks for the clarification! In an ideal world we'd define and document what we expect this API to do in the face of broken data (and perhaps for robustness against certain kinds of corrupted merge directives we eventually will), but it's probably not particularly valuable for us to do that without a motivating bug/feature. So I'll move this back to invalid.

Jelmer Vernooij (jelmer)
Changed in bzr:
status: Incomplete → Confirmed
importance: Undecided → Wishlist
Revision history for this message
Martin Pool (mbp) wrote :

I think someone may have hit this in the wild in bug 756228, so I'll bump this up. We could at least give a clean message.

tags: added: error-reporting patch traceback
Changed in bzr:
importance: Wishlist → Medium
Jelmer Vernooij (jelmer)
tags: added: check-for-breezy
Jelmer Vernooij (jelmer)
Changed in brz:
status: New → Triaged
importance: Undecided → Medium
Jelmer Vernooij (jelmer)
tags: removed: check-for-breezy
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.