Using tuned_gzip.GzipFile.readline corrupts data on Python 2.7

Bug #654731 reported by Martin Packman
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Bazaar
Fix Released
Medium
Unassigned

Bug Description

On Python 2.7 bt.test_multiparent.TestMultiVersionedFile.test_save_load fails in a mysterious manner.

<http://babune.ladeuil.net:24842/job/selftest-maverick/37/testReport/junit/bzrlib.tests.test_multiparent/TestMultiVersionedFile/test_save_load/>

Traceback (most recent call last):
  File "/home/babune/lib/python/testtools/runtest.py", line 144, in _run_user
    return fn(*args)
  File "/home/babune/lib/python/testtools/testcase.py", line 465, in _run_test_method
    testMethod()
  File "/home/babune/babune/slaves/maverick64.local/workspace/selftest-maverick/bzrlib/tests/test_multiparent.py", line 263, in test_save_load
    self.assertEqual('a\nb\nc\nd', ''.join(newvf.get_line_list(['a'])[0]))
  File "/home/babune/babune/slaves/maverick64.local/workspace/selftest-maverick/bzrlib/multiparent.py", line 512, in get_line_list
    return [self.cache_version(v) for v in version_ids]
  File "/home/babune/babune/slaves/maverick64.local/workspace/selftest-maverick/bzrlib/multiparent.py", line 519, in cache_version
    diff = self.get_diff(version_id)
  File "/home/babune/babune/slaves/maverick64.local/workspace/selftest-maverick/bzrlib/multiparent.py", line 567, in get_diff
    return MultiParent.from_patch(zip_file.read())
  File "/home/babune/babune/slaves/maverick64.local/workspace/selftest-maverick/bzrlib/multiparent.py", line 192, in from_patch
    return cls._from_patch(StringIO(text))
  File "/home/babune/babune/slaves/maverick64.local/workspace/selftest-maverick/bzrlib/multiparent.py", line 211, in _from_patch
    hunks[-1].lines[-1] += '\n'
IndexError: list index out of range

The root cause here is scary, and caused by the partial replacement of gzip internals in tuned_gzip clashing with later upstream changes, similar to bug 614476. Can cause all sorts of mayhem:

Python 2.7.0+ (trunk, Oct 1 2010, 18:14:22) [MSC v.1500 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> from bzrlib.tuned_gzip import bytes_to_gzip, GzipFile
>>> from StringIO import StringIO
>>> GzipFile(fileobj=StringIO(bytes_to_gzip("a\nb\n"))).readlines()
['a\n', 'b\n']
>>> g = GzipFile(fileobj=StringIO(bytes_to_gzip("a\nb\n")))
>>> g.readline()
'a\n'
>>> g.read()
'\na\n'

The revision that broke us is <http://svn.python.org/view?view=rev&revision=77288> which is attempting to improve perf as well <http://bugs.python.org/issue7471>. It introduces a new 'extrastart' member and changes the semantics of the internal buffering.

Tags: python27

Related branches

Martin Packman (gz)
Changed in bzr:
importance: Undecided → Medium
status: New → Confirmed
Martin Packman (gz)
tags: added: python27
removed: python2.7
Vincent Ladeuil (vila)
Changed in bzr:
status: Confirmed → In Progress
milestone: none → 2.3b5
Vincent Ladeuil (vila)
Changed in bzr:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.