bzr workingtree interprets inventory xml as ascii incorrectly.

Bug #49224 reported by Robert Collins
2
Affects Status Importance Assigned to Milestone
Bazaar
Fix Released
Medium
Unassigned
bzr (Ubuntu)
Fix Released
Medium
Unassigned

Bug Description

The following patch fixes a severe bug in 0.8 : it will fail when given well formed xml encoded as utf8 in the inventory. It fails because it implicitly upcasts the serialized inventory from an ascii string when writing it to the working tree as utf8 - in actual fact it is already encoded as a utf8 bytestream and no conversion should be performed. Our current xml writer happens to emit ascii-safe inventory xml, but third party xml generators could well emit utf8, and its likely we will want to do so ourselves in future. Accordingly I'd like to put this fix into 0.8, and an update into dapper.

--- bzrlib/workingtree.py 2006-06-07 15:18:15 +0000
+++ bzrlib/workingtree.py 2006-06-10 07:59:07 +0000
@@ -1218,16 +1218,17 @@
                     new_revision, xml)
                 inv.revision_id = new_revision
                 xml = bzrlib.xml5.serializer_v5.write_inventory_to_string(inv)
-
+ assert isinstance(xml, str), 'serialised xml must be bytestring.'
             path = self._basis_inventory_name()
- self._control_files.put_utf8(path, xml)
+ sio = StringIO(xml)
+ self._control_files.put(path, sio)
         except WeaveRevisionNotPresent:
             pass

     def read_basis_inventory(self):
         """Read the cached basis inventory."""
         path = self._basis_inventory_name()
- return self._control_files.get_utf8(path).read()
+ return self._control_files.get(path).read()

     @needs_read_lock
     def read_working_inventory(self):

Changed in bzr:
status: Unconfirmed → Confirmed
Revision history for this message
Martin Pool (mbp) wrote :

+1 on the patch in general, and feel free to merge to 0.8-fixes.

I'm not sure this justifies a new 0.8 release though. Instead we can just say that this format is defined to be in ascii (with entity escapes) rather than utf-8; this has the advantage of not requiring people to upgrade.

Revision history for this message
Robert Collins (lifeless) wrote : Re: [Bug 49224] Re: bzr workingtree interprets inventory xml as ascii incorrectly.

On Sun, 2006-06-11 at 02:32 +0000, Martin Pool wrote:
> +1 on the patch in general, and feel free to merge to 0.8-fixes.
>
> I'm not sure this justifies a new 0.8 release though. Instead we can
> just say that this format is defined to be in ascii (with entity
> escapes) rather than utf-8; this has the advantage of not requiring
> people to upgrade.

Jeff Bailey seemed happy to consider this something to backport into
Dapper. I don't think there is any particular harm in recognising we had
a bug in the stable series and fixing it there : this has the advantage
of allowing us to upgrade our code ;).

(And it really is just a bug in the previous releases - all the other
places we use the xml content of inventories we treat it as a
bytestring.)

Rob

--
GPG key available at: <http://www.robertcollins.net/keys.txt>.

Revision history for this message
Robert Collins (lifeless) wrote :

On Sun, 2006-06-11 at 02:32 +0000, Martin Pool wrote:
> +1 on the patch in general, and feel free to merge to 0.8-fixes.
>
> I'm not sure this justifies a new 0.8 release though. Instead we can
> just say that this format is defined to be in ascii (with entity
> escapes) rather than utf-8; this has the advantage of not requiring
> people to upgrade.

Oh, btw where is the 0.8-fixes branch ? Is it 0.8 ?

Rob

--
GPG key available at: <http://www.robertcollins.net/keys.txt>.

Revision history for this message
Martin Pool (mbp) wrote : Re: [Bug 49224] Re: [Bug 49224] Re: bzr workingtree interprets inventory xml as ascii incorrectly.

On 11 Jun 2006, Robert Collins <email address hidden> wrote:
> On Sun, 2006-06-11 at 02:32 +0000, Martin Pool wrote:
> > +1 on the patch in general, and feel free to merge to 0.8-fixes.
> >
> > I'm not sure this justifies a new 0.8 release though. Instead we can
> > just say that this format is defined to be in ascii (with entity
> > escapes) rather than utf-8; this has the advantage of not requiring
> > people to upgrade.
>
> Oh, btw where is the 0.8-fixes branch ? Is it 0.8 ?

Yes just

  http://bazaar-vcs.org/bzr/bzr.0.8/

We should possibly rename it...

--
Martin

Revision history for this message
John A Meinel (jameinel) wrote :

It doesn't seem like we are backporting this, but the fix exists on the 0.9 branch.

Changed in bzr:
importance: Untriaged → Medium
status: Confirmed → Fix Released
Simon Law (sfllaw)
Changed in bzr:
importance: Untriaged → Medium
status: Unconfirmed → Confirmed
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.