Python3 version of launchpadlib doesn't properly upload binary attachments to Launchpad

Bug #1729754 reported by Pierre Equoy on 2017-11-03
16
This bug affects 2 people
Affects Status Importance Assigned to Milestone
wadllib
High
Colin Watson
python-wadllib (Ubuntu)
High
Colin Watson
Xenial
High
Colin Watson
Bionic
High
Colin Watson

Bug Description

Tested on 16.04 and 17.10
python-launchpadlib (1.10.5-1 on 17.10)
python3-launchpadlib (1.10.5-1 on 17.10)

While working on porting a utility using launchpadlib from Python2 to Python3, I noticed the binary attachments were becoming unreadable.

I've tried uploading .jpg files, .tgz files, .tar.xz files, they all fail to open properly.

P.S.: When a fix is available, can it be ported to Xenial? We need this for our tools running on devices using Xenial.

[Test Case]
I wrote a little proof of concept that uploads a given binary attachment to a launchpad issue (see attachment). To use it on staging launchpad:

    APPORT_STAGING=1 ./poc.py <bug_number> </path/to/file.bin>

When using the exact same script with python2 (basically, replacing `python3` with `python` in the shebang line), the binary file is properly attached. That's why I think it's a bug with python3-launchpadlib.

[Regression Potential]
The only sensible way to fix this bug was to rewrite how wadllib does MIME-encoding of its form uploads, so it'll be important to test both text and binary uploads.

Launchpad only defines a few methods that use multipart/form-data, so the regression potential is confined to those: bug.addAttachment, distro_arch_series.setChroot, and project_release.add_file.

Related branches

Pierre Equoy (pieq) wrote :
Pierre Equoy (pieq) wrote :

Attached using python2 launchpadlib

Pierre Equoy (pieq) wrote :

Attached using python3 launchpadlib

description: updated
description: updated
Pierre Equoy (pieq) on 2017-11-03
description: updated
description: updated
Pierre Equoy (pieq) wrote :

Bits of IRC discussion with Will Grant about this:

<ePierre> wgrant, so yeah, my question is: I don't understand where to find the source code for the `addAttachment` method
<ePierre> wgrant, when I grep in launchpadlib source code, I don't see anything other than in testing/launchpad-wadl.xml
<wgrant> ePierre: sorry, was afk for a bit. launchpadlib uses lazr.restfulclient to parse Launchpad's WADL API description, so we can make changes to the API without having to roll out a new launchpadlib everywhere.
<wgrant> ePierre: lazr.restfulclient's NamedOperation.__call__ is probably interesting
<ePierre> wgrant, where can I find lazr.restfulclient's source code?
<wgrant> ePierre: lp:lazr.restfulclient

Pierre Equoy (pieq) wrote :

After some investigation following wgrant's suggestion, I found the following in `wadllib/application.py` RepresentationDefinition.bind():

            if hasattr(outer, "as_bytes"):
                doc = outer.as_bytes()
            else:
                doc = outer.as_string(unixfrom=False)

where `outer` is an instance of `email.mime.multipart.MIMEMultipart`.

Python 3 `email.message.Message` class includes an `as_bytes()` method [1], but Python 2 does not [2].

It looks like the resulting `doc` (which ends up being the `in_representation` variable in lazr.restfulclient's `NamedOperation.__call__`) returns an altered version of the attachment (similar to what can be seen in comment #3 above) when run using Python3...

So the issue probably comes from `wadllib.application`'s RepresentationDefinition.bind.

[1] https://github.com/python/cpython/blob/3.6/Lib/email/message.py#L166
[2] https://github.com/python/cpython/blob/2.7/Lib/email/message.py#L92

Pierre Equoy (pieq) wrote :

I did some investigation on the attached images as well as on a compressed tarball (a 7 MB .tar.xz archive) uploaded to Launchpad using poc.py with python2 and python3.

First of all, the binary size is different (following is using python3 REPL, kyo2 and kyo3 being the binary data from the images uploaded with python2 and python3 libs, respectively):

kyo2 = open('/tmp/kyo2.jpg', 'rb').read()
kyo3 = open('/tmp/kyo3.jpg', 'rb').read()

len(kyo2)
83703

len(kyo3)
83702

The difference boils down to a different representation of CR:

kyo2.count(b'\r')
355
kyo2.count(b'\n')
334
kyo3.count(b'\r')
0
kyo3.count(b'\n')
688

→ 355+334 = 689, one byte more than the number of \n in the image uploaded using python3.

The "missing byte" seems to appear in byte 41177:

for i in range(41177-5, 41177+10):
    print('{}\t{}\t{}'.format(i, kyo2[i:i+1], kyo3[i:i+1]))

41172 b'\xfc' b'\xfc'
41173 b'\xc2' b'\xc2'
41174 b'\xb5' b'\xb5'
41175 b'\x81' b'\x81'
41176 b'\r' b'\n'
41177 b'\n' b'\xc3' ← from this byte, everything is shifted by one byte
41178 b'\xc3' b'\xd1'
41179 b'\xd1' b'B'
41180 b'B' b'\x92'
41181 b'\x92' b'r'
41182 b'r' b','
41183 b',' b'<'
41184 b'<' b'\xac'
41185 b'\xac' b'n'
41186 b'n' b'N'
(...)

The same issue happens with the .tar.xz binary except it's shifted in more than one place (probably because it's much bigger than the 83 kb image file)

Colin Watson (cjwatson) on 2018-07-03
affects: launchpadlib → wadllib
Changed in wadllib:
assignee: nobody → Colin Watson (cjwatson)
importance: Undecided → High
status: New → In Progress
Colin Watson (cjwatson) on 2018-07-20
Changed in wadllib:
status: In Progress → Fix Committed
Colin Watson (cjwatson) wrote :

1.3.3 (2018-07-20)
==================

- Drop support for Python < 2.6.
- Add tox testing support.
- Implement a subset of MIME multipart/form-data encoding locally rather
  than using the standard library's email module, which doesn't have good
  handling of binary parts and corrupts bytes in them that look like line
  endings in various ways depending on the Python version. [bug=1729754]

Changed in wadllib:
status: Fix Committed → Fix Released
Colin Watson (cjwatson) wrote :
description: updated
Changed in python-wadllib (Ubuntu):
status: New → Fix Committed
importance: Undecided → High
assignee: nobody → Colin Watson (cjwatson)
Changed in python-wadllib (Ubuntu Xenial):
status: New → In Progress
importance: Undecided → High
assignee: nobody → Colin Watson (cjwatson)
Changed in python-wadllib (Ubuntu Bionic):
status: New → In Progress
importance: Undecided → High
assignee: nobody → Colin Watson (cjwatson)
Colin Watson (cjwatson) wrote :
Colin Watson (cjwatson) wrote :

This fix isn't quite in cosmic yet, but it's in unstable (https://tracker.debian.org/pkg/python-wadllib) and just waiting for auto-sync to do its thing. I'm going to go ahead and upload this now, since I don't expect the SRU team to get round to it last thing on a Friday anyway.

tags: added: patch
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package python-wadllib - 1.3.3-1

---------------
python-wadllib (1.3.3-1) unstable; urgency=medium

  * Team upload.

  [ Ondřej Nový ]
  * Fixed VCS URL (https)
  * d/control: Set Vcs-* to salsa.debian.org
  * d/copyright: Use https protocol in Format field
  * d/control: Remove ancient X-Python-Version field
  * d/control: Remove ancient X-Python3-Version field

  [ Piotr Ożarowski ]
  * Add dh-python to Build-Depends

  [ Colin Watson ]
  * New upstream release:
    - Fix MIME encoding of binary parts (LP: #1729754).

 -- Colin Watson <email address hidden> Fri, 20 Jul 2018 14:18:49 +0100

Changed in python-wadllib (Ubuntu):
status: Fix Committed → Fix Released
Colin Watson (cjwatson) wrote :

This is fixed in cosmic now, so I don't know of anything else blocking this SRU.

Hello Pierre, or anyone else affected,

Accepted python-wadllib into bionic-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/python-wadllib/1.3.2-3ubuntu0.18.04.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-bionic to verification-done-bionic. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-bionic. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in python-wadllib (Ubuntu Bionic):
status: In Progress → Fix Committed
tags: added: verification-needed verification-needed-bionic
Łukasz Zemczak (sil2100) wrote :

Hello Pierre, or anyone else affected,

Accepted python-wadllib into xenial-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/python-wadllib/1.3.2-3ubuntu0.16.04.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-xenial to verification-done-xenial. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-xenial. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in python-wadllib (Ubuntu Xenial):
status: In Progress → Fix Committed
tags: added: verification-needed-xenial
Pierre Equoy (pieq) wrote :

I followed the same procedure as in the description, using the same poc.py script.

Prereq:

- a xenial LXC container with -proposed packages enabled
- a bionic LXC container with -proposed packages enabled

Steps:

1. Connect to the Xenial container and install the following packages:

python-launchpadlib
python3-launchpadlib
python-wadllib/xenial-proposed
python3-wadllib/xenial-proposed

2. Make sure the poc.py is configured to use Python3, then:

  a. attach a binary file:
    $ APPORT_STAGING=1 ./poc.py <bug_number> <binary_file.tar.xz>
  b. attach a text file:
    $ APPORT_STAGING=1 ./poc.py <bug_number> <text_file.json>

→ In both cases, the file is attached properly. It can then be downloaded and opened without problem.

3. Repeat step 2 using the poc.py in Python2 configuration

4. Repeat steps 1~3 using a Bionic container

→ In every cases, it worked out well for me. I attached a .tar.xz file, a .jpg file (the one attached to this issue) as well as a .json file, and in every cases I could download and open the file properly.

tags: added: verification-done verification-done-bionic verification-done-xenial
removed: verification-needed verification-needed-bionic verification-needed-xenial
Pierre Equoy (pieq) wrote :

Thanks a lot @cjwatson for your patch! :)

The verification of the Stable Release Update for python-wadllib has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package python-wadllib - 1.3.2-3ubuntu0.16.04.1

---------------
python-wadllib (1.3.2-3ubuntu0.16.04.1) xenial; urgency=medium

  * Fix MIME encoding of binary parts (LP: #1729754).

 -- Colin Watson <email address hidden> Fri, 20 Jul 2018 18:09:05 +0100

Changed in python-wadllib (Ubuntu Xenial):
status: Fix Committed → Fix Released
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package python-wadllib - 1.3.2-3ubuntu0.18.04.1

---------------
python-wadllib (1.3.2-3ubuntu0.18.04.1) bionic; urgency=medium

  * Fix MIME encoding of binary parts (LP: #1729754).

 -- Colin Watson <email address hidden> Fri, 20 Jul 2018 18:09:05 +0100

Changed in python-wadllib (Ubuntu Bionic):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers