Launchpad downloads fail with "incorrect size, got X expected Y"

Bug #2025748 reported by Robie Basak
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
ubuntu-dev-tools (Ubuntu)
Fix Released
Medium
Robie Basak
Focal
Fix Released
Critical
Robie Basak
Jammy
Fix Released
Undecided
Robie Basak
Kinetic
Fix Released
Undecided
Robie Basak
Lunar
Fix Released
Undecided
Robie Basak

Bug Description

[Impact]

src:ubuntu-dev-tools 0.193ubuntu4~20.04.2 regressed behaviour in how python3-ubuntutools downloads source packages from Launchpad. This is currently breaking the git-ubuntu importer, causing it to fail to update repositories for some packages, since the git-ubuntu snap is based on Focal and has been rebuilt since. The most recent version of the git-ubuntu snap for amd64 that does not exhibit the problem is r1212.

[Test Plan]

Scenario A:

python3
from ubuntutools.archive import UbuntuSourcePackage
pkg=UbuntuSourcePackage(package='ffc', version='0.7.0-1')
pkg.pull()

Actual results:

File ffc_0.7.0-1.diff.gz incorrect size, got 53464 expected 13021
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3/dist-packages/ubuntutools/archive.py", line 470, in pull
    self._download_file_from_urls(urls, name, int(entry["size"]), dscverify=True)
  File "/usr/lib/python3/dist-packages/ubuntutools/archive.py", line 447, in _download_file_from_urls
    raise DownloadError(f"Failed to download {filename}")
ubuntutools.misc.DownloadError: Failed to download ffc_0.7.0-1.diff.gz

Expected results: no output and no error and ffc_0.7.0-1.diff.gz should be 13021 bytes.

You may also want to add the lines:

import logging
logging.basicConfig()
logging.getLogger().setLevel(logging.DEBUG)

to help with debugging, but it is not necessary to reproduce the problem.

With python3-ubuntutools 0.176ubuntu20.04.1 this works
With python3-ubuntutools 0.193ubuntu4~20.04.2 this fails.
I think 0.193ubuntu4~20.04.1 was never published to focal-updates.

Scenario B:

$ pull-lp-source ffc 0.7.0-1
Found ffc 0.7.0-1 in lucid
Downloading ffc_0.7.0-1.dsc from launchpadlibrarian.net (0.001 MiB)
[=====================================================>]100%
Public key not found, could not verify signature
Downloading ffc_0.7.0.orig.tar.gz from launchpadlibrarian.net (1.551 MiB)
[=====================================================>]100%
Downloading ffc_0.7.0-1.diff.gz from launchpadlibrarian.net (0.012 MiB)
File ffc_0.7.0-1.diff.gz incorrect size, got 53464 expected 13021
Traceback (most recent call last):
  File "/usr/bin/pull-lp-source", line 14, in <module>
    PullPkg.main(distro="ubuntu", pull="source")
  File "/usr/lib/python3/dist-packages/ubuntutools/pullpkg.py", line 111, in main
    cls(*args, **kwargs).pull()
  File "/usr/lib/python3/dist-packages/ubuntutools/pullpkg.py", line 461, in pull
    srcpkg.pull()
  File "/usr/lib/python3/dist-packages/ubuntutools/archive.py", line 470, in pull
    self._download_file_from_urls(urls, name, int(entry["size"]), dscverify=True)
  File "/usr/lib/python3/dist-packages/ubuntutools/archive.py", line 447, in _download_file_from_urls
    raise DownloadError(f"Failed to download {filename}")
ubuntutools.misc.DownloadError: Failed to download ffc_0.7.0-1.diff.gz

Expected results: success and ffc_0.7.0-1.diff.gz should be 13021 bytes.

Scenario C:

Pick a package in an Unapproved queue. In this example I chose r8125 in the Jammy queue.

pull-pkg -D ubuntu --upload-queue --pull source r8125 jammy

This should result in (amongst others) an appropriately named changes file eg. r8125_9.007.01-3ubuntu0.2_source.changes and the file should be uncompressed. For example:

$ xxd r8125_9.007.01-3ubuntu0.2_source.changes|head -1
00000000: 466f 726d 6174 3a20 312e 380a 4461 7465 Format: 1.8.Date

[Where things could go wrong]

We're changing how we download files, so other classes of downloads could regress. To try and mitigate this, the Test Plan tries the opposing known edge cases of .diff.gz and .changes files.

Related branches

Revision history for this message
Robie Basak (racb) wrote :

Also reproduced on Mantic with 0.193ubuntu4.

Changed in ubuntu-dev-tools (Ubuntu):
importance: Critical → Medium
Changed in ubuntu-dev-tools (Ubuntu Focal):
status: New → Triaged
importance: Undecided → Critical
Revision history for this message
Robie Basak (racb) wrote :

It seems that python3-ubuntutools switched from using urllib.request.urlopen() to requests.get() when fetching sources using the pull() method. For the sake of providing a progress bar, both the old code and the new code "streamed" the data by asking for chunks at a time, incrementally saving those to disk, and then later checking the size and checksum. But the switch to requests now inadvertently _decompresses_ files, in the case for example of a .diff.gz. The uncompressed data gets written to a file named something.diff.gz, which of course is now larger than expected.

For validation purposes we require the original unmodified file to be written to disk.

I don't see a way of getting the requests library to handle transfer encoding but not content encoding without using undocumented API. As far as I can tell, the raw attribute would give me neither.

Perhaps "Accept-Encoding: identity" in the request would help? Would that be correct in all cases?

Revision history for this message
Robie Basak (racb) wrote :

Also relevant is https://git.launchpad.net/ubuntu-dev-tools/commit/ubuntutools/misc.py?id=1e2036399e535c939c6e581259e5aa9fdae99f27 which explicitly swapped raw access for the purposes of decoding.

Revision history for this message
Robie Basak (racb) wrote :
Download full text (3.3 KiB)

> I don't see a way of getting the requests library to handle transfer encoding but not content encoding without using undocumented API. As far as I can tell, the raw attribute would give me neither.

Digging into this much deeper, the handling of transfer encoding doesn't seem to be documented anywhere, but requests uses urllib3 which uses http.client on Python 3, and http.client's automatic handling of transfer encoding is passed all the way through. The requests library's response object's raw attribute is a urllib3 response object, and testing urllib3 against a deliberately chunked HTTP response does result in it decoding correctly. For example, the following code worked as expected:

import urllib3
http = urllib3.PoolManager()
resp = http.request("GET", "http://localhost", preload_content=False)
print(resp.read())

In that test I arranged a fake chunked response with the following:

while true; do nc -q1 -l -p 80 < result;done

With the file result having the following contents saved with ff=dos in vim:

--BEGIN--
HTTP/1.1 200
Transfer-Encoding: chunked
Content-Type: text/plain

4
Wiki
7
pedia i
B
n
chunks.
0

--END--

Similarly, with requests directly, the following also worked fine against the same fake chunked response:

import requests
with requests.get('http://localhost', stream=True) as resp:
    resp.raw.read(decode_content=False)

In conclusion I don't need to worry about transfer encoding not being handled correctly when using raw.read() with requests.

Going back to the original issue, it is fixed with a partial revert of 1e20363 to restore the use of raw.read().

The commit message against 1e20363 talks of the changes file ending up compressed on disk, and indeed I re-introduced this behaviour after my revert.

To fix that without regressing the fetch of a diff.gz by "accidentally" decompressing it, I think the correct fix would be to set "Accept-Encoding: identity". Indeed, on testing with wget (using -d), I see that wget sets this by default, so I'm confident it's correct. Setting this using headers={'accept-encoding': 'identity'} fixes that issue, so that for example "pull-pkg -D ubuntu --upload-queue --pull source r8125 jammy" downloads the changes file and writes it uncompressed again (r8125 is currently in the unapproved queue but won't work later as an example).

I think the final fix should therefore look something like:

--- a/ubuntutools/misc.py 2023-05-30 16:47:58.000000000 +0000
+++ b/ubuntutools/misc.py 2023-07-04 18:41:54.919108821 +0000
@@ -348,7 +348,7 @@
     with tempfile.TemporaryDirectory() as tmpdir:
         tmpdst = Path(tmpdir) / "dst"
         try:
- with requests.get(src, stream=True, timeout=60, auth=auth) as fsrc:
+ with requests.get(src, stream=True, timeout=60, auth=auth, headers={'accept-encoding': 'identity'}) as fsrc:
                 with tmpdst.open("wb") as fdst:
                     fsrc.raise_for_status()
                     _download(fsrc, fdst, size, blocksize=blocksize)
@@ -433,7 +433,10 @@

     downloaded = 0
     try:
- for block in fsrc.iter_content(blocksize):
+ while True:
+ block = fsrc.raw.read(blocksize)
+ if not b...

Read more...

Robie Basak (racb)
Changed in ubuntu-dev-tools (Ubuntu):
assignee: nobody → Robie Basak (racb)
Changed in ubuntu-dev-tools (Ubuntu Focal):
assignee: nobody → Robie Basak (racb)
Changed in ubuntu-dev-tools (Ubuntu):
status: Triaged → In Progress
Changed in ubuntu-dev-tools (Ubuntu Focal):
status: Triaged → In Progress
Robie Basak (racb)
description: updated
Robie Basak (racb)
description: updated
Revision history for this message
Robie Basak (racb) wrote :

Fixed in 0.193ubuntu5 in mantic-proposed; awaiting migration.

Changed in ubuntu-dev-tools (Ubuntu Jammy):
status: New → In Progress
Changed in ubuntu-dev-tools (Ubuntu Kinetic):
status: New → In Progress
Changed in ubuntu-dev-tools (Ubuntu Lunar):
status: New → In Progress
Changed in ubuntu-dev-tools (Ubuntu Jammy):
assignee: nobody → Robie Basak (racb)
Changed in ubuntu-dev-tools (Ubuntu Kinetic):
assignee: nobody → Robie Basak (racb)
Changed in ubuntu-dev-tools (Ubuntu Lunar):
assignee: nobody → Robie Basak (racb)
Changed in ubuntu-dev-tools (Ubuntu):
status: In Progress → Fix Committed
Revision history for this message
Andreas Hasenack (ahasenack) wrote : Please test proposed package

Hello Robie, or anyone else affected,

Accepted ubuntu-dev-tools into lunar-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/ubuntu-dev-tools/0.193ubuntu4~23.04.2 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-lunar to verification-done-lunar. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-lunar. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in ubuntu-dev-tools (Ubuntu Lunar):
status: In Progress → Fix Committed
tags: added: verification-needed verification-needed-lunar
Changed in ubuntu-dev-tools (Ubuntu Kinetic):
status: In Progress → Fix Committed
tags: added: verification-needed-kinetic
Revision history for this message
Andreas Hasenack (ahasenack) wrote :

Hello Robie, or anyone else affected,

Accepted ubuntu-dev-tools into kinetic-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/ubuntu-dev-tools/0.193ubuntu4~22.10.2 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-kinetic to verification-done-kinetic. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-kinetic. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in ubuntu-dev-tools (Ubuntu Jammy):
status: In Progress → Fix Committed
tags: added: verification-needed-jammy
Revision history for this message
Andreas Hasenack (ahasenack) wrote :

Hello Robie, or anyone else affected,

Accepted ubuntu-dev-tools into jammy-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/ubuntu-dev-tools/0.193ubuntu4~22.04.2 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-jammy to verification-done-jammy. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-jammy. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in ubuntu-dev-tools (Ubuntu Focal):
status: In Progress → Fix Committed
tags: added: verification-needed-focal
Revision history for this message
Andreas Hasenack (ahasenack) wrote :

Hello Robie, or anyone else affected,

Accepted ubuntu-dev-tools into focal-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/ubuntu-dev-tools/0.193ubuntu4~20.04.3 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-focal to verification-done-focal. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-focal. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package ubuntu-dev-tools - 0.193ubuntu5

---------------
ubuntu-dev-tools (0.193ubuntu5) mantic; urgency=medium

  [ Steve Langasek ]
  * Remove references to deprecated
    http://people.canonical.com/~ubuntu-archive.
  * Remove references to architectures not supported in any active
    Ubuntu release.

  [ Robie Basak ]
  * ubuntutools/misc: swap iter_content for raw stream with "Accept-Encoding:
    identity" to fix .diff.gz downloads (LP: #2025748).

 -- Robie Basak <email address hidden> Wed, 05 Jul 2023 15:32:43 +0100

Changed in ubuntu-dev-tools (Ubuntu):
status: Fix Committed → Fix Released
Revision history for this message
Robie Basak (racb) wrote :

I performed the Test Plan on Lunar, Kinetic, Jammy and Focal. With the current versions the problem reproduced as expected in scenarios A and B, and scenario C worked as expected. Then I upgraded to proposed and scenarios A, B and C all worked as expected.

The current versions were:

Lunar: 0.193ubuntu4~23.04.1
Kinetic: 0.193ubuntu4~22.10.1
Jammy: 0.193ubuntu4~22.04.1
Focal: 0.193ubuntu4~20.04.2

The proposed versions were:

Lunar: 0.193ubuntu4~23.04.2
Kinetic: 0.193ubuntu4~22.10.2
Jammy: 0.193ubuntu4~22.04.2
Focal: 0.193ubuntu4~20.04.3

My notes from reproducing manually are below in case I need to do these steps again:

lxc launch ubuntu:focal rbasak-focal
lxc exec rbasak-focal bash
apt update && apt -y install --no-install-recommends ubuntu-dev-tools
dpkg-query -W ubuntu-dev-tools

0.193ubuntu4~23.04.1
0.193ubuntu4~22.10.1
0.193ubuntu4~22.04.1
0.193ubuntu4~20.04.2

python3
from ubuntutools.archive import UbuntuSourcePackage
pkg=UbuntuSourcePackage(package='ffc', version='0.7.0-1')
pkg.pull()
exit()
stat -c%s ffc_0.7.0-1.diff.gz

rm *

pull-lp-source ffc 0.7.0-1

stat -c%s ffc_0.7.0-1.diff.gz
rm -r ffc*

pull-pkg -D ubuntu --upload-queue --pull source elfutils jammy
xxd elfutils_0.186-1ubuntu0.1_source.changes|head -1
rm -r elf*

# manual on Focal
add-apt-repository -yp proposed
apt -y install -t focal-proposed ubuntu-dev-tools

dpkg-query -W ubuntu-dev-tools
0.193ubuntu4~23.04.2
0.193ubuntu4~22.10.2
0.193ubuntu4~22.04.2
0.193ubuntu4~20.04.3

tags: added: verification-done verification-done-focal verification-done-jammy verification-done-kinetic verification-done-lunar
removed: verification-needed verification-needed-focal verification-needed-jammy verification-needed-kinetic verification-needed-lunar
Revision history for this message
Łukasz Zemczak (sil2100) wrote :

As this is a developer-facing package (regular users unaffected) + changes are limited to package download with the test scope vast enough, I'll consider this package for an early release.

Revision history for this message
Łukasz Zemczak (sil2100) wrote : Update Released

The verification of the Stable Release Update for ubuntu-dev-tools has completed successfully and the package is now being released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package ubuntu-dev-tools - 0.193ubuntu4~23.04.2

---------------
ubuntu-dev-tools (0.193ubuntu4~23.04.2) lunar; urgency=medium

  * ubuntutools/misc.py: swap iter_content for raw stream with
    "Accept-Encoding: identity" to fix .diff.gz downloads (LP: #2025748). This
    fixes a regression caused by the backport of 0.193 that included this bug.

 -- Robie Basak <email address hidden> Thu, 06 Jul 2023 13:01:35 +0100

Changed in ubuntu-dev-tools (Ubuntu Lunar):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package ubuntu-dev-tools - 0.193ubuntu4~22.10.2

---------------
ubuntu-dev-tools (0.193ubuntu4~22.10.2) kinetic; urgency=medium

  * ubuntutools/misc.py: swap iter_content for raw stream with
    "Accept-Encoding: identity" to fix .diff.gz downloads (LP: #2025748). This
    fixes a regression caused by the backport of 0.193 that included this bug.

 -- Robie Basak <email address hidden> Thu, 06 Jul 2023 13:06:18 +0100

Changed in ubuntu-dev-tools (Ubuntu Kinetic):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package ubuntu-dev-tools - 0.193ubuntu4~22.04.2

---------------
ubuntu-dev-tools (0.193ubuntu4~22.04.2) jammy; urgency=medium

  * ubuntutools/misc.py: swap iter_content for raw stream with
    "Accept-Encoding: identity" to fix .diff.gz downloads (LP: #2025748). This
    fixes a regression caused by the backport of 0.193 that included this bug.

 -- Robie Basak <email address hidden> Thu, 06 Jul 2023 13:09:21 +0100

Changed in ubuntu-dev-tools (Ubuntu Jammy):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package ubuntu-dev-tools - 0.193ubuntu4~20.04.3

---------------
ubuntu-dev-tools (0.193ubuntu4~20.04.3) focal; urgency=medium

  * ubuntutools/misc.py: swap iter_content for raw stream with
    "Accept-Encoding: identity" to fix .diff.gz downloads (LP: #2025748). This
    fixes a regression caused by the backport of 0.193 that included this bug.

 -- Robie Basak <email address hidden> Thu, 06 Jul 2023 12:14:00 +0000

Changed in ubuntu-dev-tools (Ubuntu Focal):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.