apt-get's HTTP pipeline desynchronizes, hilarity ensues

Bug #1413428 reported by Edward Z. Yang
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
apt (Ubuntu)
New
Undecided
Unassigned

Bug Description

tl;dr: apt-get improperly handles servers which respond 404 with HTTP content to a Range query, resulting in a desychronized HTTP buffer and hilarious bugs.

OK, this is going to be a long one. Where to begin? I was updating my Aptitude packages and noticed that my Dropbox source was not updating correctly:

Err http://linux.dropbox.com utopic/main amd64 Packages
  Bad header line

Silly Dropbox, not checking their package list! I report it to them, and they report back that the URL being fetched seems to be giving back a well formed HTTP response, and that they couldn't reproduce. I verify that is the case. We ponder the problem for a while, clearing caches and permuting the source.list line, and finally someone suggests running -o Debug::Acquire::http=true. I take the log and scroll to the error line:

Answer for: http://linux.dropbox.com/ubuntu/dists/utopic/main/i18n/Translation-en_US.lzma
Package: dropbox
Priority: optional
Section: gnome
Installed-Size: 404
Maintainer: Rian Hunter <email address hidden>
Architecture: amd64
Version: 2.10.0
Replaces: nautilus-dropbox
Provides: nautilus-dropbox
Depends: procps, python-gtk2 (>= 2.12), python (>= 2.5), libatk1.0-0 (>= 1.20.0), libc6 (>= 2.4), libcairo2 (>= 1.6.0), libglib2.0-0 (>= 2.16.0), libgtk2.0-0 (>= 2.12.0), libpango1.0-0 (>= 1.20.1)
Suggests: nautilus (>= 2.16.0), python-gpgme (>= 0.1)
Breaks: nautilus-dropbox
Filename: pool/main/dropbox_2.10.0_amd64.deb
Size: 94296
MD5sum: 39d2f6558a35defbb4e3346c66651da9
SHA1: f68b9e102b96a72f37e79f74ac7030cd881db284
SHA256: 5ddf820c1f2e2b12c7824f9691d09f204c33ec7073736891544b774f7e0a0812
Description: cloud synchronization engine - CLI and Nautilus extension
 Dropbox is a free service that lets you bring your photos, docs, and videos
 anywhere and share them easily.
 .
 This package provides a command-line tool and a Nautilus extension that
 integrates the Dropbox web service with your GNOME Desktop.
Homepage: https://www.dropbox.com/

Err http://linux.dropbox.com utopic/main amd64 Packages
  Bad header line

Well. That *sort* of looks reasonable. But I looked around at some of the other responses in the log, and I realized, "Oh shit, these should be HTTP headers!"

Answer for: http://debian.stanford.edu/ubuntu/dists/utopic/InRelease
HTTP/1.1 404 Not Found
Date: Wed, 21 Jan 2015 22:54:17 GMT
Server: Apache
Vary: Accept-Encoding
Content-Length: 227
Content-Type: text/html; charset=iso-8859-1

So, why, then, does Apt think that the content is the HTTP headers? I was reminded of an old bug I encountered in MediaWiki:

https://issues.apache.org/bugzilla/show_bug.cgi?id=40953
https://bugzilla.mozilla.org/show_bug.cgi?id=363109#c12
https://phabricator.wikimedia.org/T19537

Checking the source, it does seem apt pipelines requests by default, so if it desynchronized in its processing of the HTTP stream, that would be bad news. Seeking back in the log, we see this:

Answer for: http://linux.dropbox.com/ubuntu/dists/utopic/main/binary-amd64/Packages.bz2
HTTP/1.1 404 Not Found
Server: nginx
Date: Wed, 21 Jan 2015 22:54:17 GMT
Content-Type: text/html
Content-Length: 162
Connection: keep-alive
Content-Range: bytes */1142

GET /ubuntu/dists/utopic/main/binary-i386/Packages.bz2 HTTP/1.1
Host: linux.dropbox.com
Cache-Control: max-age=0
Range: bytes=2635-
If-Range: Mon, 29 Dec 2014 22:30:54 GMT
User-Agent: Debian APT-HTTP/1.3 (1.0.9.2ubuntu2)

Answer for: http://linux.dropbox.com/ubuntu/dists/utopic/main/binary-i386/Packages.bz2
<html>
<head><title>404 Not Found</title></head>
<body bgcolor="white">
<center><h1>404 Not Found</h1></center>
<hr><center>nginx</center>
</body>
</html>

Bingo.

By the way, you won't be able to reproduce the error unless you can induce apt-get to send the If-Range/Range header to the server. apt-get only sends the header if it has some cached partial lists (which, BY THE WAY, are not cleared when you clear your apt cache, WHY?!) I'll attach some files which you can put in /var/lib/apt/lists/partial which, along with adding

deb [arch=amd64,i386] http://linux.dropbox.com/ubuntu utopic main

to your sources list, should cause you to be able to reproduce the error.

For what it's worth, I also think the server is also partially to blame; I'm not sure but 404 doesn't seem like the right code to return here. I'll also attach full HTTP cache logs.

Can forward to upstream on request. (In fact, I'll probably do it anyway.)

ProblemType: Bug
DistroRelease: Ubuntu 14.10
Package: apt 1.0.9.2ubuntu2
ProcVersionSignature: Ubuntu 3.16.0-28.38-generic 3.16.7-ckt1
Uname: Linux 3.16.0-28-generic x86_64
NonfreeKernelModules: openafs
ApportVersion: 2.14.7-0ubuntu8.1
Architecture: amd64
Date: Wed Jan 21 15:27:02 2015
EcryptfsInUse: Yes
InstallationDate: Installed on 2013-11-21 (426 days ago)
InstallationMedia: Ubuntu 13.10 "Saucy Salamander" - Release amd64 (20131016.1)
SourcePackage: apt
UpgradeStatus: Upgraded to utopic on 2014-12-04 (48 days ago)

Revision history for this message
Edward Z. Yang (ezyang) wrote :
Revision history for this message
Edward Z. Yang (ezyang) wrote :
Revision history for this message
Edward Z. Yang (ezyang) wrote :
Revision history for this message
Edward Z. Yang (ezyang) wrote :
Revision history for this message
Edward Z. Yang (ezyang) wrote :
Revision history for this message
Edward Z. Yang (ezyang) wrote :

Oh, apparently you have to set the timestamp on the files sometime before 29 Dec 2014 too, because nginx doesn't spazz unless the if-range is also sent.

Revision history for this message
Edward Z. Yang (ezyang) wrote :
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.