disable apt http pipelining in quantal

Bug #996151 reported by Ben Howard
20
This bug affects 3 people
Affects Status Importance Assigned to Milestone
apt (Debian)
Fix Released
Unknown
apt (Ubuntu)
Fix Released
Low
Unassigned

Bug Description

Per UDS session on Apt improvements, it has been proposed to remove apt http pipelining

The reasons:
1. HTTP Pipelining has issue with certain proxy implementation
2. Some new object stores, like S3, or Google's APT repositories have problems with HTTP Pipelining

Running a test shows that disabling apt-pipelining has no perceptable diffferenvce, and disabling apt pipeling actually performed slightly better with an average of 31.899s versuses 32.456s. I tested an "apt-get -y update" with and without apt HTTP pipelining turned on.

For more information on apt-pipelining, here are 2 threads to read:
 http://old.nabble.com/APT-do-not-work-with-Squid-as-a-proxy-because-of-pipelining-default-td28579596.html
 http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=565555

Pipelining on (apt-get -y upgrade):
33.92
31.37
31.64
31.63
33.29
33.08
32.92
32.88
31.73
31.98
32.01
32.96
31.51
32.68
33.25

Pipelining off (apt-get -o Acquire::http::Pipeline-Depth="0" -y upgrade):
31.66
31.59
31.24
31.30
31.29
32.85
32.75
31.50
31.18
32.26
31.43
33.28
31.67
32.45
32.04

Related branches

CVE References

Aditya V (kroq-gar78)
summary: - disable apt http pipelinig in quantal
+ disable apt http pipelining in quantal
Craig Hrabal (mathor)
Changed in apt (Ubuntu):
status: New → Opinion
Scott Moser (smoser)
description: updated
Changed in apt (Debian):
status: Unknown → New
Revision history for this message
Scott Moser (smoser) wrote :

per links, at best pipelining is break-even, at worst, actually performs worse or is buggy.

Changed in apt (Ubuntu):
status: Opinion → Confirmed
Revision history for this message
David Kalnischkies (donkult) wrote :
Download full text (3.2 KiB)

per links can only be concluded that a ever-growing amount of webservers and proxies prefer to violate a must requirement in the HTTP/1.1 spec (rfc2616 section "8.1.2.2 Pipelining" second sentence: "A server MUST send its responses to those requests in the same order that the requests were received."), nothing else.

The link to the debian bts includes also the agreement of the squid-proxy maintainer that squid should support a pipelining client hence a still open bugreport against squid (which is not the same as supporting pipelining! squid doesn't need to pipeline its own requests, it just needs to ensure that it's responses are in the correct order). And while i haven't tested it, the internet beliefs that newer squid versions supports pipelining clients (and even pipelining for itself!).

As already said in the uds session (and after it) it's rather meaningless to perform a test on a single machine with a low-latency network connection. Obviously pipelining has only a benefit for the client if the connection is flaky/high-latency (like e.g. my phone). For the network in-between and the server it might be that it has to handle fewer packages including requests and the server can deal with the requests faster and therefore being done with serving faster, ready to handle another client. (My preferred realworld example is: Consider a shopping list, do you shop and pay for each item individual or are you packing everything in your cart and pay for all in one go? Congrats, you pipelined your shopping!)

For the record: The opera webbrowser has it enabled since ever, chromium enabled it a bit more than 2 weeks ago in their dev-branch (http://codereview.chromium.org/10170044). Both might try harder to work around buggy servers through (before someone asks: yes if the server would close the connection, apt would fallback to none-pipelining as the spec recommends). Other browsers have it disabled for "the web is a buggy hell"-reasons or concerns about Head-of-line blocking (handling request A is slow - e.g. because it needs to be generated dynamically - so that B - e.g. a static file - could be already sent & done, if we wouldn't need to wait to finish A first…), which doesn't apply to repositories through so isn't an argument here. The "solution" in most clients is just to open a few more connections instead of working with one efficiently (realworld analogy: Oh, a traffic jam! Lets split our car into two cars to get two cars through the jam in the same amount of time…). Googles SPDY and HTTP/2.0 (will) try to fix all these "the web feels sooo slow!" with multiplexing and (drum-roll) pipelining…

So we are back at square one: the web is a buggy mess. Lets just hope that Google will force the web once again (after they fixed there own repository to work with their own browser [reductio ad absurdum]) to be more standard conform and disable it until then by default as i don't have the energy to defend it like previous contributors did (which is the only real conclusion to be taken from the previously mentioned threads) and will just enable it on all my machines.
(And now hands up, who imagined such an outcome after reading the previous four paragrap...

Read more...

Revision history for this message
Scott Moser (smoser) wrote : Re: [Bug 996151] Re: disable apt http pipelining in quantal

David,
   Thank you for your well written comment above.

On Mon, 14 May 2012, David Kalnischkies wrote:

> As already said in the uds session (and after it) it's rather
> meaningless to perform a test on a single machine with a low-latency
> network connection. Obviously pipelining has only a benefit for the
> client if the connection is flaky/high-latency (like e.g. my phone). For

For some reason this point was lost on me. You're right, the tests we've
done on EC2 to a lan-connected http server are not useful information.

I've tried a couple tests locally here (cable modem) with:
  sudo rm /var/lib/apt/lists/*;
  time sudo apt-get update -o Acquire::HTTP::Proxy=None \
     -o Acquire::http::Pipeline-Depth=5
and
  sudo rm /var/lib/apt/lists/*;
  time sudo apt-get update -o Acquire::HTTP::Proxy=None \
     -o Acquire::http::Pipeline-Depth=0

That didn't give me really good consistent data, though.
The most annoying issue was that apt would apparently hang on one
connection. Ie, output would hang somewhere like:
  Get:101 http://us.archive.ubuntu.com precise-backports/universe Translation-en [696 B]
  100% [16 Packages 0 B/769 B 0%]

That single hang (apparent http get for a small-ish file, derails any
actual data). I saw that both with both values of Pipeline-Depth.

> So we are back at square one: the web is a buggy mess. Lets just hope
> that Google will force the web once again (after they fixed there own
> repository to work with their own browser [reductio ad absurdum]) to be
> more standard conform and disable it until then by default as i don't
> have the energy to defend it like previous contributors did (which is
> the only real conclusion to be taken from the previously mentioned
> threads) and will just enable it on all my machines.

Do you have any thoughts on how we could collect enough data to show if it
is useful for our usecase? I would think that my cable modem would
qualify as the target case for pipelineing (relatively high-bandwidth and
high-latency).

> (And now hands up, who imagined such an outcome after reading the
> previous four paragraphs? I just needed a reference to point people
> complaining about the new default to…)

I actually agree that it makes sense to have a safer default, and to allow
those interested to enable the more risky option.

Changed in apt (Ubuntu):
importance: Undecided → Low
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (22.3 KiB)

This bug was fixed in the package apt - 0.9.6ubuntu1

---------------
apt (0.9.6ubuntu1) quantal-proposed; urgency=low

  [ Michael Vogt ]
  * merged from Debian, remaining changes:
    - use ubuntu keyring and ubuntu archive keyring in apt-key
    - run update-apt-xapian-index in apt.cron
    - support apt-key net-update and verify keys against master-keyring
    - run apt-key net-update in cron.daily
    - different example sources.list
    - APT::pkgPackageManager::MaxLoopCount set to 5000
    - apport pkgfailure handling
    - ubuntu changelog download handling
    - patch for apt cross-building, see http://bugs.debian.org/666772

  [ Steve Langasek ]
  * Drop upgrade handling for obsolete conffile /etc/apt/apt.conf.d/01ubuntu,
    removed in previous LTS.
  * prepare-release: declare the packages needed as source build deps.

apt (0.9.6) unstable; urgency=low

  [ David Kalnischkies ]
  * apt-pkg/cdrom.cc:
    - fix regression from 0.9.3 which dumped the main configuration
      _config instead of the cdrom settings (Cnf) as identified and
      tested by Milan Kupcevic, thanks! (Closes: #674100)
  * cmdline/apt-get.cc:
    - do not show 'list of broken packages' header if no package
      is broken as it happens e.g. for external resolver errors
    - print URIs for all changelogs in case of --print-uris,
      thanks to Daniel Hartwig for the patch! (Closes: #674897)
    - show 'bzr branch' as 'bzr get' is deprecated (LP: #1011032)
    - check build-dep candidate if install is forbidden
  * debian/apt-utils.links:
    - the internal resolver 'apt' is now directly installed in
      /usr/lib/apt/solvers, so don't instruct dh to create a broken link
  * doc/apt-verbatim.ent:
    - APT doesn't belong to the product 'Linux', so use 'APT' instead
      as after all APT is a big suite of applications
  * doc/examples/sources.list:
    - use the codename instead of 'stable' in the examples sources.list
      as we do in the manpage and as the debian-installer does
  * doc/apt-get.8.xml:
    - use apt-utils as package example instead of libc6
  * apt-pkg/contrib/cmdline.cc:
    - apply patch from Daniel Hartwig to fix a segfault in case
      the LongOpt is empty (Closes: #676331)
    - fix segfault with empty LongOpt in --no-* branch
  * ftparchive/apt-ftparchive.cc:
    - default to putting the Contents-* files below $(SECTION) as apt-file
      expects them there - thanks Martin-Éric Racine! (Closes: #675827)
  * apt-pkg/deb/deblistparser.cc:
    - set pkgCacheGen::Essential to "all" again (Closes: #675449)
  * apt-pkg/algorithms.cc:
    - force install only for one essential package out of a group
  * apt-pkg/aptconfiguration.cc:
    - if APT::Languages=none save "none" in allCodes so that the detected
      configuration is cached as intended (Closes: #674690, LP: #1004947)
  * apt-pkg/cacheiterators.h:
    - add an IsMultiArchImplicit() method for Dep- and PrvIterator

  [ Justin B Rye ]
  * doc/apt-cdrom.8.xml:
    - replace CDROM with the proper CD-ROM in text
    - correct disc vs. disk issues
  * doc/apt-extracttemplates.1.xml:
    - debconf is not DebConf
  * doc/apt-get.8.xml:
    - move dselect-upgrade below dist-upgrade
    - rev...

Changed in apt (Ubuntu):
status: Confirmed → Fix Released
Revision history for this message
Sandip Bhattacharya (sandipb) wrote :

Since it is just a config change, can this fix be enabled in Precise, since it an LTS after all. We are trying to install Precise in our org, and our squid proxies are preventing us from using the latest LTS because of this bug.

If however, there is a way to disable pipelining in a preseed file, that would work as well as a workaround.

Revision history for this message
Jason Gunthorpe (jgunthorpe) wrote :

Wow, I'm really surprised you guys have decided to turn pipe-lining off. That is crazy. Pipelining has been in APT since day one (ie since 1997, wow!), and I personally worked with a number of web server developers to make sure their severs worked properly, according to the RFC.

Squid has *always* had varying levels of breakage when working with pipelining, but I also extensively tested APT's HTTP method with squid and ensured it worked for many years. It looks to me like someone must have tried to 'improve' things in squid (probably tried to support HTTP/1.1 keep-alive) and broke it even more..

The thought that pipe lining is inherently broken is ridiculous. The behaviour of requesters, proxies and completer's is very well defined, and if you follow the damn spec you don't create any problems, security, correctness, or otherwise.

And yes, it makes an huge, obvious, night and day difference:

$ time sudo apt-get update -o Acquire::http::Pipeline-Depth=10
real 0m9.090s
$ time sudo apt-get update -o Acquire::http::Pipeline-Depth=0
real 0m19.700s

A much better suggestion would be to detect a proxy during install (most proxies add headers to their reply) and drop a pipeline depth config into /etc/apt/apt.conf.d/ ... Or perhaps not pipeline the first request and look for a proxy in the reply, then turn it on.

It is completely mind blowing that squid has been broken since 1997, even to the extent that the breakage created a whole new class of proxy vulnerabilities (request smuggling) and nobody has fixed it, or even really cared to notice..

Revision history for this message
David Kalnischkies (donkult) wrote :

> A much better suggestion would be to detect a proxy during install (most proxies add headers to their reply) and drop a pipeline depth config into /etc/apt/apt.conf.d/ ... Or perhaps not pipeline the first request and look for a proxy in the reply, then turn it on.

Its a bit unfair to disable it for all proxies (and as said I am not even sure squid is still fully buggy) - and after all S3 and the other "dumb storage" entities which triggered this are not proxies, so we would gain nothing – and they seem to be in no way interested to follow a standard because they are "the cloud" and "the cloud rulez": "Hey, there is a + in the URI, that must be a space!" …

Looking just at the first response from the server is not going to work as we are frequently lucky enough to get the expected response just by chance. I can find frequent reports that reducing the Pipeline-Depth to 2 makes it work "all the time" (which is not too surprising, but still fails once in a blue moon).

I guess our only option is to check the hashsum of the file in the methods already (we calculate them there, so we "just" have to tell them what to expect), so that on a mismatch we can see if another file we requested in the pipeline matches the hashsum and if so do a data shuffle and disable pipelining for this server in this run (in the hope that it will be fixed soon - or in fear that a different one will answer our request next time).

That is quiet a bit of work though and not only on the code, but also on the test front – and has very ugly corner cases (partial downloads, …) just to work around the problems in "the cloud" … (I should just have grown some balls and tell the world "<censored>!! Fix Pipelining <censored>!" I guess, but I am so weak, I even censor myself …)

Revision history for this message
Scott Moser (smoser) wrote :

@Jason,
  I'm curious as to what sort of link you're on that shows that greatly improved behavior.
  In my (admittedly small) testing, I tested against local apache mirrors, remote mirrors through a proxy...

  I could very well have been missing something obvious, or just being woefully ignorant.
  I admit to being not terribly well educated on this topic, and relied heavily on the advice of Robert Collins (author of squid) to give direction.

Changed in apt (Debian):
status: New → Fix Committed
Changed in apt (Debian):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.