update-manager times out checking for updates

Bug #1171047 reported by Robin Sheat
14
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Launchpad itself
Fix Released
High
Chris Johnston
apt (Ubuntu)
Fix Released
High
Michael Vogt

Bug Description

I've bought a number of humble bundle games that were activated through the software centre, so I have quite a lot of sources in apt. The server that they come from is so terribly slow to fetch from that it causes the update manager to time out when checking for updates.

See the attached file for the timestamped output from apt-get update, and you'll see that while it's generally slow (multiple seconds between requests once the normal speed ones are dealt with) there is one that takes about 12 minutes to respond at all (13:21:48 to 13:34:12)

This means that I can't update through the ubuntu update manager, nor does it tell me when updates are available.

My internet is quite good, the first lot of sources complete in ~10 seconds, the problem seems to be at the private-ppa.launchpad.net server end. However, this wouldn't be such an issue if the update manager didn't time out (but really the solution is to fix the apt servers.)

ProblemType: Bug
DistroRelease: Ubuntu 13.04
Package: update-manager 1:0.185
ProcVersionSignature: Ubuntu 3.8.0-17.27-generic 3.8.6
Uname: Linux 3.8.0-17-generic x86_64
NonfreeKernelModules: nvidia
ApportVersion: 2.9.2-0ubuntu8
Aptdaemon:

Architecture: amd64
Date: Sun Apr 21 13:39:54 2013
GsettingsChanges:
 b'com.ubuntu.update-manager' b'first-run' b'false'
 b'com.ubuntu.update-manager' b'launch-time' b'1366195146'
 b'com.ubuntu.update-manager' b'show-details' b'true'
 b'com.ubuntu.update-manager' b'window-height' b'470'
 b'com.ubuntu.update-manager' b'window-width' b'583'
InstallationDate: Installed on 2011-04-26 (725 days ago)
InstallationMedia: Ubuntu 10.10 "Maverick Meerkat" - Release amd64 (20101007)
MarkForUpload: True
PackageArchitecture: all
SourcePackage: update-manager
UpgradeStatus: Upgraded to raring on 2013-04-13 (7 days ago)

Related branches

Revision history for this message
Robin Sheat (eythian) wrote :
Changed in update-manager (Ubuntu):
importance: Undecided → High
Revision history for this message
David Ames (thedac) wrote :

Robin,

Are you still having update times in the minutes? Or was that a single event?

I am unable to re-create the failure in testing.

If it is reoccurring can we get an 'mtr --report private-ppa.launchpad.net' from you?

Revision history for this message
Robin Sheat (eythian) wrote :

I haven't noticed a time in the past year-ish that it hasn't happened. When I get the chance I'll do an mtr report.

Revision history for this message
William Grant (wgrant) wrote :

I don't think there's any problem with private-ppa.launchpad.net. I'd say that the problem is rather with apt's HTTPS transport, which renegotiates an SSL connection for every single request, which is particularly a problem since apt won't make multiple concurrent requests to a single host (bug #275994). A tcpdump should confirm that most of the time is spent in SSL handshakes.

Revision history for this message
Robin Sheat (eythian) wrote :

$ mtr --report private-ppa.launchpad.net
HOST: syadasti Loss% Snt Last Avg Best Wrst StDev
  1.|-- 192.168.0.1 0.0% 10 1.0 2.7 0.9 18.6 5.6
  2.|-- ??? 100.0 10 0.0 0.0 0.0 0.0 0.0
  3.|-- lo0.internet.ivpn.pe25.te 0.0% 10 11.5 17.4 10.8 31.7 7.0
  4.|-- ie2-g-0-0-0.telstraclear. 0.0% 10 33.9 28.9 21.3 40.6 6.4
  5.|-- ge-0-2-0-1.xcore1.acld.te 0.0% 10 25.6 30.6 24.7 41.9 5.4
  6.|-- te7-3.ccr01.sjc05.atlas.c 0.0% 10 340.8 180.5 146.4 340.8 58.2
  7.|-- te0-3-0-7.ccr21.sjc01.atl 0.0% 10 144.5 158.3 144.5 213.0 20.0
  8.|-- be2000.ccr21.sjc03.atlas. 0.0% 10 147.5 154.7 147.5 165.9 5.6
  9.|-- level3.sjc03.atlas.cogent 10.0% 10 149.5 154.8 144.4 192.7 14.9
 10.|-- vlan60.csw1.SanJose1.Leve 0.0% 10 288.8 300.0 288.8 317.8 11.2
 11.|-- ae-61-61.ebr1.SanJose1.Le 0.0% 10 324.7 295.7 288.2 324.7 10.7
 12.|-- ae-2-2.ebr2.NewYork1.Leve 0.0% 10 297.1 300.7 286.7 314.9 9.4
 13.|-- ae-62-62.csw1.NewYork1.Le 0.0% 10 289.1 299.4 287.2 325.1 14.0
 14.|-- ae-61-61.ebr1.NewYork1.Le 0.0% 10 305.5 297.8 287.1 316.9 8.1
 15.|-- ae-42-42.ebr2.London1.Lev 0.0% 10 289.1 295.9 288.2 310.7 7.1
 16.|-- ae-56-221.csw2.London1.Le 10.0% 10 293.6 295.5 288.3 312.5 8.1
 17.|-- ??? 100.0 10 0.0 0.0 0.0 0.0 0.0
 18.|-- SOURCE-MANA.edge5.London1 10.0% 10 290.1 299.6 289.2 318.3 9.0
 19.|-- te2-2.tigermelon.canonica 10.0% 10 289.2 295.2 289.2 318.4 8.8
 20.|-- te2-2.honeymelon.canonica 10.0% 10 297.9 299.0 288.1 310.2 8.1
 21.|-- makara.canonical.com 10.0% 10 290.6 301.7 290.6 309.1 6.6
 22.|-- haetae.canonical.com 20.0% 10 316.5 305.0 298.8 316.5 7.3

Revision history for this message
Robin Sheat (eythian) wrote :

The SSL negotiation may be an issue, but I'm not sure it explains the 12 minute gap (unless there's something going on that doesn't cause output, which is feasible.)

It may or may not be related that plain transfers are also very slow. They rarely get above 60kB/s, often dropping below that. This especially sucks when a game is pushing a 2GB update. However I'd be surprised if a single operation was pushing enough data to take that long and notification wasn't occurring.

I'm trying it again, and seeing similar long times:

mei 09 22:48:13 Geraakt https://private-ppa.launchpad.net raring/main amd64 Packages
mei 09 22:48:13 Geraakt https://private-ppa.launchpad.net raring/main i386 Packages
mei 09 22:48:22 Geraakt https://private-ppa.launchpad.net raring/main amd64 Packages
mei 09 22:48:22 Geraakt https://private-ppa.launchpad.net raring/main i386 Packages
mei 09 22:48:31 Geraakt https://private-ppa.launchpad.net raring/main amd64 Packages
mei 09 22:48:31 Geraakt https://private-ppa.launchpad.net raring/main i386 Packages
mei 09 22:48:38 Geraakt https://private-ppa.launchpad.net raring/main amd64 Packages
mei 09 22:48:39 Geraakt https://private-ppa.launchpad.net raring/main i386 Packages

this isn't the really long one though, it hasn't got that far yet. The fact that the times group together surprises me too. I imagine it's one source, but given apt doesn't reuse connections and it's the same host, I'd expect a more even spread.

I could get a pcap of all the traffic to private-ppa if that'd be useful. Obviously it'd be a lot of SSL, but the timestamps and connection requests might be interesting when lined up against a log like above.

Revision history for this message
Robin Sheat (eythian) wrote :

Same gap:

mei 09 22:52:43 Geraakt https://private-ppa.launchpad.net raring/main amd64 Packages
mei 09 22:52:44 Geraakt https://private-ppa.launchpad.net raring/main i386 Packages
mei 09 23:05:03 Genegeerd https://private-ppa.launchpad.net raring/main Translation-nl_NL
mei 09 23:05:07 Genegeerd https://private-ppa.launchpad.net raring/main Translation-nl
mei 09 23:05:08 Genegeerd https://private-ppa.launchpad.net raring/main Translation-en_NZ
mei 09 23:05:10 Genegeerd https://private-ppa.launchpad.net raring/main Translation-en

I did see network traffic from my wifi adapter blinking during this, but that's hardly scientific.

Revision history for this message
William Grant (wgrant) wrote :

Using -o Debug::Acquire::https=true I was able to discover the cause of a big gap: near the end of the update it goes through all the archives and spends several minutes silently trying to grab compressed ddtp translations from all of them. Without debug logging it only logs the final, uncompressed set.

I see it first probing for Translation-en.xz and Translation-en_AU.xz for every archive, followed by the lzma versions, then gz, then eventually the uncompressed variants. All of these 404, causing the SSL connection to be renegotiated (my previous comment was incorrect; it only reconnects on error, but there are so many unlikely probes that it might as well reconnect every time). I have a few Humble Bundles installed, for something like 25 private PPAs. 25 PPAs * 4 compression algorithms * 2 languages == 200 requests just for Translation-*. Since none of those 200 files exist, there's an SSL negotiation for every single one, which is somewhere around a second from Australia. So there's about four minutes there.

Revision history for this message
William Grant (wgrant) wrote :

A normal apt-get update takes 8 minutes here, with 364 HTTPS requests, and 260 HTTPS connections. With Acquire::Languages=none it's down to 50 seconds, 103 HTTPS requests, and 2 HTTPS connections. In a pcap I see ~1250ms for the SSL handshake, then ~420ms for each SSL response, with an ICMP RTT of ~320ms. I'm fairly convinced this is an apt issue, not a server one, as those SSL times are pretty good considering the RTT.

William Grant (wgrant)
affects: update-manager (Ubuntu) → apt (Ubuntu)
Revision history for this message
Brian Murray (brian-murray) wrote :

Would you mind taking a long at this Michael? Thanks!

Changed in apt (Ubuntu):
assignee: nobody → Michael Vogt (mvo)
William Grant (wgrant)
Changed in launchpad:
assignee: nobody → Chris Johnston (cjohnston)
importance: Undecided → High
status: New → In Progress
Revision history for this message
Launchpad QA Bot (lpqabot) wrote :
tags: added: qa-needstesting
Changed in launchpad:
status: In Progress → Fix Committed
William Grant (wgrant)
tags: added: qa-ok
removed: qa-needstesting
Revision history for this message
William Grant (wgrant) wrote :

Launchpad now writes out Translation-en.bz2 for PPAs and includes it in Release, preventing apt from probing. This will gradually roll out to all existing PPAs as new packages are uploaded to them.

Changed in launchpad:
status: Fix Committed → Fix Released
Revision history for this message
Robin Sheat (eythian) wrote :

Will this change help for non-en locales?

Revision history for this message
William Grant (wgrant) wrote : Re: [Bug 1171047] Re: update-manager times out checking for updates

On 01/09/14 18:49, Robin Sheat wrote:
> Will this change help for non-en locales?

Yes.

If no translations are listed in Release, apt probes for en and all of
the enabled languages. As soon as even a single Translation-* file is
listed, the probing is disabled.

Revision history for this message
Robin Sheat (eythian) wrote :

Brilliant, thanks.

Revision history for this message
Julian Andres Klode (juliank) wrote :

This should be fixed in apt since some time as well, so closing.

Changed in apt (Ubuntu):
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.