Provide pdiffs for apt-get update

Bug #214612 reported by Johan Kiviniemi on 2008-04-09
64
This bug affects 11 people
Affects Status Importance Assigned to Milestone
Launchpad itself
Low
Unassigned
Raspbian
Undecided
Unassigned

Bug Description

Debian has been using pdiffs for apt-get update for a while now. Instead of downloading megabytes of package lists when a tiny part has changed, a number of diffs are downloaded and applied to the local package lists. That makes apt-get update faster for the end user and possibly much cheaper for the mirrors.

Ubuntu repositories do not provide pdiffs and apt in Ubuntu doesn’t try to download them by default.

While we’re waiting for https://blueprints.edge.launchpad.net/ubuntu/+spec/apt-sync, it would be nice to use pdiffs in the meantime, since that functionality has already been implemented and tested.

Siegfried Gevatter (rainct) wrote :

I'm also interested in this. Currently an "apt-get update" having just the "deb" lines for current release and a "deb-src" for the development release can take over 5 minutes here using a 3G connection.

Julian Edwards (julian-edwards) wrote :

This would be a nice change, let's see what we can do.

Changed in soyuz:
importance: Undecided → Medium
status: New → Triaged
Julian Edwards (julian-edwards) wrote :

Lars Wirzenius has recently tested updates using zsync, we'll look into that.

tags: added: soyuz-publish
Magnes (magnesus2) wrote :

So, it's 2011 now. Did you look into that? Because it's really slow to download packages.gz even on fast connections.

Julian Edwards (julian-edwards) wrote :

We would love to implement this but we're really busy with more important fixes, like performance enhancements and fixing OOPSes. However, Launchpad is open source so if anyone wants to help fix this they'd get mentoring help from the developers.

Curtis Hovey (sinzui) on 2011-09-28
Changed in launchpad:
importance: Medium → Low
Clint Byrum (clint-fewbar) wrote :

So this may be more important than its "bug importance" would imply.

It would seem that the Ubuntu archive sets Expires: headers about 25 minutes after the modified time of the file it is serving:

[ ] Release 17-Apr-2012 06:53 48K

$ HEAD http://archive.ubuntu.com/ubuntu/dists/precise/Release
200 OK
Cache-Control: max-age=1762, s-maxage=3300, proxy-revalidate
Connection: close
Date: Tue, 17 Apr 2012 07:18:51 GMT
Accept-Ranges: bytes
ETag: "c19b-4bdda64880a80"
Server: Apache/2.2.14 (Ubuntu)
Content-Length: 49563
Content-Type: text/plain
Expires: Tue, 17 Apr 2012 07:48:14 GMT
Last-Modified: Tue, 17 Apr 2012 06:53:14 GMT
Client-Date: Tue, 17 Apr 2012 07:18:51 GMT
Client-Peer: 91.189.92.170:80
Client-Response-Num: 1

Any skew between the Release, Release.gpg, and especially Packages.gz file, means that apt reports a 'hash sum mismatch'. This is particularly frustrating while testing deployment/automation during the dev release, because the files keep on changing.

With the influx of more apt sources (backports, multiarch, extras), the potential for running into this skew gets larger and larger. The Expires: headers means that if you happen to cache the responses mid-update (pretty easy with 5 - 8 minutes seeming the average between Packages.gz and Release writing), then you will have a broken cache for 25 minutes.

With the pdiff format, it would seem that the window for archive skew is, if nothing else, smaller and less painful to repeat. Its also conceivable that Expires: can be relaxed a bit, perhaps to just 10 minutes, if the pdiff format is used since in theory people requesting Release and the pdiff index twice will not be as bad as re-requesting Packages.gz.

Nikolaus Waxweiler (madleser) wrote :

Anybody working on this? I'm on a very slow link at home and it's very annoying to have the connection blocked for 5-10 minutes or something while megabytes of package lists are downloaded :(

hackel (hackel) wrote :

Ubuntu really needs to implement this or another solution. apt-sync appears to be dead. I have a decent 20M connection, but most of the time I can't pull list or package updates at more than 100 KiB/s. It's not just users on a slow or limited bandwidth link that are affected by this!

Michael (michaelraspi) wrote :

It is mid 2018 now and nothing's change, that's quite a shame, especially considering they (Raspberry Foundation) just finally add better Ethernet, but the new Pi 3 doesn't worth it at all.

Raspbian is popular (because it's there ?) but no so much time and energy is put into as opposite to Debian.

It's even weird considering everything should be provided from upstream and a pdiffs are not related to hardware at all AFAIK.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers