apt-get update size is too big

Bug #1001780 reported by Javier López
238
This bug affects 46 people
Affects Status Importance Assigned to Milestone
Launchpad itself
Invalid
Undecided
Unassigned
Ubuntu
Fix Released
Undecided
Unassigned

Bug Description

I ran a clean install to Ubuntu 12.04 and so far everything has been working well. I especially commend the Ubuntu team for this release.

I only noticed that the size of repository update is now about ~13MB. Normally, it is about this size for the first time you run apt-get update after a clean install and then ~ 23kb - 1300kb for subsequent updates. However now it looks like it gets 13MB and bigger every now and then.

Using us.archive.ubuntu.com archive, I see that the Universe Package files are being recreated a couple of times an hour, but contain the same content. The file modification date and expiration date, and in particular, the etag, are changing each time causing apt-get update to reload the Package file again even though it hasn't changed.

Launchpad shouldn't recreating the Package files when no changes have been made to the contained packages.

#http://askubuntu.com/questions/135818/the-apt-get-update-cache-size-is-too-big

Curtis Hovey (sinzui)
affects: launchpad → apt (Ubuntu)
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in apt (Ubuntu):
status: New → Confirmed
summary: - apt-get update cache size is too big
+ apt-get update size is too big
Revision history for this message
zpletan (zpletan) wrote :

A look at us.archive.ubuntu.com/ubuntu/dists/$DIST/[main|universe]/binary-i386 reveals that timestamps on the package info for hardy, lucid, maverick, natty, oneiric, and precise are being updated. Based on this, I would say that the problem is *not* specific to Precise. This concurs with my experience; in addition to real and virtual machines I have with Precise, this happened on my virtual machine of Oneiric before I blew it away a few weeks ago.

Revision history for this message
John S. Gruber (jsjgruber) wrote :

Terminal logs of http headers showing changing modification times and md5sums of the unchanging files. Included both archives.ubuntu.com and us.archives.ubuntu.com. A packet trace showed that apt is using a "If-Modified-Since" header which is being subverted. The problem applies to main and universe but not to, for example, the security repo.

Revision history for this message
John S. Gruber (jsjgruber) wrote :

Is this possibly related to the optimizations mentioned in https://lists.ubuntu.com/archives/ubuntu-devel/2011-December/034577.html ?

Revision history for this message
William Grant (wgrant) wrote :

It's not a Launchpad problem. From the master copy of the archive:

-rw-r--r-- 1 lp_publish lp_publish 1.5M Sep 20 2008 /srv/launchpad.net/ubuntu-archive/ubuntu/dists/hardy/main/binary-i386/Packages.gz

Since http://archive.ubuntu.com/ubuntu/dists/hardy/main/binary-i386/ shows the bad mtime, it's probably something in the internal mirroring pipeline.

But one could also argue that apt still shouldn't regrab the file, since Release still has the same hashes for it.

Changed in launchpad:
status: New → Invalid
Revision history for this message
William Grant (wgrant) wrote :

https://rt.admin.canonical.com/Ticket/Display.html?id=53097 (Canonical-only link, sorry) filed to track the mirror script problem.

Revision history for this message
John S. Gruber (jsjgruber) wrote :

I've been able to circumvent this problem by touching the appropriate files in /var/lib/apt/lists/ right before updating the apt cache.

Revision history for this message
zpletan (zpletan) wrote :

@jsjgruber, if I touch the files, will they still be downloaded if info has been changed?

Revision history for this message
John S. Gruber (jsjgruber) wrote :

@zpletan, touching them stops them from being downloaded. You should only touch the files that haven't changed since you downloaded them. By appropriate files I'm speaking of the files that were frozen at the release of the release you are running (those are the ones giving people trouble until the bug is fixed). You shouldn't use the touch command on any others, or against Quantal--its main and universe repos are active.

In this case touching a file just says it was current at the time you run the touch command (it makes the last modification time the current time).

There's detail in: http://askubuntu.com/questions/135818/the-apt-get-update-cache-size-is-too-big

Revision history for this message
David T (ubuntuwiki-datmail) wrote :

touching 4 files on 50+ servers every 60mins is not practical. Sure hope a patch appears soon. Problem started 4/26/2012 (+ or - a day), I saw the major spike in my bandwidth usage around there.

Revision history for this message
John S. Gruber (jsjgruber) wrote :

A fix would save a lot of bandwidth for users, mirrors, and Canonical. A fix would be great. Most users don't know about any circumventions, of course, and many don't even realize there is a problem.

Nevertheless I'm afraid I don't understand your comment. Your system should normally be downloading from the repositories on just one server and you should be touching just these four files on your system. Am I missing something? If the directions on askubuntu.com aren't clear I'd like to clarify them.

Revision history for this message
David T (ubuntuwiki-datmail) wrote :

I am feeling this bug so much because I am a specialized hosting provider and chose Ubuntu as my primary linux distro. I have 50+ individual servers in 2 sets of racks, all of them checking for security updates every 60 minutes.

I feel the 120GB in the last 25'ish days :/

Revision history for this message
J Phani Mahesh (phanimahesh) wrote :

Suggestion:
----------------
How about using zsync for downloading the lists?
Practically feasible solution, with no additional server side overload.

@David: If you have 50 servers that poll for updates every 60 minutes, I suggest you set up a caching server. That saves a lot of bandwidth. I suggest squid-deb-proxy or something similar.

Revision history for this message
Ernst Kloppenburg (ernst-kloppenburg) wrote :

could somebody please assign an adequate importance to this bug

Revision history for this message
David T (ubuntuwiki-datmail) wrote :

I'm amazed that every ubuntu mirror hasn't gone up in arms about this bug. They must be feeling it. Went from 40k-13000k that's what a 325+ fold increase in their outbound bandwidth levels. They must have massive pipes and don't care. :/

Revision history for this message
Jane Atkinson (irihapeti) wrote :

Not everyone has high-bandwidth plans. I'll probably have to move to a low-bandwidth plan shortly and downloading 12 MB or so daily is going to hurt.

Revision history for this message
William Grant (wgrant) wrote :

The internal mirror script has been fixed to not clobber the timestamps for old files, so this should no longer be a big problem.

Greg A (etulfetulf)
affects: apt (Ubuntu) → ubuntu
Changed in ubuntu:
status: Confirmed → Fix Released
Revision history for this message
Martin Lee (hellnest) wrote :

The issue is stille exist on 13.10

Revision history for this message
Javier López (javier-lopez) wrote :

I've seen this issue again in Ubuntu 14.04 currently being developed. On every $ sudo apt-get update #even when they're run one after the other, apt-get keeps downloading 19.8MB~

apt:
  Installed: 0.9.13.1~ubuntu1
  Candidate: 0.9.13.1~ubuntu1
  Version table:
 *** 0.9.13.1~ubuntu1 0
        500 http://us.archive.ubuntu.com/ubuntu/ trusty/main i386 Packages
        100 /var/lib/dpkg/status

deb http://us.archive.ubuntu.com/ubuntu/ trusty main restricted
deb-src http://us.archive.ubuntu.com/ubuntu/ trusty main restricted
deb http://us.archive.ubuntu.com/ubuntu/ trusty-updates main restricted
deb-src http://us.archive.ubuntu.com/ubuntu/ trusty-updates main restricted
deb http://us.archive.ubuntu.com/ubuntu/ trusty universe
deb-src http://us.archive.ubuntu.com/ubuntu/ trusty universe
deb http://us.archive.ubuntu.com/ubuntu/ trusty-updates universe
deb-src http://us.archive.ubuntu.com/ubuntu/ trusty-updates universe
deb http://us.archive.ubuntu.com/ubuntu/ trusty multiverse
deb-src http://us.archive.ubuntu.com/ubuntu/ trusty multiverse
deb http://us.archive.ubuntu.com/ubuntu/ trusty-updates multiverse
deb-src http://us.archive.ubuntu.com/ubuntu/ trusty-updates multiverse
deb http://us.archive.ubuntu.com/ubuntu/ trusty-backports main restricted universe multiverse
deb-src http://us.archive.ubuntu.com/ubuntu/ trusty-backports main restricted universe multiverse
deb http://security.ubuntu.com/ubuntu trusty-security main restricted
deb-src http://security.ubuntu.com/ubuntu trusty-security main restricted
deb http://security.ubuntu.com/ubuntu trusty-security universe
deb-src http://security.ubuntu.com/ubuntu trusty-security universe
deb http://security.ubuntu.com/ubuntu trusty-security multiverse
deb-src http://security.ubuntu.com/ubuntu trusty-security multiverse

Revision history for this message
Alan (alanjas) wrote :

I have the same problem in Ubuntu 14.04 (trusty).
I run apt-get update and every time it downloads 15,5 MB !!
Seeing the output, I see that every time downloads the same file!!
It's the SAME file because have exactly same size! Maybe only changes the timestamp..

there are some.. this are the first:

Des:1 http://security.ubuntu.com saucy-security Release.gpg [933 B]
Des:2 http://security.ubuntu.com saucy-security Release [49,6 kB]
Des:3 http://archive.ubuntu.com trusty Release.gpg [933 B]
Des:4 http://security.ubuntu.com saucy-security/main Sources [27,9 kB]
...
Des:8 http://security.ubuntu.com saucy-security/universe Sources [8.372 B]
Des:32 http://archive.ubuntu.com trusty/universe i386 Packages [5.878 kB]

Revision history for this message
Faheem Ahmed (faheem-webmaster) wrote :

I'm also having this bug on Ubuntu 14.04 LTS
Each time I run 'sudo apt-get update' it fetches 22.5MB.

Revision history for this message
Dave (dv1) wrote :

Likewise - Just upgraded to 14.04, and downloading 15.1MB per apt-get update (although not absolutely _every_ time, perhaps the first time each hour?)

(Need a new launchpad button for "This bug now affects me _again_")

Revision history for this message
Gunji (mgpatt-deactivatedaccount) wrote :

I'm having this error as well after upgrading from 12.04 LTS to 14.04 LTS. I'm having to download 37MB+ every time I run the software updater and/or use apt-get. I've tired using some of the tricks above, like using touch and disabling some repos, but it's still happening.

It's tolerable for the time being while I'm using an unmetered fast connection, but it would be insanely painful if I was using a metered, slow connection. (Say, off the top of my head, like some Australian users.)

Revision history for this message
John S. Gruber (jsjgruber) wrote :

This particular bug report was marked fixed. Please open another against Ubuntu itself (like this one), tag it a regression, and mention this bug report.

Thanks.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.