contents-generation could be 2x faster by not regenerating Packages/Sources

Bug #1013583 reported by Colin Watson on 2012-06-15
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Launchpad itself
Low
Unassigned
apt (Ubuntu)
Medium
Unassigned

Bug Description

10:25 <cjwatson> Why does contents-generation generate its own dists/ rather than copying the most recent published version, anyway? I get why it has a separate dists/, but I don't really see why it has to go to the effort of generating its own.
10:28 <cjwatson> Oh, maybe it's just hard to get apt-ftparchive not to do so.
10:29 <wgrant> Because I hate apt-ftparchive.
10:30 <wgrant> But yeah, it's probably difficult to tell it otherwise.
10:31 <wgrant> Or was in 2006.
10:32 <cjwatson> It takes it 100+ minutes to generate all the Packages and Sources again, so avoiding that would be a nice improvement.
10:33 <cjwatson> Then Contents takes 90 minutes.
10:33 * cjwatson files a bug.
10:33 <wgrant> Oh
10:33 <wgrant> It doesn't preserve?
10:34 <cjwatson> Not so you'd notice.

From a brief foray into the code, I think it may indeed still be rather difficult to tell 'apt-ftparchive generate' not to update Packages/Sources every time. Worst case, perhaps we can fall back to using 'apt-ftparchive contents' manually.

Related branches

Changed in launchpad:
status: New → Triaged
importance: Undecided → Low
tags: added: packages
William Grant (wgrant) on 2012-06-15
tags: added: soyuz-publish
Michael Vogt (mvo) wrote :

Something like http://paste.ubuntu.com/1050714/ may work, I add a apt task.

Michael Vogt (mvo) wrote :

I added a branch link that adds a APT::FTPArchive::ContentsOnly option that will skip the Packages/Source file generation. I did some light testing and it seems to be working as expected. Note that there are some timestamp checks in the code, so it will expect that Package/Sources is fresher than Contents when it generates the Contents.

Michael Vogt (mvo) wrote :

Did the light testing with lp:~mvo/+junk/apt-ftparchive-testsuit.

Changed in apt (Ubuntu):
importance: Undecided → Medium
Adam Conrad (adconrad) wrote :

While ContentsOnly is a neat feature, is there a reason we don't just generate contents in the main apt-ftparchive run and actually cache it in the cache DB? It seems like our current contents-generation does no caching at all, which could explain why it's so dreadfully slow, even after it regenerates dists.

Dimitri John Ledkov (xnox) wrote :

The apt patch didn't seem to be merged, re-submitted as https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=756287

Re:
why not in the main apt-ftparchive run? Good point, at the moment however we only run Contents generation for the Ubuntu Distro archive only, and not the PPAs.

no caching at all? -> I don't see that, I see that in ./cronscripts/publishing/gen-contents/apt_conf_header.template a separate cache is setup and used/reused.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.