packages unnecessarily symlinked to other components in primary archive

Bug #117780 reported by James Troup on 2007-05-30
2
Affects Status Importance Assigned to Milestone
Launchpad itself
High
Unassigned

Bug Description

james@syowa:~$ bzgrep akode-dbg_2.0-0ubuntu3_amd64.deb /srv/ftp.root/ubuntu/dists/*/*/*/Packages.bz2
/srv/ftp.root/ubuntu/dists/dapper/universe/binary-amd64/Packages.bz2:Filename: pool/universe/a/akode/akode-dbg_2.0-0ubuntu3_amd64.deb
james@syowa:~$

i.e. the _only_ reference to that particular (version, architecture) of that package is by the dapper universe Packages file and it refers to it as a file in universe. But...

james@syowa:~$ find /srv/ftp.root/ -name akode-dbg_2.0-0ubuntu3_amd64.deb -ls
18531077 4 -rw-r--r-- 1 archvsync archvsync 1926 Jan 31 2006 /srv/ftp.root/ubuntu/pool/main/a/akode/akode-dbg_2.0-0ubuntu3_amd64.deb
30375968 0 lrwxrwxrwx 1 archvsync archvsync 54 Aug 23 2006 /srv/ftp.root/ubuntu/pool/universe/a/akode/akode-dbg_2.0-0ubuntu3_amd64.deb -> ../../../main/a/akode/akode-dbg_2.0-0ubuntu3_amd64.deb
james@syowa:~$

somehow, the file is in main and there is a symlink in universe to it. So why is this a problem?

Two reasons:

 (1) Our new mirror script works off the Packages/Sources files and basically ignores everything else in pool (when it isn't referenced).
      (We don't have any choice in this, we don't have any other data to work with - and we _have_ to run this script because we need to drop warty/hoary/breezy before our servers run out of space never mind all the mirrors it's cost and is costing us.)
       The problem is of course pool/universe/a/akode/akode-dbg_2.0-0ubuntu3_amd64.deb is referenced, but comes across as a dangling symlink because it's target is not referenced by any packages file. We fix this up after the fact, but because we have to do that, we leave a window open where a Packages-referenced file doesn't exist on the master mirror. Which is obviously bad.

 (2) it's a wasted/unecessary extra inode - which is important, in the context of the fact that the ubuntu archive is > 400K files and that makes rsyncing it very painful for both the server and the client. (I don't know how many of these are of course, if there's not thousands, this point is moot.)

Changed in soyuz:
status: New → Triaged
importance: Undecided → Medium
tags: added: soyuz-publish
William Grant (wgrant) wrote :

I think this is because death row processing doesn't bother to take components into account. If a package is published anywhere in the archive, its files won't be removed from any component.

Changed in launchpad:
importance: Medium → High
summary: - packages unnecessarily symlinked to other component
+ packages unnecessarily symlinked to other components in primary archive
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers