source packages are removed from Sources index while binary packages still depend on them - makes maintaining gpl compliance hard for mirrors

Bug #549041 reported by James Troup
20
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Launchpad itself
Triaged
High
Unassigned

Bug Description

The source <-> binary reference counting appears to be broken unless
I'm missing something really obvious.

| linux-image-2.6.24-26-386 | 2.6.24-26.64 | hardy-security | i386

Yet:

| /srv/ftp.root/ubuntu/pool$ ls */l/linux/linux_*.diff.gz | grep "2\.6\.24"
| -rw-r--r-- 1 archvsync archvsync 3803441 2008-04-10 14:03 linux_2.6.24-16.30.diff.gz
| -rw-r--r-- 1 archvsync archvsync 4805685 2010-03-16 23:13 linux_2.6.24-27.68.diff.gz
| -rw-r--r-- 1 archvsync archvsync 4805899 2010-03-24 10:04 linux_2.6.24-27.69.diff.gz

dak style source <-> binary reference counting (which is critical for
GPL compliance) shouldn't have allowed the source for 2.6.24-26 to be
de-published. So, err, where did it go?

Revision history for this message
James Troup (elmo) wrote :

Lala. So it turns out this is in fact my fault and nothing to do with soyuz. The source is on ftp-master, it gets eaten by our magic mirror script which splits out the archive into archive.u.c and ports.u.c because that syncs source based on what's in the Sources indices files. We'll fix it to just blindly sync all source.

Changed in soyuz:
status: New → Invalid
Revision history for this message
James Troup (elmo) wrote :

Short version: for distro series >> lucid, please only remove source
packages from indices files when they're actually due to be removed
from disk.

Long version: So, on reflection, there is actually a soyuz bug here.
I can fix our magic mirror script for the archive / ports use case.
However a third use case for (the/a) magic mirror script is
old-releases. With source in the archive but not in the indices
files, it's extremely non-trivial to (in an automated fashion) get a
GPL compliant mirror of any given distro-series.

I think we should fix this by no longer automatically removing source
packages from indices files when there's a newer version, but instead
only removing them from indices files when the source packages are
about to be removed from disk.

This does mean that any given Sources file could have several versions
of any given source package which may break some tools that are not
expecting this. However the Debian folks have apparently being
publishing multiple versions of a given source package in Sources
files for a couple of months now, so presumably tools will started to
get fixed.

Changed in soyuz:
status: Invalid → New
summary: - source <-> binary reference counting appears to be broken
+ don't de-index source packages until they're due to be removed from disk
Revision history for this message
William Grant (wgrant) wrote : Re: don't de-index source packages until they're due to be removed from disk

I think we could reasonably easily produce a list of pool files for each subset of the archive that you're interested in. That would solve it for old releases too, without requiring a full pool sync. We'll need to teach Soyuz to do a similar filtering thing soon anyway if we are to integrate lmirror into it.

Revision history for this message
Julian Edwards (julian-edwards) wrote :

James, yeah we should cope with this, we had to just fix Gina to deal with the Debian imports with multiple releases in the index.

Another question I have is that do we really need a period where the file is not really removed given that we're storing it in Launchpad anyway? Can we just blow it away from the pool at the same time it's de-indexed?

Changed in soyuz:
status: New → Incomplete
Revision history for this message
Julian Edwards (julian-edwards) wrote :

Ok wgrant convinced me that doing that would be annoying for people with out-of-date indices that only update every 24 hours, for example.

We'll go with the multi-version index solution.

Changed in soyuz:
status: Incomplete → Triaged
importance: Undecided → Medium
tags: added: soyuz-publish
summary: - don't de-index source packages until they're due to be removed from disk
+ source packages are removed from indices while binary packages still
+ depend on them - makes maintaining gpl compliance hard for mirrors
summary: - source packages are removed from indices while binary packages still
- depend on them - makes maintaining gpl compliance hard for mirrors
+ source packages are removed from Sources index while binary packages
+ still depend on them - makes maintaining gpl compliance hard for mirrors
Changed in launchpad:
importance: Medium → High
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.