Natty alpha2 source jigdo's fail - 1500 files missing, don't match binaries

Bug #713876 reported by John Gilmore on 2011-02-06
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Ubuntu CD Images
Low
Unassigned

Bug Description

I tried to download the three source DVDs from the Natty alpha2 release candidate from here:

  http://cdimage.ubuntu.com/releases/natty/alpha-2/source/

I was able to get the jigdo and template files, but when I run jigdo-lite on them, it is only able to
download some of the thousands of files required from the Ubuntu repositories.

In natty-src-1, 883 files are missing. Examples include:
  http://us.archive.ubuntu.com/ubuntu/pool/main/k/kde-l10n-hr/kde-l10n-hr_4.5.80.orig.tar.bz2
  libimobiledevice_1.0.4.orig.tar.gz
  php5_5.3.3-1ubuntu10.dsc
  kdegames_4.5.80-0ubuntu1.debian.tar.gz
  grantlee_0.1.7-0ubuntu1.debian.tar.gz
  meta-kde_63ubuntu9.tar.gz
  base-installer_1.114ubuntu1.dsc
  kdevelop_4.1.0-0ubuntu1.debian.tar.gz
  lapack_3.2.2-1.2.diff.gz

In natty-src-2, 500 files are missing. Examples include:
  http://us.archive.ubuntu.com/ubuntu/pool/main/p/python-numpy/python-numpy_1.4.1-5ubuntu2.debian.tar.gz
  packagekit_0.6.8-0ubuntu4.dsc
  partman-reiserfs_49.dsc
  python-apt_0.7.100ubuntu1.tar.gz
  telepathy-glib_0.13.6-1.dsc
  nvidia-graphics-drivers_260.19.21-0ubuntu1.dsc
  pkgsel_0.32ubuntu1.dsc
  libselinux_2.0.96-1.diff.gz
  vlc_1.1.5-3ubuntu1.dsc
  qt4-perl_4.5~~svn1145508-2.dsc

In natty-src-3, 161 files are missing. Examples include:
  http://us.archive.ubuntu.com/ubuntu/pool/main/t/totem/totem_2.32.0-0ubuntu2.dsc
  xserver-xorg-video-intel_2.13.901.orig.tar.gz
  xserver-xorg-video-ark_0.7.3-1.dsc
  ubufox_0.9~rc2-0ubuntu6.dsc
  xulrunner-1.9.2_1.9.2.13+build1+nobinonly.orig.tar.gz
  xserver-xorg-video-vmware_11.0.3-1.dsc
  xserver-xorg-input-vmmouse_12.6.10-1.dsc
  ispell.pt_20101104.orig.tar.gz
  xfwm4_4.7.1.orig.tar.bz2

I tried chasing down one of these packages (xserver-xorg-video-intel). The Ubuntu packages site:

  http://packages.ubuntu.com/natty/xserver-xorg-video-intel

says that the Natty version is 2.14, not 2.13.901, and lists these source file locations:

  http://archive.ubuntu.com/ubuntu/pool/main/x/xserver-xorg-video-intel/xserver-xorg-video-intel_2.14.0-1ubuntu6.dsc
  http://archive.ubuntu.com/ubuntu/pool/main/x/xserver-xorg-video-intel/xserver-xorg-video-intel_2.14.0.orig.tar.gz
  http://archive.ubuntu.com/ubuntu/pool/main/x/xserver-xorg-video-intel/xserver-xorg-video-intel_2.14.0-1ubuntu6.diff.gz

Those files exist and I was able to download each of them.

Now the question is: which *binary* was included in alpha-2? I downloaded (via its torrent) and mounted the Natty desktop ISO image:

  http://cdimage.ubuntu.com/releases/natty/alpha-2/natty-desktop-i386.iso

Inside the image, /casper/filesystem.manifest and /casper/filesystem.manifest-desktop both list:

  xserver-xorg-video-intel 2:2.14.0-1ubuntu5

Inside the squashfs in the desktop iso, /usr/doc/xserver-xorg-video-intel has an entry for version 2:2.14.0-1ubuntu5, and
the README file mentions "Release 2.14.0 (2011-01-07)".

The natty-src-3.jigdo file includes the line:

  BHe5tCtOFkP-_WYzLm0fOQ=Debian:pool/main/x/xserver-xorg-video-intel/xserver-xorg-video-intel_2.13.901.orig.tar.gz

It looks like the binary ISO shipped version 2:2.14.0-1ubuntu5, but the source jigdo's tried to ship something else, thus causing the jigdo problem.

The missing files have always been a problem with milestone (i.e.
pre-release) builds. Unfortunately, this is extremely hard to fix. We
don't have a snapshot archive that corresponds to the point at which the
jigdo files were built, and files are expired from archive.ubuntu.com
once they've been superseded by newer versions in the same release for a
while (about a day, I think). This is not a problem for the final
release because everything in the final release is kept around.

My usual suggestion is to use jigdo to download everything it can, and
then use rsync to fix up any remaining differences, although I agree
that this is not well documented.

While bug 187864 requests that Launchpad provide snapshot archives, and
this would gain something in terms of avoiding confusing UI, it wouldn't
gain anything in terms of the primary purpose of jigdo, i.e.
transferring some of the ISO image download cost to a local mirror,
because any such snapshot archive would likely be extremely poorly
mirrored.

As for the source/binary desync, well, this is partly because I switched
to a single aggregated source ISO image as the simplest resolution to
your previous comments on source images, and that image is built from
whatever the current versions are at the time when it was built. If the
archive skewed between the binary image builds then the source image
only gets to contain one of them. (There are of course theoretical
workarounds for this - scan all binary images, pick out required
versions, calculate union of all those versions, construct archive of
any versions no longer in the archive by downloading directly from
Launchpad, build source image based on that archive - but unfortunately
debian-cd doesn't support this kind of thing very well, to put it
mildly.)

I suspect that this was compounded because our release managers have to
remember to rebuild the source image towards the end of the milestone
release process, and it looks like that may not have happened in this
case. We could probably use some process improvement here.

 status triaged
 importance low

Changed in ubuntu-cdimage:
importance: Undecided → Low
status: New → Triaged
John Gilmore (gnu-gilmore) wrote :

It's not just the jigdo's that are wrong -- the "natty-src-3.iso" file that I downloaded from the web also includes the wrong source code (xserver-xorg-video-intel_2.13.901 rather than version 2.14).

It appears that Ubuntu is intentionally violating the GPL every time it cuts an alpha or beta release, by releasing binary CD/DVDs without matching sources. And marking this "importance: Low"? I don't think releasing matching sources for your binaries is optional, whether or not "Unfortunately, this is extremely hard to fix." If you can't fix it, you can't ship those binaries. Any GPL copyright holder could sue you over it, today, and shut down your release process.

It also seems bizarre that Ubuntu can manage to build complete binary releases but can't build matching source releases. In every other shop I've ever dealt with, the source release is built first, THEN the binaries are built from the sources. Why can't Ubuntu make releases in the normal way, which would guarantee the ability to produce matching sources and binaries?

All versions of our source packages are available on Launchpad,
distributed by us. True, they aren't on the same server, but we do
distribute them for at least the lifetime required by the GPL - in fact,
I'm not aware that we've ever deleted any library copies of source
packages since we started using Launchpad for our archive.

  https://launchpad.net/ubuntu/+source/PACKAGE-NAME/+publishinghistory

I marked this as Importance: low because it's a CD assembly issue, not a
failure to provide source code (which would certainly be of much higher
importance).

Individual Ubuntu packages are obviously built as you say: source is
uploaded first, then binaries are built from that. However, we do not
rebuild the binaries from source every time we build a CD set - it would
be prohibitively slow. The difference from your analogy with other
shops is that CD images are assembled, not compiled: they're built from
pre-existing binaries taken from the Ubuntu archive. Since different CD
images are built at different times, there's version skew. We have thus
the following choices:

  1) Build individual source images for each binary image.

     This is what we used to do. However, the space cost is prohibitive
     because there's so much duplication, and for the same reason it's
     not convenient for users trying to keep a source archive. We
     therefore switched to ...

  2) Build a single set of source images during the binary image set
     cycle.

     This mostly works, but there's unfortunate skew.

  3) Once we've built binary images, retroactively construct a set of
     source images with the union of all the versions found therein.

     This is probably what we need to do.

Colin Watson (cjwatson) wrote :

I've added a note now to the HTML index files for natty alpha-2 and all future source image builds, giving directions for getting source packages from Launchpad in the event of version skew. I realise this doesn't meet ideal archiving requirements for source images, but I think it should make it clear that we're meeting our legal obligations.

If you still believe that we're failing to meet our obligations under the GPL, of course you're welcome to contact Canonical's legal department - they will no doubt send a rocket my way if that's so.

Colin Watson (cjwatson) wrote :

(I mean, if you don't feel you're getting satisfaction from me.)

Dimitri John Ledkov (xnox) wrote :

https://launchpad.net/ubuntu/+archive/primary/+files/$FILENAME is an automatic redirector. I guess if either I get my snapshot.ubuntu.com going it will be generic "archive" that has all versions of .dsc & .debs (well anything that launchpadlibrarian still has) I guess we can teach http://pad.lv/ to become an "ubuntu mirror" e.g. accept normal archive like urls and redirect them to launchpad librarian as per above.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers