Ubuntu is a Good Thing, and I would like to recommend it to my friends, giving them CDs. However, I cannot do this when I know that the installation CD or a routine kernel upgrade could leave their computer in an unbootable state. Especially when it is now clear where the error lies. Please can we fix it? Earlier, Steve spelt out the sequence 1. kernel is unpacked 2. update-grub is called 3. initramfs is generated (in kernel package postinst) 4. update-grub is *not* called 5. kernel package postinst ends successfully, leaving the kernel configured but menu.lst broken suggesting that the fault lies in step 4 or 5. This is incorrect. The error is at step 2. As I said before, this fails to maintain the data invariant. If you don't like the theoretical description, let me spell it out with reference to the kernel upgrade that Ubtunu/Hardy has just done on my machine. I ran "sudo apt-get update" and then "upgrade" from a terminal, and a number of other packages arrived at the same time: acpid libcamel1.2-11 libebook1.2-9 libecal1.2-7 libedataserver1.2-9 libldap-2.4-2 linux-headers-2.6.24-23 linux-headers-2.6.24-23-generic linux-image-2.6.24-23-generic linux-libc-dev python-apt python-gobject vim-common vim-runtime vim-tiny ... Fetched 35.1MB in 1min57s (300kB/s) (Reading database ... 153425 files and directories currently installed.) Preparing to replace linux-image-2.6.24-23-generic 2.6.24-23.46 (using .../linux-image-2.6.24-23-generic_2.6.24-23.48_i386.deb) ... Done. Unpacking replacement linux-image-2.6.24-23-generic ... Running postrm hook script /sbin/update-grub. Searching for GRUB installation directory ... found: /boot/grub Searching for default file ... found: /boot/grub/default Testing for an existing GRUB menu.lst file ... found: /boot/grub/menu.lst Searching for splash image ... none found, skipping ... Found kernel: /boot/vmlinuz-2.6.24-23-generic Found kernel: /boot/vmlinuz-2.6.24-22-generic Found kernel: /boot/vmlinuz-2.6.24-21-generic Found kernel: /boot/vmlinuz-2.6.24-19-generic Found kernel: /boot/memtest86+.bin Replacing config file /var/run/grub/menu.lst with new version Updating /boot/grub/menu.lst ... done Now we have in /boot --- please notice the dates: 422838 2009-01-26 04:30 abi-2.6.24-23-generic 80051 2009-01-26 04:30 config-2.6.24-23-generic 7504607 2009-01-20 17:19 initrd.img-2.6.24-23-generic 7504424 2009-01-15 15:43 initrd.img-2.6.24-23-generic.bak 905809 2009-01-26 04:30 System.map-2.6.24-23-generic 1922904 2009-01-26 04:30 vmlinuz-2.6.24-23-generic So if I had had a power failure at this point, a reboot would try to run the kernel (vmlinuz) of 26th Jan with the (presumably incompatible) image (initrd) of 20th Jan. Alternatively, if a kernel with a new version number had been installed, there would have been no initrd at all. Either way, menu.lst is now CORRUPT, because it contains an invalid kernel configuration. There are a lot more installations, and therefore opportunities for crashes and power failures, before Unpacking replacement linux-headers-2.6.24-23 ... Preparing to replace linux-headers-2.6.24-23-generic 2.6.24-23.46 (using .../linux-headers-2.6.24-23-generic_2.6.24-23.48_i386.deb) ... Unpacking replacement linux-headers-2.6.24-23-generic ... Yet more installations take place before Setting up linux-image-2.6.24-23-generic (2.6.24-23.48) ... Running depmod. update-initramfs: Generating /boot/initrd.img-2.6.24-23-generic Not updating initrd symbolic links since we are being updated/reinstalled (2.6.24-23.46 was configured last, according to dpkg) Not updating image symbolic links since we are being updated/reinstalled (2.6.24-23.46 was configured last, according to dpkg) Running postinst hook script /sbin/update-grub. Searching for GRUB installation directory ... found: /boot/grub Searching for default file ... found: /boot/grub/default Testing for an existing GRUB menu.lst file ... found: /boot/grub/menu.lst Searching for splash image ... none found, skipping ... Found kernel: /boot/vmlinuz-2.6.24-23-generic Found kernel: /boot/vmlinuz-2.6.24-22-generic Found kernel: /boot/vmlinuz-2.6.24-21-generic Found kernel: /boot/vmlinuz-2.6.24-19-generic Found kernel: /boot/memtest86+.bin Updating /boot/grub/menu.lst ... done Finally /boot/grub/menu.lst is back in a safe state. So this is what I propose: update-grub should merely CONCATENATE (some list of grub options and) kernel configuration items that are in individual files, and not try to implement any of the "logic" of matching kernels with initrds. I don't understand what the circumstances might be in which a kernel might not want an initrd, but let us imagine a very general situation in which there are Unix kernels from different Linux distros, MacOSX, Salaris, BSD, etc etc. Each of them is built and installed in a different way, and needs different options in grub. So, as the last stage of the installation of a particular kernel, some script (belonging to that kernel installation package) checks that all of its bits are in place, and generates a file that update-grub will subsequently add to menu.lst. There is the opportunity here to use this file for other purposes, communicating between the installation steps, and telling the user where to find log files for debugging purposes. All of this information would be "commented out" with # so that grub doesn't treat it as an active kernel, EXCEPT where it actually provides a valid kernel configuration. For the purposes of disaster recovery, grub itself should have facilities for showing menu.lst and any other files, in particular debugging log files, and should also give urls of Linux help pages and of a launchpad page where advice can be found and bugs reported. By this mechanism, you will be able to get debugging information back about why mkinitramfs fails, for example. update-grub may find NO valid kernel configurations. We avoid this by adopting the usual methods of preventing the user from deleting the last remaining valid kernel unless s/he really intends to do so. This leaves the case of a fresh installation (Ubiquity). Of necessity, this must be running under SOME kernel, so Ubiquity should begin by installing this, and creating the initial menu.lst containing just it. Alternatively, it should copy a minimal kernel straight from disk. I couldn't follow the source code for grub and ubiquity, but I did notice that ubiquity duplicates code that is like that in update-grub. Duplication of code is a bad idea for correctness reasons - it should CALL update-grub instead. Besides, this, I have said before that Ubiquity should not trash the disk, but try to leave any existing Unix system in a recoverable state. It should also have, in addition to LIVE CD running and installation, a recovery mode for fixing a previous broken installation attempt.