ucf and update-grub semantics are incompatible

Bug #239674 reported by Russ Allbery on 2008-06-13
22
This bug affects 3 people
Affects Status Importance Assigned to Milestone
grub (Ubuntu)
Undecided
Unassigned

Bug Description

Binary package hint: grub

Ubuntu hardy (amd64)
grub 0.97-29ubuntu21

The Ubuntu-specific modification to use ucf for managing menu.lst breaks the traditional update-grub semantics in fundamental ways. If I understand the architecture correctly, I'm not sure how using ucf could ever support the normal workflow that's possible using the Debian update-grub.

ucf's goal is to preserve local changes to a configuration file, but its notion of local changes is contrary to how update-grub is supposed to work. Suppose that you're a large organization with a lot of systems with a variety of kernel revisions, for one reason or another, but with a desire to have uniform kernel parameters like console speed and timeouts. With update-grub, you can just copy your menu.lst template to all of your systems and run update-grub afterwards, and update-grub will apply your new defaults and generate a new kernel list using them. However, with Ubuntu's update-grub, it generates the initial menu.lst and kernel list, and then when you replace that with a fresh template, ucf sees that you've removed the kernel list. It thinks that's a local modification, and when update-grub generates a new kernel list, it "preserves" your modification and leaves the kernel list out of the new menu.lst. Tada, instant unbootable system.

Even worse, this bug is sticky: you can re-run update-grub all you want, it succeeds without errors, and it keeps generating an empty menu.lst. This is a completely mystifying error until you dig into the details of what's going on. To recover, you have to run ucf --purge /var/run/grub/menu.lst (and that path, or for that matter any of the ucf stuff, is completely undocumented in the update-grub man page) and then re-run update-grub, at which point ucf will display a debconf prompt (which doesn't support debconf preseeding, but that's another problem). If you then select three-way merge (not the default) from that prompt, you get a correct menu.lst.

The specific problem I had was that we were copying our template over and re-running update-grub in an FAI post-install script, resulting in a system that always had an empty kernel menu and wouldn't boot. Once I finally figured out what's going on, I worked around it by being very careful to put our template in place before the initial system bootstrap and then never changing menu.lst afterwards, but this is a bad limitation for a large site that scales system administrators by automating configuration file updates.

It's possible that the right solution here is to change to a completely different configuration that uses a separate file for the template and explicitly supports custom kernel menu entries rather than reverting to the Debian behavior, but the current solution in Ubuntu breaks capabilities of update-grub that we expected to be able to use. Reverting to the Debian behavior, which we've never had any trouble with, would be an improvement over the current Ubuntu version for us.

Andy Wettstein (ajw-uiuc) wrote :

I just want to add a me too for this bug. I'm not using any custom kernels, but I just have a noninteractive script that does my software updates, so I set UCF_FORCE_CONFFOLD=YES in the script and this causes nothing but problems for grub menu.lst configuration.

Thanks to this bug report, it does seem that I can do this to workaround the problem in my script:
   unset UCF_FORCE_CONFFOLD
   export UCF_FORCE_CONFFNEW=YES
   ucf --purge /var/run/grub/menu.lst
   update-grub

I have spent hours trying to figure out why update-grub would say it found new kernels, yet, never update the menu.lst. There should at least be an error message of some sort if the menu.lst didn't actually get updated with the kernels it said it found. For example, without purging /var/run/grub/menu.lst here is the output of update-grub with no UCF environment variables set:

Searching for GRUB installation directory ... found: /boot/grub
Searching for default file ... found: /boot/grub/default
Testing for an existing GRUB menu.lst file ... found: /boot/grub/menu.lst
Searching for splash image ... none found, skipping ...
Found kernel: /boot/vmlinuz-2.6.24-21-generic
Found kernel: /boot/vmlinuz-2.6.24-19-generic
Found kernel: /boot/memtest86+.bin
Updating /boot/grub/menu.lst ... done

At this point menu.lst does not have the 2.6.24-21-generic kernel. This kernel was installed with my noninteractive script, so UCF_FORCE_CONFFOLD=YES would have been set when it was initially installed.

This problem affects me, too. We would like to do a mass rollout with FAI (Fully Automatically Installer) and use some default values in our menu.lst. Unfortunately this doesn't work together with the ubuntu's update-grub.
FAI-specific Link for this problem: http://faiwiki.informatik.uni-koeln.de/index.php/FAIUbuntuGrubProblems

Stephan Adig (sadig) wrote :

Marking it as "Won't fix".
As Patrick wrote the fix is http://faiwiki.informatik.uni-koeln.de/index.php/FAIUbuntuGrubProblems :)

Regards,

\sh

Changed in grub (Ubuntu):
status: New → Won't Fix
Russ Allbery (rra-debian) wrote :

I don't agree with that rationale of marking it won't fix.

Regardless of whether FAI has a documented workaround for this bug, this is a bug in Ubuntu's grub package, where a normal user interaction with the grub configuration file leads to extremely strange results and an unbootable system. There are other ways to encounter this situation besides FAI, such as any configuration management system that wants to update the general initial settings in the menu.lst file.

Changed in grub (Ubuntu):
status: Won't Fix → New
Steve Langasek (vorlon) wrote :

I agree that's not a good rationale for a wontfix here.

FWIW, the long-term solution expected here is that we will convert systems from grub1 to grub2 on upgrade to lucid, at which point all the ucf handling goes away because we have a sensible config file format; which means that unfortunately the bug in grub1 (which I agree is a bug, and I regret not noticing when this was being developed) isn't likely to receive further attention before the grub package becomes obsolete entirely.

Stephan Adig (sadig) wrote :

Well,

guys, if you install an Ubuntu Server or Desktop with FAI it's totally different from a kickstart+D-I preseeding auto install.
The ucf + grub/menu.lst handling is working when using the kickstart + D-I preseeding installation method, but it's not usable (without the workaround) when using FAI.

Grub2 istallation is working fine in the latest FAI version and Ubuntu auto Install the FAI way...That's the reason why I do think the bugreport, (which was filed as report to the ubuntu fai team) is a "won't fix".

Regarding the fact, that someone could change the menu.lst manually, that someone has to know how to "disable" ucf and how to deal with kernel upgrades manually.

But that's my reasoning for setting this bug to "won't fix"

Regards,

\sh

Steve Langasek (vorlon) wrote :

> the bugreport, (which was filed as report to the ubuntu fai team)

No, it wasn't.

Stephan Adig (sadig) wrote :

@steve:

sorry...I found this bug via ubuntu fai team....just checked the activity...you are right.

\sh

Sergey Svishchev (svs) wrote :

There is another way to install Ubuntu without using D-I or FAI, and that is VMBuilder.

Stephan Adig (sadig) wrote :

@Sergey:

but not in a real world datacenter ;)

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers