friendly-recovery generates a bad grub.cfg in a narrow set of conditions
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
friendly-recovery (Ubuntu) |
New
|
Undecided
|
Unassigned |
Bug Description
friendly-recovery (on 16.04 at least) runs update-grub in its postinst. If friendly-recovery and a linux-image get updated within the same apt command (particularly, linux-image-
Kernel panic - not syncing: VFS: unable to mount root fs on unknown-block(0,0)
instead of a more graceful/less frightening grub error.
Under normal conditions this is hard to discover - as long as the rest of the "dist-upgrade" runs to completion and the initrd actually gets built, the kernel package tools make sure update-grub gets run again, and this time it finds the initrd and produces a valid config. This is the "narrow" window - any crash after the inaccurate grub.cfg gets written and before it gets fixed leads to a machine that panics on boot, though you can extend the window arbitrarily simply by rebooting at the right time.
I caught it because I was also upgrading another package of my own which had a buggy /etc/grub.d plugin, so when the kernel postinst.d ran, it generated a *syntactically* invalid grub.cfg, which was discarded by update-grub but did *not* fail the dist-upgrade, just produced more text among a bunch of other text (fortunately for debugging, this was visible in /var/log/
Is this obscure and hard to trigger? Yes.
Can the end user recover from it? As long as they're actually in front of the machine to select an alternate grub menu item, and have an older kernel (which is likely, since this needs a linux-image package upgrade to trigger - if the linux-image upgrade happened in an earlier apt command, friendly-
I don't actually have advice on fixing this (the kernel-postinst mechanism isn't really available for other packages to trigger, and the direct update-grub dpkg-trigger went away after grub-legacy was replaced with grub2, which would be the obvious choices) but I think "never write an invalid grub.cfg" is a reasonable rule...