Comment 8 for bug 1103187

Revision history for this message
Péter Prőhle (prohlep) wrote : Re: automatic updates tend to reboot and die into grub rescue

Thanks for your time and suggestions. I think, we are closer to narrowing the source of the problem. The phenomenon appears on other box as well, and I did also a fresh install with XFS root containing the /boot/ tree. The deterministically repeatable phenomenon is:

        take a 12.10 with XFS root partition containing the /boot/ tree

        dpkg-reconfigure grub-pc ; sync

        reboot (either by command, or by menu)

        diverse error messages + grub rescue + /boot/ is partially accessible

        boot UBUNTU 12.10 pendrive + choose "try Ubuntu" + open a terminal

        sudo -s; mkdir foo; mount [the XFS root partition] foo; umount foo

        reboot into the "original" Ubuntu is NOW successful

Key information, that in "try Ubuntu" the xfs_check of my XFS root partition told, that "error: the filesystem has valuable metadata changes in a log which needs to be replayed. Mount the filesystem to replay the log".

This gave me the idea, that not "the boot into my Ubuntu by using SYSRESCCD" itself is important, but the sideeffect of it, that namely the root partition is mounted, and hence the pending changes are discharged, and hence the intention of the "dpkg-reconfigure grub-pc" and the reality in the physical blocks on the drive get into coherence with each other, and hence all the later reboots will be successful!

That's why I gave up booting my Ubuntu with the help of SYSRESCCD, but simply to get somehow (say with "try Ubuntu") into a living linux, and issue a mount concerning the partition in question. And it does work, everywhere of my 3 boxes, and even on the fresh install YOU SUGGESTED ME. Hence the problem is somewhere around the temporary incoherency due to the delayed feautures of the journaled filesystem in question.

Yet another error message: ELF sections outside core.

It is not clear, how much the "dpkg-reconfigure grub-pc" is responsible, since it perhaps should be prepared for journaled filesystems.

To put /boot/ tree into a NON-root partitions appears to be a reliable work around. Probably because the shut down process can bring down a NON-root partition coheretntly, even if there are delayed transactions in it's journal.

In the case of the root partition, perhaps as a final stage, the shut down process should migrate to a ram rooted system, and then it can bring down the original root partition the same way as all the other NON-root partitions.