Xenial images won't reboot if disk size is > 2TB when using GPT
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
cloud-init |
Won't Fix
|
Undecided
|
Unassigned | ||
grub2 (Ubuntu) |
Fix Released
|
Undecided
|
Unassigned | ||
Xenial |
Fix Released
|
High
|
Matthew Ruffell | ||
grub2-signed (Ubuntu) |
Fix Released
|
Undecided
|
Unassigned | ||
Xenial |
Fix Released
|
High
|
Eric Desrochers |
Bug Description
[Impact]
On Xenial images which use GPT instead of MBR to enable efi based booting, there is an issue where after booting an instance that has a disk size of 2049 GB or higher, we hang on the next subsequent boot (Logs indicate it hanging on "Booting Hard Disk 0").
This is a problem in grub2 where the system would become unbootable after ext* online resize if no resize_inode was created at ext* format time.
[Test Case]
To reproduce:
1) Create an image with a disk size of 3072 GB using a serial that has GPT:
gcloud compute instances create test-3072-xenial --image daily-ubuntu-
2) Reboot the instance
The instance will hang on reboot and you cannot connect. If you go to GCP console and select Logs > Serial port 1 (console), you will see the boot process has stopped at "Booting Hard Disk 0".
I have built a test package, which is available here:
https:/
If you do step 1) but do not reboot, and instead add the PPA, install the new grub like so:
1) gcloud compute instances create test-3072-xenial --image daily-ubuntu-
2) sudo add-apt-repository ppa:mruffell/
3) sudo apt-get update
4) sudo apt remove grub-common grub-efi-amd64 grub-efi-amd64-bin grub-efi-
5) sudo apt install grub-common grub-efi-amd64 grub-efi-amd64-bin grub-pc-bin grub2-common
6) sudo grub-install /dev/sda
7) sudo reboot
The instance will boot successfully and you will be able to connect.
Note, we must use "daily-
[Regression Potential]
Grub is a core package and every care must be taken in order to not introduce any regressions.
The commit is present in B, D, E and F, and is considered well tested and widely adopted by the community.
The commit comes with its own testcase, to test the ext4_metabg fix.
The changes are localised to ext* based filesystems, although since they are the most popular family of filesystems used by the community, this does not reduce risk of breakage by much.
If a regression were to happen, a regression would have a large impact, and in the worst case, can lead to unbootable systems and data loss for users who are not technical enough to reinstall grub from a working package inside the broken system chroot.
[Other Info]
In comment #4, Sultan identifies the fix as:
commit e20aa39ea429801
Author: Vladimir Serbinenko <email address hidden>
Date: Mon Feb 16 20:53:26 2015 +0100
Subject: ext2: Support META_BG.
This commit is from upstream grub2, and can be found here:
https:/
Looking at when this was merged:
$ git describe --contains e20aa39ea429801
2.02-beta3~429
This commit is present in B, D, E and F, leaving X as the only version needing an SRU.
The commit cleanly cherry picks to X, because the delta from 2.02~beta2-
tags: | added: id-5d484a6466c79944a30e4644 |
description: | updated |
description: | updated |
description: | updated |
description: | updated |
tags: | added: sts-sponsor-slashd |
Changed in grub2-signed (Ubuntu): | |
status: | New → In Progress |
importance: | Undecided → High |
assignee: | nobody → Eric Desrochers (slashd) |
affects: | grub (Ubuntu) → grub2 (Ubuntu) |
Changed in grub2-signed (Ubuntu Xenial): | |
assignee: | nobody → Eric Desrochers (slashd) |
importance: | Undecided → High |
Seems related or at least "close to" bug 1762748.
If nothing else, that bug has nice local recreate information.