grub stuck on loading kernel, fails to ls zfs and swap partitions
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
| grub2 (Ubuntu) |
Undecided
|
Unassigned |
Bug Description
I did a fresh install on a test laptop of 20.04 a while back, and after today's update, it no longer boots. Today's update included 2.31-0ubuntu6, but other 20.04 machines of mine also applied that and didn't fail, so I can't really be sure it's the cause.
I also have zsys installed, and I noticed there are many snapshots of my datasets, and grub has a new menu entry about history.
All that being said, I'm still troubleshooting and so far these are the facts:
- partition layout:
Device Start End Sectors Size Type
/dev/sda1 2048 1050623 1048576 512M EFI System
/dev/sda2 1050624 5244927 4194304 2G Linux swap
/dev/sda3 5244928 9439231 4194304 2G Solaris boot
/dev/sda4 9439232 937703054 928263823 442.6G Solaris root
- grub hangs when doing an ls on (hd0,gpt2), which is swap
- grub hangs when doing an ls on (hd0,gpt4).
- zfs-info (from grub's command line) on gpt3 is happy. It identifies it as bpool
- zfs-info on gpt4 complains it has unsupported features, but does not hang
When booting the system, it hangs right after I select a menu entry in grub, regardless of which one. Even the history ones from zfs hang, although I didn't try all. It looks just like the hang on the simple ls command.
This is grub2 2.04-1ubuntu22, and zsys 0.4.1. I'll see if I can run collect from inside the mounted system in a chroot from a rescue image, and attach info to this log.
---
ProblemType: Bug
ApportVersion: 2.20.11-0ubuntu20
Architecture: amd64
DistroRelease: Ubuntu 20.04
InstallationDate: Installed on 2020-02-29 (15 days ago)
InstallationMedia: Ubuntu 20.04 LTS "Focal Fossa" - Alpha amd64 (20200228)
NonfreeKernelMo
Package: grub2-common 2.04-1ubuntu22
PackageArchitec
ProcVersionSign
Tags: focal
Uname: Linux 5.4.0-14-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups:
_MarkForUpload: True
Andreas Hasenack (ahasenack) wrote : | #1 |
apport information
description: | updated |
tags: | added: apport-collected focal |
description: | updated |
apport information
apport information
summary: |
- grub stuck on loading kernel, fails to ls zfs partition + grub stuck on loading kernel, fails to ls zfs and swap partitions |
Andreas Hasenack (ahasenack) wrote : | #5 |
If I select advanced, then safe mode, all I see is
Loading Linux 5.4.0-14-generic ...
<cursor>
I've left it at that for about 30min, nothing changes. ctrl-alt-del also doesn't work, nor does sysrq boot.
Andreas Hasenack (ahasenack) wrote : | #6 |
Another important bit of information is that rpool (NOT bpool) has zfs encryption enabled. This was working just fine: I got a prompt for the password during boot (graphical even). But now the kernel doesn't even load.
Andreas Hasenack (ahasenack) wrote : | #7 |
I just ran dist-upgrade, fetched the latest focal (not proposed) updated, ran update-grub, grub-install, same thing.
tags: | added: champagne |
Andreas Hasenack (ahasenack) wrote : | #8 |
I removed all the zsys snapshots (had about 17), and then it booted fine. grub must have some trouble when there are many zfs snapshots. I tried removing only the last and rebooted, and it still wouldn't boot, but I didn't run update-grub, nor updated the initramfs, nor reinstalled grub, in that attempt.
Note that update-grub was listing the kernels from all those snapshots.
Steve Langasek (vorlon) wrote : | #9 |
There is definitely a bug in grub here if it's hanging, but it seems like zsys should be doing a better job of garbage collecting old snapshots instead of letting this list grow to the point that it breaks grub.
Didier Roche (didrocks) wrote : | #10 |
The ZSys part of the issue (garbage collection not being agressive enough when reaching bpool or rpool size limit) is handled on bug #1876334.
I’ll just add a reference there and remove the ZSys task instead of dupping so that the foundation team can handle the grub side of it.
no longer affects: | zsys (Ubuntu) |
tags: |
added: rls-gg-notfixing removed: champagne |
photo showing the hang, and where it works