grub stuck on loading kernel, fails to ls zfs and swap partitions

Bug #1867542 reported by Andreas Hasenack
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
grub2 (Ubuntu)
New
Undecided
Unassigned

Bug Description

I did a fresh install on a test laptop of 20.04 a while back, and after today's update, it no longer boots. Today's update included 2.31-0ubuntu6, but other 20.04 machines of mine also applied that and didn't fail, so I can't really be sure it's the cause.

I also have zsys installed, and I noticed there are many snapshots of my datasets, and grub has a new menu entry about history.

All that being said, I'm still troubleshooting and so far these are the facts:

- partition layout:
Device Start End Sectors Size Type
/dev/sda1 2048 1050623 1048576 512M EFI System
/dev/sda2 1050624 5244927 4194304 2G Linux swap
/dev/sda3 5244928 9439231 4194304 2G Solaris boot
/dev/sda4 9439232 937703054 928263823 442.6G Solaris root

- grub hangs when doing an ls on (hd0,gpt2), which is swap
- grub hangs when doing an ls on (hd0,gpt4).
- zfs-info (from grub's command line) on gpt3 is happy. It identifies it as bpool
- zfs-info on gpt4 complains it has unsupported features, but does not hang

When booting the system, it hangs right after I select a menu entry in grub, regardless of which one. Even the history ones from zfs hang, although I didn't try all. It looks just like the hang on the simple ls command.

This is grub2 2.04-1ubuntu22, and zsys 0.4.1. I'll see if I can run collect from inside the mounted system in a chroot from a rescue image, and attach info to this log.
---
ProblemType: Bug
ApportVersion: 2.20.11-0ubuntu20
Architecture: amd64
DistroRelease: Ubuntu 20.04
InstallationDate: Installed on 2020-02-29 (15 days ago)
InstallationMedia: Ubuntu 20.04 LTS "Focal Fossa" - Alpha amd64 (20200228)
NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
Package: grub2-common 2.04-1ubuntu22
PackageArchitecture: amd64
ProcVersionSignature: Ubuntu 5.4.0-14.17-generic 5.4.18
Tags: focal
Uname: Linux 5.4.0-14-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups:

_MarkForUpload: True

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

photo showing the hang, and where it works

Revision history for this message
Andreas Hasenack (ahasenack) wrote : Dependencies.txt

apport information

description: updated
tags: added: apport-collected focal
description: updated
Revision history for this message
Andreas Hasenack (ahasenack) wrote : ProcCpuinfoMinimal.txt

apport information

Revision history for this message
Andreas Hasenack (ahasenack) wrote : ProcEnviron.txt

apport information

summary: - grub stuck on loading kernel, fails to ls zfs partition
+ grub stuck on loading kernel, fails to ls zfs and swap partitions
Revision history for this message
Andreas Hasenack (ahasenack) wrote :

If I select advanced, then safe mode, all I see is

Loading Linux 5.4.0-14-generic ...
<cursor>

I've left it at that for about 30min, nothing changes. ctrl-alt-del also doesn't work, nor does sysrq boot.

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

Another important bit of information is that rpool (NOT bpool) has zfs encryption enabled. This was working just fine: I got a prompt for the password during boot (graphical even). But now the kernel doesn't even load.

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

I just ran dist-upgrade, fetched the latest focal (not proposed) updated, ran update-grub, grub-install, same thing.

tags: added: champagne
Revision history for this message
Andreas Hasenack (ahasenack) wrote :

I removed all the zsys snapshots (had about 17), and then it booted fine. grub must have some trouble when there are many zfs snapshots. I tried removing only the last and rebooted, and it still wouldn't boot, but I didn't run update-grub, nor updated the initramfs, nor reinstalled grub, in that attempt.

Note that update-grub was listing the kernels from all those snapshots.

Revision history for this message
Steve Langasek (vorlon) wrote :

There is definitely a bug in grub here if it's hanging, but it seems like zsys should be doing a better job of garbage collecting old snapshots instead of letting this list grow to the point that it breaks grub.

Revision history for this message
Didier Roche-Tolomelli (didrocks) wrote :

The ZSys part of the issue (garbage collection not being agressive enough when reaching bpool or rpool size limit) is handled on bug #1876334.

I’ll just add a reference there and remove the ZSys task instead of dupping so that the foundation team can handle the grub side of it.

no longer affects: zsys (Ubuntu)
tags: added: rls-gg-notfixing
removed: champagne
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.