Unable to boot with F-gkeop kernel

Bug #1909316 reported by Po-Hsu Lin
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
ubuntu-kernel-tests
Fix Released
Critical
Sean Feole
linux-gkeop (Ubuntu)
Invalid
Undecided
Unassigned

Bug Description

Issue found with 5.4.0-1008.9 Focal GKEOP

The procedure to deploy a F-gkeop system for the test:
1. Deploy a Focal gcp instance
2. Enable proposed, install the linux-gkeop and linux-modules-extra-gkeop package (bug 1909232)
3. Modify the grub to make it boot with the 5.4.0-1008.9 gkeop kernel instead of the default 5.4.0-1029-gcp kernel

When trying to reboot to this kernel with GRUB default being modified as:
GRUB_DEFAULT="gnulinux-advanced-dc7ce673-d33f-4a3b-bce1-fff7064b9ab4>gnulinux-5.4.0-1008-gkeop-advanced-dc7ce673-d33f-4a3b-bce1-fff7064b9ab4"

It looks like the instance will hang. You won't be able to ssh into it anymore.

Command "gcloud compute instances list" shows it's still RUNNING

I can't connect to the console to see what's going on there:
  $ gcloud compute connect-to-serial-port f-lgkeop-gkeop-5-4-0-e2std2-aio-dio-bugs --zone=us-west1-a
  channel 0: open failed: connect failed: The serial-port-enable metadata attribute is not set on this VM. It should be set to 'yes' to enable serial port access.
  Connection to ssh-serialport.googleapis.com closed.

This is blocking the Focal GKEOP SRU tests.

Po-Hsu Lin (cypressyew)
description: updated
tags: added: 5.4 focal gkeop kqa-blocker sru-20201130
Revision history for this message
Sean Feole (sfeole) wrote :

See attached for full boot logs

Revision history for this message
Sean Feole (sfeole) wrote :

I have attempted to boot the kernel via using the kernel index in grub which has also failed at this time. The kernel does appear to boot as expected in a qemu VM. Grub appears t o be forcing the uuid partid,

Panic: [ 0.969467] tty tty0: hash matches
[ 0.970273] tty console: hash matches
[ 0.971276] rtc_cmos 00:00: setting system clock to 2021-01-05T16:55:01 UTC (1609865701)
[ 1.077316] input: AT Translated Set 2 keyboard as /devices/platform/i8042/serio0/input/input2
[ 1.079288] md: Waiting for all devices to be available before autodetect
[ 1.080346] md: If you don't use raid, use raid=noautodetect
[ 1.081347] md: Autodetecting RAID arrays.
[ 1.081962] md: autorun ...
[ 1.082359] md: ... autorun DONE.
[ 1.082952] VFS: Cannot open root device "PARTUUID=737695f9-0fcb-48ea-acc5-48cda21d60f6" or unknown-block(0,0): error -6
[ 1.084735] Please append a correct "root=" boot option; here are the available partitions:
[ 1.086158] Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)
[ 1.087428] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.4.0-1008-gkeop #9-Ubuntu
[ 1.088680] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
[ 1.090070] Call Trace:
[ 1.090429] dump_stack+0x6d/0x9a
[ 1.090913] panic+0x101/0x2e3
[ 1.091354] mount_block_root+0x23f/0x2e8
[ 1.091968] mount_root+0x38/0x3a
[ 1.092435] prepare_namespace+0x13f/0x194
[ 1.093003] kernel_init_freeable+0x231/0x255
[ 1.093679] ? rest_init+0xb0/0xb0
[ 1.094294] kernel_init+0xe/0x110
[ 1.094845] ret_from_fork+0x35/0x40
[ 1.095410] Kernel Offset: 0x25600000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[ 1.097034] ACPI MEMORY or I/O RESET_REG.

In /etc/default/grub.d/40-force-partuuid.cfg the following entry exists:

GRUB_FORCE_PARTUUID=737695f9-0fcb-48ea-acc5-48cda21d60f6

By removing this file, allows grub to specify the block device path for kernel root (root=/dev/sda1 vs UUID).

[ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-5.4.0-1008-gkeop root=/dev/sda1 ro console=ttyS0

This appears to resolve the booting issue in the cloud. Using indexes now works, configuring /etc/default/grub ; default="1>2" boots as expected.

Also using the $menu_id strings also works now:
GRUB_DEFAULT="gnulinux-advanced-dc7ce673-d33f-4a3b-bce1-fff7064b9ab4>gnulinux-5.4.0-1008-gkeop-advanced-dc7ce673-d33f-4a3b-bce1-fff7064b9ab4"

Changed in ubuntu-kernel-tests:
assignee: nobody → Sean Feole (sfeole)
importance: Undecided → Critical
status: New → In Progress
Revision history for this message
Sean Feole (sfeole) wrote :

Pushed 2 commits to internal tools to handle this problem, will update further if problem still exists.

Changed in linux-gkeop (Ubuntu):
status: New → Invalid
Changed in ubuntu-kernel-tests:
status: In Progress → Fix Committed
Sean Feole (sfeole)
Changed in ubuntu-kernel-tests:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.