automatic reboot fails with zero size kernel, no watchdog in grub

Bug #1467553 reported by Federico Gimenez on 2015-06-22
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
snapd
Low
Unassigned
Ubuntu
Undecided
Unassigned

Bug Description

Steps to reproduce:

Begin with an upgradable image

$ sudo ubuntu-device-flash --revision=-1 core rollling --channel edge -o snappy.img --developer-mode

Launch

$ kvm -m 768 -redir :8022::22 ./snappy.img

Update

$ ssh -p 8022 ubuntu@localhost
$ sudo snappy update

Remove the kernel in the other partition and create a zero size file:

$ sudo mount -o remount,rw /writable/cache/system
$ sudo rm /writable/cache/system/vmlinuz-3.19.0-22-generic
$ sudo touch /writable/cache/system/vmlinuz-3.19.0-22-generic

Reboot

Steve Langasek (vorlon) wrote :

I am unsure why you are reporting this as a bug. If you corrupt the filesystem, it will fail to boot. After it fails to boot, powering the system off and powering it on again should fall back to the other partition. How is this different from what you are expecting to happen?

Federico Gimenez (fgimenez) wrote :

In the steps above, if instead of the vmlinuz file you remove the initrd file in the new partition, when you try to reboot to apply the update the system detects the kernel panic and is able to automatically reboot into the good partition by itself.

It would be very useful if the system could autoreboot in case of a bad or empty kernel, especially for unattended devices, now it gets stuck and you need to power cycle the device to boot into the previous partition again, as you mentioned. More taking into account that now the update process download and application is automatic by default (managed by snappy-autopilot)

Michael Vogt (mvo) wrote :

Adding Paolo to get a expert opinion on what the bootloader can do for us here (if anything).

Changed in snappy:
status: New → Triaged
importance: Undecided → High
Paolo Pisati (p-pisati) wrote :

Sorry, but i'm not a grub expert but what you are describing here is a watchdog - no idea if grub supports it and no idea if the qemu platform you are emulating has one.

Michael Vogt (mvo) on 2015-08-26
tags: added: snappy-robustness
Zygmunt Krynicki (zyga) on 2018-05-07
affects: snappy → snapd
summary: - automatic reboot fails with zero size kernel
+ automatic reboot fails with zero size kernel, no watchdog in grub
faisal (alfaesal18) on 2019-06-11
Changed in snapd:
assignee: nobody → faisal (alfaesal18)
Changed in ubuntu:
assignee: nobody → faisal (alfaesal18)
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in ubuntu:
status: New → Confirmed
faisal (alfaesal18) on 2019-06-11
Changed in snapd:
status: Triaged → Confirmed
faisal (alfaesal18) on 2019-06-11
information type: Public → Public Security
faisal (alfaesal18) on 2019-06-30
Changed in snapd:
status: Confirmed → Fix Committed
Changed in ubuntu:
status: Confirmed → Fix Committed
information type: Public Security → Private
faisal (alfaesal18) on 2019-11-26
Changed in snapd:
status: Fix Committed → Fix Released
Haw Loeung (hloeung) on 2020-02-29
Changed in snapd:
assignee: faisal (alfaesal18) → nobody
Changed in ubuntu:
assignee: faisal (alfaesal18) → nobody
Changed in snapd:
status: Fix Released → Triaged
information type: Private → Public
Changed in snapd:
status: Triaged → New
Changed in ubuntu:
status: Fix Committed → New
Changed in snapd:
status: New → Confirmed
status: Confirmed → Triaged
Ian Johnson (anonymouse67) wrote :

It would be nice to have this if only for our spread tests in snapd around failed reboots.

Ian Johnson (anonymouse67) wrote :

I think something like this can be implemented with grub fallback menus, but this is not a high priority bug

Changed in snapd:
importance: High → Low
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers