Recovery mode won't allow recovery after manually installing the OS incorrectly

Bug #1609475 reported by Brian Candler
36
This bug affects 5 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Opinion
Low
Unassigned

Bug Description

If one manually installs Ubuntu but doesn't install a boot partition, the recovery mode doesn't work.

The expectation (perhaps naively) is that the recovery mode is bootable and allows one to fix the situation.

---
AlsaVersion: Advanced Linux Sound Architecture Driver Version k4.4.0-31-generic.
AplayDevices: Error: [Errno 2] No such file or directory
ApportVersion: 2.20.1-0ubuntu2.1
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/by-path', '/dev/snd/hwC0D2', '/dev/snd/hwC0D0', '/dev/snd/pcmC0D3p', '/dev/snd/pcmC0D1p', '/dev/snd/pcmC0D0c', '/dev/snd/pcmC0D0p', '/dev/snd/controlC0', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
Card0.Amixer.info: Error: [Errno 2] No such file or directory
Card0.Amixer.values: Error: [Errno 2] No such file or directory
DistroRelease: Ubuntu 16.04
HibernationDevice: RESUME=UUID=8c695f64-12a0-4748-a431-7ab97a1e9042
InstallationDate: Installed on 2016-08-04 (33 days ago)
InstallationMedia: Ubuntu-Server 16.04.1 LTS "Xenial Xerus" - Release amd64 (20160719)
IwConfig: Error: [Errno 2] No such file or directory
Lsusb:
 Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
 Bus 001 Device 003: ID 8087:0a2a Intel Corp.
 Bus 001 Device 002: ID 05e3:0610 Genesys Logic, Inc. 4-port hub
 Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
NonfreeKernelModules: zfs zunicode zcommon znvpair zavl
Package: linux (not installed)
ProcEnviron:
 LANGUAGE=en_GB:en
 TERM=xterm-256color
 PATH=(custom, no user)
 LANG=en_GB.UTF-8
 SHELL=/bin/bash
ProcFB: 0 inteldrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.4.0-31-generic.efi.signed root=UUID=a91f753b-69af-4125-a03d-0dcb63d55d38 ro net.ifnames=0
ProcVersionSignature: Ubuntu 4.4.0-31.50-generic 4.4.13
RelatedPackageVersions:
 linux-restricted-modules-4.4.0-31-generic N/A
 linux-backports-modules-4.4.0-31-generic N/A
 linux-firmware 1.157.2
RfKill: Error: [Errno 2] No such file or directory
Tags: xenial
Uname: Linux 4.4.0-31-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups:

_MarkForUpload: True
dmi.bios.date: 05/03/2016
dmi.bios.vendor: Intel Corp.
dmi.bios.version: PYBSWCEL.86A.0054.2016.0503.1546
dmi.board.name: NUC5CPYB
dmi.board.vendor: Intel Corporation
dmi.board.version: H61145-407
dmi.chassis.type: 3
dmi.modalias: dmi:bvnIntelCorp.:bvrPYBSWCEL.86A.0054.2016.0503.1546:bd05/03/2016:svn:pn:pvr:rvnIntelCorporation:rnNUC5CPYB:rvrH61145-407:cvn:ct3:cvr:

Revision history for this message
Paul White (paulw2u) wrote :

According to https://wiki.ubuntu.com/Bugs/FindRightPackage#During_boot this bug should be filed against the Linux kernel.

affects: ubuntu → linux (Ubuntu)
Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1609475

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Joseph Salisbury (jsalisbury) wrote : Re: recovery mode completely broken by systemd

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v4.7 kernel[0].

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.7

tags: added: kernel-da-key xenial
Changed in linux (Ubuntu):
importance: Undecided → High
Revision history for this message
Brian Candler (b-candler) wrote :

I will happily file a separate bug for NUC5CPYH not booting properly with 16.04.1, and I will test it with a mainline kernel.

However this bug is to report that "recovery mode" is completely broken in this situation, which makes it especially hard to debug the problem.

If I cannot open a console for more than two minutes, because systemd continues launching programs and then takes over the console, this is broken behaviour.

The workaround is to boot from USB and do the repair from there (although this is unfortunately a more long-winded process, prompting you for many of the installation questions)

If there is no intention to fix recovery mode, then I think it should be removed.

Revision history for this message
Brian Candler (b-candler) wrote :

The specific problem with NUC seems to have been my fault: I didn't realise it was installing in UEFI mode, so did not create an ESP (EFI System Partition).

It was strange that it managed to even start booting the kernel at all; I can only guess that the hardware was not correctly initialized.

Revision history for this message
Brian Candler (b-candler) wrote :

Separate issue #1609715 raised about installer continuing with UEFI installation even if there is no ESP.

This specific hardware is now working fine.

I would still like recovery mode to be more predictable in the event of startup problems: after all, the whole point of recovery mode is for when there are problems during bootup which need investigation/fixing.

Revision history for this message
penalvch (penalvch) wrote :

Brian Candler, could you please boot into the OS via working method and then execute https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1609475/comments/2 ?

Revision history for this message
Brian Candler (b-candler) wrote : AlsaDevices.txt

apport information

tags: added: apport-collected
description: updated
Revision history for this message
Brian Candler (b-candler) wrote : CRDA.txt

apport information

Revision history for this message
Brian Candler (b-candler) wrote : Card0.Codecs.codec.0.txt

apport information

Revision history for this message
Brian Candler (b-candler) wrote : Card0.Codecs.codec.2.txt

apport information

Revision history for this message
Brian Candler (b-candler) wrote : CurrentDmesg.txt

apport information

Revision history for this message
Brian Candler (b-candler) wrote : JournalErrors.txt

apport information

Revision history for this message
Brian Candler (b-candler) wrote : Lspci.txt

apport information

Revision history for this message
Brian Candler (b-candler) wrote : PciMultimedia.txt

apport information

Revision history for this message
Brian Candler (b-candler) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
Brian Candler (b-candler) wrote : ProcInterrupts.txt

apport information

Revision history for this message
Brian Candler (b-candler) wrote : ProcModules.txt

apport information

Revision history for this message
Brian Candler (b-candler) wrote : UdevDb.txt

apport information

Revision history for this message
Brian Candler (b-candler) wrote : WifiSyslog.txt

apport information

Revision history for this message
penalvch (penalvch) wrote : Re: recovery mode completely broken by systemd

Brian Candler, to see if this is already resolved, could you please test http://cdimage.ubuntu.com/daily-live/current/ and advise to the results?

tags: added: bios-outdated-0055
Revision history for this message
Brian Candler (b-candler) wrote :

Not sure about tag "bios-outdated-0055". The latest BIOS for this machine is 0055: see
https://downloadcenter.intel.com/product/85254/Intel-NUC-Kit-NUC5CPYH

As for live CD: no, it can't be reproduced that way. The specific sequence is:

* Boot from USB in UEFI mode
* Repartition the disk, but forget to include a UEFI boot partition
* Continue with installation
* Reboot, things go horribly wrong

Problems are:
(1a) The installer lets you do a UEFI-mode install without a UEFI boot partition
(1b) The installer doesn't maker it clear whether you are making a UEFI-mode install or a BIOS-mode install

(These have been raised as separate issues)

(2) The broken system boots but then goes mental; and systemd makes it *much* harder to diagnose than without systemd.

Revision history for this message
penalvch (penalvch) wrote :

Brian Candler:
>"* Boot from USB in UEFI mode
* Repartition the disk, but forget to include a UEFI boot partition"

The user manually repartitioning without setting up an UEFI partition would seem a user error, versus a software bug. Could you please advise?

Revision history for this message
Brian Candler (b-candler) wrote :

Let me try one last time to separate the issues.

** The UEFI issue (a side issue)

The installer works in two completely different ways, depending on whether the system booted via UEFI or BIOS. But it does not show whether it is installing in UEFI or BIOS mode. Hence the user has little way, short of guesswork, to know how to partition the system correctly.

Many systems can boot from a USB stick in either mode. If you don't tell it, you get whatever the system chose. So:

(1) The installer *could* tell you which mode it's running in, but it doesn't. If you don't realise you've booted via UEFI mode and that the system is going to configure UEFI booting, and decide to partition manually, then you don't realise that you need a UEFI boot partition.

(2) The system *could* warn you that you have a missing UEFI boot partition when installing in UEFI mode, but it doesn't.

Those points have now been raised separately in issue #1609715.

However the only relevance here is it gives a way to reproduce the main problem.

** Broken recovery mode (the main issue)

The point I tried to raise in this issue is the brokenness of recovery mode when you have a system with some sort of corruption. The UEFI missing-boot-partition problem is just one specific way to reproduce the brokenness in recovery mode. Reproducible cases are good; they allow things to be fixed. There are however many other different ways the system could be broken and recovery mode would not work.

With an older version of Ubuntu, I could simply log in, poke around, look at logs, find the problem and fix it.

With ubuntu 16.04, I have now experienced a situation where recovery mode is broken. I described what happens at the top of this issue. Basically you can start a recovery shell, and 50% of your keystrokes are thrown away; and then a few minutes later the recovery shell quits and recovery mode locks up. I suspect this is something to do with systemd sitting in the background launching stuff when it thinks dependencies have been met, and terminating stuff when it thinks it would be a good idea to do so.

For recovery mode, I just want a shell. Let me do my job. Please spawn me a shell connected to the console, reliably. That's it. No shells vanishing and reappearing. No timeouts because filesystems haven't yet been mounted or because networking is not up. That's the whole point of recovery mode - to have sufficient access to be able to fix those things.

For now, the best workaround seems to be to boot from an Ubuntu 14.04 USB, and then mount the system disk. But it makes me sad that 16.04 has become less good in this respect than it was before. It seems to be a regression in how easy it is to recover a broken system.

Of course, this only affects systems which require some sort of maintenance - but it's a fact of life that systems *do* get into states which require fixing.

That's it. If you have never had to use recovery mode, and hence don't care about it, then you are lucky.

Revision history for this message
penalvch (penalvch) wrote :
Download full text (4.0 KiB)

Brian Candler:

>"Let me try one last time to separate the issues. ** The UEFI issue (a side issue) The installer works in two completely different ways, depending on whether the system booted via UEFI or BIOS. But it does not show whether it is installing in UEFI or BIOS mode. Hence the user has little way, short of guesswork, to know how to partition the system correctly."

The user would already have setup in the BIOS menu to either be in UEFI or BIOS mode prior to installation. This would also be user error.

>"Many systems can boot from a USB stick in either mode. If you don't tell it, you get whatever the system chose. So: (1) The installer *could* tell you which mode it's running in, but it doesn't."

That is true (although one may be able to run a command to query what mode one is in. However, I'm not sure that is something that should be modified to accomplish with Ubuntu. Is there another operating system that does what you are suggesting?

>"If you don't realise you've booted via UEFI mode and that the system is going to configure UEFI booting, and decide to partition manually, then you don't realise that you need a UEFI boot partition. (2) The system *could* warn you that you have a missing UEFI boot partition when installing in UEFI mode, but it doesn't."

Same question for me as above.

>"Those points have now been raised separately in issue #1609715."

Given the scope of this report hasn't been determined, making new reports isn't helpful here (and is considered a duplicate).

>"However the only relevance here is it gives a way to reproduce the main problem. ** Broken recovery mode (the main issue) The point I tried to raise in this issue is the brokenness of recovery mode when you have a system with some sort of corruption. The UEFI missing-boot-partition problem is just one specific way to reproduce the brokenness in recovery mode. Reproducible cases are good; they allow things to be fixed. There are however many other different ways the system could be broken and recovery mode would not work. With an older version of Ubuntu, I could simply log in, poke around, look at logs, find the problem and fix it. With ubuntu 16.04, I have now experienced a situation where recovery mode is broken. I described what happens at the top of this issue. Basically you can start a recovery shell, and 50% of your keystrokes are thrown away; and then a few minutes later the recovery shell quits and recovery mode locks up. I suspect this is something to do with systemd sitting in the background launching stuff when it thinks dependencies have been met, and terminating stuff when it thinks it would be a good idea to do so. For recovery mode, I just want a shell. Let me do my job. Please spawn me a shell connected to the console, reliably. That's it. No shells vanishing and reappearing. No timeouts because filesystems haven't yet been mounted or because networking is not up. That's the whole point of recovery mode - to have sufficient access to be able to fix those things. For now, the best workaround seems to be to boot from an Ubuntu 14.04 USB, and then mount the system disk. But it makes me sad that 16.04 has become less good i...

Read more...

Changed in linux (Ubuntu):
importance: High → Low
status: Incomplete → Opinion
summary: - recovery mode completely broken by systemd
+ Recovery mode won't allow recovery after manually installing the OS
+ incorrectly
description: updated
Revision history for this message
Brian Candler (b-candler) wrote :

> The user would already have setup in the BIOS menu to either be in UEFI or BIOS mode prior to installation. This would also be user error.

Really? What's wrong with:

- buy computer
- plug in USB stick
- boot it up

Revision history for this message
RobertL (robert-loehning) wrote :

I have the same problem that Brian described as "Broken recovery mode (the main issue)". I'm not aware of anything being wrong with my UEFI-boot or my partitions, so I don't think this is directly related.

I'd really appreciate a working recovery-mode, too. If there's any information I could provide to help you fix it, just let me know how I can gather that.

So far I don't understand why a broken recovery mode has a "Low" priority...

Revision history for this message
Guy Paddock (guy-paddock) wrote :

I just ran into this issue with Bento Box's 64-bit image for 16.04. We were trying to boot into recovery mode in order to run `zerofree` prior to packaging up a custom box for internal use (per steps from https://snippets.khromov.se/shrinking-a-virtualbox-linux-image-with-zerofree/).

I can confirm this is a systemd-specific issue. We booted normally, followed instructions under "Permanent switch back to upstart" in https://wiki.ubuntu.com/SystemdForUpstartUsers, and rebooted, which successfully removed systemd. We then booted into Recovery Mode and had no interruptions or strange boot behavior (though there was an error about not being able to start "User Service" on the boot after installing Upstart).

Prior to removing systemd, the behavior @b-candler described is exactly what we were seeing -- after about 1 to 2 mins, it seems like systemd gets impatient and starts booting the system normally, bringing up networking, etc, AND THEN starts the recovery menu and recovery shell on top of an existing session. After another few minutes, you actually get a third one on top of that. If you can actually brave the storm and get to a prompt, running `init 1` will cause a system hang.

Looks like Ubuntu is not alone here:
- https://www.reddit.com/r/linuxquestions/comments/5dug3e/linux_mint_recovery_mode_crashingtiming_out/
- https://forums.linuxmint.com/viewtopic.php?t=232329

Revision history for this message
Brian Candler (b-candler) wrote :

BTW, I reproduced the same problem in a different (and arguably more realistic) scenario:

- install ubuntu 16.04
- configure networking with a bridge interface but a port member that doesn't exist when you next boot up (e.g. make br0 with a member which is a USB ethernet adapter, and then remove the USB adapter)
- reboot - you find it hangs for about 6 minutes
- so you decide to reboot again, and go into recovery mode to fix the networking config

Then you get the same as described before: you get a recovery shell which works for a few minutes, but then systemd continues with the bootup and blats over the recovery shell with a new session, making the system unusable.

Revision history for this message
Detlev Zundel (laodzu) wrote :

I also do see this flaky behaviour of rescue mode. The new bug report includes some more information: https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1662137 Apparently on my machine systemd runs into a timeout waiting for swap- or non-root partitions.

Unfortunately, entering rescue mode from the running system is also not possible as reported in this other bug: https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1661851

This renders rescue mode practically useless and should certainly be fixed.

Revision history for this message
Bill Pechter (pechter) wrote :

This rescue mode is a permanent PITA for me. I've been trying to do lowlevel disk recovery only to have something signal to the systemd to go multiuser. Alternative keystrokes are sent to
1. the recovery root level shell
2. the systemd menu.

I've worked around this when needed by booting an Ubuntu CD or live USB and trying
to work from there -- but this shouldn't be necessary. I like the recovery menu and it was
easy to work an end-user through recovery with it when it works. It's worse than non-helpful in the current stage.

I could live with being in a standalone shell without the menu if that would let me fix
the system. We need a way to block systemd from changing run levels without a direct
command to do so... or a complete exit of the recovery shell. It definitely gets some signal
that wakes it up and at times screws up recovery to the point of needing a reload.

Revision history for this message
Detlev Zundel (laodzu) wrote :

Hi Bill,

as far as I can see, this is not a systemd but a Ubuntu problem. My Debian machines enter rescue mode just fine with systemd. Youncan circumvent this by talking to systemd directly as described in my blog post http://blog.lazy-evaluation.net/posts/linux/ubuntu-16-04-entering-recovery.html

Hope that helps until the real problemmis fixed.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.