'Enable Network' in recovery mode not working properly.

Bug #1766872 reported by Eric Desrochers on 2018-04-25
52
This bug affects 7 people
Affects Status Importance Assigned to Milestone
friendly-recovery (Ubuntu)
Medium
Dimitri John Ledkov
Xenial
Medium
Eric Desrochers
Bionic
Medium
Eric Desrochers
Cosmic
Medium
Dimitri John Ledkov
Disco
Medium
Unassigned

Bug Description

[Impact]

 * network menu in recovery mode doesn't work correctly, blocking at starting systemd services depends to enable networking.

[Test Case]

 * Boot w/ Xenial or Bionic in recovery mode via grub
 * Choose "network" in friendly-recovery menu

 The network won't be activated and it'll be stuck at systemd-tty-ask-password :

# pstree
systemd-+-bash---pstree
        |-recovery-menu---network---systemctl---systemd-tty-ask

[Regression Potential]

* Low.
* All options works fine.
* Cosmic has the same changes already in place.

* According to xnox, resume option fails to boot now.
After verification the 'resume' has the same effect before/after that change, it boots up but still seems to stick in 'recovery' option according to /proc/cmdline so I don't see any obvious behaviour change before and after.

[Other Info]

 * Upstream :
https://bazaar.launchpad.net/~ubuntu-core-dev/friendly-recovery/ubuntu/changes/161?start_revid=161
Revision 154 to 161

[Original Description]

This bug has been noticed after the introduction of the fix of (LP: #1682637) in Bionic.

I have notice a block in Bionic when choosing 'Enable Network' option in recovery mode on different bionic vanilla system and I can reproduce all the time.

I also asked colleagues to give it a try (for a second pair of eye on this) and they have the same result as me.

Basically, when choosing 'Enable Network' it get block or lock.
If we hit 'ctrl-c', then a shell arrive and the system has network connectivity.

Here's what I find while enabling "systemd.debug-shell=1" from vtty9 :

# pstree
systemd-+-bash---pstree
        |-recovery-menu---network---systemctl---systemd-tty-ask
        |-systemd-journal
        ....

# ps
root 486 473 0 08:29 tty1 00:00:00 /bin/systemd-tty-ask-password-agent

root 473 486 0 08:29 tty1 00:00:00 systemctl start dbus.socket

root 486 283 0 08:29 tty1 00:00:00 /bin/sh /lib/recovery-mode/options/network

Additionally,

systemd-analyze blame:
"Bootup is not yet finished. Please try again later"

"systemctl list-jobs" is showing a 100 jobs in 'waiting' state

The only 'running' unit is friendly-recovery.service :
52 friendly-recovery.service start running

The rest are all "waiting". My understanding is that "waiting" units will be executed only after those which are "running" are completed. Which explain why the "ctlr-c" allow the boot to continue.

All the systemd special unit important at boot-up are waiting.
7 sysinit.target start waiting
3 basic.target start waiting
.....

Seems like systemd is not fully initialise in 'Recovery Mode' and doesn't allow any 'systemctl start' operation without password/passphrase request, which I suspect is hidden by the recovery-mode menu.

Related branches

Eric Desrochers (slashd) on 2018-04-25
Changed in friendly-recovery (Ubuntu):
importance: Undecided → High
Eric Desrochers (slashd) on 2018-04-25
tags: added: sts
Eric Desrochers (slashd) on 2018-04-25
description: updated
Eric Desrochers (slashd) on 2018-04-25
description: updated
Eric Desrochers (slashd) wrote :

I did some progress...

I was able to make a few 'enabled|static' services to start such as :
dbus.socket
networking.service
system-resolved.service

NetworkManger doesn't seems to start well.

By making the following changes :

# /lib/recovery-mode/options/network
--------------------------------------
if [ -d /run/systemd/system ]; then
- for i in dbus.socket systemd-resolved.service networking.service systemd-networkd.service NetworkManager.service; do
+ for i in dbus.socket networking.service systemd-networkd.service systemd-resolved.service NetworkManager.service; do
- systemctl is-enabled -q $i && systemctl start $i
+ systemctl is-enabled -q $i && systemctl --job-mode=ignore-dependencies --no-ask-password start $i
    done
    /lib/systemd/systemd-networkd-wait-online && exit 0
fi
--------------------------------------

* Changed the order to start systemd-resolved.service as mentioned in:

# /lib/systemd/system/systemd-resolved.service
After=systemd-networkd.service network.target

to be after systemd-networkd.service

* Add "--no-ask-password" to avoid systemd-tty-ask-password-agent to block
* Add "--job-mode=ignore-dependencies" has most of the service are not running and waiting.

Eric Desrochers (slashd) wrote :

By reading more about systemd, I start to think the blocker here to start the desired services is "Type=oneshot" instruction in friendly-recovery.service.

"
Behavior of oneshot is similar to simple; however, it is expected that the process has to exit before systemd starts follow-up units.
"

Need to explore more that route.

Eric Desrochers (slashd) wrote :

So all the remaining jobs are basically waiting for friendly-recovery to exit to start due to the "oneshot" instruction, which make xnox patch (revno 152) not working.

Some waiting jobs are units in which the desired network services in revno 152 depends on.
I totally understand why "oneshot" is set, but in this case it is the blocker.

If we remove "oneshot" then everything will start and make the recovery mode useless.
So I think "oneshot" need to stay there.

The only thing then I can think of atm is to add --job-mode=ignore-dependencies --no-ask-password and possibly extra stuff in frienly-recovery.service to make the necessary network services to start in recovery mode when requesting 'Enable Network'.

Changed in systemd (Ubuntu):
status: New → Won't Fix
Eric Desrochers (slashd) wrote :

or possibly add After=network.target or something around those line.

tags: added: rls-bb-incoming
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in friendly-recovery (Ubuntu):
status: New → Confirmed
Eric Desrochers (slashd) wrote :

https://www.freedesktop.org/software/systemd/man/systemctl.html

If "ignore-dependencies" is specified, then all unit dependencies are ignored for this new job and the operation is executed immediately. If passed, no required units of the unit passed will be pulled in, and no ordering dependencies will be honored. This is mostly a debugging and rescue tool for the administrator and should not be used by applications

based on the above description, --job-mode may apply in friendly-recovery as it fit in a debugging purpose IMHO.

Eric Desrochers (slashd) wrote :

The only problem missing is the dbus aspect.

dbus.service is necessary for NetworkManager but cannot be started manually[1], preventing NetworkManager to start succefully.

[1] dbus.service
# we don't properly stop D-Bus (see ExecStop=), thus disallow restart
RefuseManualStart=yes

As a debug exercise, I have commented (#RefuseManualStart=yes) start dbus.service by hand and then pick the 'Enable Networking' option and it works.

Need to find a way to make dbus.service to start with friendly-recovery or when 'Enable Network' is chosen.

Eric Desrochers (slashd) wrote :

So the --job-mode=ignore-dependencies approach cover the scenario when using systemd-networkd, but NetworkManager still fails for the reason above ^ (dbus.service) not starting.

tags: added: id-5afda46ded21519fcf26f22b
Bougron (francis-bougron) wrote :

Hello

My ubuntu version is 18.04.1

cat /etc/lsb*
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=18.04
DISTRIB_CODENAME=bionic
DISTRIB_DESCRIPTION="Ubuntu 18.04.1 LTS"

It works very well. But thinking that someday, this could not be the case, I decided to check if the recovery mode works

function "fsck" works
function "dpg" works
but function "enable network support" hangs.
dmesg | grep vmlinuz
[ 0.000000] Command line: \\boot\vmlinuz-4.15.0-32-generic recovery root=UUID=cb50473a-5d9c-441a-9502-690c1c8684d6 ro initrd=\boot\initrd.img-4.15.0-32-generic
[ 0.000000] Kernel command line: \\boot\vmlinuz-4.15.0-32-generic recovery root=UUID=cb50473a-

it is necessary to press the ctrl c keys to unlock the situation.

(checked with two different computers.

Attachment: contents of the start trace

Bougron (francis-bougron) wrote :

Another computer

Eric Desrochers (slashd) wrote :

We made some progress last week about this bug. We should be able to push the changes into bzr soon, and then start testing. Once the testing feedback are positive, we will then start SRU'ing the fix.

- Eric

Changed in systemd (Ubuntu Bionic):
status: New → Won't Fix
Changed in friendly-recovery (Ubuntu Bionic):
status: New → Confirmed
Changed in friendly-recovery (Ubuntu Cosmic):
importance: High → Medium
Changed in friendly-recovery (Ubuntu Bionic):
importance: Undecided → Medium
importance: Medium → Low
Changed in friendly-recovery (Ubuntu Cosmic):
importance: Medium → Low
Eric Desrochers (slashd) wrote :

Test packages[1] are available for Xenial/16.04 & Bionic/18.04 on my PPA[2]. I did some preliminary test which seems to work as expected.

Please anyone affected test the packages, and keep me posted.

[1] - Test package
Xenial - friendly-recovery - 0.2.31ubuntu1+testpkgb2
Bionic - friendly-recovery - 0.2.38ubuntu1+testpkgb1

[2] PPA
sudo add-apt-repository ppa:slashd/test
sudo apt-get update
sudo apt-get install friendly-recovery

Regards,
Eric

Changed in friendly-recovery (Ubuntu Cosmic):
assignee: nobody → Eric Desrochers (slashd)
Changed in friendly-recovery (Ubuntu Bionic):
assignee: nobody → Eric Desrochers (slashd)
Changed in friendly-recovery (Ubuntu Xenial):
assignee: nobody → Eric Desrochers (slashd)
status: New → Confirmed
importance: Undecided → Low
Changed in systemd (Ubuntu Xenial):
status: New → Won't Fix
Eric Desrochers (slashd) on 2018-10-02
Changed in friendly-recovery (Ubuntu Cosmic):
status: Confirmed → In Progress
Eric Desrochers (slashd) on 2018-10-02
summary: - 'Enable Network' in recovery mode not working in Bionic
+ 'Enable Network' in recovery mode not working properly.
Saroumane (saroumane) wrote :

Hello Eric, I just tested your "friendly-recovery" PPA package on Bionic/18.04, and it works perfectly ! Network access is finally restored when I use friendly-recovery menu. (NetworkManager is no longer stopped from being start by dbus dependency.)
Well done !

I still don't understand why this bug's importance is "Low". When you are stucked in recovery mode without network and you can't (re)install/upgrade critical packages (like graphics drivers package) ... Good luck !

Eric Desrochers (slashd) wrote :

Thanks Saroumane,

I changed the priority to medium. I'm waiting for another test feedback, and I'll upload the new packages in the upload queue very soon.

- Eric

Changed in friendly-recovery (Ubuntu Cosmic):
importance: Low → Medium
Changed in friendly-recovery (Ubuntu Bionic):
importance: Low → Medium
Changed in friendly-recovery (Ubuntu Xenial):
importance: Low → Medium
Eric Desrochers (slashd) wrote :

debdiff for Xenial

Eric Desrochers (slashd) wrote :

debdiff for bionic

Eric Desrochers (slashd) wrote :

debdiff for cosmic

Eric Desrochers (slashd) on 2018-10-03
Changed in friendly-recovery (Ubuntu Bionic):
status: Confirmed → In Progress
Changed in friendly-recovery (Ubuntu Xenial):
status: Confirmed → In Progress
tags: added: patch
Eric Desrochers (slashd) wrote :

I've been told that the "resume" doesn't resume boot.

Can someone validate ?

Eric Desrochers (slashd) on 2018-10-15
description: updated
Eric Desrochers (slashd) on 2018-10-15
description: updated
description: updated
Eric Desrochers (slashd) on 2018-10-15
description: updated
description: updated
Changed in friendly-recovery (Ubuntu Cosmic):
status: In Progress → Fix Released
Eric Desrochers (slashd) on 2018-10-15
description: updated
description: updated
Eric Desrochers (slashd) on 2018-10-15
description: updated
description: updated
Łukasz Zemczak (sil2100) wrote :

This is good. After accepting please double-check if the resume functionality really didn't regress. Thanks!

Changed in friendly-recovery (Ubuntu Bionic):
status: In Progress → Fix Committed
tags: added: verification-needed verification-needed-bionic

Hello Eric, or anyone else affected,

Accepted friendly-recovery into bionic-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/friendly-recovery/0.2.38ubuntu1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-bionic to verification-done-bionic. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-bionic. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Eric Desrochers (slashd) on 2018-10-15
Changed in friendly-recovery (Ubuntu Cosmic):
assignee: Eric Desrochers (slashd) → Dimitri John Ledkov (xnox)
Eric Desrochers (slashd) wrote :

Uploaded to Xenial as well. It is now waiting for SRU approval to start building in xenial-proposed for the testing phase.

tags: added: verification-done-bionic
removed: verification-needed-bionic
Eric Desrochers (slashd) wrote :

[Verification Done Bionic]

I have tested "0.2.38ubuntu1" in Bionic/18.04.1 LTS. Recovery mode now properly enable 'network' and 'dns' (/etc/resolv.conf) as expected with systemd without hitting dependencies unit issues nor asking for tty-ask-password or anything else which was previously blocking the process networking process when systemd-networkd or NetworkManager was involved.

The other friendly-recovery options still works and seems to behave as it was prior this SRU.

- Eric

The verification of the Stable Release Update for friendly-recovery has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package friendly-recovery - 0.2.38ubuntu1

---------------
friendly-recovery (0.2.38ubuntu1) bionic; urgency=medium

  [ Dimitri John Ledkov ]
  * Actually use a friendly-recovery.target which systemd boots to
    using a generator for default.target symlink. This ensures that
    one is not in the middle of a boot transaction during recovery
    and can start/stop/change systemd units without interference.
  * Cleanup lintian warnings. (LP: #1766872)

 -- Eric Desrochers <email address hidden> Mon, 15 Oct 2018 09:02:45 -0400

Changed in friendly-recovery (Ubuntu Bionic):
status: Fix Committed → Fix Released

Hello Eric, or anyone else affected,

Accepted friendly-recovery into xenial-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/friendly-recovery/0.2.31ubuntu2 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-xenial to verification-done-xenial. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-xenial. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in friendly-recovery (Ubuntu Xenial):
status: In Progress → Fix Committed
tags: added: verification-needed-xenial
David Coronel (davecore) wrote :

I tested friendly-recovery 0.2.31ubuntu2 in Xenial/16.04.3 LTS.

Selecting network Enable networking" from the Recovery Menu now enables the network and DNS successfully. I can "cat /etc/resolv.conf" vs before I couldn't. I don't see any issues with dependencies or asking for tty-ask-password or anything that was reported previously.

Selecting "resume Resume normal boot" works good as well to return to the normal OS.

tags: added: verification-done-xenial
removed: verification-needed-xenial
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package friendly-recovery - 0.2.31ubuntu2

---------------
friendly-recovery (0.2.31ubuntu2) xenial; urgency=medium

  [ Dimitri John Ledkov ]
  * Actually use a friendly-recovery.target which systemd boots to
    using a generator for default.target symlink. This ensures that
    one is not in the middle of a boot transaction during recovery
    and can start/stop/change systemd units without interference.
  * Cleanup lintian warnings. (LP: #1766872)

 -- Eric Desrochers <email address hidden> Tue, 02 Oct 2018 14:43:12 -0400

Changed in friendly-recovery (Ubuntu Xenial):
status: Fix Committed → Fix Released
Cory F. Cohen (cfcohen) wrote :

With friendly-recovery version 0.2.38ubuntu1, selecting the check filesystems option results in an error message like:

/etc/default/rcS: file or directory not found

This file does not exist on my system. I don't know if the problem I'm reporting is related to this issue, but it does seem to be the same as another bug that was marked as a duplicate of this one (1767685).

Eric Desrochers (slashd) wrote :

Cory, problem is not related, but I agree that this must be fix eventually in Ubuntu.
I have to check if it was fix upstream.

Can you please report a new bug about it ?

Changed in friendly-recovery (Ubuntu Disco):
status: Invalid → Fix Released
tags: removed: rls-bb-incoming verification-needed
no longer affects: systemd (Ubuntu)
no longer affects: systemd (Ubuntu Xenial)
no longer affects: systemd (Ubuntu Bionic)
no longer affects: systemd (Ubuntu Cosmic)
no longer affects: systemd (Ubuntu Disco)
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Bug attachments