netplan should allow NICs to be disconnected and not stall the boot

Bug #1619258 reported by Oliver Grawert on 2016-09-01
20
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Snappy
Undecided
Unassigned
nplan (Ubuntu)
Undecided
Unassigned
systemd (Ubuntu)
Undecided
Unassigned

Bug Description

in older snappy image we used to set "allow-hotplug" in /etc/network/interfaces.d/ when configuring a NIC to avoid the boot to completely stall or to be stuck for 5min when trying to find an internet connection.

with recent versions of netplan the configuration seems to be back to require a network to be up before moving on with them boot so that booting takes forever until the network connection times out.

netplan should make the system check the physical link status when trying to bring up a network device instead of stalling the boot.

Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in nplan (Ubuntu):
status: New → Confirmed
Martin Pitt (pitti) wrote :

Which services in snappy actually stall/wait on network-online.target? On my yakkety desktop system these are just rc-local.service, kerneloops.service, and lxd.service -- neither of which are really required to log into the device or start services (even ssh.service doesn't block on network-online.target).

So not implementing network-online.target would be a workaround, but I believe eventually this would not help -- as services which *legitimately* need to wait until the network is up (such as network mounts, backup software, etc.) would then unnecessarily fail.

I believe it is a more robust design to always let network-online.target do its job but avoid depending on it in services which don't really require network.

Can you please copy&paste the output of "systemctl list-jobs" when this happens? This should show the bits that wait on network-online.target.

> netplan should make the system check the physical link status when trying to bring up a network device instead of stalling the boot.

Both networkd and NM already do this -- they bring up an interface once it appears and gets a carrier. However, s-n-wait-online waits until all configured links appear (with a timeout of 2 minutes), which is usually what you want with services that depend on network-online.

So I believe completely disabling wait-online is not the right way. We might want to add syntax and functionality to mark devices like "allow-hotplug" in ifupdown, though.

Changed in nplan (Ubuntu):
status: Confirmed → Incomplete
Martin Pitt (pitti) wrote :

Ping? Is this still an issue?

Michael Hudson-Doyle (mwhudson) wrote :

I don't think so.

Martin Pitt (pitti) wrote :

OK, thanks. Please reopen with answers to comment #2 if it turns out to still be an issue after all.

Changed in nplan (Ubuntu):
status: Incomplete → Invalid
Changed in snappy:
status: New → Invalid
Oliver Grawert (ogra) on 2017-09-09
Changed in nplan (Ubuntu):
status: Invalid → Confirmed
Changed in snappy:
status: Invalid → Confirmed
Oliver Grawert (ogra) wrote :

setting this back to confirmed

this is definitely still an issue (and always has been), very easily to reproduce on a raspberry pi3 with Ubuntu Core that you only configure for wifi...

if you go with the defaults in the network config of subiquity the eth0 device stays enabled and defaults to dhcp but nothing in systemd will take into account if a physical wire is connected on boot so it will blindly wait for the interface to be up until the timeout hits.

adding something like "allow-hotplug" to nplan would solve this. alternatively simply having systemd-networkd check for physical status and skipping the unwired device would help too.

Oliver Grawert (ogra) wrote :

thee is also an example on the snapcraft forum now:

https://forum.snapcraft.io/t/ubuntu-core-16-slow-boot-with-netplan/2057

Might this also be why the boot is so slow on the server live-install image?

Mark

Michael Hudson-Doyle (mwhudson) wrote :

Yes, I think it probably is. I want to change the server live-install image
to dhcp on wired interfaces by default which will mostly work around the
problem but a real fix/understanding is clearly also required.

On 12 September 2017 at 04:42, Mark Shuttleworth <<email address hidden>
> wrote:

> Might this also be why the boot is so slow on the server live-install
> image?
>
> Mark
>
> --
> You received this bug notification because you are a member of Snappy
> Developers, which is subscribed to Snappy.
> https://bugs.launchpad.net/bugs/1619258
>
> Title:
> netplan should allow NICs to be disconnected and not stall the boot
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/snappy/+bug/1619258/+subscriptions
>

tags: added: rls-aa-incoming
Dimitri John Ledkov (xnox) wrote :

possibly a bug in systemd-networkd-wait-online

Oliver Grawert (ogra) wrote :

Note that this bug is about Ubuntu Core (xenial), i dont think we use systemd-networkd in the rest of xenial ( https://lists.ubuntu.com/archives/ubuntu-devel/2016-January/039066.html )

Yeah, we're kinda talking about two things, the problem with ubuntu core and the problem with the artful daily-live server installer. Both of those use networkd :) I'm think xnox has a plan for artful, I'm not sure if it will apply to ubuntu core though, maybe he can confirm?

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers