networkd waits 10 seconds for ipv6 network discovery by default

Bug #1765173 reported by Ryan Harper on 2018-04-18
20
This bug affects 3 people
Affects Status Importance Assigned to Milestone
cloud-init (Ubuntu)
Undecided
Unassigned
netplan.io (Ubuntu)
Undecided
Unassigned
systemd (Ubuntu)
Undecided
Unassigned

Bug Description

[Impact]

 * On ipv6 less hosts, boot stalls for 10s
 * This is due to implicit RA being on, and wait-online awaiting for RA to timeout
 * Expectation is, that since explicit request for RA was not done, it should not block network-online.target, whilst all other aspects of network-online.target should be honored.

[Test Case]

 * lxc network set lxdbr0 ipv6.address none
 * lxc launch ubuntu-daily:bionic lp1765173
 * sleep 20 && lxc exec lp1765173 systemd-analyze blame | head

Bad result is:
         12.157s systemd-networkd-wait-online.service
Good result is:
           739ms systemd-networkd-wait-online.service

[Regression Potential]

 * If explicit IPv6 dhcp/ra was enabled in .network file, the boot will be blocked awaiting RA response

 * Only kernel implicit RA configuration will now become "async"

 * This is inline with behaviour of xenial systems

[Other Info]

 * Original bug report

1.
$ lsb_release -rd
Description: Ubuntu Bionic Beaver (development branch)
Release: 18.04

2.
$ apt-cache policy systemd
systemd:
  Installed: 237-3ubuntu8
  Candidate: 237-3ubuntu8
  Version table:
 *** 237-3ubuntu8 500
        500 http://archive.ubuntu.com/ubuntu bionic/main amd64 Packages
        100 /var/lib/dpkg/status

3. systemd-networkd-wait-online doesn't block for 10 seconds waiting on an IPV6 Router Advertisement

4. systemd-networkd sends at least two RA solicitation packets waiting for a response before setting the link to configured. This blocks the boot for every Ubuntu system where it does not have a IPV6 router responding to solicitations. THis includes bionic containers, cloud instances, vms and physical machines.

--

We can see that we're spending 13 seconds waiting for networkd to say the network is up.

% systemd-analyze blame
         13.326s systemd-networkd-wait-online.service
           999ms cloud-init-local.service
           954ms cloud-init.service
           887ms cloud-config.service
           749ms dev-vda1.device
           666ms cloud-final.service
           248ms keyboard-setup.service
           175ms systemd-udev-trigger.service
           171ms lxd-containers.service
           165ms snapd.service
           154ms apparmor.service
           147ms ssh.service
           144ms systemd-timesyncd.service
           133ms grub-common.service
           130ms systemd-journald.service
           130ms accounts-daemon.service
            99ms systemd-modules-load.service
            98ms systemd-resolved.service
            94ms apport.service
            92ms rsyslog.service
            88ms systemd-logind.service
            80ms lvm2-monitor.service
            75ms iscsid.service
            64ms ebtables.service
            62ms user@1000.service
            54ms dev-mqueue.mount
            53ms ufw.service
            52ms systemd-remount-fs.service
            52ms kmod-static-nodes.service
            34ms systemd-journal-flush.service
            34ms polkit.service
            32ms systemd-tmpfiles-setup-dev.service
            27ms systemd-udevd.service
            26ms systemd-networkd.service
            26ms systemd-sysctl.service
            25ms systemd-tmpfiles-setup.service
            21ms dev-hugepages.mount
            17ms sys-kernel-debug.mount
            16ms console-setup.service
            15ms plymouth-read-write.service
            15ms snapd.socket
            15ms plymouth-quit.service
            14ms systemd-update-utmp.service
            10ms systemd-user-sessions.service
             9ms boot-efi.mount
             8ms sys-fs-fuse-connections.mount
             8ms systemd-update-utmp-runlevel.service
             8ms lxd.socket
             7ms blk-availability.service
             5ms systemd-random-seed.service
             3ms setvtrgb.service
             3ms plymouth-quit-wait.service
             3ms sys-kernel-config.mount

Here we can see that we start networking at 18:30:51.69, and then IPv6 NDISC runs for 10 seconds
and then the link is configured at 18:31:05. *AND* we see NDISC stays around and continues to see if it gets a RA packet.

$ journalctl -o short-precise -u systemd-networkd.service | egrep "(Starting Network|Discovering IPv6 routers|NDISC|Configured)"
Apr 18 18:30:51.691532 rharper-b1 systemd[1]: Starting Network Service...
Apr 18 18:30:53.031041 rharper-b1 systemd-networkd[603]: ens3: Discovering IPv6 routers
Apr 18 18:30:53.031306 rharper-b1 systemd-networkd[603]: NDISC: Started IPv6 Router Solicitation client
Apr 18 18:30:53.031612 rharper-b1 systemd-networkd[603]: NDISC: Sent Router Solicitation, next solicitation in 4s
Apr 18 18:30:57.092282 rharper-b1 systemd-networkd[603]: NDISC: Sent Router Solicitation, next solicitation in 8s
Apr 18 18:31:05.040895 rharper-b1 systemd-networkd[603]: NDISC: No RA received before link confirmation timeout
Apr 18 18:31:05.040932 rharper-b1 systemd-networkd[603]: NDISC: Invoking callback for 't'.
Apr 18 18:31:05.041099 rharper-b1 systemd-networkd[603]: ens3: Configured
Apr 18 18:31:05.484261 rharper-b1 systemd-networkd[603]: NDISC: Sent Router Solicitation, next solicitation in 16s
Apr 18 18:31:21.550612 rharper-b1 systemd-networkd[603]: NDISC: Sent Router Solicitation, next solicitation in 33s
Apr 18 18:31:54.706283 rharper-b1 systemd-networkd[603]: NDISC: Sent Router Solicitation, next solicitation in 1min 5s
Apr 18 18:33:00.232685 rharper-b1 systemd-networkd[603]: NDISC: Sent Router Solicitation, next solicitation in 2min 15s
Apr 18 18:35:15.436376 rharper-b1 systemd-networkd[603]: NDISC: Sent Router Solicitation, next solicitation in 4min 32s
Apr 18 18:39:48.111413 rharper-b1 systemd-networkd[603]: NDISC: Sent Router Solicitation, next solicitation in 9min 16s

networkd runs ndisc if your host is configured with ipv6 forwarding disabled and accept_ra enabled.

% sysctl net.ipv6.conf.all.forwarding
net.ipv6.conf.all.forwarding = 0

% sysctl net.ipv6.conf.all.accept_ra
net.ipv6.conf.all.accept_ra = 1

So ndisc path is expected per systemd code. We should be able to disable the NDISC link timeout; especially since NDISC stays around to process RA packets later on anyhow. Nothing is gained by blocking boot waiting for an RA packet. This is also a change in behavior on previous LTS, Xenial where the RA handling is done in the kernel rather than netowkrd.

Full debug output of systemd-networkd during this boot.

% journalctl -o short-precise -u systemd-networkd.service
-- Logs begin at Wed 2018-04-18 18:30:19 UTC, end at Wed 2018-04-18 18:35:15 UTC. --
Apr 18 18:30:34.371090 rharper-b1 systemd[1]: Stopping Network Service...
Apr 18 18:30:34.399603 rharper-b1 systemd[1]: Stopped Network Service.
-- Reboot --
Apr 18 18:30:51.691532 rharper-b1 systemd[1]: Starting Network Service...
Apr 18 18:30:51.714593 rharper-b1 systemd-networkd[603]: Bus n/a: changing state UNSET → OPENING
Apr 18 18:30:51.714736 rharper-b1 systemd-networkd[603]: Added inotify watch for /run on bus n/a: 2
Apr 18 18:30:51.714754 rharper-b1 systemd-networkd[603]: Added inotify watch for /run/dbus on bus n/a: -1
Apr 18 18:30:51.714795 rharper-b1 systemd-networkd[603]: Bus n/a: changing state OPENING → WATCH_BIND
Apr 18 18:30:51.715122 rharper-b1 systemd-networkd[603]: Failed to open configuration file '/etc/systemd/networkd.conf': No such file or directory
Apr 18 18:30:51.715134 rharper-b1 systemd-networkd[603]: timestamp of '/etc/systemd/network' changed
Apr 18 18:30:51.715144 rharper-b1 systemd-networkd[603]: timestamp of '/run/systemd/network' changed
Apr 18 18:30:51.715334 rharper-b1 systemd-networkd[603]: Ignoring /run/systemd/network/10-netplan-ens3.network, because it's not a regular file with suffix .netdev.
Apr 18 18:30:51.715342 rharper-b1 systemd-networkd[603]: Ignoring /run/systemd/network/10-netplan-ens3.link, because it's not a regular file with suffix .netdev.
Apr 18 18:30:51.715375 rharper-b1 systemd-networkd[603]: Ignoring /lib/systemd/network/80-container-vz.network, because it's not a regular file with suffix .netdev.
Apr 18 18:30:51.715382 rharper-b1 systemd-networkd[603]: Ignoring /lib/systemd/network/99-default.link, because it's not a regular file with suffix .netdev.
Apr 18 18:30:51.715389 rharper-b1 systemd-networkd[603]: Ignoring /lib/systemd/network/80-container-host0.network, because it's not a regular file with suffix .netdev.
Apr 18 18:30:51.715397 rharper-b1 systemd-networkd[603]: Ignoring /lib/systemd/network/80-container-ve.network, because it's not a regular file with suffix .netdev.
Apr 18 18:30:51.715573 rharper-b1 systemd-networkd[603]: Ignoring /run/systemd/network/10-netplan-ens3.link, because it's not a regular file with suffix .network.
Apr 18 18:30:51.715603 rharper-b1 systemd-networkd[603]: Ignoring /lib/systemd/network/99-default.link, because it's not a regular file with suffix .network.
Apr 18 18:30:51.717227 rharper-b1 systemd-networkd[603]: ens3: Flags change: +MULTICAST +BROADCAST
Apr 18 18:30:51.717240 rharper-b1 systemd-networkd[603]: ens3: Link 2 added
Apr 18 18:30:51.717552 rharper-b1 systemd-networkd[603]: ens3: udev initialized link
Apr 18 18:30:51.717585 rharper-b1 systemd-networkd[603]: ens3: Saved original MTU: 1500
Apr 18 18:30:51.717597 rharper-b1 systemd-networkd[603]: lo: Flags change: +LOOPBACK +UP +LOWER_UP +RUNNING
Apr 18 18:30:51.717628 rharper-b1 systemd-networkd[603]: lo: Link 1 added
Apr 18 18:30:51.717869 rharper-b1 systemd-networkd[603]: lo: udev initialized link
Apr 18 18:30:51.717900 rharper-b1 systemd-networkd[603]: lo: Saved original MTU: 0
Apr 18 18:30:51.717987 rharper-b1 systemd-networkd[603]: lo: Adding address: ::1/128 (valid forever)
Apr 18 18:30:51.717998 rharper-b1 systemd-networkd[603]: lo: Adding address: 127.0.0.1/8 (valid forever)
Apr 18 18:30:51.718115 rharper-b1 systemd-networkd[603]: rtnl: received address with invalid family 129, ignoring
Apr 18 18:30:51.718293 rharper-b1 systemd-networkd[603]: Enumeration completed
Apr 18 18:30:51.718444 rharper-b1 systemd-networkd[603]: ens3: Link state is up-to-date
Apr 18 18:30:51.718493 rharper-b1 systemd[1]: Started Network Service.
Apr 18 18:30:51.720311 rharper-b1 systemd-networkd[603]: ens3: found matching network '/run/systemd/network/10-netplan-ens3.network'
Apr 18 18:30:51.720850 rharper-b1 systemd-networkd[603]: lo: Link is not managed by us
Apr 18 18:30:51.720915 rharper-b1 systemd-networkd[603]: ens3: Bringing link up
Apr 18 18:30:51.720974 rharper-b1 systemd-networkd[603]: ens3: IPv6 successfully enabled
Apr 18 18:30:51.721574 rharper-b1 systemd-networkd[603]: lo: Link state is up-to-date
Apr 18 18:30:51.721778 rharper-b1 systemd-networkd[603]: Virtualization QEMU found in DMI (/sys/class/dmi/id/sys_vendor)
Apr 18 18:30:51.721792 rharper-b1 systemd-networkd[603]: Virtualization found, CPUID=KVMKVMKVM
Apr 18 18:30:51.721799 rharper-b1 systemd-networkd[603]: Found VM virtualization kvm
Apr 18 18:30:51.721808 rharper-b1 systemd-networkd[603]: lo: Unmanaged
Apr 18 18:30:51.722017 rharper-b1 systemd-networkd[603]: ens3: Flags change: +UP +LOWER_UP +RUNNING
Apr 18 18:30:51.727823 rharper-b1 systemd-networkd[603]: LLDP: Started LLDP client
Apr 18 18:30:51.727831 rharper-b1 systemd-networkd[603]: ens3: Started LLDP.
Apr 18 18:30:51.727838 rharper-b1 systemd-networkd[603]: ens3: Gained carrier
Apr 18 18:30:51.727852 rharper-b1 systemd-networkd[603]: ens3: Acquiring DHCPv4 lease
Apr 18 18:30:51.727965 rharper-b1 systemd-networkd[603]: DHCP CLIENT (0x9bbb16d3): STARTED on ifindex 2
Apr 18 18:30:51.728263 rharper-b1 systemd-networkd[603]: DHCP CLIENT (0x9bbb16d3): DISCOVER
Apr 18 18:30:51.729279 rharper-b1 systemd-networkd[603]: DHCP CLIENT (0x9bbb16d3): OFFER
Apr 18 18:30:51.729361 rharper-b1 systemd-networkd[603]: DHCP CLIENT (0x9bbb16d3): REQUEST (requesting)
Apr 18 18:30:51.730084 rharper-b1 systemd-networkd[603]: DHCP CLIENT (0x9bbb16d3): ACK
Apr 18 18:30:51.730212 rharper-b1 systemd-networkd[603]: DHCP CLIENT (0x9bbb16d3): lease expires in 59min 58s
Apr 18 18:30:51.730221 rharper-b1 systemd-networkd[603]: DHCP CLIENT (0x9bbb16d3): T2 expires in 52min 27s
Apr 18 18:30:51.730229 rharper-b1 systemd-networkd[603]: DHCP CLIENT (0x9bbb16d3): T1 expires in 29min 57s
Apr 18 18:30:51.730314 rharper-b1 systemd-networkd[603]: ens3: DHCPv4 address 192.168.122.140/24 via 192.168.122.1
Apr 18 18:30:51.730375 rharper-b1 systemd-networkd[603]: Setting transient hostname: 'rharper-b1'
Apr 18 18:30:51.730385 rharper-b1 systemd-networkd[603]: Not connected to system bus, not setting hostname.
Apr 18 18:30:51.730676 rharper-b1 systemd-networkd[603]: ens3: Updating address: 192.168.122.140/24 (valid for 1h)
Apr 18 18:30:51.731002 rharper-b1 systemd-networkd[603]: ens3: DHCP error: could not get routes: No data available
Apr 18 18:30:53.030808 rharper-b1 systemd-networkd[603]: ens3: Adding address: fe80::5054:ff:feba:2b98/64 (valid forever)
Apr 18 18:30:53.030877 rharper-b1 systemd-networkd[603]: ens3: Gained IPv6LL
Apr 18 18:30:53.031041 rharper-b1 systemd-networkd[603]: ens3: Discovering IPv6 routers
Apr 18 18:30:53.031306 rharper-b1 systemd-networkd[603]: NDISC: Started IPv6 Router Solicitation client
Apr 18 18:30:53.031612 rharper-b1 systemd-networkd[603]: NDISC: Sent Router Solicitation, next solicitation in 4s
Apr 18 18:30:57.092282 rharper-b1 systemd-networkd[603]: NDISC: Sent Router Solicitation, next solicitation in 8s
Apr 18 18:31:05.040895 rharper-b1 systemd-networkd[603]: NDISC: No RA received before link confirmation timeout
Apr 18 18:31:05.040932 rharper-b1 systemd-networkd[603]: NDISC: Invoking callback for 't'.
Apr 18 18:31:05.041099 rharper-b1 systemd-networkd[603]: ens3: Configured
Apr 18 18:31:05.484261 rharper-b1 systemd-networkd[603]: NDISC: Sent Router Solicitation, next solicitation in 16s
Apr 18 18:31:05.495751 rharper-b1 systemd-networkd[603]: Got inotify event on bus n/a.
Apr 18 18:31:05.495842 rharper-b1 systemd-networkd[603]: Added inotify watch for /run on bus n/a: 2
Apr 18 18:31:05.495870 rharper-b1 systemd-networkd[603]: Added inotify watch for /run/dbus on bus n/a: -1
Apr 18 18:31:05.928832 rharper-b1 systemd-networkd[603]: Got inotify event on bus n/a.
Apr 18 18:31:05.928914 rharper-b1 systemd-networkd[603]: Added inotify watch for /run on bus n/a: 2
Apr 18 18:31:05.928935 rharper-b1 systemd-networkd[603]: Added inotify watch for /run/dbus on bus n/a: -1
Apr 18 18:31:06.014785 rharper-b1 systemd-networkd[603]: Got inotify event on bus n/a.
Apr 18 18:31:06.015036 rharper-b1 systemd-networkd[603]: Added inotify watch for /run on bus n/a: 2
Apr 18 18:31:06.017290 rharper-b1 systemd-networkd[603]: Added inotify watch for /run/dbus on bus n/a: -1
Apr 18 18:31:06.017532 rharper-b1 systemd-networkd[603]: Got inotify event on bus n/a.
Apr 18 18:31:06.017581 rharper-b1 systemd-networkd[603]: Added inotify watch for /run on bus n/a: 2
Apr 18 18:31:06.017598 rharper-b1 systemd-networkd[603]: Added inotify watch for /run/dbus on bus n/a: -1
Apr 18 18:31:06.027908 rharper-b1 systemd-networkd[603]: Got inotify event on bus n/a.
Apr 18 18:31:06.028491 rharper-b1 systemd-networkd[603]: Bus n/a: changing state WATCH_BIND → AUTHENTICATING
Apr 18 18:31:06.051721 rharper-b1 systemd-networkd[603]: Bus n/a: changing state AUTHENTICATING → HELLO
Apr 18 18:31:06.051767 rharper-b1 systemd-networkd[603]: Sent message type=method_call sender=n/a destination=org.freedesktop.DBus path=/org/freedesktop/DBus interface=org.freedesktop.DBus member=Hello cookie=1 reply_cookie=0 signature=n/a error-name=n/a error-message=n/a
Apr 18 18:31:06.051782 rharper-b1 systemd-networkd[603]: Sent message type=method_call sender=n/a destination=org.freedesktop.DBus path=/org/freedesktop/DBus interface=org.freedesktop.DBus member=RequestName cookie=2 reply_cookie=0 signature=su error-name=n/a error-message=n/a
Apr 18 18:31:06.051794 rharper-b1 systemd-networkd[603]: Sent message type=method_call sender=n/a destination=org.freedesktop.DBus path=/org/freedesktop/DBus interface=org.freedesktop.DBus member=AddMatch cookie=3 reply_cookie=0 signature=s error-name=n/a error-message=n/a
Apr 18 18:31:06.051807 rharper-b1 systemd-networkd[603]: Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/network1/link/_31 interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=4 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a
Apr 18 18:31:06.051819 rharper-b1 systemd-networkd[603]: Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/network1/link/_32 interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=5 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a
Apr 18 18:31:06.051831 rharper-b1 systemd-networkd[603]: Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/network1/link/_31 interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=6 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a
Apr 18 18:31:06.051842 rharper-b1 systemd-networkd[603]: Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/network1/link/_32 interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=7 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a
Apr 18 18:31:06.051854 rharper-b1 systemd-networkd[603]: Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/network1/link/_32 interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=8 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a
Apr 18 18:31:06.051868 rharper-b1 systemd-networkd[603]: Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/network1/link/_32 interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=9 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a
Apr 18 18:31:06.051881 rharper-b1 systemd-networkd[603]: Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/network1 interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=10 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a
Apr 18 18:31:06.051893 rharper-b1 systemd-networkd[603]: Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/network1/link/_32 interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=11 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a
Apr 18 18:31:06.051905 rharper-b1 systemd-networkd[603]: Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/network1 interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=12 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a
Apr 18 18:31:06.051916 rharper-b1 systemd-networkd[603]: Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/network1/link/_32 interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=13 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a
Apr 18 18:31:06.053073 rharper-b1 systemd-networkd[603]: Got message type=method_return sender=org.freedesktop.DBus destination=:1.2 path=n/a interface=n/a member=n/a cookie=1 reply_cookie=1 signature=s error-name=n/a error-message=n/a
Apr 18 18:31:06.053092 rharper-b1 systemd-networkd[603]: Bus n/a: changing state HELLO → RUNNING
Apr 18 18:31:06.053151 rharper-b1 systemd-networkd[603]: Got message type=signal sender=org.freedesktop.DBus.Local destination=n/a path=/org/freedesktop/DBus/Local interface=org.freedesktop.DBus.Local member=Connected cookie=4294967295 reply_cookie=0 signature=n/a error-name=n/a error-message=n/a
Apr 18 18:31:06.053163 rharper-b1 systemd-networkd[603]: Setting transient hostname: 'rharper-b1'
Apr 18 18:31:06.053186 rharper-b1 systemd-networkd[603]: Sent message type=method_call sender=n/a destination=org.freedesktop.hostname1 path=/org/freedesktop/hostname1 interface=org.freedesktop.hostname1 member=SetHostname cookie=14 reply_cookie=0 signature=sb error-name=n/a error-message=n/a
Apr 18 18:31:06.053246 rharper-b1 systemd-networkd[603]: Got message type=signal sender=org.freedesktop.DBus destination=:1.2 path=/org/freedesktop/DBus interface=org.freedesktop.DBus member=NameAcquired cookie=2 reply_cookie=0 signature=s error-name=n/a error-message=n/a
Apr 18 18:31:06.053364 rharper-b1 systemd-networkd[603]: Got message type=signal sender=org.freedesktop.DBus destination=:1.2 path=/org/freedesktop/DBus interface=org.freedesktop.DBus member=NameAcquired cookie=3 reply_cookie=0 signature=s error-name=n/a error-message=n/a
Apr 18 18:31:06.053409 rharper-b1 systemd-networkd[603]: Got message type=method_return sender=org.freedesktop.DBus destination=:1.2 path=n/a interface=n/a member=n/a cookie=4 reply_cookie=2 signature=u error-name=n/a error-message=n/a
Apr 18 18:31:06.053419 rharper-b1 systemd-networkd[603]: Successfully acquired requested service name.
Apr 18 18:31:06.053502 rharper-b1 systemd-networkd[603]: Got message type=method_return sender=org.freedesktop.DBus destination=:1.2 path=n/a interface=n/a member=n/a cookie=5 reply_cookie=3 signature=n/a error-name=n/a error-message=n/a
Apr 18 18:31:06.053511 rharper-b1 systemd-networkd[603]: Match type='signal',sender='org.freedesktop.login1',path='/org/freedesktop/login1',interface='org.freedesktop.login1.Manager',member='PrepareForSleep' successfully installed.
Apr 18 18:31:06.229740 rharper-b1 systemd-networkd[603]: Got message type=method_return sender=:1.6 destination=:1.2 path=n/a interface=n/a member=n/a cookie=3 reply_cookie=14 signature=n/a error-name=n/a error-message=n/a
Apr 18 18:31:21.550612 rharper-b1 systemd-networkd[603]: NDISC: Sent Router Solicitation, next solicitation in 33s
Apr 18 18:31:54.706283 rharper-b1 systemd-networkd[603]: NDISC: Sent Router Solicitation, next solicitation in 1min 5s
Apr 18 18:33:00.232685 rharper-b1 systemd-networkd[603]: NDISC: Sent Router Solicitation, next solicitation in 2min 15s
Apr 18 18:35:15.436376 rharper-b1 systemd-networkd[603]: NDISC: Sent Router Solicitation, next solicitation in 4min 32s

ProblemType: Bug
DistroRelease: Ubuntu 18.04
Package: systemd 237-3ubuntu8
ProcVersionSignature: User Name 4.15.0-15.16-generic 4.15.15
Uname: Linux 4.15.0-15-generic x86_64
ApportVersion: 2.20.9-0ubuntu4
Architecture: amd64
Date: Wed Apr 18 18:25:53 2018
Lsusb: Bus 001 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
MachineType: QEMU Standard PC (i440FX + PIIX, 1996)
ProcEnviron:
 TERM=xterm
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=C.UTF-8
 SHELL=/bin/bash
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-15-generic root=LABEL=cloudimg-rootfs ro console=tty1 console=ttyS0
SourcePackage: systemd
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 04/01/2014
dmi.bios.vendor: SeaBIOS
dmi.bios.version: 1.10.2-1ubuntu1
dmi.chassis.type: 1
dmi.chassis.vendor: QEMU
dmi.chassis.version: pc-i440fx-bionic
dmi.modalias: dmi:bvnSeaBIOS:bvr1.10.2-1ubuntu1:bd04/01/2014:svnQEMU:pnStandardPC(i440FX+PIIX,1996):pvrpc-i440fx-bionic:cvnQEMU:ct1:cvrpc-i440fx-bionic:
dmi.product.name: Standard PC (i440FX + PIIX, 1996)
dmi.product.version: pc-i440fx-bionic
dmi.sys.vendor: QEMU

Ryan Harper (raharper) wrote :
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in systemd (Ubuntu):
status: New → Confirmed
Steve Langasek (vorlon) wrote :

The semantics of systemd-networkd-wait-online.service are that the service is ready when "the Internet" is reachable. In the past, the implementation of network-online.target has been less correct on Ubuntu, *because* ipv6 RA was handled asynchronously, and this target did not block waiting to find out if ipv6 had been configured. That was mostly ok up until now in an IPv4-dominated world, but over the course of 18.04 LTS's lifetime, I expect we will increasingly see IPv6-enabled deployments where the default IPv6 route is as important as, or more important than, the default IPv4 route, and to not block waiting for SLAAC results in an RFC-compliant manner would make the boot racy. We cannot regard IPv6 as an afterthought. I agree with systemd upstream's decision to block on IPv6 results by default.

However, in most environments where cloud-init is running, we shouldn't need a configurable timeout for SLAAC at all because we know by design whether or not we can expect ipv6 and if we know that the substrate does not provide ipv6, the netplan yaml for this provider should specify 'accept-ra: false'. Then the timeout is the correct 0, and there is no adverse effect on boot speed.

For public clouds that don't provide netplan yaml today and instead have default netplan yaml generated by cloud-init, this should be included as part of the default policy in cloud-init.

For any public clouds that are ipv6 opt-in, it is probably still appropriate to emit accept-ra: false by default in order to ensure a good default experience; and deal with opt-in ipv6 support via user-provided (or, preferably: cloud-provided) override cloud config.

For the lxd case, the default should be 'accept-ra: false', with logic added later in lxd to set 'accept-ra: true' when lxd knows that the bridge supports ipv6.

I think we will want to extend networkd and netplan to allow expression of additional network policies beyond the options of 'accept-ra: true' and 'accept-ra: false'. It should be possible to express that the kernel should try SLAAC, but that networkd should not block on SLAAC results for purposes of systemd-networkd-wait-online, while still blocking on *ipv4* connectivity for this interface. However, this will take some time to design properly, and I do not think that this should be the default for networkd+netplan once implemented: the forward-looking default is still that we should wait for a firm answer from SLAAC before considering the network up, and that if the user wants a different policy from this, it should be an override of the default netplan configuration.

So I believe the quick path for addressing this issue in 18.04 is for cloud-init to declare 'accept-ra: false' everywhere that this is reasonable, with a possible relaxation to a future opportunistic accept-ra policy once this is available in netplan+networkd.

Dimitri John Ledkov (xnox) wrote :

NetworkManager I believe already supports settings if ipv6/ipv4 should be attempted, and whether or not ipv4/ipv6 needs to be considered for a networkmanager-network to be considered up.

networkd wise, I am currently inclined to extend following options passed to RequiredForOnline=no|ipv4|ipv6|yes.
This work should be implementable and SRUable between GA and .1 releases.

Dimitri John Ledkov (xnox) wrote :

The default will remain, implicit, RequiredForOnline=yes. And it will be up to netplan / cloud metdata / cloud-init, to pass via netplan whatever the right syntax there needs to be to tell networkd that RequiredForOnline needs to be relaxed for ipv4, or ipv6, or for the whole interface altogether.

Ryan Harper (raharper) wrote :

I disagree that SLAAC is something that we should by default block reaching network-online.target.

In general we cannot know in advance whether or not another system in the same subnet is providing Router Advertisements to other instances.

If one instance in my VPC is running radvd, ec2 has no knowledge of this meaning that if cloud-init sets accept-ra: false; that disables accepting RAs and that breaks this use-case; this is similar for LXD as well.

Additionally, accept-ra:false doesn't mean networkd doesn't handle RA, it also disables accept-ra in the kernel as well;

I believe that *IF* you are running an instance where you expect to use SLAAC and know that you won't use other means of network configuration then that instance should set some sort of flag to indicate that you want network-online.target to wait for a SLAAC response.

No matter how forward looking we want to be; it doesn't make sense to punish the majority of boots today with a 10 second timeout.

Steve Langasek (vorlon) wrote :

I am wary of any design decision in netplan that fundamentally prioritizes ipv4 over ipv6. I'd like to suggest the following working definition of network-online.target:

 - all "non-optional" interfaces have an address configured for at least one address family or have timed out, and
 - there is at least one default route for at least one address family, and
 - there is at least one DNS server known.

Does this sound like a correct, and complete, definition of what network-online.target should provide, by default, with further knobs in netplan to enforce specific address families?

Steve Langasek (vorlon) wrote :

Addendum: we also need to be sure that the DNS server address is one that we have a route for. It does no good to start trying to talk to the DNS server if the DNS address is IPv4 but we only have routes for IPv6.

Ryan Harper (raharper) wrote :

I don't think not blocking for RAs during boot as a decision that prefers ipv4 over ipv6. networkd is just doing something else by default that the kernel doesn't do today, or historically; *and* there is no toggle for the blocking; only do you want RAs or not *and* if you say you don't want RAs it disables them in the kernel too. That's just not very flexible and breaks at least one known use-case w.r.t radvd use.

W.r.t the the online-target; what you describe sounds complete. I think it's worth mentioning that this state can be reached by combining different interfaces; one may have a default route, a second has DNS, and a third has an address.

Dimitri John Ledkov (xnox) wrote :

in summary the consensus is as follows:

- cloud-init should not provide accept-ra key to netplan.io
- netplan.io should not emit any accept-ra keys to networkd
- networkd, when not receiving any accept-ra keys should use kernel implicit default
- networkd, when in implicit RA mode, should not block on getting RA as a pre-condition for a link to become ready

I believe cloud-init & netplan.io portions are done already.

Changed in cloud-init (Ubuntu):
status: New → Fix Released
Changed in netplan.io (Ubuntu):
status: New → Fix Released
Changed in systemd (Ubuntu):
status: Confirmed → In Progress
description: updated
description: updated
Dimitri John Ledkov (xnox) wrote :

Test PPA is available at $ sudo add-apt-repository ppa:ci-train-ppa-service/3243

Changed in systemd (Ubuntu):
status: In Progress → Fix Committed
Ryan Harper (raharper) wrote :

I've given this a test in LXD on my system which triggers the RA stall; after upgrading it does not wait for an RA before marking the interface configured and also confirmed that it does not regress the case where there are existing RA's on the network; those are still received. Note that we see eth0: Configured before the Discovering IPv6 routers, and Received Router Advertisement.

Apr 20 20:55:21.062761 b1 systemd[1]: Started Network Service.
Apr 20 20:55:21.904103 b1 systemd-networkd[163]: eth0: Configured
Apr 20 20:55:21.904190 b1 systemd-networkd[163]: eth0: Discovering IPv6 routers
Apr 20 20:55:21.904252 b1 systemd-networkd[163]: NDISC: Started IPv6 Router Solicitation client
Apr 20 20:55:21.904632 b1 systemd-networkd[163]: NDISC: Sent Router Solicitation, next solicitation in 3s
Apr 20 20:55:21.906538 b1 systemd-networkd[163]: NDISC: Received Router Advertisement: flags none preference medium lifetime 30 sec

root@b1:~# networkctl status
● State: routable
       Address: 10.8.107.88 on eth0
                2001:db8:100:f101:216:3eff:fe5f:d776 on eth0
                fe80::216:3eff:fe5f:d776 on eth0
       Gateway: 10.8.107.1 on eth0
                fe80::216:3eff:fe31:e211 (Xensource, Inc.) on eth0
           DNS: 10.8.107.1
Search Domains: lxd

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package systemd - 237-3ubuntu10

---------------
systemd (237-3ubuntu10) bionic; urgency=medium

  * Create tmpfiles for persistent journal in postinst only when running
    systemd (LP: #1748659)

 -- Balint Reczey <email address hidden> Fri, 20 Apr 2018 18:55:56 +0200

Changed in systemd (Ubuntu):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers