Wifi configuration

Bug #1870346 reported by Dave Jones on 2020-04-02
This bug affects 3 people
Affects Status Importance Assigned to Milestone
cloud-init (Ubuntu)

Bug Description

As noted in the cloud-init documentation under https://cloudinit.readthedocs.io/en/latest/topics/network-config-format-v2.html#networking-config-version-2 "Cloud-init does not current support wifis type that is present in native netplan."

This is entirely understandable given the primary focus of cloud-init on commercial clouds (which are likely entirely devoid of wifi on their instances). However, as the Ubuntu Pi images are also cloud derived, they use cloud-init for initial configuration, and on the Pi it's fairly common to want a wifi-only set up.

The current state of affairs is that cloud-init is happy to copy a "wifis" section found in network-config to the relevant netplan config but won't apply it on the first boot. It's not a huge inconvenience to have to reboot to get the wifi configuration applied, and this can be worked around with hacks like running "netplan apply" in a runcmd section. However, there is a larger problem in that it several other cloud-init modules (in particular ssh_import_id) rely on a network connection to operate. Hence, if a Pi user happens to have a wifi-only set up, this effectively means they cannot use ssh_import_id sections in cloud-init.

Is there any possibility that wifi support might be considered for a future version?

Ryan Harper (raharper) wrote :

The docs need updating, we do copying netplan through to target, and emit a warning.
Ubuntu Server images have a default policy of not configuring wifi with networkd, but
letting network-manager do that. That said, you can override this policy with a change
to your netplan provided.

I think you can just put, 'renderers: networkd' under the wifis: in your config.

  version: 2
     renderer: networkd

Note, that we' have an open bug that I'm fixing to allow the 'renderer' key to
not through an error.

Paride Legovini (paride) on 2020-04-03
Changed in cloud-init (Ubuntu):
status: New → Triaged
importance: Undecided → Wishlist
Scott Moser (smoser) wrote :

In my opinion, the provider of network configuration to an instance (image) should not have to "know" that the image uses networkd, NetworkManager, ifupdown, wikid.... They just declare "make the networking like this".

The operating system inside implements API. Anything else just cannot work. For example, a cloud provider that implements "bring your own image" can't possibly know what an image uses to configure its networking.

cloud-init should either do the right thing, or error saying "Cannot implement network configuration."

Scott Moser (smoser) wrote :

In case it wasn't clear above... I don't personally believe that specifying "renderer" should be required or even allowed. Its a hack.

Dave Jones (waveform) wrote :

Many thanks for the quick answer. Unfortunately I haven't had much luck with the "renderer: networkd" workaround; I attempted the following configuration in "network-config" with a Pi4 with no ethernet connected:

version: 2
    dhcp4: true
    optional: true
  renderer: networkd
    dhcp4: true
    optional: true
        password: redacted

During boot, cloud-init printed the following warning:

Cloud-init v. 20.1-10-g71af48df-0ubuntu3 running 'init-local' at Wed, 18 Mar 2020 19:32:50 +0000. Up 12.25 seconds.
2020-03-18 19:32:50,728 - network_state.py[WARNING]: Wifi configuration is only available to distros withnetplan rendering support.
Cloud-init v. 20.1-10-g71af48df-0ubuntu3 running 'init' at Wed, 18 Mar 2020 19:32:52 +0000. Up 14.89 seconds.

(I can provide full logs if required)

After boot had completed, the wifi interface was down, and a simple "sudo netplan apply" brought it up successfully which rather suggests that though the config was copied in it wasn't applied in spite of the "renderer:" setting.

In general, I'm in agreement with Scott about the design of netplan, though I may well be missing some piece of information about why the current behaviour is as it is (and whether it's practical to change that behaviour).

Ryan Harper (raharper) wrote :

If you can, please run 'cloud-init collect-logs' and attach the tarball it creates? ; You'll need to unpack and repack the tarball as the cloud-init.log will have the network-config unredacted and I see you redacted your wifi password. Sorry for the trouble there.

Ryan Harper (raharper) wrote :

Please move this back to New once you've attached the requested information.

Changed in cloud-init (Ubuntu):
status: Triaged → Incomplete
Dave Jones (waveform) wrote :

No problem; ran this attempt on a Pi3 (just in case you're wondering about any of the timings!) as the Pi4 is currently busy with Core20 duties, but the result is the same so I assume it shouldn't make any difference to the output.

The event at 2020-03-18 19:33:02,502 in cloud-init.log (line 246) looks suspicious to me?

"No network config applied. Neither a new instance nor datasource network update on 'System boot' event"

Changed in cloud-init (Ubuntu):
status: Incomplete → New
Dan Watkins (oddbloke) wrote :

Looking at the journal, I see that systemd-networkd does know about wlan0, and reports Link UP for it when cloud-init would expect system networking to be configured:

  Mar 18 19:32:59.099186 ubuntu systemd-networkd[1121]: wlan0: Link UP

is before:

  Mar 18 19:32:59.107823 ubuntu systemd[1]: Finished Wait for Network to be Configured.

but we also see:

  Mar 18 19:32:59.129316 ubuntu systemd-networkd[1121]: eth0: Link UP

when I don't believe we're expecting any link on eth0 (though maybe I'm mistaken about the configuration of this particular device), so the wlan0 line may be a red herring.

I do see later in the journal:

  Mar 18 19:33:08.385332 ubuntu systemd[1]: Starting WPA supplicant...

which is well after cloud-init would expect networking to be up. Is it possible that systemd-networkd-wait-online.service doesn't know how to handle waiting for wifi correctly? (Or is this just a symptom of not having the renderer: set? I'm a little out of my depth here.)

Dan Watkins (oddbloke) wrote :

Dave, Balint suggests that the latest systemd upload may have fixed this: https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1870410/comments/16

Would you be able to perform another test and let us know if this is still an issue?

Dave Jones (waveform) wrote :

Sorry Dan - no joy with the patched systemd either. I've attached another chunk of output from collect-logs in case it shows anything interesting (I don't see anything hugely different from the last run, but I'm not wholly sure what I should be looking for).

On your earlier query, I'm not sure networkd wait-online does know how to deal with wlan0 ... by default it doesn't appear in the config, but I'm not sure what that implies as far as netplan's configuration is concerned. If it defaults to "optional: false" then wait-online definitely isn't waiting for it (I remember the two minute wait we had when eth0 wasn't marked optional in Disco!). However, if it defaults to "optional: true" for wireless interfaces, then things could be operating as intended. I'll try setting up an image in which the wireless is explicitly marked "optional: false" and see what happens!

Paride Legovini (paride) wrote :

Thanks Dave, let us know what you find out. As netplan.io 0.99-0ubuntu2 introduced some fixes on WiFi handling, it may be worth redoing some testing with this version and see if it behaves differently.

Dan Watkins (oddbloke) wrote :

Moving this to Incomplete until we hear back about whether the netplan fixes affect this. (Please move back to New if it's still an issue!)

Changed in cloud-init (Ubuntu):
status: New → Incomplete
Dave Jones (waveform) wrote :

Just booted the focal release image (with netplan 0.99-0ubuntu2) on a pi3 with a valid wifi config in network-config, and I'm afraid it's the same as ever: the wifi config is present in /etc/netplan but hasn't been activated. Running "sudo netplan apply" after logging in brings everything up successfully.

re: comment10, I haven't checked wait-online service with a non-optional configuration yet though; that's next on the list. Will update with info in a bit.

Ryan Harper (raharper) wrote :

Looking at the journal, I can see that networkd attempts to bring wlan0 online; at this point I do not believe this is a cloud-init bug.

I did see this netplan issue yesterday, which may be where you are at:


Changed in cloud-init (Ubuntu):
importance: Wishlist → Medium
status: Incomplete → Triaged
Lukas Märdian (slyon) wrote :
Download full text (5.0 KiB)

Hi! I see a similar issue with the latest (development) netplan, where some OVS services are triggered via systemd service units (same as wpa_supplicant service units in the case of WiFi).

On first boot (with cloud-init user.network-config provided [0]) I can see that the services are not being started at all. While on subsequent boots (or after executing 'netplan apply' manually), the services are started and everything works as expected:

==== FIRST BOOT ====

root@ovs-y:~# systemctl status netplan-ovs-cleanup
● netplan-ovs-cleanup.service - OpenVSwitch configuration for cleanup
     Loaded: loaded (/run/systemd/system/netplan-ovs-cleanup.service; enabled-runtime; vendor preset: enabled)
     Active: inactive (dead)

root@ovs-y:~# systemctl status netplan-ovs-ovs0
● netplan-ovs-ovs0.service - OpenVSwitch configuration for ovs0
     Loaded: loaded (/run/systemd/system/netplan-ovs-ovs0.service; enabled-runtime; vendor preset: enabled)
     Active: inactive (dead)

==== REBOOT ====

root@ovs-y:~# systemctl status netplan-ovs-cleanup
● netplan-ovs-cleanup.service - OpenVSwitch configuration for cleanup
     Loaded: loaded (/run/systemd/system/netplan-ovs-cleanup.service; enabled-runtime; vendor preset: enabled)
     Active: inactive (dead) since Fri 2020-07-31 11:02:57 UTC; 2s ago
    Process: 325 ExecStart=/usr/sbin/netplan apply --only-ovs-cleanup (code=exited, status=0/SUCCESS)
   Main PID: 325 (code=exited, status=0/SUCCESS)

root@ovs-y:~# systemctl status netplan-ovs-ovs0
● netplan-ovs-ovs0.service - OpenVSwitch configuration for ovs0
     Loaded: loaded (/run/systemd/system/netplan-ovs-ovs0.service; enabled-runtime; vendor preset: enabled)
     Active: inactive (dead) since Fri 2020-07-31 11:02:57 UTC; 5min ago
    Process: 326 ExecStart=/usr/bin/ovs-vsctl --may-exist add-br ovs0 (code=exited, status=0/SUCCESS)
    Process: 343 ExecStart=/usr/bin/ovs-vsctl --may-exist add-port ovs0 eth0.21 (code=exited, status=0/SUCCESS)
    Process: 344 ExecStart=/usr/bin/ovs-vsctl set Bridge ovs0 external-ids:netplan=true (code=exited, status=0/SUCCESS)
    Process: 345 ExecStart=/usr/bin/ovs-vsctl set-fail-mode ovs0 standalone (code=exited, status=0/SUCCESS)
    Process: 346 ExecStart=/usr/bin/ovs-vsctl set Bridge ovs0 mcast_snooping_enable=false (code=exited, status=0/SUCCE>
    Process: 347 ExecStart=/usr/bin/ovs-vsctl set Bridge ovs0 rstp_enable=false (code=exited, status=0/SUCCESS)
   Main PID: 347 (code=exited, status=0/SUCCESS)

Jul 31 11:02:57 ovs-y systemd[1]: Starting OpenVSwitch configuration for ovs0...
Jul 31 11:02:57 ovs-y ovs-vsctl[326]: ovs|00001|vsctl|INFO|Called as /usr/bin/ovs-vsctl --may-exist add-br ovs0
Jul 31 11:02:57 ovs-y ovs-vsctl[343]: ovs|00001|vsctl|INFO|Called as /usr/bin/ovs-vsctl --may-exist add-port ovs0 eth0>
Jul 31 11:02:57 ovs-y ovs-vsctl[344]: ovs|00001|vsctl|INFO|Called as /usr/bin/ovs-vsctl set Bridge ovs0 external-ids:n>
Jul 31 11:02:57 ovs-y ovs-vsctl[345]: ovs|00001|vsctl|INFO|Called as /usr/bin/ovs-vsctl set-fail-mode ovs0 standalone
Jul 31 11:02:57 ovs-y ovs-vsctl[346]: ovs|00001|vsctl|INFO|Called as /usr/bin/ovs-vsctl set Bridge ovs0 mcast_snooping>
Jul 31 11:02:57 ovs-y ovs-vsctl[347]: ovs|00001|vsctl|INFO|...


Dimitri John Ledkov (xnox) wrote :

Had a quick chat about it. At the moment cloud-init uses just one trasaction, thus it's not quite possible to add more units to it, unless they were already there. Hence we bake networkd as enabled in our cloud-images.

It would be nice if cloud-init subverted boot a little bit, by diverting default transaction to isolation to cloud-init-local stage. Then doing daemon-reload, and then starting to isolate to multi-user.target. This would then allow to have networked disabled in the image, and enabled by the netplan which is slotted in just in time by cloud-init. But alas we are not there.

Dave Jones (waveform) wrote :

Looks like netplan 0.100 has fixed this

Changed in cloud-init (Ubuntu):
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers