Wifi "device not ready" after booting into OS for the 1st time

Bug #1576024 reported by Kristin Chuang on 2016-04-28
62
This bug affects 9 people
Affects Status Importance Assigned to Milestone
network-manager (Ubuntu)
High
Unassigned
Xenial
Undecided
Unassigned
ubiquity (Ubuntu)
High
Unassigned
Xenial
Undecided
Unassigned
wpa (Ubuntu)
High
Unassigned
Xenial
Undecided
Unassigned

Bug Description

[Impact]
Users booting from OEM setup may find that their wireless device is "not ready" as per NetworkManager, because wpasupplicant is not running. This is because the steps taken to start in OEM prepare, before moving to the real system runs systemctl isolate, and wpasupplicant gets caught in the crossfire.

[Test case]
* Steps to reproduce:
1. Install in OEM mode
2. Boot into OS
3. Check the wifi status in network-manager applet

* Expected result:
Available APs listed in network-manager applet, wifi connection can be established

* Actual result:
AP list replaced by a greyed-out "device not ready" wording
Reboot system or do "$ sudo service network-manager restart" and wifi will then start working correctly.

[Regression potential]
The following are examples of possible regression scenarios from this stable update:
- Failure to get the wireless device ready at session start
- Driver loading issues for the wireless devices
- Failure to complete OEM preparation steps, due to the oem user remaining connected while it's being removed by the last steps of the OEM preparation process.

[Background information]
* OS: Xenial
* Network-manager: 1.1.93-0ubuntu4
* Wireless module: Marvell Technology Group Ltd. 88W8897 [AVASTAR] 802.11ac Wireless [11ab:2b38]

Kristin Chuang (hyac109) on 2016-04-28
description: updated
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in network-manager (Ubuntu):
status: New → Confirmed
System76 (salmon76) wrote :

This affects the entire System76 product line, including:

Lemur 6, Lemur 7, Gazelle 10, Gazelle 11, Kudu 2, Kudu 3, Serval 10, Bonobo 10, Bonobo 11, Meerkat 2, Sable 6, Ratel 5, and other desktop hardware with additional wifi cards added.

To reproduce:

- Install Ubuntu in OEM mode
- Prepare for shipping to end user
- Power off system
- Power on system.
- Go through user setup
- Either join or don't join a Wifi during user setup, does not change outcome
- Log into user account
- Try to join a WiFi network

No wifi networks will be shown.

This affects Ubuntu 16.04.1 and 16.10, including the following kernels:

4.8.0-26-generic
4.8.0-22-generic
4.4.0-35-generic
4.4.0-47-generic

Looking at log messages, there may be a race condition present that a fresh boot does not have. On first boot:

[8.824974] iwlwifi 0000:03:00.0 wlp3s0: renamed from wlan0
[9.133313] IPv6: ADDRCONF(NETDEV_UP): wlp3s0: link is not ready
[9.206671] IPv6: ADDRCONF(NETDEV_UP): wlp3s0: link is not ready
[9.324861] IPv6: ADDRCONF(NETDEV_UP): wlp3s0: link is not ready

On reboot:

[3.515363] iwlwifi 0000:03:00.0 wlp3s0: renamed from wlan0
[4.386126] IPv6: ADDRCONF(NETDEV_UP): wlp3s0: link is not ready
[4.610799] IPv6: ADDRCONF(NETDEV_UP): wlp3s0: link is not ready
[4.712271] IPv6: ADDRCONF(NETDEV_UP): wlp3s0: link is not ready
[7.614299] IPv6: ADDRCONF(NETDEV_UP): wlp3s0: link becomes ready

It appears that the system gives up/times out trying to make the device ready when being booted the first time, but the process happens much faster on later reboots.

This is solved by either rebooting, or restarting network-manager, as indicated above.

System76 (salmon76) wrote :

Our systems contain an Intel 8260, 7260, or 3165 Wifi card, and this bug affects all 3 cards.

C K (chri-klocker) wrote :

Experience the same on a new Macbook Air running Kubuntu 16.04

Jeremy Soller (jackpot51) wrote :

This is caused by wpa_supplicant being killed by systemctl isolate in oem-config-firstboot

A potential fix is to add IgnoreOnIsolate=true to [Unit] in wpa_supplicant.service

Jeremy Soller (jackpot51) wrote :

By the way, it is possible to trigger the bug running on your own system. All you have to do is run `sudo systemctl isolate default.target` while connected to WiFi

Bluetooth may also go down.

Robie Basak (racb) wrote :

<pitti> rbasak: hmm, jackpot51 isn't here, but "isolate" is pretty much de
fined as "stop everything which is not part of the given target"

<pitti> so this sounds like a case of YAFIYGI to me -- isolate isn't somet
hing which users should ever want to actually use...

Robie Basak (racb) wrote :

I suggest talking to the person who wrote the "systemctl isolate ..." line in the first place. Did that line come from Debian?

Robie Basak (racb) wrote :

<pitti> rbasak: I guess what you said -- find out who added that isolate c
all there

Let's set this to Triaged, as Jeremy appears to have pinned down the cause so it's solely in the realm of "developers" now. The affected package may need to change as we figure out where it should go.

Jeremy: are you continuing to drive this bug?

Changed in network-manager (Ubuntu):
status: Confirmed → Triaged
importance: Undecided → High
Sebastien Bacher (seb128) wrote :

The call was added by Mathieu in bin/oem-config-firstboot when he added systemd support, subscribing him to the bug

Jeremy Soller (jackpot51) wrote :

Yes, I am. I just need to know what is permissible - we can add IgnoreOnIsolate in a ubuntu patch but it seems like the source of the issue is in network manager - it should detect that wpa supplicant is not running and restart it dynamically.

Robie Basak (racb) wrote :

Let's wait for Mathieu's opinion.

This seems like a good candidate for 16.04.2 if we can get it fixed in time.

Changed in network-manager (Ubuntu):
milestone: none → ubuntu-16.04.2

It's exactly as Jeremy points out -- NM should be detecting that wpasupplicant is not running and start it -- this should already have been working by way of wpasupplicant being dbus-activated.

The isolate command is there for a good reason. We want a dramatically different environment while oem-config is running -- users should not be logged in, daemons should not be running -- as anything running may block the steps taking in oem-config, such as the removal of the oem user at the end. Once all that is done, we isolate back to default mode -- this makes it so we don't have to do a full reboot just to get a working system after oem-config has been run.

It seems to me like IgnoreOnIsolate for wpasupplicant would be the right thing to do, or to figure out why it isn't being properly started when NM tries to use it.

'systemctl isolate' is documented as being equivalent to changing runlevels in a system without systemd: that's pretty much exactly what we want after changing default.target back to what it really should be.

If this doesn't work, we'll need to come up with a vastly different way of having oem-config work.

Adding an oem-config/ubiquity task in case it turns out it needs to be fixed there.

Changed in ubiquity (Ubuntu):
status: New → Incomplete
importance: Undecided → High
Changed in wpa (Ubuntu):
importance: Undecided → High
status: New → Incomplete
Jeremy Soller (jackpot51) wrote :

Would it be acceptable to have a patch to `wpa` to add `IgnoreOnIsolate` for the time being? If we hold it in the Ubuntu debian/patches then we can wait for a better solution to be done upstream.

Jeremy Soller (jackpot51) wrote :

Here is how I have been working around this bug for System76's machines: https://launchpadlibrarian.net/303137969/wpa_2.4-0ubuntu6_2.4-0ubuntu7~system76~1.diff.gz

Changed in network-manager (Ubuntu):
status: Triaged → Invalid
milestone: ubuntu-16.04.2 → none
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package wpa - 2.4-0ubuntu9

---------------
wpa (2.4-0ubuntu9) zesty; urgency=medium

  * debian/patches/wpa_service_ignore-on-isolate.patch: add IgnoreOnIsolate=yes
    so that when switching "runlevels" in oem-config will not kill off wpa
    and cause wireless to be unavailable on first boot. (LP: #1576024)

 -- Mathieu Trudel-Lapierre <email address hidden> Mon, 13 Mar 2017 13:46:12 -0400

Changed in wpa (Ubuntu):
status: Incomplete → Fix Released
Deepankar Agrawal (grimydevil) wrote :

Hi, is this issue released with Ubuntu 16.04.2 LTS. I am still experiencing this one. If not what should be done exactly to fix it.

Jeremy Soller (jackpot51) wrote :

Yes, I believe this still affects 16.04. I applied the patch on machines that System76 ships, but it is not upstream. It is the same patch as was applied to 17.04.

Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in network-manager (Ubuntu Xenial):
status: New → Confirmed
Changed in ubiquity (Ubuntu Xenial):
status: New → Confirmed
Changed in wpa (Ubuntu Xenial):
status: New → Confirmed
Michel-Ekimia (michel.ekimia) wrote :

This affects the entire Ekimia product line also ( intel 8285 M.2 )

We will apply the patch for WPA

Michel-Ekimia (michel.ekimia) wrote :

Shall the WPA package be updated in 16.04 ? I guess the impact is so limited that it's possible.

Changed in network-manager (Ubuntu Xenial):
status: Confirmed → Invalid
description: updated

Hello Kristin, or anyone else affected,

Accepted wpa into xenial-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/wpa/2.4-0ubuntu6.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-xenial to verification-done-xenial. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-xenial. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in wpa (Ubuntu Xenial):
status: Confirmed → Fix Committed
tags: added: verification-needed verification-needed-xenial
Aki Ketolainen (akik) wrote :

wpasupplicant 2.4-0ubuntu6.1 amd64 fixed it for me.

My case was going from graphical.target to multi-user.target and back to graphical.target.

Aki Ketolainen (akik) on 2017-09-02
tags: added: verification-done-xenial
removed: verification-needed-xenial
Łukasz Zemczak (sil2100) wrote :

I see there are some nplan regressions for this upload. I looked at them briefly and I see that the nplan testsuite is in terrible shape and tends to just fail a lot, so they're probably are unrelated, but could the uploader please take a look and double-confirm? I can then release this fix to -updates.

Timo Jyrinki (timo-jyrinki) wrote :

Note that this should be rebased on 6.2 now that a security release was published without the 6.1 changes.

Nafallo Bjälevik (nafallo) wrote :

Any progress in seeing this fixed in Xenial?

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers