Cannot connect to WiFi with Nvidia GPU using nvidia-331, SSD

Bug #1388130 reported by Jason Gerard DeRose on 2014-10-31
30
This bug affects 6 people
Affects Status Importance Assigned to Milestone
System76
Critical
Jason Gerard DeRose
network-manager (Ubuntu)
High
Mathieu Trudel-Lapierre

Bug Description

For some strange reason, we cannot connect to WiFi on hardware with a descrete Nvidia GPU (using the nvidia-331 driver) when the system is running off a fast SSD.

Swap the SSD for a platter drive, and things work fine. Likewise, on Intel GPU systems, with either an SSD or a platter drive, WiFi works fine.

The failure message is:

"""
Connection activation failed.
(1) Creation of object for path '/org/freedesktop/NetworkManager/ActiveConnection/2' failed in libnm-glib
"""

See the attached screenshot.

ProblemType: Bug
DistroRelease: Ubuntu 14.10
Package: network-manager 0.9.8.8-0ubuntu28
ProcVersionSignature: Ubuntu 3.16.0-24.32-generic 3.16.4
Uname: Linux 3.16.0-24-generic x86_64
NonfreeKernelModules: nvidia
ApportVersion: 2.14.7-0ubuntu8
Architecture: amd64
CurrentDesktop: Unity
Date: Fri Oct 31 08:58:25 2014
IfupdownConfig:
 # interfaces(5) file used by ifup(8) and ifdown(8)
 auto lo
 iface lo inet loopback
IpRoute:
 default via 10.17.76.1 dev eth0 proto static
 10.17.76.0/24 dev eth0 proto kernel scope link src 10.17.76.193 metric 1
NetworkManager.state:
 [main]
 NetworkingEnabled=true
 WirelessEnabled=true
 WWANEnabled=true
 WimaxEnabled=true
SourcePackage: network-manager
UpgradeStatus: No upgrade log present (probably fresh install)
nmcli-con:
 NAME UUID TYPE TIMESTAMP TIMESTAMP-REAL AUTOCONNECT READONLY DBUS-PATH
 system76_5g d7cafbd5-f1ef-422d-9ed4-4b3a9095b234 802-11-wireless 0 never yes no /org/freedesktop/NetworkManager/Settings/1
 Wired connection 1 c52af28c-07c5-4140-bf2c-3f0d236a05fc 802-3-ethernet 1414767492 Fri 31 Oct 2014 08:58:12 AM MDT yes no /org/freedesktop/NetworkManager/Settings/0
nmcli-dev:
 DEVICE TYPE STATE DBUS-PATH
 wlan0 802-11-wireless disconnected /org/freedesktop/NetworkManager/Devices/1
 eth0 802-3-ethernet connected /org/freedesktop/NetworkManager/Devices/0
nmcli-nm:
 RUNNING VERSION STATE NET-ENABLED WIFI-HARDWARE WIFI WWAN-HARDWARE WWAN
 running 0.9.8.8 connected enabled enabled enabled enabled disabled

Jason Gerard DeRose (jderose) wrote :
Changed in system76:
status: New → Triaged
importance: Undecided → Critical
assignee: nobody → Jason Gerard DeRose (jderose)
Jason Gerard DeRose (jderose) wrote :

Note I confirmed that WiFi works fine when using nouveau on the same hardware.

Also I tried a minimal a nvidia-331 install with --no-install-recommends, just in case the problem is related to any of the optimus stuff, which isn't needed for System76 hardware... and still no dice. Installing nvidia-331 breaks WiFi, for whatever reason.

Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in network-manager (Ubuntu):
status: New → Confirmed

This wouldn't be related in any way with the GPU or disks; I'm not sure why it's happening that way -- it might just be a matter of luck in this particular case.

Marking as Confirmed/High; there are other bug reports of similar issues. I'll merge them and try to figure out how to fix the underlying cause, which appears to be the secrets agent detection.

Changed in network-manager (Ubuntu):
importance: Undecided → High
assignee: nobody → Mathieu Trudel-Lapierre (mathieu-tl)
Jason Gerard DeRose (jderose) wrote :

You'd think it wouldn't be related to the nvidia driver... but it definitely is.

At System76, we've frequently encountered scattered problems like this. The nvidia proprietary effects the boot sequence enough (for example, no kms) that it frequently exposes subtle problems in the overall structure of the upstart jobs, usually related to timing issues.

My hunch is some upstart job actually should depend on an event that it doesn't, but without the nvidia proprietary driver installed, this job works correctly by chance because this event will usually have already happened.

This might not be a network-manager problem, but that's where the symptom is occurring.

My current hunch is this might be related to nvidia-persistenced being started via udev in Utopic, vs Upstart in Trusty and older.

Jason Gerard DeRose (jderose) wrote :

Oh, and a little more detail on why I'm certain this is related to the nvidia proprietary driver...

Part of our QA process after we image a system (before it's shipped to the customer) is to test WiFi. Since we started shipping Utopic, we've had 0% failure on systems with Intel graphics.

On hardware with an nvidia GPU (for which we always pre-install the proprietary driver), we've had 100% failure when the system has an SSD for the OS drive. When the OS is on an HDD, it often works, but sometimes there are failures there too.

You can work around this by going into "Edit connections" and entering your password for the wifi there. But the proper password dialog fails to pop-up when you just click on the wifi name in the indicator.

Jason Gerard DeRose (jderose) wrote :

Okay, think I just found a lead in /var/log/upstart/lightdm.log:

/etc/modprobe.d is not a file
/etc/modprobe.d is not a file
/etc/modprobe.d is not a file
/etc/modprobe.d is not a file
update-alternatives: error: no alternatives for x86_64-linux-gnu_gfxcore_conf
Failed to get D-Bus connection
/etc/modprobe.d is not a file
/etc/modprobe.d is not a file
/etc/modprobe.d is not a file
/etc/modprobe.d is not a file
update-alternatives: error: no alternatives for x86_64-linux-gnu_gfxcore_conf
/etc/modprobe.d is not a file
/etc/modprobe.d is not a file
/etc/modprobe.d is not a file
/etc/modprobe.d is not a file
update-alternatives: error: no alternatives for x86_64-linux-gnu_gfxcore_conf

Jason Gerard DeRose (jderose) wrote :

BTW, it was the "Failed to get D-Bus connection" bit above that seems problematic.

Also, looking in syslog, there are some interesting tidbits:

Nov 21 13:17:45 system76-pc NetworkManager[977]: <info> (wlan0): device state change: config -> need-auth (reason 'none') [50 60 0]
Nov 21 13:17:45 system76-pc NetworkManager[977]: <info> Activation (wlan0) Stage 2 of 5 (Device Configure) complete.
Nov 21 13:17:45 system76-pc NetworkManager[977]: <warn> No agents were available for this request.
Nov 21 13:17:45 system76-pc NetworkManager[977]: <info> (wlan0): device state change: need-auth -> failed (reason 'no-secrets') [60 120 7]

No agents were available for this request... is the "agent" in a separate process, something you'd connect to through dbus?

Jason Gerard DeRose (jderose) wrote :

Another update: I think I've ruled out nvidia-persistenced being started via udev as the possible culprit.

I tried Trusty with the nvidia-343 driver from the System76 PPA (which starts nvidia-persistenced with udev)... and I can connect to WiFi just fine.

I also tried an up-to-date Vivid install with the same nvidia-343 driver... and again, I can connect to WiFi just fine.

So something specific to Utopic is causing the problem (whether or not that problem is actually in NetworkManager itself, I still don't know).

Jason Gerard DeRose (jderose) wrote :

Still no solution, but I've at least (hopefully) eliminated some more variables.

As I know this problem doesn't currently exist on Vivid, I tried back-porting `network-manager` and `network-manager-applet` from Vivid, but no luck... same problem still exists.

And on the off chance that this is kernel-related, I also tried the 3.16 kernel on Trusty... but connecting to WiFi still works fine, so it doesn't seem to have anything to do with the kernel version.

However, I have noticed some differences in the Dbus processes running an a system with Nvidia hardware vs a system with an Intel GPU, so I'm further investigating that today...

Also, I'm not that familiar with network-manager, so if you have any advice for how I could get you better debugging information, please let me know!

Jason Gerard DeRose (jderose) wrote :

Hmm, after more careful investigation, I think my hunch about differences in the DBus related process was a dead end.

Jason Gerard DeRose (jderose) wrote :

Also tried back-porting `policykit-1` (which needed a back-port of `glib2.0` and `gobject-introspection`)... still no luck.

But at this point, the delta between Utopic and Vivid is still pretty small, so I feel this is a promising avenue, at least as far as shot-gunning goes.

Jason Gerard DeRose (jderose) wrote :

To eliminate more variables, I just tried xubuntu 14.10 (with nvidia-343 from ppa:system76-dev/stable)... and I can connect to password-protected WiFi just fine.

As Xubuntu and Ubuntu are using most of the same lower-level stack, this kinda suggests the problem is fairly high-level, potentially even Unity specific.

Jason Gerard DeRose (jderose) wrote :

More variables eliminated: this bug does *not* occur on:

- Ubuntu GNOME 14.10
- Ubuntu MATE 14.10
- Kubuntu 14.10
- Xubuntu 14.10 (as mentioned above)

Jason Gerard DeRose (jderose) wrote :

Another possible hint: some in this set of packages currently in vivid proposed breaks the WiFi password dialog:

Calculating upgrade... Done
The following NEW packages will be installed:
  libisl13
The following packages will be upgraded:
  apport apport-gtk btrfs-tools fontconfig fontconfig-config
  gir1.2-timezonemap-1.0 gnome-system-monitor libc-bin libc-dev-bin libc6
  libc6-dbg libc6-dev libc6-i386 libcloog-isl4 libdb5.3 libfontconfig1
  libjasper1 libqt5qml5 libqt5quick5 libtbb2 libtimezonemap-data
  libtimezonemap1 multiarch-support python3-apport python3-problem-report
  qml-module-qtquick-localstorage qml-module-qtquick-window2
  qml-module-qtquick2 qtdeclarative5-localstorage-plugin
  qtdeclarative5-qtquick2-plugin syslinux syslinux-common
32 upgraded, 1 newly installed, 0 to remove and 0 not upgraded.
Need to get 21.4 MB of archives.
After this operation, 2,157 kB of additional disk space will be used.

As none of these packages seem directly related, but libc is being updated, I'm wondering if the issue is a ABI mismatch in a codepath only being triggered on Utopic Unity systems (as I also noticed that network-manager seemingly wasn't rebuilt at any time during Utopic).

So my next test is a no-change rebuild of network-manager for Utopic...

Jason Gerard DeRose (jderose) wrote :

Er, I mean an ABI mismatched in policykit-1, not network-manager....

Jason Gerard DeRose (jderose) wrote :

Okay, on vivid, unity-greeter 15.04.2-0ubuntu1 re-introduces this bug.

I have a vivid image that was up-to-date as of Fri 5 Dec. In this snapshot, with applying any updates, I can connect to protected WiFi just fine.

However, today something landing from proposed re-introduced this bug, and I noticed that unity-greeter was among the updates.

So I re-imaged and then upgraded only unity-greeter:

sudo apt-get update
sudo apt-get install unity-greeter

After a reboot, the password dialog wont show when I try to connect to protected WiFi (same symptom as Utopic).

Interestingly enough, unity-greeter was one of the many things I tried back-porting and it didn't fix this on Utopic, although now I can't recall whether I did that back-port in isolation or whether there were other backports I was testing at the same time.

Ah, also note that unity-greeter 15.04.2-0ubuntu1 does *not* break WiFi on Intel GPU systems. This is still only a problem on systems with Nvidia GPUs running the proprietary nvidia driver (might effect the nouveau driver too, not sure either way on that one).

It would really be quite surprising for unity-greeter to have any effect on nm-applet without breaking a whole lot of other things. The one thing pointed out that looked the most promising were the D-Bus errors, and the "no agent available" message in syslog -- that's really where we should be looking.

Please, see that your system is fully up-to-date, and if the bug can be reproduced there, then let's try to remove the nvidia proprietary drivers and see if this still has any effect on wifi.

The agent NetworkManager is looking for is almost always nm-applet (as from the network-manager-applet source package), except in conditions like in Gnome Shell. Perhaps there are logs in .xsession-errors (or now .cache/upstart/gnome-session-Unity.log?) that could explain issues on the nm-applet side if NetworkManager can't connect to it?

hexinn12 (980358871-b) on 2015-04-13
description: updated
Jason Gerard DeRose (jderose) wrote :

FYI, this bug wasn't initially present on the final 15.04 release, but has since raised it's ugly head again on 15.04.

Not sure what has changed, but from my previous experience, my bet is still on `unity-greeter`, even though I'm not sure how exactly it's causing it.

Jason Gerard DeRose (jderose) wrote :

As with 14.10, this problem isn't happening with Ubuntu MATE 15.04. Haven't tried the other flavors yet, but I'm guessing that again this is only happening with Ubuntu/Unity.

I just want to confirm possible interaction between unity-greater and nvidia. And I have SSD too, not sure if this is related.

My previous Ubuntu Vivid install with Nvidia-346 was bricked, with boot ending at unity-greater.
I am not sure what caused it, I've attributed it to that I've played with compiz settings and installed some extra gnome packages.

Now (after reinstalling Ubundu), I am able to connect to my home wifi AP, but not to my phone AP (with error message as described in Bug: 1438003)

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers