MASTER network-manager crashes when wpasupplicant ctrl socket is not available

Bug #141233 reported by Alexander Sack on 2007-09-20
60
Affects Status Importance Assigned to Milestone
NetworkManager
Fix Released
Critical
network-manager (Ubuntu)
High
Alexander Sack
wpasupplicant (Ubuntu)
Medium
Unassigned

Bug Description

Binary package hint: network-manager

network-manager crashes when it cannot connect to wpasupplicant ctrl socket.

A typical log for this issue looks like: from https://bugs.edge.launchpad.net/ubuntu/+source/network-manager/+bug/43379/comments/15 in bug 43379

ep 17 05:34:48 uber NetworkManager: <info> Activation (eth1) New wireless user key for network 'name' received.
Sep 17 05:34:48 uber NetworkManager: <info> Activation (eth1) Stage 1 of 5 (Device Prepare) scheduled...
Sep 17 05:34:48 uber NetworkManager: <info> Activation (eth1) Stage 1 of 5 (Device Prepare) started...
Sep 17 05:34:48 uber NetworkManager: <info> Activation (eth1) Stage 2 of 5 (Device Configure) scheduled...
Sep 17 05:34:48 uber NetworkManager: <info> Activation (eth1) Stage 1 of 5 (Device Prepare) complete.
Sep 17 05:34:48 uber NetworkManager: <info> Activation (eth1) Stage 2 of 5 (Device Configure) starting...
Sep 17 05:34:48 uber NetworkManager: <info> Activation (eth1/wireless): access point 'name' is encrypted, and a key exists. No new key needed.
Sep 17 05:34:49 uber NetworkManager: <info> supplicant_interface_init() - connect to global ctrl socket (0/10).
Sep 17 05:34:49 uber NetworkManager: <info> supplicant_interface_init() - connect to global ctrl socket (1/10).
Sep 17 05:34:49 uber NetworkManager: <info> supplicant_interface_init() - connect to global ctrl socket (2/10).
Sep 17 05:34:49 uber NetworkManager: <info> supplicant_interface_init() - connect to global ctrl socket (3/10).
Sep 17 05:34:49 uber NetworkManager: <info> supplicant_interface_init() - connect to global ctrl socket (4/10).
Sep 17 05:34:49 uber NetworkManager: <info> supplicant_interface_init() - connect to global ctrl socket (5/10).
Sep 17 05:34:50 uber NetworkManager: <info> supplicant_interface_init() - connect to global ctrl socket (6/10).
Sep 17 05:34:50 uber NetworkManager: <info> supplicant_interface_init() - connect to global ctrl socket (7/10).
Sep 17 05:34:50 uber NetworkManager: <info> supplicant_interface_init() - connect to global ctrl socket (8/10).
Sep 17 05:34:50 uber NetworkManager: <info> supplicant_interface_init() - connect to global ctrl socket (9/10).
Sep 17 05:34:50 uber NetworkManager: <info> supplicant_interface_init() - connect to global ctrl socket (10/10).
Sep 17 05:34:50 uber NetworkManager: <info> supplicant_interface_init() - connect to global ctrl socket (11/10).
Sep 17 05:34:50 uber NetworkManager: <info> supplicant_interface_init() - connect to global ctrl socket (12/10).
Sep 17 05:34:50 uber NetworkManager: <info> supplicant_interface_init() - connect to global ctrl socket (13/10).
Sep 17 05:34:50 uber NetworkManager: <info> supplicant_interface_init() - connect to global ctrl socket (14/10).
Sep 17 05:34:50 uber NetworkManager: <info> supplicant_interface_init() - connect to global ctrl socket (15/10).
Sep 17 05:34:51 uber NetworkManager: <info> supplicant_interface_init() - connect to global ctrl socket (16/10).
Sep 17 05:34:51 uber NetworkManager: <info> supplicant_interface_init() - connect to global ctrl socket (17/10).
Sep 17 05:34:51 uber NetworkManager: <info> supplicant_interface_init() - connect to global ctrl socket (18/10).
Sep 17 05:34:51 uber NetworkManager: <info> SUP: sending command 'INTERFACE_ADD eth1^I^Iwext^I/var/run/wpa_supplicant3^I'
Sep 17 05:34:51 uber NetworkManager: <info> SUP: response was 'OK'
Sep 17 05:34:51 uber NetworkManager: <info> Error opening control interface to supplicant.
Sep 17 05:34:51 uber NetworkManager: <WARN> real_act_stage2_config(): Activation (eth1/wireless): couldn't connect to the supplicant.
Sep 17 05:34:51 uber NetworkManager: <info> Activation (eth1) failure scheduled...
Sep 17 05:34:51 uber NetworkManager: <info> Activation (eth1) Stage 2 of 5 (Device Configure) complete.
Sep 17 05:34:51 uber NetworkManager: <info> Activation (eth1) failed for access point (name)
Sep 17 05:34:51 uber NetworkManager: <info> Activation (eth1) failed.
Sep 17 05:34:51 uber NetworkManager: <info> Deactivating device eth1.
Sep 17 05:34:51 uber NetworkManager: <WARN> nm_signal_handler(): Caught signal 11. Generating backtrace...
Sep 17 05:34:51 uber NetworkManager: ******************* START **********************************
Sep 17 05:34:53 uber NetworkManager: (no debugging symbols found)
Sep 17 05:34:53 uber NetworkManager: Using host libthread_db library "/lib/tls/i686/cmov/libthread_db.so.1".
Sep 17 05:34:53 uber NetworkManager: (no debugging symbols found)
Sep 17 05:34:54 uber last message repeated 13 times
Sep 17 05:34:54 uber NetworkManager: [Thread debugging using libthread_db enabled]
Sep 17 05:34:54 uber NetworkManager: [New Thread -1212589632 (LWP 4650)]
Sep 17 05:34:54 uber NetworkManager: [New Thread -1229378672 (LWP 5612)]
Sep 17 05:34:54 uber NetworkManager: [New Thread -1220985968 (LWP 4765)]
Sep 17 05:34:54 uber NetworkManager: [New Thread -1212593264 (LWP 4759)]
Sep 17 05:34:54 uber NetworkManager: (no debugging symbols found)
Sep 17 05:34:54 uber last message repeated 3 times
Sep 17 05:34:54 uber NetworkManager: 0xffffe410 in __kernel_vsyscall ()
Sep 17 05:34:54 uber NetworkManager: ******************* END **********************************

Alexander Sack (asac) on 2007-09-20
Changed in network-manager:
assignee: nobody → asac
importance: Undecided → High
status: New → Confirmed
unggnu (unggnu) wrote :

Thanks for help but isn't this a duplicate of https://bugs.launchpad.net/ubuntu/+source/network-manager/+bug/139966 ?

Alexander Sack (asac) wrote :

nggnu, please don't merge any master bugs, especially if I tell you 10 minutes ago that this issue is tracked in _this_ masterbug.

Alexander Sack (asac) wrote :

unggnu ... my pardon ... i failed to see it wasn't you who marked this as duplicate.

Thanks for your contributions,

 - Alexander

TJ (tj) wrote :

Apologies Alexander, it was me that duped this one. As the older bug #139966 seemed to be getting all the attention and this one appeared to be duplicating the NetworkManager log output I originally filed it seemed sensible to link.

TJ (tj) wrote :

Alexander, it appears to be a side-effect of patch

41t_nm_device_wireless_index_ctrl_sockets_by_run_count.patch

Building without that has (so-far) given me good starts from cold every time.

I'll dig deeper into the source to see if I can pin-point the reason, but from a cursory glance it looks reasonably easy to solve.

Alexander Sack (asac) wrote :

for network-manager i have a fix committed. After some cleanup I will upload it tomorrow.

Changed in network-manager:
status: Confirmed → Fix Committed
Alexander Sack (asac) wrote :

For wpasupplicant I keep this bug open, because i am not sure if the behaviour we see here in the log is already tracked somewhere else. As you can see the global ctrl socket takes two seconds to come up ... though it finally succeeds, this is a pretty long time for just opening a unix socket.

Changed in wpasupplicant:
importance: Undecided → Medium
status: New → Confirmed
Alexander Sack (asac) wrote :

network-manager (0.6.5-0ubuntu13) gutsy; urgency=low

  * debian/patches/20_do_not_take_over_dhcpv4iface_when_v6_is_configured.patch:
    fix race/null-deref crash that happens if /e/n/i is not completely flushed out
    by editor/network-admin (LP: #141106).
  * debian/patches/41v_lp141233-fix-supplicant-cleanup-crashes.patch: Add patch
    for LP: #141233 - fix crash by not sending ctrl commands to supplicant if no ctrl
    socket is open.
  * debian/patches/series: add new patch to quilt series
  * debian/patches/41t_nm_device_wireless_index_ctrl_sockets_by_run_count.patch:
    use two distinct variables to track and check connect attempt count to
    supplicant and global supplicant socket. Fix bug that prevented network
    manager from attempting to connect to supplicant ctrl for cases where
    global control already took a long time to succeed. (LP: #141233).
  * debian/patches/41t_nm_device_wireless_index_ctrl_sockets_by_run_count.patch:
    close global ctrl socket connection when nm_utils_supplicant_request_with_check
    failed.
  * debian/patches/41u_custom_timeout_for_some_wpa_ctrl_operations.patch: update patch
    because of overlapping change with 41t_* patch.

 -- Alexander Sack <email address hidden> Fri, 21 Sep 2007 11:26:01 +0200

Changed in network-manager:
status: Fix Committed → Fix Released
Gavin McCullagh (gmccullagh) wrote :

I've just the ubuntu13 version out having seen this issue up to now.

network-manager no longer seems to crash -- which is good -- though I still can't seem to switch reliably onto my WPA network, or off it again once that fails. I guess that's probably a wpa-supplicant issue?

Gavin McCullagh (gmccullagh) wrote :

I seem to hae spoken too soon.

Before going further I'll point out that I'm using a bcm4318 card and the wireless-dev kernel tree in order to get a working driver for it. However, my friend sees the exact same issue using an Atheros card so I don't think that's the problem.

I have two ESSIDs, one WPA, one unencrypted. If I just boot the laptop with the card plugged in, I end up on the unencrypted network. If I try to switch to the encrypted one, I can't connect. Commonly network-manager would then die. Initially I presumed this issue was with the driver, so I reported it here with syslog, etc:

https://lists.berlios.de/pipermail/bcm43xx-dev/2007-September/thread.html#5900

I have now tried the ubuntu13-i386 version of network-manager, downloading it by hand from the pool on archive.ubuntu.com. I got the same result as before, except that network-manager seemed somewhat more robust and didn't crash, the applet stayed working, showing it attempting to join the network. Some fiddling later (sorry I can't exactly describe what I did, but mainly trying to switch networks with the applet and iwconfig), network-manager crashed again with a similar error (I'll attach it in a moment).

Gavin McCullagh (gmccullagh) wrote :

this is a log where I booted up first, then plugged in my wifi card once network-manager was working. It defaults onto the WPA network (GAV). I then try to switch it (via the nm-applet) to the unencrypted network and after a few seconds network-manager hangs.

Gavin McCullagh (gmccullagh) wrote :

similar log of crash, this time booting up with the card present, defaults to unencrypted network (FON_GAV) and attempts to switch networks eventually cause crash of network-manager.

Alexander Sack (asac) wrote :

please test the wpasupplicant provided at:

http://ppa.launchpad.net/asac/ubuntu/pool/main/w/wpasupplicant/

Changed in network-manager:
status: Unknown → New
Gavin McCullagh (gmccullagh) wrote :

tried it out. It doesn't seem to correct my problems (using wireless-dev kernel and bcm43xx).

attached is a log where I booted up, then inserted the bcm4318 pcmcia card. This gets it onto the wpa network, but once I try to switch to the unecnrypted one, it fails. iwconfig showed the ESSID did change after a few seconds, but after a little more time the essid went to off/any and attempts to go back on the WPA network also fail.

Erik Vetters (evetters) wrote :

@asac
tried it out too. I have the same problems as in bug

https://bugs.launchpad.net/bugs/141467

Can I help with a little more debugging ... ?

Greetings
Erik

Reinhard Tartler (siretart) wrote :

wpasupplicant (0.6.0+0.5.8-0ubuntu1) gutsy; urgency=low

  * New upstream release
  * remove patch 11_erroneous_manpage_ref, applied upstream
  * remove patch 25_wpas_dbus_unregister_iface_fix, applied upstream

  [ Alexander Sack ]
  * bumping upstream version to replace development version 0.6.0 with
    this package from stable release branch.
  * attempt to fix wierd timeout and high latency issues by going
    back to stable upstream version (0.5.9) (LP: #140763,
    LP: #141233).

 -- Reinhard Tartler <email address hidden> Sun, 26 Aug 2007 16:06:57 +0200

Changed in wpasupplicant:
status: Confirmed → Fix Released
Luís Silva (luis) wrote :

With wpasupplicant (0.6.0+0.5.8-0ubuntu1) and network-manager (0.6.5-0ubuntu14) this problem still happens to me. It happens in a less frequent time base, but definately still happens. I'm using a "hand made" kernel, version 2.6.22.6 compiled with mac80211-10.0 and iwlwifi-1.1.0 and the network card is an intel 3945.

Gavin McCullagh (gmccullagh) wrote :

[ posting by hand as email seemed to disappear ]

Ditto, moreorless.

gavinmc@tilly:~$ apt-cache policy wpasupplicant network-manager
wpasupplicant:
  Installed: 0.6.0+0.5.8-0ubuntu1
  Candidate: 0.6.0+0.5.8-0ubuntu1
  Version table:
 *** 0.6.0+0.5.8-0ubuntu1 0
        500 http://ie.archive.ubuntu.com gutsy/main Packages
        100 /var/lib/dpkg/status
network-manager:
  Installed: 0.6.5-0ubuntu14
  Candidate: 0.6.5-0ubuntu14
  Version table:
 *** 0.6.5-0ubuntu14 0
        500 http://ie.archive.ubuntu.com gutsy/main Packages
        100 /var/lib/dpkg/status

To be specific, the situation has improved substantially, but issues still
exist. In the current session I have successfully switched from
unencrypted->encrypted, which I couldn't do before. In the previous boot
I switched from encrypted->unencrypted but not back to encrypted (I needed
to reboot then) and prior to that I managed to crash network-manager. I'm
attaching logs from both sessions.

It's not totally finished yet, but it's definitely improving

Gavin McCullagh (gmccullagh) wrote :
Gavin McCullagh (gmccullagh) wrote :

Is/should this bug be regarded as release critical?

It's certainly not releasable quality in my experience.

unggnu (unggnu) wrote :

Never had this issue again after fixed NetworkManager released.

Luís Silva (luis) wrote :

This problem shows itself especially on unstable networks (lots of ap's within the same range).

I this it should be considered release critical.

On 04/10/2007, unggnu <email address hidden> wrote:
>
> Never had this issue again after fixed NetworkManager released.
>

Unless there's a recent fix I didn't notice, it definitely wasn't fixed for
me at time of my last post to this bug.

unggnu (unggnu) wrote :

I only had the startup issue and this has been fixed.

It's not a frequent bug, but I've seen it at least 2 times. I'm attaching daemon.log. Hardware is ipw3945.

snifer@snifer-laptop:~$ dpkg -l | grep network-manager
ii network-manager 0.6.5-0ubuntu16 network management framework daemon
ii network-manager-gnome 0.6.5-0ubuntu10 network management framework (GNOME frontend
snifer@snifer-laptop:~$ dpkg -l | grep wpasupplicant
ii wpasupplicant 0.6.0+0.5.8-0ubuntu1 Client support for WPA and WPA2 (IEEE 802.11

Changed in wpasupplicant:
status: Fix Released → Confirmed
Alex Mauer (hawke) wrote :

I no longer have this problem.

Changed in network-manager:
status: New → Fix Released
Reinhard Tartler (siretart) wrote :

the wpasupplicant part of this bug is not really a problem for wpa_supplicant when used standalone. It is however a problem when used with some frontend like network manager. For network-manager, it has been worked around by alexander, thanks for that.

I'm closing the bugtask for wpasupplicant for now, because from this bugtrail, it is not clear to me how to reproduce this nor how to fix this in wpasupplicant. If you are suffering from this bug in one or the other way, please open a new bug for clarity. thanks.

Changed in wpasupplicant:
status: Confirmed → Invalid
Changed in network-manager:
importance: Unknown → Critical
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.