"*** stack smashing detected ***: /sbin/wpa_supplicant terminated" with iwl4965

Bug #138873 reported by Darren Albers on 2007-09-11
258
Affects Status Importance Assigned to Milestone
network-manager (Ubuntu)
Undecided
Unassigned
wpasupplicant (Ubuntu)
Medium
Kees Cook

Bug Description

Binary package hint: network-manager

I recently purchased a new laptop with an Intel IWL4965 card, the
iwlwifi drivers work well except I seem to lose my connection after
around an hour of use. For the heck of it I tried connecting with
NM stopped and just ran wpa_supplicant and my connection remained
stable. I then ran NetworkManager --no-daemon and got this error
after about 30-40 minutes and my connection dropped ~ 5 minutes later:

*** stack smashing detected ***: /sbin/wpa_supplicant terminated
NetworkManager: <WARN> nm_utils_supplicant_request_with_check():
nm_device_802_11_wireless_scan: supplicant error for 'SCAN'.
Response: '��ӷ��ӷ

NetworkManager: <WARN> nm_device_802_11_wireless_scan(): could not
trigger wireless scan on device wlan0: Connection refused
NetworkManager: <WARN> nm_utils_supplicant_request_with_check():
nm_device_802_11_wireless_scan: supplicant error for 'SCAN'.
Response: '��ӷ��ӷ

NetworkManager: <WARN> nm_device_802_11_wireless_scan(): could not
trigger wireless scan on device wlan0: Transport endpoint is not
connected
NetworkManager: <WARN> nm_utils_supplicant_request_with_check():
nm_device_802_11_wireless_scan: supplicant error for 'SCAN'.
Response: '��ӷ��ӷ

NetworkManager: <WARN> nm_device_802_11_wireless_scan(): could not
trigger wireless scan on device wlan0: Transport endpoint is not
connected

The strange thing is that NM thinks I am still associated...

When I click on nm-applet and tell it to connect back to my AP it
connects fine and this is the output:
ed
NetworkManager: <debug> [1189480089.333758]
nm_device_802_11_wireless_get_activation_ap(): Forcing AP 'junkyard'
NetworkManager: <info> User Switch:
/org/freedesktop/NetworkManager/Devices/wlan0 / junkyard
NetworkManager: <info> Deactivating device wlan0.
NetworkManager: <WARN> nm_utils_supplicant_request_with_check():
nm_device_802_11_wireless_scan: supplicant error for 'SCAN'.
Response: '��ӷ��ӷ

NetworkManager: <WARN> nm_device_802_11_wireless_scan(): could not
trigger wireless scan on device wlan0: Transport endpoint is not
connected
NetworkManager: <info> SUP: sending command 'DISABLE_NETWORK 0'
NetworkManager: <info> SUP: response was '��ӷ��ӷ

NetworkManager: <WARN> nm_utils_supplicant_request_with_check():
supplicant_cleanup: supplicant error for 'DISABLE_NETWORK 0'.
Response: '��ӷ��ӷ
NetworkManager: <WARN> supplicant_cleanup(): supplicant_cleanup -
couldn't disable network in supplicant_cleanup
NetworkManager: <info> SUP: sending command 'AP_SCAN 0'
NetworkManager: <info> SUP: response was '��ӷ��ӷ

NetworkManager: <WARN> nm_utils_supplicant_request_with_check():
supplicant_cleanup: supplicant error for 'AP_SCAN 0'. Response:
'��ӷ��ӷ
NetworkManager: <WARN> supplicant_cleanup(): supplicant_cleanup -
couldn't set AP_SCAN 0
NetworkManager: <info> SUP: sending command 'TERMINATE'
NetworkManager: <info> SUP: response was '��ӷ��ӷ

NetworkManager: <WARN> nm_utils_supplicant_request_with_check():
supplicant_cleanup: supplicant error for 'TERMINATE'. Response:
'��ӷ��ӷ
NetworkManager: <WARN> supplicant_cleanup(): supplicant_cleanup -
couldn't terminate wpasupplicant cleanly.
NetworkManager: <info> Device wlan0 activation scheduled...
NetworkManager: <info> Activation (wlan0) started...
NetworkManager: <info> Activation (wlan0) Stage 1 of 5 (Device
Prepare) scheduled...
NetworkManager: <info> Activation (wlan0) Stage 1 of 5 (Device
Prepare) started...
NetworkManager: <info> Activation (wlan0) Stage 2 of 5 (Device
Configure) scheduled...

There is an open bug report for this issue on intel's bughost at
http://bughost.org/bugzilla/show_bug.cgi?id=1360 but I am beginning to
wonder if this might be a NM or possibly an issue specific to the Gutsy package?

I am running Ubuntu 7.10 (Gutsy) with Network Manager 6.5 with a
IWL4965 wireless NIC on a Lenovo T61.

Darren Albers (dalbers) wrote :

Set this to be a security vulnerability since this might indicate a possible buffer overflow...

Alexander Sack (asac) wrote :

do you still see the disconnects with latest gutsy network-manager?

Changed in network-manager:
status: New → Confirmed
Alexander Sack (asac) wrote :

siretart ... ever seen something like this before?

Changed in wpasupplicant:
status: New → Incomplete
Darren Albers (dalbers) wrote :

Alexander, I did as of last night with a system that was completely up to date with network-manager 0.6.5-0ubuntu11.

As a note I did not see the problem with Network Manager stopped and associated with wpa_supplicant manually.

Kees Cook (kees) wrote :

Hi, if you can reproduce this issue, can you start up everything, but before the "stack smashing" would normally happen, try this:

$ sudo gdb /sbin/wpa_supplicant $(pidof wpa_supplicant)
...
(gdb) br __stack_chk_fail
(gdb) continue

Then once gdb pops at the stack check failure, do:

(gdb) bt

and attach the output? Thanks!

Darren Albers (dalbers) wrote :

Sure, I will do it tonight.

Alexander Sack <email address hidden> writes:

> siretart ... ever seen something like this before?

Err, sorry. nope :(

--
Gruesse/greetings,
Reinhard Tartler, KeyID 945348A4

Darren Albers (dalbers) wrote :
Download full text (3.3 KiB)

I am not sure how useful this is going to be but here you go.

The first breakpoint came with this message from NetworkManager:
NetworkManager: <WARN> nm_utils_supplicant_request_with_check(): nm_device_802_11_wireless_scan: supplicant error for 'SCAN'. Response: 'TIMEOUT[CLI]'
NetworkManager: <WARN> nm_device_802_11_wireless_scan(): could not trigger wireless scan on device wlan0: Operation not supported

Here is the bt:
darren@dpa2:~$ ps -A | grep wpa
 8347 pts/0 00:00:00 wpa_supplicant
darren@dpa2:~$ sudo gdb /sbin/wpa_supplicant 8347
GNU gdb 6.6-debian
Copyright (C) 2006 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "i486-linux-gnu"...
(no debugging symbols found)
Using host libthread_db library "/lib/tls/i686/cmov/libthread_db.so.1".
Attaching to program: /sbin/wpa_supplicant, process 8347
Reading symbols from /usr/lib/i686/cmov/libssl.so.0.9.8...(no debugging symbols found)...done.
Loaded symbols for /usr/lib/i686/cmov/libssl.so.0.9.8
Reading symbols from /usr/lib/i686/cmov/libcrypto.so.0.9.8...(no debugging symbols found)...done.
Loaded symbols for /usr/lib/i686/cmov/libcrypto.so.0.9.8
Reading symbols from /lib/tls/i686/cmov/libdl.so.2...
(no debugging symbols found)...done.
Loaded symbols for /lib/tls/i686/cmov/libdl.so.2
Reading symbols from /usr/lib/libdbus-1.so.3...(no debugging symbols found)...done.
Loaded symbols for /usr/lib/libdbus-1.so.3
Reading symbols from /lib/tls/i686/cmov/libc.so.6...
(no debugging symbols found)...done.
Loaded symbols for /lib/tls/i686/cmov/libc.so.6
Reading symbols from /usr/lib/libz.so.1...(no debugging symbols found)...done.
Loaded symbols for /usr/lib/libz.so.1
Reading symbols from /lib/ld-linux.so.2...
(no debugging symbols found)...done.
Loaded symbols for /lib/ld-linux.so.2
(no debugging symbols found)
0xffffe410 in __kernel_vsyscall ()
(gdb) br __stack_chk_fail
Breakpoint 1 at 0xb7d2a486
(gdb) continue
Continuing.

Breakpoint 1, 0xb7d2a486 in __stack_chk_fail ()
   from /lib/tls/i686/cmov/libc.so.6
(gdb) bt
#0 0xb7d2a486 in __stack_chk_fail () from /lib/tls/i686/cmov/libc.so.6
#1 0x08081eb3 in ?? ()
#2 0x00000001 in ?? ()
#3 0x08095050 in ?? ()
#4 0x00000470 in ?? ()
#5 0x00000005 in ?? ()
#6 0xb7d8816c in ?? () from /lib/tls/i686/cmov/libc.so.6
#7 0xb7d8816c in ?? () from /lib/tls/i686/cmov/libc.so.6
#8 0x00000000 in ?? ()
(gdb) continue
Continuing.

Then when I received this message:
*** stack smashing detected ***: /sbin/wpa_supplicant terminated
NetworkManager: <WARN> nm_utils_supplicant_request_with_check(): nm_device_802_11_wireless_scan: supplicant error for 'SCAN'. Response: 'TIMEOUT[CLI]'

Here is the br from that:

Program received signal SIGABRT, Aborted.
0xffffe410 in __kernel_vsyscall ()
(gdb) bt
#0 0xffffe410 in __kernel_vsyscall ()
#1 0xb7c6b875 in raise () from /lib/tls/i686/cmov/libc.so.6
#2 0xb7c6d201 in abort () from /lib/tls/i686/cmov/libc.so.6
#3 0xb7ca2e5c in ?? () from /...

Read more...

On Tue, Sep 11, 2007 at 11:20:10PM -0000, Darren Albers wrote:
> I am not sure how useful this is going to be but here you go.

Thanks for doing this. Yeah, it's going to be tricky (since the stack
is, well, smashed). Hopefully it will help track it down.

Kees Cook (kees) wrote :

Well, I spent some time digging around in wpa_driver_wext_get_scan_results, and while it's scary to read, the overflow isn't obvious yet. Can you try another gdb recipe? This one is quite a bit more exciting -- it tries to break out the moment the stack gets trashed. Here are the commands, after doing the "sudo gdb /sbin/wpa_supplicant $(pidof wpa_supplicant)":

br *0x08081964
br *0x08081a3a

set variable $count = 2
commands 1
silent
set variable $cow = (unsigned long*)($ebp - 0x14)
watch *$cow
cont
end

commands 2
silent
set variable $count = $count + 1
delete $count
cont
end

cont

bt
info reg
x/10i $eip

If I got this aligned correctly, this should set up a hardware memory breakpoint when wpa_driver_wext_get_scan_results is called (and tears it down just before it exits). When the watchpoint triggers, the bt/info reg/etc should give us the details about the instruction immediately after the offending action. Each time wpa_driver_wext_get_scan_results is called, you'll see something like:
  Hardware watchpoint 15: *$cow
You can ignore those. We're looking for:
  Old value = 75012294
 New value = 75012241
 0x......
 (gdb)

I wonder if the issue is actually with the wireless driver itself, and that it might be clobbering the userspace buffer. We'll see. :)

Alexander Sack (asac) wrote :

for now i assume that this is not a network-manager bug. If it turns out to be, fee free to reopen the network-manager task.

 - Alexander

Changed in network-manager:
status: Confirmed → Invalid
Darren Albers (dalbers) wrote :

Kees, I appreciate all your help and I will get you that backtrace when I get back home tomorrow. It very well could be the driver itself a IPW3945 card does not exhibit this problem with the IPW driver.

Darren Albers (dalbers) wrote :

Here you go! Let me know if this isn't what you are looking for. Thanks!

(gdb) continue
Continuing.
Hardware watchpoint 3: *$cow
Hardware watchpoint 4: *$cow
Hardware watchpoint 5: *$cow
Hardware watchpoint 6: *$cow
Hardware watchpoint 7: *$cow
Hardware watchpoint 8: *$cow
Hardware watchpoint 3: *$cow

Old value = 4278845440
New value = 3084202304
Hardware watchpoint 4: *$cow

Old value = 4278845440
New value = 3084202304
Hardware watchpoint 5: *$cow

Old value = 4278845440
New value = 3084202304
Hardware watchpoint 6: *$cow

Old value = 4278845440
New value = 3084202304
---Type <return> to continue, or q <return> to quit---
Hardware watchpoint 7: *$cow

Old value = 4278845440
New value = 3084202304
Hardware watchpoint 8: *$cow

Old value = 4278845440
New value = 3084202304
0xb7c77fb7 in malloc () from /lib/tls/i686/cmov/libc.so.6
(gdb) bt
#0 0xb7c77fb7 in malloc () from /lib/tls/i686/cmov/libc.so.6
#1 0x080752db in ?? ()
#2 0x00005800 in ?? ()
#3 0x00000001 in ?? ()
#4 0x0000001a in ?? ()
#5 0x0804eb8b in ?? ()
#6 0x08096a60 in ?? ()
#7 0x080a2d5e in ?? ()
#8 0x00000008 in ?? ()
#9 0x00000000 in ?? ()
(gdb) contunue
Undefined command: "contunue". Try "help".
(gdb) cont
Continuing.
Hardware watchpoint 3: *$cow

Old value = 3084202304
New value = 4278845440
Hardware watchpoint 4: *$cow

Old value = 3084202304
New value = 4278845440
Hardware watchpoint 5: *$cow

Old value = 3084202304
New value = 4278845440
Hardware watchpoint 6: *$cow

Old value = 3084202304
New value = 4278845440
Hardware watchpoint 7: *$cow

Old value = 3084202304
New value = 4278845440
Hardware watchpoint 8: *$cow

---Type <return> to continue, or q <return> to quit---
Old value = 3084202304
New value = 4278845440
0x08081964 in ?? ()
Hardware watchpoint 9: *$cow
Hardware watchpoint 10: *$cow
Hardware watchpoint 3: *$cow

Old value = 4278845440
New value = 3084202304
Hardware watchpoint 4: *$cow

Old value = 4278845440
New value = 3084202304
Hardware watchpoint 5: *$cow

Old value = 4278845440
New value = 3084202304
Hardware watchpoint 6: *$cow

Old value = 4278845440
New value = 3084202304
Hardware watchpoint 7: *$cow

---Type <return> to continue, or q <return> to quit---
Old value = 4278845440
New value = 3084202304
Hardware watchpoint 8: *$cow

Old value = 4278845440
New value = 3084202304
Hardware watchpoint 9: *$cow

Old value = 4278845440
New value = 3084202304
Hardware watchpoint 10: *$cow

Old value = 4278845440
New value = 3084202304
0xb7c77fb7 in malloc () from /lib/tls/i686/cmov/libc.so.6

Kees Cook (kees) wrote :

On Fri, Sep 14, 2007 at 11:19:56PM -0000, Darren Albers wrote:
> Here you go! Let me know if this isn't what you are looking for.
> Thanks!

Hm, I don't think it was deleting the watch-point correctly. Can you
try it again, and attach (as a file rather than a comment) the entire
gdb session? Maybe using something like this:

sudo gdb $(pidof wpa_supplicant) 2>&1 | tee /tmp/gdb.log

Then I can double-check the commands I gave you; in case I missed
something it should be more obvious.

Thanks!

-Kees

--
Kees Cook @outflux.net

Darren Albers (dalbers) wrote :

Well I'm an idiot, I forgot to do:
info reg
x/10i $eip

I'll rerun it now and pipe it to a file for you.

Sorry about that!

Darren Albers (dalbers) wrote :

Here you go, is this better?

Kees Cook (kees) wrote :

Yup, that turned out great. I'll continue digging. Thanks!

Kees Cook (kees) wrote :

When it finishes building, can you test the wpasupplicant from my PPA?
http://ppa.launchpad.net/keescook/ubuntu/pool/main/w/

I think I found the crash and got it fixed. If you can confirm this fixes it for you, I'll get it uploaded to the main archive. Thanks for doing all the gdb debugging!

Changed in wpasupplicant:
assignee: nobody → keescook
status: Incomplete → Fix Committed
importance: Undecided → Medium
Kees Cook (kees) wrote :

wpasupplicant (0.6.0-3ubuntu1~ppa1) gutsy; urgency=low

  * Add debian/patches/90_fix_wext_tsf_stack_overflow.dpatch: correct
    buffer size limit on hexstr2bin call from wext_get_scan_custom
    (LP: #138873).

 -- Kees Cook <email address hidden> Fri, 14 Sep 2007 23:08:25 -0700

Changed in wpasupplicant:
status: Fix Committed → Fix Released
Reinhard Tartler (siretart) wrote :

this bug is not fixed in gutsy, but test packages are available in kees' PPA

Changed in wpasupplicant:
status: Fix Released → In Progress

Kees Cook <email address hidden> writes:

> wpasupplicant (0.6.0-3ubuntu1~ppa1) gutsy; urgency=low
>
> * Add debian/patches/90_fix_wext_tsf_stack_overflow.dpatch: correct
> buffer size limit on hexstr2bin call from wext_get_scan_custom
> (LP: #138873).
>
> -- Kees Cook <email address hidden> Fri, 14 Sep 2007 23:08:25 -0700

The file debian/patches/90_fix_wext_tsf_stack_overflow.dpatch has the
following contents:

diff -urNad wpasupplicant-0.6.0~/src/drivers/driver_wext.c wpasupplicant-0.6.0/src/drivers/driver_wext.c
--- wpasupplicant-0.6.0~/src/drivers/driver_wext.c 2007-05-28 10:26:55.000000000 -0700
+++ wpasupplicant-0.6.0/src/drivers/driver_wext.c 2007-09-14 23:07:24.217713592 -0700
@@ -1380,6 +1380,7 @@
                        wpa_printf(MSG_INFO, "Invalid TSF length (%d)", bytes);
                        return;
                }
+ bytes /= 2;
                hexstr2bin(spos, bin, bytes);
                res->tsf += WPA_GET_BE64(bin);

Can you please comment on it?
The complete query for this bugtrail can be found at here:

https://launchpad.net/bugs/138873

--
Gruesse/greetings,
Reinhard Tartler, KeyID 945348A4

Darren Albers (dalbers) wrote :

So far so good! I can normally trigger it within 20 minutes and today it seemed 100% stable for over 3 hours.

I opened an upstream bug report a couple of days ago when I opened this one, should I close it and link back here?

There are also a number of T61 users who are experiencing this issue, should I direct them here to test or is this a possible security issue that shouldn't be advertised until the fix is in the repo's?

Thank you so much for fixing this!

Kees Cook (kees) wrote :

I thought the PPA-upload-closing bug was fixed?

siretart: what part did you want me to comment on?

dalbers: sure, link their tracker back here. I sent the patch to Jouni Malinen last night, too.

Kees Cook (kees) wrote :

Attaching diff to this report too, just for completeness.

Reinhard Tartler (siretart) wrote :

kees: that email was directed to upstream. That bug was 'only' CCed

Alexander Sack (asac) wrote :

any news from upstream so far?

Alexander Sack (asac) wrote :

kees, i milestone this for beta. Feel free to remove it if you think its inappropriate or if its clear that we won't be able to provide a fixed package in time for beta.

Kees Cook (kees) wrote :

No word from upstream. I have upload a fixed package ASAP if you want.

Kees Cook (kees) wrote :

Sorry, that should read "I _can_ upload a fixed package ..."

Alexander Sack (asac) wrote :

wpasupplicant 0.6.0+0.5.8-0ubuntu1 (latest package in gutsy) is not affected.

Changed in wpasupplicant:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public Security information  Edit
Everyone can see this security related information.

Other bug subscribers

Bug attachments