[jaunty][iwl3945] WPA fully associate only after many retries

Bug #413874 reported by Christhonie
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Linux
Fix Released
Medium
linux (Ubuntu)
Invalid
Undecided
Unassigned

Bug Description

Connecting to a WPA-PSK wireless access point (AP), a Billion 5100W with non-hidden SSID, using a Dell Latitude D830, with an Intel PRO/Wireless 3945ABG device using the iwl3945 driver, will only associate after several retries.

Initially, after installing 8.10 on the said laptop, WPA worked fine with this setup. After upgrading to 9.04 a month has elapsed before I used the wireless link again and at that time I have noticed degraded association behaviour. Network manager was unusable as it timed out so often it was easier to use wpa_supplicant, which retries indefinitely, from the terminal. Association now takes anywhere from 2mins to over 6 hours (leaving it running over night, trying to connect), most often taking over 10mins.

Connection to the same AP using a secondary D-Link AirPlusG (DWL-G630) as wlan1, using the same /etc/wpa_supplicant.conf file, results in an immediate connection. For me it narrows it down to the iwl3945 driver.

Before wpa_supplicant finally connects it continuously retries, each retry having one of 3 possible failure cases I have detected thus far (I have been logging this for over a month now). I am not sure if this is all part of the same issue, but they all contribute to a non-connection. I will attach the annotated output of wpa_supplicant for each case separately. I have found this output far more detailed than that in syslog, but will include a syslog file with a corresponding wpa_supplicant file as an additional example.

Failure case 1: No authentication for 10secs
A successful AP scan, but no association for 10secs causes a retry.

Failure case 2: A possible disconnect by the AP?
A successful AP scan, then association event (no WPA handshake yet), but 6 seconds later a "Wireless event: cmd=0x8b15 len=20" causes a disconnect and retry.

Failure case 3: Timeout during 4way handshake
A successful AP scan, association, but timeout occurs during the 4-way handshake

Device information:
>lspci (only relevant lines listed; first device working, second device problematic)
04:00.0 Network controller: RaLink RT2561/RT61 rev B 802.11g
0c:00.0 Network controller: Intel Corporation PRO/Wireless 3945ABG [Golan] Network Connection (rev 02)

>lsmod | grep iwl3945
iwl3945 97912 0
mac80211 217592 3 rt2x00pci,rt2x00lib,iwl3945
led_class 12036 2 rt2x00lib,iwl3945
cfg80211 38288 3 rt2x00lib,iwl3945,mac80211

>modinfo iwl3945
filename: /lib/modules/2.6.28-14-generic/kernel/drivers/net/wireless/iwlwifi/iwl3945.ko
firmware: iwlwifi-3945-1.ucode
license: GPL
author: Copyright(c) 2003-2008 Intel Corporation
version: 1.2.26ks
description: Intel(R) PRO/Wireless 3945ABG/BG Network Connection driver for Linux
srcversion: 2B0F90BC6D81899594B7ACC
alias: pci:v00008086d00004227sv*sd*bc*sc*i*
alias: pci:v00008086d00004222sv*sd*bc*sc*i*
alias: pci:v00008086d00004227sv*sd00001014bc*sc*i*
alias: pci:v00008086d00004222sv*sd00001044bc*sc*i*
alias: pci:v00008086d00004222sv*sd00001034bc*sc*i*
alias: pci:v00008086d00004222sv*sd00001005bc*sc*i*
depends: mac80211,led-class,cfg80211
vermagic: 2.6.28-14-generic SMP mod_unload modversions 586

>modinfo mac80211
filename: /lib/modules/2.6.28-14-generic/kernel/net/mac80211/mac80211.ko
license: GPL
description: IEEE 802.11 subsystem
srcversion: 1F1657BC4ED0F54E1A6BBF4
depends: cfg80211
vermagic: 2.6.28-14-generic SMP mod_unload modversions 586
parm: ieee80211_default_rc_algo:Default rate control algorithm for mac80211 to use (charp)

>cat /etc/wpa_supplicant.conf
network={
        proto=WPA
        scan_ssid=1
        key_mgmt=WPA-PSK
        ssid="study"
        #PSK unencrypted: "xxxxxxxx"
        psk=2c64497a1331323dcac5be5f662934e010d27003ae64c1674846ea3db5b48fce

ProblemType: Bug
Architecture: i386
DistroRelease: Ubuntu 9.04
HibernationDevice: RESUME=UUID=a8659960-146a-4a43-a7a6-e95e23c3257e
MachineType: Dell Inc. Latitude D830
NonfreeKernelModules: nvidia
Package: linux-image-2.6.28-14-generic 2.6.28-14.47
ProcCmdLine: root=UUID=91f238b8-6e84-4720-ab36-a252cf3818e7 ro quiet splash
ProcEnviron:
 LANGUAGE=en_GB:en
 PATH=(custom, no user)
 LANG=en_GB.UTF-8
 SHELL=/bin/bash
ProcVersionSignature: Ubuntu 2.6.28-14.47-generic
SourcePackage: linux

Revision history for this message
In , Nigel (nigel-redhat-bugs) wrote :

Created attachment 333065
syslogd log

Description of problem:
NetworkManager doesn't always connect to Network

Version-Release number of selected component (if applicable):
NetworkManager-0.7.0.97-4.git20090219.fc11.i586
wpa_supplicant-0.6.7-3.fc11.i386
kernel-PAE-2.6.29-0.137.rc5.git4.fc11.i686

How reproducible:
Intermittent

Steps to Reproduce:

Either

1. Resume suspended laptop
2. wait for NM to connect to previously selected SSID
3. observe that NM fails & pops up request for secrets
or

1. Start with NM already connected to an AP
2. Select an alternate (but correctly configured, and available) AP
3. observe that NM fails &^ pops up secrets request

Actual results:
NM fails to connect

Expected results:
NM connects

Additional info:

Basically sometimes this works, sometimes it doesn't. Association appears to time out. Other non-linux clients (S60/Nokia N95 8Gb/Nokia N96/Windows 7/Windows Vista) don't *appear* to have a problem

In the log file attached note the "jonesn: " messages in the log. These were created with the "logger" command to aid in documenting the scenario

The Ap that I had been connected to ok was "planetf1c". The one that failed was "planetf1f". Ignore "planetf1e" -- this one is not active and I clicked on it in the GUI by mistake.

Error seems to hinge around
Activation (wlan0/wireless): association took too long.

Security in use is WPA2-PSK AES
wifi driver is iwl3945

Both AP are "fonera" routers running openwrt with atheros wireless. In this config the router broadcasts 2 SSIDs.

Same problem has been observed with ddwrt (single SSID)

Log file is attached (kernel debug log, -dddt )

Not entirely clear if the issue is supplicant, NM or driver.....

Final note:
I am also seeing issues with WPA Enterprise (LEAP and EAP-TLS) but figure it makes sense to address the simpler PSK case first (less variables, more control home vs enterprise...)

Revision history for this message
In , Nigel (nigel-redhat-bugs) wrote :

I tried restarting NM, and the problem persisted... ie once we get into this state the "bad" AP cannot be connected to.

I then stopped NM and unloaded the iwl3945 module, followed by reloading/restarting

NM started and connected to the "good" AP just fine (c). I then clicked on the "bad" AP in the GUI (f) and again it connected perfectly.

So is this some form of driver blacklisting/bad state info in iwl3945 driver?

From syslog during driver load:
iwl3945 0000:03:00.0: PCI INT A disabled
iwl3945: Intel(R) PRO/Wireless 3945ABG/BG Network Connection driver for Linux, 1.2.26kds
iwl3945: Copyright(c) 2003-2008 Intel Corporation
iwl3945 0000:03:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
iwl3945 0000:03:00.0: setting latency timer to 64
iwl3945: Tunable channels: 13 802.11bg, 23 802.11a channels
iwl3945: Detected Intel Wireless WiFi Link 3945ABG
wmaster0 (iwl3945): not using net_device_ops yet
phy1: Selected rate control algorithm 'iwl-3945-rs'
wlan0 (iwl3945): not using net_device_ops yet
iwl3945 0000:03:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
iwl3945 0000:03:00.0: irq 31 for MSI/MSI-X
iwl3945 0000:03:00.0: firmware: requesting iwlwifi-3945-2.ucode
iwl3945 loaded firmware version 15.28.2.8
Registered led device: iwl-phy1:radio
Registered led device: iwl-phy1:assoc
Registered led device: iwl-phy1:RX
Registered led device: iwl-phy1:TX

Revision history for this message
In , Nigel (nigel-redhat-bugs) wrote :

Created attachment 333072
wpa supplicant log from same time period

Added wpa supplicant log -- hopefully from correct times!

Also adding
options iwl3945 debug=0x0000bc88
for next time (43fff was too verbose)

Revision history for this message
In , Nigel (nigel-redhat-bugs) wrote :

From full wpa_supplicant log it appears that Association timeout is the issue -- then causing blacklisting.

1235497219.167758: Selecting BSS from priority group 0
1235497219.167805: Try to find WPA-enabled AP
1235497219.167852: 0: 00:18:84:2a:74:67 ssid='planetf1f' wpa_ie_len=0 rsn_ie_len=20 caps=0x11
1235497219.167921: skip - disabled
1235497219.167973: selected based on RSN IE
1235497219.168020: selected WPA AP 00:18:84:2a:74:67 ssid='planetf1f'
1235497219.168079: Trying to associate with 00:18:84:2a:74:67 (SSID='planetf1f' freq=2412 MHz)
1235497219.168131: Cancelling scan request
1235497219.168184: WPA: clearing own WPA/RSN IE
1235497219.168244: Automatic auth_alg selection: 0x1
1235497219.168304: RSN: using IEEE 802.11i/D9.0
1235497219.168352: WPA: Selected cipher suites: group 16 pairwise 16 key_mgmt 2 proto 2
1235497219.168403: WPA: clearing AP WPA IE
1235497219.168449: WPA: set AP RSN IE - hexdump(len=22): 30 14 01 00 00 0f ac 04 01 00 00 0f ac 04 01 00 00 0f ac 02 00 00
1235497219.168509: WPA: using GTK CCMP
1235497219.168557: WPA: using PTK CCMP
1235497219.168605: WPA: using KEY_MGMT WPA-PSK
1235497219.168652: WPA: Set own WPA IE default - hexdump(len=22): 30 14 01 00 00 0f ac 04 01 00 00 0f ac 04 01 00 00 0f ac 02 00 00
1235497219.168727: No keys have been configured - skip key clearing
1235497219.168772: wpa_driver_wext_set_drop_unencrypted
1235497219.168822: State: SCANNING -> ASSOCIATING
1235497219.168998: wpa_driver_wext_set_operstate: operstate 0->0 (DORMANT)
1235497219.169088: WEXT: Operstate: linkmode=-1, operstate=5
1235497219.169164: wpa_driver_wext_associate
1235497219.169249: wpa_driver_wext_set_psk
1235497219.184949: Setting authentication timeout: 20 sec 0 usec
1235497219.186846: EAPOL: External notification - EAP success=0
1235497219.186871: EAPOL: External notification - EAP fail=0
1235497219.186894: EAPOL: External notification - portControl=Auto
1235497219.186933: RSN: Ignored PMKID candidate without preauth flag
1235497219.186962: RTM_NEWLINK: operstate=0 ifi_flags=0x1003 ([UP])
1235497219.186986: RTM_NEWLINK, IFLA_IFNAME: Interface 'wlan0' added
1235497219.187009: Wireless event: cmd=0x8b04 len=12
1235497219.785878: RTM_NEWLINK: operstate=0 ifi_flags=0x1003 ([UP])
1235497219.786026: RTM_NEWLINK, IFLA_IFNAME: Interface 'wlan0' added
1235497219.786082: Wireless event: cmd=0x8b15 len=20
1235497219.786137: Wireless event: new AP: 00:00:00:00:00:00
1235497219.786196: Added BSSID 00:18:84:2a:74:67 into blacklist
1235497219.786256: CTRL-EVENT-DISCONNECTED - Disconnect event - remove keys
1235497219.786309: wpa_driver_wext_set_key: alg=0 key_idx=0 set_tx=0 seq_len=0 key_len=0
1235497219.786382: wpa_driver_wext_set_key: alg=0 key_idx=1 set_tx=0 seq_len=0 key_len=0
1235497219.786448: wpa_driver_wext_set_key: alg=0 key_idx=2 set_tx=0 seq_len=0 key_len=0
1235497219.786512: wpa_driver_wext_set_key: alg=0 key_idx=3 set_tx=0 seq_len=0 key_len=0
1235497219.786576: wpa_driver_wext_set_key: alg=0 key_idx=0 set_tx=0 seq_len=0 key_len=0
1235497219.786721: Driver did not support SIOCSIWENCODEEXT
1235497219.786791: State: ASSOCIATING -> DISCONNECTED
1235497219.787000: wpa_driver_wext_set_operstate: operstate 0->0 (DORMANT)

Revision history for this message
In , Nigel (nigel-redhat-bugs) wrote :

Tried with disable_hw_scan=1, but didn't appear to prevent the problem. Occurred again this am. Other clients ok. Recycle (of iwl3945+NM) fixed. Kernel at 2.6.29-0.145.rc6.fc11.i686.PAE

Interesting section in debug log
Feb 25 06:12:59 snowdon NetworkManager: <info> (wlan0): supplicant connection state: scanning -> associating
Feb 25 06:12:59 snowdon kernel: iwl3945: U iwl3945_add_station Add STA ID 24: ff:ff:ff:ff:ff:ff
Feb 25 06:12:59 snowdon kernel: wlan0: direct probe to AP 00:18:84:2a:74:67 try 1
Feb 25 06:13:00 snowdon kernel: wlan0: direct probe to AP 00:18:84:2a:74:67 try 2
Feb 25 06:13:00 snowdon kernel: wlan0: direct probe to AP 00:18:84:2a:74:67 try 3
Feb 25 06:13:00 snowdon NetworkManager: <info> (wlan0): supplicant connection state: associating -> disconnected
Feb 25 06:13:00 snowdon kernel: wlan0: direct probe to AP 00:18:84:2a:74:67 timed out
Feb 25 06:13:08 snowdon NetworkManager: <info> wlan0: link timed out.
Feb 25 06:13:08 snowdon NetworkManager: <info> (wlan0): supplicant connection state: disconnected -> scanning

So "direct probe to AP XXX timed out"?

Revision history for this message
In , Nigel (nigel-redhat-bugs) wrote :

Moving to iwl3945 -- don't believe this to be a NM issue?

Revision history for this message
In , Nigel (nigel-redhat-bugs) wrote :

With

kernel-PAE-2.6.29-0.176.rc6.git5.fc11.i686
& a change to /etc/modprobe.d/iwl3945
options iwl3945 disable_hw_scan=1

don't appear to be getting issues.

Will remove the option to disable hw scan at next reboot to see which change fixed it.

Revision history for this message
In , Nigel (nigel-redhat-bugs) wrote :

My error. HW scan is already reenabled - currently have
options iwl3945 debug=0x0000bc88

so kernel fix may have addressed this. will confirm in a few more days.

Revision history for this message
In , Nigel (nigel-redhat-bugs) wrote :

closing. no reoccurances

Revision history for this message
Christhonie (christhonie) wrote :
Revision history for this message
Christhonie (christhonie) wrote :
Revision history for this message
Christhonie (christhonie) wrote :
Revision history for this message
Christhonie (christhonie) wrote :
Revision history for this message
Christhonie (christhonie) wrote :
Download full text (4.4 KiB)

Output from syslog and matched with output from wpa_supplicant over the same period;

Aug 15 02:22:43 cg-dell-laptop kernel: [41851.362704] Registered led device: iwl-phy0:radio
Aug 15 02:22:43 cg-dell-laptop kernel: [41851.362725] Registered led device: iwl-phy0:assoc
Aug 15 02:22:43 cg-dell-laptop kernel: [41851.362778] Registered led device: iwl-phy0:RX
Aug 15 02:22:43 cg-dell-laptop kernel: [41851.362793] Registered led device: iwl-phy0:TX
Aug 15 02:22:43 cg-dell-laptop kernel: [41851.382400] ADDRCONF(NETDEV_UP): wlan0: link is not ready
Aug 15 02:22:46 cg-dell-laptop kernel: [41854.265714] wlan0: direct probe to AP 00:04:ed:5b:b9:c9 try 1
Aug 15 02:22:46 cg-dell-laptop kernel: [41854.464123] wlan0: direct probe to AP 00:04:ed:5b:b9:c9 try 2
Aug 15 02:22:46 cg-dell-laptop kernel: [41854.664450] wlan0: direct probe to AP 00:04:ed:5b:b9:c9 try 3
Aug 15 02:22:47 cg-dell-laptop kernel: [41854.864504] wlan0: direct probe to AP 00:04:ed:5b:b9:c9 timed out
Aug 15 02:22:57 cg-dell-laptop kernel: [41865.670080] wlan0: direct probe to AP 00:04:ed:5b:b9:c9 try 1
Aug 15 02:22:57 cg-dell-laptop kernel: [41865.671651] wlan0: direct probe to AP 00:04:ed:5b:b9:c9 try 1
Aug 15 02:22:58 cg-dell-laptop kernel: [41865.868239] wlan0: direct probe to AP 00:04:ed:5b:b9:c9 try 2
Aug 15 02:22:58 cg-dell-laptop kernel: [41866.068242] wlan0: direct probe to AP 00:04:ed:5b:b9:c9 try 3
Aug 15 02:22:58 cg-dell-laptop kernel: [41866.268055] wlan0: direct probe to AP 00:04:ed:5b:b9:c9 timed out
Aug 15 02:23:09 cg-dell-laptop kernel: [41877.097659] wlan0: direct probe to AP 00:04:ed:5b:b9:c9 try 1
Aug 15 02:23:09 cg-dell-laptop kernel: [41877.304481] wlan0: direct probe to AP 00:04:ed:5b:b9:c9 try 2
Aug 15 02:23:09 cg-dell-laptop kernel: [41877.504246] wlan0: direct probe to AP 00:04:ed:5b:b9:c9 try 3
Aug 15 02:23:09 cg-dell-laptop kernel: [41877.705238] wlan0: direct probe to AP 00:04:ed:5b:b9:c9 timed out
Aug 15 02:23:20 cg-dell-laptop kernel: [41888.590756] wlan0: direct probe to AP 00:04:ed:5b:b9:c9 try 1
Aug 15 02:23:20 cg-dell-laptop kernel: [41888.591526] wlan0: direct probe to AP 00:04:ed:5b:b9:c9 try 1
Aug 15 02:23:21 cg-dell-laptop kernel: [41888.788243] wlan0: direct probe to AP 00:04:ed:5b:b9:c9 try 2
Aug 15 02:23:21 cg-dell-laptop kernel: [41888.988259] wlan0: direct probe to AP 00:04:ed:5b:b9:c9 try 3
Aug 15 02:23:21 cg-dell-laptop kernel: [41889.188243] wlan0: direct probe to AP 00:04:ed:5b:b9:c9 timed out
Aug 15 02:23:32 cg-dell-laptop kernel: [41899.971467] wlan0: authenticate with AP 00:04:ed:5b:b9:c9
Aug 15 02:23:32 cg-dell-laptop kernel: [41899.972371] wlan0: authenticate with AP 00:04:ed:5b:b9:c9
Aug 15 02:23:32 cg-dell-laptop kernel: [41900.172239] wlan0: authenticate with AP 00:04:ed:5b:b9:c9
Aug 15 02:23:32 cg-dell-laptop kernel: [41900.372240] wlan0: authenticate with AP 00:04:ed:5b:b9:c9
Aug 15 02:23:32 cg-dell-laptop kernel: [41900.572239] wlan0: authentication with AP 00:04:ed:5b:b9:c9 timed out
Aug 15 02:23:43 cg-dell-laptop kernel: [41911.293628] wlan0: authenticate with AP 00:04:ed:5b:b9:c9
Aug 15 02:23:43 cg-dell-laptop kernel: [41911.294394] wlan0: authenticate with AP 00:04:ed:5b:b9:c9
Aug 15 02:23:43 cg-dell-laptop kernel: [41911.492236...

Read more...

Revision history for this message
Christhonie (christhonie) wrote :

Redhat Bugzilla #487183 (https://bugzilla.redhat.com/show_bug.cgi?id=487183) reports a similar issue as failure case 2. Reported fixed in "kernel-PAE-2.6.29-0.176.rc6.git5.fc11.i686", so there is hope that (part of) this bug might be fixed in a later kernel.

Changed in linux:
status: Unknown → Fix Released
Revision history for this message
^_Pepe_^ (jose-angel-fernandez-freire) wrote :

Hi,

Please, can you tell us whether this is still an issue for you? It seem to be solved.

tags: added: jaunty
Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Christhonie (christhonie) wrote :

Hi,

I no longer own the hardware mentioned in this post and are unable to confirm the current status. Sorry. Thanks for getting back to me in any way.

Revision history for this message
^_Pepe_^ (jose-angel-fernandez-freire) wrote :

Hi,
Thanks for your comments.

I'm updating as "Invalid" since you have no more access to that hardware.

Changed in linux (Ubuntu):
status: Incomplete → Invalid
Changed in linux:
importance: Unknown → Medium
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.