NetworkManager fails to finish association to AP

Bug #133966 reported by Tom Wood
This bug report is a duplicate of:  Bug #50214: can't connect to hidden network. Edit Remove
2
Affects Status Importance Assigned to Milestone
network-manager (Ubuntu)
Incomplete
Medium
Unassigned

Bug Description

Binary package hint: network-manager

With a completely updated Feisty install, NetworkManager fails to associate to my AP running WPA2. My network card is an Atheros-based AR5212 PCI NIC from Netgear (model WG311T).

This seems to be a timing/race condition problem whereby NetworkManager doesn't seem to understand that it can proceed with the association and the rest of its startup process. The reason I state this is that if I run "ifdown ath0" and immediately run "ifup ath0" at the "right time" after NetworkManager has started, NetworkManager will finish and ath0 comes up associated with the AP, dhcp'ed address and all. I can't clearly define "right time" at present. These seems to be a narrow window of opportunity right after the "CTRL_IFACE monitor attached" log statement from NetworkManager appears to pull off this workaround. If you miss the window, restarting NetworkManager (via /etc/init.d/dbus restart and or directly restarting NetworkManager when running it manually) and getting the timing just right with the ifdown/ifup pair will bring the network up.

I will be happy to provide any log files that you wish to see. Please indicate what I can send to help you understand this problem.

Items of note: I've already checked /etc/network/interfaces. I've commented out everything except ath0 and wifi0 entries. Those two are auto dhcp. I have no manually defined wpasupplicant.conf.

Revision history for this message
Alexander Sack (asac) wrote :

can you please test if you see the same issue with gutsy tribe 5? if you don't want to upgrade, please try the live-cd.

Thanks,

 - Alexander

Changed in network-manager:
importance: Undecided → Medium
status: New → Incomplete
Revision history for this message
Tom Wood (woodts) wrote : Re: [Bug 133966] Re: NetworkManager fails to finish association to AP

I'll give it a shot with gutsy.

Thanks,

Tom

On 8/31/07, Alexander Sack <email address hidden> wrote:
>
> can you please test if you see the same issue with gutsy tribe 5? if you
> don't want to upgrade, please try the live-cd.
>
> Thanks,
>
> - Alexander
>
> ** Changed in: network-manager (Ubuntu)
> Importance: Undecided => Medium
> Status: New => Incomplete
>
> --
> NetworkManager fails to finish association to AP
> https://bugs.launchpad.net/bugs/133966
> You received this bug notification because you are a direct subscriber
> of the bug.
>

Revision history for this message
Tom Wood (woodts) wrote :

Same problem with gutsy tribe 5.

I'll attach the relevant portion of the daemon.log that shows the entire process stalling after stage 2 (device configure) in NetworkManager. However, dhclient fires right after this and before stage 3 (IP configure start) even starts. Unless I'm mistaken, the interface is not up at this point, at least not the IP stack, but dhclient is still started. The ifdown/ifup trick works to bring the interface truly up, so stages 3-5 finish in short order thereafter.

As always, I'll be glad to provide any more details you need.

Thanks!

Revision history for this message
Alexander Sack (asac) wrote :

On Fri, Aug 31, 2007 at 10:39:10PM -0000, Tom Wood wrote:
> Same problem with gutsy tribe 5.
>
> I'll attach the relevant portion of the daemon.log that shows the entire
> process stalling after stage 2 (device configure) in NetworkManager.
> However, dhclient fires right after this and before stage 3 (IP
> configure start) even starts. Unless I'm mistaken, the interface is not
> up at this point, at least not the IP stack, but dhclient is still
> started. The ifdown/ifup trick works to bring the interface truly up,
> so stages 3-5 finish in short order thereafter.
>
> As always, I'll be glad to provide any more details you need.
>

When you are at

Aug 31 22:12:48 ubuntu NetworkManager: <info> Activation (ath0) Stage
2 of 5 (Device Configure) complete.
Aug 31 22:12:50 ubuntu avahi-daemon[8042]: Registering new address
record for fe80::218:4dff:fe7a:59c0 on ath0.*.

you should have about 10 seconds to capture output of iwconfig ath0

Can you do that please? Is it associated at that point? Do you see
more of relevance in /var/log/syslog at that point (or before/after)?

 - Alexander

Revision history for this message
Tom Wood (woodts) wrote :

I'm not at that machine at the moment, but I can tell you that it's not
associated. The two "lights" on the NetworkManager applet are both gray.
As I gather, the bottom one indicates whether the NIC is associated to the
AP. And IIRC iwconfig ath0 doesn't indicate that it's associated either -
at least the error is consistent.

I'll double-check this and answer the rest of your questions when I get back
to that machine either later today or tomorrow.

Thanks for the help!

Tom

On 8/31/07, Alexander Sack <email address hidden> wrote:
>
> On Fri, Aug 31, 2007 at 10:39:10PM -0000, Tom Wood wrote:
> > Same problem with gutsy tribe 5.
> >
> > I'll attach the relevant portion of the daemon.log that shows the entire
> > process stalling after stage 2 (device configure) in NetworkManager.
> > However, dhclient fires right after this and before stage 3 (IP
> > configure start) even starts. Unless I'm mistaken, the interface is not
> > up at this point, at least not the IP stack, but dhclient is still
> > started. The ifdown/ifup trick works to bring the interface truly up,
> > so stages 3-5 finish in short order thereafter.
> >
> > As always, I'll be glad to provide any more details you need.
> >
>
> When you are at
>
> Aug 31 22:12:48 ubuntu NetworkManager: <info> Activation (ath0) Stage
> 2 of 5 (Device Configure) complete.
> Aug 31 22:12:50 ubuntu avahi-daemon[8042]: Registering new address
> record for fe80::218:4dff:fe7a:59c0 on ath0.*.
>
> you should have about 10 seconds to capture output of iwconfig ath0
>
> Can you do that please? Is it associated at that point? Do you see
> more of relevance in /var/log/syslog at that point (or before/after)?
>
> - Alexander
>
> --
> NetworkManager fails to finish association to AP
> https://bugs.launchpad.net/bugs/133966
> You received this bug notification because you are a direct subscriber
> of the bug.
>

Revision history for this message
Tom Wood (woodts) wrote :

It's absolutely not associated. It hunts around for the correct channel and
does find the correct frequency, but it just doesn't go onward to stage 3
and finish the association, unless I time the ifdown/ifup just right.

There are no other clues in /var/log/syslog worth mentioning. It's the same
stuff that's in the daemon.log.

Thanks!

Tom

On 8/31/07, Alexander Sack <email address hidden> wrote:
>
> On Fri, Aug 31, 2007 at 10:39:10PM -0000, Tom Wood wrote:
> > Same problem with gutsy tribe 5.
> >
> > I'll attach the relevant portion of the daemon.log that shows the entire
> > process stalling after stage 2 (device configure) in NetworkManager.
> > However, dhclient fires right after this and before stage 3 (IP
> > configure start) even starts. Unless I'm mistaken, the interface is not
> > up at this point, at least not the IP stack, but dhclient is still
> > started. The ifdown/ifup trick works to bring the interface truly up,
> > so stages 3-5 finish in short order thereafter.
> >
> > As always, I'll be glad to provide any more details you need.
> >
>
> When you are at
>
> Aug 31 22:12:48 ubuntu NetworkManager: <info> Activation (ath0) Stage
> 2 of 5 (Device Configure) complete.
> Aug 31 22:12:50 ubuntu avahi-daemon[8042]: Registering new address
> record for fe80::218:4dff:fe7a:59c0 on ath0.*.
>
> you should have about 10 seconds to capture output of iwconfig ath0
>
> Can you do that please? Is it associated at that point? Do you see
> more of relevance in /var/log/syslog at that point (or before/after)?
>
> - Alexander
>
> --
> NetworkManager fails to finish association to AP
> https://bugs.launchpad.net/bugs/133966
> You received this bug notification because you are a direct subscriber
> of the bug.
>

Revision history for this message
Alexander Sack (asac) wrote : Re: [Bug 133966] Re: NetworkManager fails to finish association to AP

On Sun, Sep 02, 2007 at 03:30:58AM -0000, Tom Wood wrote:
> It's absolutely not associated. It hunts around for the correct channel and
> does find the correct frequency, but it just doesn't go onward to stage 3
> and finish the association, unless I time the ifdown/ifup just right.
>
> There are no other clues in /var/log/syslog worth mentioning. It's the same
> stuff that's in the daemon.log.
>

Sorry that I ask you again, but somehow the info i am looking for is
not in the daemon.log. Can you please sudo killall NetworkManager

then start NetworkManager like:

 NetworkManager --no-daemon 2>&1 | tee /tmp/nm.log

then try one association through nm applet ... and wait till it gives
up. Then attach /tmp/nm.log ?

Thanks,

 - Alexander

Revision history for this message
Tom Wood (woodts) wrote :

I will attach two NetworkManager log files. The first, nm-nowork.log. is the case where I wait out the initial failure. The second, nm.log, is where I issue the "ifdown ath0; ifup ath0" and get the network to come up properly.

Revision history for this message
Tom Wood (woodts) wrote :
Revision history for this message
Alexander Sack (asac) wrote :

do you have a chance to test this in gutsy?

Revision history for this message
Tom Wood (woodts) wrote :

Since you asked me to test this in gutsy, that's all I've been testing
on. All of my input since your request for gutsy has been from a gutsy
machine. I install gutsy tribe 5 on the affected machine and have been
running updates against it every day. The problem still exists.

Thanks!

Tom

On Fri, 2007-09-07 at 10:37 +0000, Alexander Sack wrote:
> do you have a chance to test this in gutsy?
>

Revision history for this message
Alexander Sack (asac) wrote :

For the log ... is that a hidden AP you try to connect to?

Revision history for this message
Tom Wood (woodts) wrote : Re: [Bug 133966] Re: NetworkManager fails to finish association to AP

Yeah, it's cloaked. I explicitly declare the SSID to nm-applet when
connecting. Search for "woodintlinc" - that's the SSID.

On 9/9/07, Alexander Sack <email address hidden> wrote:
>
> For the log ... is that a hidden AP you try to connect to?
>
> --
> NetworkManager fails to finish association to AP
> https://bugs.launchpad.net/bugs/133966
> You received this bug notification because you are a direct subscriber
> of the bug.
>

Revision history for this message
Tom Wood (woodts) wrote :

Color me confused. This has now been marked as a dupe of bug # 50214. What I find most interesting is that with any other wireless NIC that I have, this is *not* a problem, just with this Atheros PCI-based NIC that has the same chipset in it as a PCMCIA-based NIC from the same manufacturer (Netgear WG311T is PCI and the PCMCIA is a WG511T). One Broadcom NIC that I have (Linksys WMP54GS) works without a hitch on this network. So do my Orinoco Classic Gold 802.11b-only PCMCIA NICs. And last, but not least, my Broadcom-based Compaq NX9010 (Broadcom BCM4306) works like a charm against this cloaked network under Feisty. Only the WG311T does not work without the ifup/ifdown.

Under Feisty and Gutsy, the only two versions I've tried, the WG311T requires the ifup/ifdown trick to work. Maybe that's a good clue to start figuring out what's wrong with NetworkManager, if indeed that is the culprit. To repeat, all my other wireless NICs work fine against my cloaked network under either Edgy or Feisty.

Thanks for looking into this, Alexander. It will be fantastic when we finally get to put this one to rest.

Revision history for this message
Matthew Carpenter (matt-eisgr) wrote :

On Monday 10 September 2007, Tom Wood wrote:
> *** This bug is a duplicate of bug 50214 ***
> https://bugs.launchpad.net/bugs/50214
>
> Color me confused. This has now been marked as a dupe of bug # 50214.
> What I find most interesting is that with any other wireless NIC that I
> have, this is *not* a problem, just with this Atheros PCI-based NIC that
> has the same chipset in it as a PCMCIA-based NIC from the same
> manufacturer (Netgear WG311T is PCI and the PCMCIA is a WG511T). One
> Broadcom NIC that I have (Linksys WMP54GS) works without a hitch on this
> network. So do my Orinoco Classic Gold 802.11b-only PCMCIA NICs. And
> last, but not least, my Broadcom-based Compaq NX9010 (Broadcom BCM4306)
> works like a charm against this cloaked network under Feisty. Only the
> WG311T does not work without the ifup/ifdown.
>
> Under Feisty and Gutsy, the only two versions I've tried, the WG311T
> requires the ifup/ifdown trick to work. Maybe that's a good clue to
> start figuring out what's wrong with NetworkManager, if indeed that is
> the culprit. To repeat, all my other wireless NICs work fine against my
> cloaked network under either Edgy or Feisty.
>
> Thanks for looking into this, Alexander. It will be fantastic when we
> finally get to put this one to rest.

I'm having this issue with an IPW3945 using WPA on Feisty, so this is not an
Atheros bug only.
What's disturbing is that the bug has been around for a year and it is still
undecided how important it is!

Revision history for this message
Tom Wood (woodts) wrote :
Download full text (3.5 KiB)

Matthew,

I'm astounded by this whole experience. This doesn't seem to be an easy bug
to track down. I've spent a fair amount of time on this myself and don't
completely grok the problem, other than I suspect it is some sort of race
condition/timing error/insufficient necessary conditions checking between
iwconfig actually completing the AP association and ifup bringing up the
interface (told you I was confused). This would explain the differences in
the behavior of NetworkManager among the network cards - it's the drivers
for the cards themselves that differ in this situation. Therein lies the
big problem with wireless on Linux in general - too many cards with too many
different ways of bringing them to life. You've got everything from
completely open source drivers like the Ralink stuff to the Atheros closed
source HAL to the fwcutter requirement for the firmware for the Broadcom
cards to ndiswrapper for the mongrels. There are a lot of paths here that
may need to be explored and eventually need to be consolidated. Frankly,
I'm shocked that NetworkManager works as well as it does. That being said,
it's just not yet good enough - it doesn't "Just Work" (tm).

I'm probably going to uncloak my AP to get around this right now. With
tools like Kismet that's a pretty weak defense anyway. But I have clients
that still require cloaked APs as part of their approved security measures.

This issue really needs a lot more attention than it's been given over the
last year or so. I'm afraid that it's working well enough for most folks
that it's dropped off the priority list for the Powers That Be (tm). It
really needs to be pushed through to completion.

Tom

On 9/10/07, Matthew Carpenter <email address hidden> wrote:
>
> *** This bug is a duplicate of bug 50214 ***
> https://bugs.launchpad.net/bugs/50214
>
> On Monday 10 September 2007, Tom Wood wrote:
> > *** This bug is a duplicate of bug 50214 ***
> > https://bugs.launchpad.net/bugs/50214
> >
> > Color me confused. This has now been marked as a dupe of bug # 50214.
> > What I find most interesting is that with any other wireless NIC that I
> > have, this is *not* a problem, just with this Atheros PCI-based NIC that
> > has the same chipset in it as a PCMCIA-based NIC from the same
> > manufacturer (Netgear WG311T is PCI and the PCMCIA is a WG511T). One
> > Broadcom NIC that I have (Linksys WMP54GS) works without a hitch on this
> > network. So do my Orinoco Classic Gold 802.11b-only PCMCIA NICs. And
> > last, but not least, my Broadcom-based Compaq NX9010 (Broadcom BCM4306)
> > works like a charm against this cloaked network under Feisty. Only the
> > WG311T does not work without the ifup/ifdown.
> >
> > Under Feisty and Gutsy, the only two versions I've tried, the WG311T
> > requires the ifup/ifdown trick to work. Maybe that's a good clue to
> > start figuring out what's wrong with NetworkManager, if indeed that is
> > the culprit. To repeat, all my other wireless NICs work fine against my
> > cloaked network under either Edgy or Feisty.
> >
> > Thanks for looking into this, Alexander. It will be fantastic when we
> > finally get to put this one to rest.
>
> I'm having this is...

Read more...

Revision history for this message
Matthew Carpenter (matt-eisgr) wrote :

Tom,

I completely agree. I've uncloaked my own AP, but that doesn't save me from
headaches as I visit client sites or various campuses at work.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.