bcmwl-kernel-source fixes [Patch]

Bug #1478592 reported by Patric
16
This bug affects 3 people
Affects Status Importance Assigned to Milestone
bcmwl (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

The broadcom-wl driver have a few issues that have not been corrected in the ubuntu package.
Supplied with this bugreport is a monolithic patch that addresses some of those.

Summary/testing of patch:
- 2 days of local testing with associate/disassociate/scan between 2 access-points. (same model and running same openwrt version, so might not be a good test).
- Suspend/resume seems to work (no extensive testing there yet).
- Testing been done on a MBP 11,3 with a BCM4360 card running Ubuntu 15.04 with kernel 4.1.3 (vanilla)
- Only 4.x changes are new, rest have been in use by several projects (ubuntu/gentoo/archlinux etc) for some time.

Feel free to split it up into several patches.... This have been a locally maintained repo that have used several different sources and that's the reason for the monolithic patch.

One issue that is left is the:
        if (WARN_ON(!bss))
                return;
in net/wireless/sme.c but this does not seem to cause issues.

Tags: patch
Revision history for this message
Ubuntu Foundations Team Bug Bot (crichton) wrote :

The attachment "merged-patches.patch" seems to be a patch. If it isn't, please remove the "patch" flag from the attachment, remove the "patch" tag, and if you are a member of the ~ubuntu-reviewers, unsubscribe the team.

[This is an automated message performed by a Launchpad user owned by ~brian-murray, for any issues please contact him.]

tags: added: patch
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in bcmwl (Ubuntu):
status: New → Confirmed
Revision history for this message
Patric (pakar) wrote :

Update: There seems to still be a suspend/resume issue.

After each resume there is a chance that the wifi-card may stop working. Have not dug into this yet, but at least the system as a whole seems stable.

Behavior is that reception seems to stop working. It can transmit (and i can see packages arriving on the router) "EAPOL key" packages during the association-requests but no replies are received.

It does manage to do scans via "wl dev wlan0 scan" so the radio-part does seem to be working.

Revision history for this message
Patric (pakar) wrote : Re: [Bug 1478592] Re: bcmwl-kernel-source fixes [Patch]

Update:
Since manually doing suspend/resume it a bit tedious i started to do
some load/unload of wl.ko. after ~200 load/unload cycles the wifi
works without problem.

I have not been able to get a failure with the below for about 15
suspend/resume's, and it would be good if more people than me could
test it as a workaround.

A possible workaround, until the real issue has been resolved could be
to create "/etc/pm/sleep.d/09_brcm_wl" with the following content
BYMMW:
#!/bin/sh

case $1 in
    resume|thaw)
        date >>/var/log/brcm_wl.log
        echo "Loading wl module" >>/var/log/brcm_wl.log
        /sbin/modprobe wl
        ;;
    suspend|hibernate)
        date >>/var/log/brcm_wl.log
        echo "Removing wl module" >>/var/log/brcm_wl.log
        /sbin/rmmod wl
       sleep 5 # to allow for network-manager and other things to
finish before system is suspended. Got a few hiccups without it.
        exit 0
        ;;
esac

On Tue, Jul 28, 2015 at 1:43 PM, ldc <email address hidden> wrote:
> Update: There seems to still be a suspend/resume issue.
>
> After each resume there is a chance that the wifi-card may stop working.
> Have not dug into this yet, but at least the system as a whole seems
> stable.
>
> Behavior is that reception seems to stop working. It can transmit (and i
> can see packages arriving on the router) "EAPOL key" packages during the
> association-requests but no replies are received.
>
> It does manage to do scans via "wl dev wlan0 scan" so the radio-part
> does seem to be working.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1478592
>
> Title:
> bcmwl-kernel-source fixes [Patch]
>
> Status in bcmwl package in Ubuntu:
> Confirmed
>
> Bug description:
>
> The broadcom-wl driver have a few issues that have not been corrected in the ubuntu package.
> Supplied with this bugreport is a monolithic patch that addresses some of those.
>
> Summary/testing of patch:
> - 2 days of local testing with associate/disassociate/scan between 2 access-points. (same model and running same openwrt version, so might not be a good test).
> - Suspend/resume seems to work (no extensive testing there yet).
> - Testing been done on a MBP 11,3 with a BCM4360 card running Ubuntu 15.04 with kernel 4.1.3 (vanilla)
> - Only 4.x changes are new, rest have been in use by several projects (ubuntu/gentoo/archlinux etc) for some time.
>
>
> Feel free to split it up into several patches.... This have been a locally maintained repo that have used several different sources and that's the reason for the monolithic patch.
>
> One issue that is left is the:
> if (WARN_ON(!bss))
> return;
> in net/wireless/sme.c but this does not seem to cause issues.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/ubuntu/+source/bcmwl/+bug/1478592/+subscriptions

Revision history for this message
Andreas John (derjohn) wrote :

Hello,
I build the module with your patch. Thanks for that patch!

I work in a an environment with a cisco-based corporate Wi-Fi solution which leads to a situation that I roam often. My observation is, that I suspend, change the building and resume, the "no traffic" effect occurs. In difference to your test-setup I can be sure, that the MAC (bssid) is different, as the hotspot I was associated to is not longer in reach.

I just added the pm-script and will have a look at the effects. Are you sure that removing wl is enough? Could the cfg802-stuff have 'wrong MACs' stored?

rgds,
j

Revision history for this message
Patric (pakar) wrote :

Hi Andreas,

Thanks for the assistance... What kernel-version and model of the wifi-card are you using? Does it behave the same on both 2.4 and 5 Ghz?

For suspend/resume the pm-script seems to make it more stable atleast... When/if getting the "no traffic bug" it's enough to unload/load the module. It *may* take a few tries but the link comes back at least.. But i think that since you suspend (and unload the driver) it should perform better than before at least....

During a full day of normal usage i have lost my link once, so it's an improvement... unload / wait a few sec / load worked for me straight off today.. Yesterday i had to reload it a few times, but then i did not wait at all between unload/load, but it succeeded to come back anyway.

Main part of the patch is for 4.x support, but there is a couple of things to prevent complete system-crashes. (Think it was in 3.18 or 3.19 that behavior started.)

I'm currently just trying to figure out a way to trigger the issue here, in a reproducible way, to allow for some easier debugging.

Revision history for this message
Andreas John (derjohn) wrote :

Hello,
yes, it supports 5 GHz and 2.4GHz and it's a hidden SSID. With your two accesspoint setup, did you try to associate with one AP and turn that off or force the AP desaccociate the client?

About my chipset: I run a Macbook Pro ("rMBP") with the following broadcom chipset (which is NOT supported by b43):

03:00.0 Network controller: Broadcom Corporation BCM4360 802.11ac Wireless Network Adapter (rev 03)
03:00.0 0280: 14e4:43a0 (rev 03)

I installed the the suspend/resume script, and didn't see a "no traffic" issue yet, but I have so choose the wifi again in the kde panel after resume, even I was connected while suspending.

I keep you updated!

rgds,
derjohn

Revision history for this message
Patric (pakar) wrote :

Hi.

Ok? So you are using ubuntu 15.04? Remember that systemd does not run the /etc/pm/sleep.d scripts, i use the pm-suspend command to suspend, that does execute those.. *If* you are using systemd see the following:
http://askubuntu.com/questions/620494/ubuntu-15-04-suspend-doesnt-run-pm-suspend

I'm also running a MBP retina with the same card so should be easy to make sure we have the same issues..

Will upload a cleaner patch now.. Feel free to try it out, but should not affect much from the things you already got.

Revision history for this message
Patric (pakar) wrote :

Adding a cleaner version of the patch.

issues remaining:
- Current testing is ongoing on the "no traffic" issue, but it seems fairly stable.
- The "wl" module should be unloaded before suspend and loaded on resume. Makes it less prone to experience the "no traffic" bug. Do not know how or by who that should be handled.. ( with consideration for systemd and the old /etc/pm way of doing things on suspend/resume. )

Revision history for this message
Patric (pakar) wrote :

After a few days of testing the results are:
- During low network utlization i have seen the "no traffic" issue 2 times during 5 days. Issue can be "fixed" by unload/wait 2 seconds/load of the wl module.
- Suspend/resume - With unloading/loading the module on every resume i have not seen any issues.
- No crashes or hangs have been seen during usage or at resume as can be experienced with the standard ubuntu-patches.

Revision history for this message
Andreas John (derjohn) wrote :

Hi,
Yesterday I ran into a "no traffic" situation, which was not resolveable by unloading the wl module:

tcpdump on the wlan0 interface showed me only:

00:21:21.575404 EAPOL key (3) v2, len 117
00:21:21.865239 EAPOL key (3) v2, len 117
00:21:22.385278 EAPOL key (3) v2, len 11
...

At the same I was able to use the same (hotel) accesspoint with my android phone without trouble. After a reboot (and a update to 4.1.4), the Wifi worked.

One observation: I remember the kde network manager thing telling me the Wifi was "WPA" while it was not able to connect. After the reboot the working variant is WPA2-PSK. Maybe the wpa-supplicant got wrong info from the driver (via cfg802 ????)

rgds,
j

Revision history for this message
Andreas John (derjohn) wrote :

@ldc:
- I did observe multiple times now, that killing the wpa_supplicant + reconnecting to the wifi solved the no traffic issue.
- I can reproduce it quite good in the company by changing the floor (and thus, getting a different access point with the same ssid)

- Does that workaround work for your "no traffic" situations, too? With having looked into the code, I bet the crypto WPA stuff is done with the wrong (old, before roaming) BSSID in the keys or challenge response or whatever.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.