rt2x00 random stalls in AP Mode - 12.04 Beta 1

Bug #965043 reported by Ted Strong
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Expired
Medium
Unassigned

Bug Description

After upgrading to 12.04 Beta 1, from 11.10 , rt2x00 wireless driver is stalling quit often in AP mode.

I am running hostapd 0.7.3, and clients are able to connect and browse small web pages, however, if a client attempts to download a large file over wifi, the connection will stall. Other clients are unaffected.

There is nothing displayed in dmesg or syslog regarding the issue. After a stall, the client must wait or re-associate with the AP to re-activate the connection.

AP mode with rt2x00 driver was working fine in 11.10

There are some recent posts regarding the issue, and a patch was released http://www.spinics.net/lists/linux-wireless/msg86199.html , however this patch seems to already be included in the ubuntu kernel source, yet the problem stills persists?
---
AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.24.
ApportVersion: 1.95-0ubuntu1
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: wrostek 5064 F.... pulseaudio
DistroRelease: Ubuntu 12.04
HibernationDevice: RESUME=UUID=c2cba4ed-595d-4cec-970b-2f5c26cf3b75
InstallationMedia: Ubuntu 10.04 LTS "Lucid Lynx" - Release amd64 (20100429)
MachineType: System manufacturer System Product Name
NonfreeKernelModules: nvidia
Package: linux (not installed)
ProcEnviron:
 TERM=xterm
 PATH=(custom, no user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcFB: 0 VESA VGA
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-3.2.0-20-generic root=UUID=3472dad6-5116-46ce-bec2-774b0478eff0 ro nomodeset
ProcVersionSignature: Ubuntu 3.2.0-20.32-generic 3.2.12
PulseList: Error: command ['pacmd', 'list'] failed with exit code 1: No PulseAudio daemon running, or not running as session daemon.
RfKill:
 0: phy0: Wireless LAN
  Soft blocked: no
  Hard blocked: no
Tags: precise
Uname: Linux 3.2.0-20-generic x86_64
UpgradeStatus: Upgraded to precise on 2012-03-25 (1 days ago)
UserGroups: audio mythtv pulse pulse-access
dmi.bios.date: 03/25/2010
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: 0806
dmi.board.asset.tag: To Be Filled By O.E.M.
dmi.board.name: P7P55D-E
dmi.board.vendor: ASUSTeK Computer INC.
dmi.board.version: Rev 1.xx
dmi.chassis.asset.tag: Asset-1234567890
dmi.chassis.type: 3
dmi.chassis.vendor: Chassis Manufacture
dmi.chassis.version: Chassis Version
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr0806:bd03/25/2010:svnSystemmanufacturer:pnSystemProductName:pvrSystemVersion:rvnASUSTeKComputerINC.:rnP7P55D-E:rvrRev1.xx:cvnChassisManufacture:ct3:cvrChassisVersion:
dmi.product.name: System Product Name
dmi.product.version: System Version
dmi.sys.vendor: System manufacturer

Revision history for this message
Wolfgang Kufner (wolfgangkufner) wrote :

Changing target package to linux (the rt2x00 drivers are part of the linux kernel). This separate old rt2x00 package is just a red herring misdirecting the unsuspecting bug reporter. It does not seem to do anything useful anymore (see https://launchpad.net/ubuntu/+source/rt2x00/+publishinghistory ).

affects: rt2x00 (Ubuntu) → linux (Ubuntu)
Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 965043

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
tags: added: precise
Revision history for this message
Ted Strong (tstrong) wrote : .etc.asound.conf.txt

apport information

tags: added: apport-collected
description: updated
Revision history for this message
Ted Strong (tstrong) wrote : AcpiTables.txt

apport information

Revision history for this message
Ted Strong (tstrong) wrote : AlsaDevices.txt

apport information

Revision history for this message
Ted Strong (tstrong) wrote : AplayDevices.txt

apport information

Revision history for this message
Ted Strong (tstrong) wrote : ArecordDevices.txt

apport information

Revision history for this message
Ted Strong (tstrong) wrote : BootDmesg.txt

apport information

Revision history for this message
Ted Strong (tstrong) wrote : CRDA.txt

apport information

Revision history for this message
Ted Strong (tstrong) wrote : Card0.Amixer.info.txt

apport information

Revision history for this message
Ted Strong (tstrong) wrote : Card0.Amixer.values.txt

apport information

Revision history for this message
Ted Strong (tstrong) wrote : Card0.Codecs.codec.0.txt

apport information

Revision history for this message
Ted Strong (tstrong) wrote : Card1.Amixer.info.txt

apport information

Revision history for this message
Ted Strong (tstrong) wrote : Card1.Amixer.values.txt

apport information

Revision history for this message
Ted Strong (tstrong) wrote : CurrentDmesg.txt

apport information

Revision history for this message
Ted Strong (tstrong) wrote : IwConfig.txt

apport information

Revision history for this message
Ted Strong (tstrong) wrote : Lspci.txt

apport information

Revision history for this message
Ted Strong (tstrong) wrote : Lsusb.txt

apport information

Revision history for this message
Ted Strong (tstrong) wrote : PciMultimedia.txt

apport information

Revision history for this message
Ted Strong (tstrong) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
Ted Strong (tstrong) wrote : ProcInterrupts.txt

apport information

Revision history for this message
Ted Strong (tstrong) wrote : ProcModules.txt

apport information

Revision history for this message
Ted Strong (tstrong) wrote : RelatedPackageVersions.txt

apport information

Revision history for this message
Ted Strong (tstrong) wrote : UdevDb.txt

apport information

Revision history for this message
Ted Strong (tstrong) wrote : UdevLog.txt

apport information

Revision history for this message
Ted Strong (tstrong) wrote : WifiSyslog.txt

apport information

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
penalvch (penalvch)
tags: added: regression-release
tags: added: amd64
Revision history for this message
penalvch (penalvch) wrote :

Ted Strong, thank you for reporting this and helping make Ubuntu better. The next step is to perform a kernel bisection to find out which commit caused this regression. Could you please do so following https://wiki.ubuntu.com/Kernel/KernelBisection ?

Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
Ted Strong (tstrong) wrote :

After testing out the various commits in drivers/net/wireless/rt2x00 , I can tell you it is this one causing the problem:

commit 8a3a3c85e44d58f5af0adac74a0b866ba89a1978
Author: Eliad Peller <email address hidden>
Date: Sun Oct 2 10:15:52 2011 +0200

    mac80211: pass vif param to conf_tx() callback

    tx params should be configured per interface.
    add ieee80211_vif param to the conf_tx callback,
    and change all the drivers that use this callback.

    The following spatch was used:
    @rule1@
    struct ieee80211_ops ops;
    identifier conf_tx_op;
    @@
        ops.conf_tx = conf_tx_op;

    @rule2@
    identifier rule1.conf_tx_op;
    identifier hw, queue, params;
    @@
        conf_tx_op (
    - struct ieee80211_hw *hw,
    + struct ieee80211_hw *hw, struct ieee80211_vif *vif,
                u16 queue,
                const struct ieee80211_tx_queue_params *params) {...}

    Signed-off-by: Eliad Peller <email address hidden>
    Signed-off-by: John W. Linville <email address hidden>

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Brad Figg (brad-figg) wrote : Test with newer development kernel (3.2.0-20.33)

Thank you for taking the time to file a bug report on this issue.

However, given the number of bugs that the Kernel Team receives during any development cycle it is impossible for us to review them all. Therefore, we occasionally resort to using automated bots to request further testing. This is such a request.

We have noted that there is a newer version of the development kernel than the one you last tested when this issue was found. Please test again with the newer kernel and indicate in the bug if this issue still exists or not.

You can update to the latest development kernel by simply running the following commands in a terminal window:

    sudo apt-get update
    sudo apt-get dist-upgrade

If the bug still exists, change the bug status from Incomplete to Confirmed. If the bug no longer exists, change the bug status from Incomplete to Fix Released.

If you want this bot to quit automatically requesting kernel tests, add a tag named: bot-stop-nagging.

 Thank you for your help, we really do appreciate it.

Changed in linux (Ubuntu):
status: Confirmed → Incomplete
tags: added: kernel-request-3.2.0-20.33
Revision history for this message
Ted Strong (tstrong) wrote :

Sorry, it looks like the problem is occuring again.

 I dont think the patch I listed above was the problem, It must exist earlier than this.

 I will take some additional time to do more extensive testing.

penalvch (penalvch)
tags: added: bot-stop-nagging
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

@Ted, did you perform a bisect to identify commit 8a3a3c85e44d58f5af0adac74a0b866ba89a1978 as the root cause? If so, did you build a test kernel with commit 8a3a3c85e44d58f5af0adac74a0b866ba89a1978 reverted and still hit the bug?

Changed in linux (Ubuntu):
importance: Undecided → Medium
status: Incomplete → Confirmed
Revision history for this message
Ted Strong (tstrong) wrote :

Sorry, No, I was only compiling the rt2x00 module, not the entire kernel.

I am starting a proper full kernel bisect now, and will report the results here when finished...

Thanks

Revision history for this message
Ted Strong (tstrong) wrote :

The Kernel Bisect is finally completed.

I found the problem was caused by this commit:

commit f0425beda4d404a6e751439b562100b902ba9c98
Author: Felix Fietkau <email address hidden>
Date: Sun Aug 28 21:11:01 2011 +0200

    mac80211: retry sending failed BAR frames later instead of tearing down aggr

    Unfortunately failed BAR tx attempts happen more frequently than I
    expected, and the resulting aggregation teardowns cause performance
    issues, as the aggregation session does not always get re-established
    properly.
    Instead of tearing down the entire aggr session, we can simply store the
    SSN of the last failed BAR tx attempt, wait for the first successful
    tx status event, and then send another BAR with the same SSN.

    Signed-off-by: Felix Fietkau <email address hidden>
    Cc: Helmut Schaa <email address hidden>
    Signed-off-by: John W. Linville <email address hidden>

Upon further investigation, I have found some discussion on the same issue:

http://marc.info/?l=linux-wireless&m=132534063601847&w=2

And a possible patch to the rt2x00 driver to fix the issue:

http://permalink.gmane.org/gmane.linux.drivers.rt2x00.user/569

I will apply the patch to my rt2x00 drivers later today and report back my findings...

penalvch (penalvch)
tags: removed: kernel-request-3.2.0-20.33
tags: added: kernel-da-key
Revision history for this message
Ted Strong (tstrong) wrote :

I have tested the patch from http://permalink.gmane.org/gmane.linux.drivers.rt2x00.user/569 however, it is causing Ubuntu to freeze/crash.

Now looking for other possible patches...

Revision history for this message
penalvch (penalvch) wrote :

Ted Strong, if you could also please test the latest upstream kernel available that would be great. It will allow additional upstream developers to examine the issue. Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Once you've tested the upstream kernel, please remove the 'needs-upstream-testing' tag. This can be done by clicking on the yellow pencil icon next to the tag located at the bottom of the bug description and deleting the 'needs-upstream-testing' text. Please let us know your results.

Thanks in advance.

tags: added: bisect-done
tags: added: needs-upstream-testing
Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for linux (Ubuntu) because there has been no activity for 60 days.]

Changed in linux (Ubuntu):
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.