iwlwifi causes total system freeze on certain WiFi speeds

Bug #1035889 reported by Hossein Atashi on 2012-08-12
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Medium
Unassigned
Precise
Undecided
Luis Henriques
Quantal
Medium
Unassigned

Bug Description

I've had an intermittent system hard freeze problem for 2 years. Today I was finally able ro reproduce it.

I have a Dell Latitude E4300 laptop with Intel WiFi Link 5100 wireless adaptor. Whenever I switch to 11Mbps speed:

$ sudo iwconfig wlan0 rate 11M

causes a total system freeze from which I cannot recover even using the Magic SysRq+REISUB key sequence. This is especially a problem because I think (but have no evidence) automatic switching to this speed causes system freeze as well (which results in intermittent problems).

It is noteworthy that I can reproduce the same crash on a Lenovo ThinkPad X230 with an Intel Ultimate-N 6300 wireless adaptor. Switching to 11Mbps results in a total system freeze on this laptop as well. I think the only related common module between these laptops is iwlwifi driver.

I don't know whether this depends on my connection too or not, but here is my connection information just in case:

$ sudo iwconfig
lo no wireless extensions.

wlan0 IEEE 802.11abgn ESSID:"XXXXXXXXXXX"
          Mode:Managed Frequency:2.437 GHz Access Point: XX:XX:XX:XX:XX:XX
          Bit Rate=21.7 Mb/s Tx-Power=15 dBm
          Retry long limit:7 RTS thr:off Fragment thr:off
          Encryption key:off
          Power Management:off
          Link Quality=51/70 Signal level=-59 dBm
          Rx invalid nwid:0 Rx invalid crypt:0 Rx invalid frag:0
          Tx excessive retries:597 Invalid misc:2117 Missed beacon:0

eth0 no wireless extensions.

In case there is any more information I can provide to help triangulate the problem further, let me know.
---
AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.24.
ApportVersion: 2.0.1-0ubuntu12
Architecture: amd64
ArecordDevices:
 **** List of CAPTURE Hardware Devices ****
 card 0: Intel [HDA Intel], device 0: STAC92xx Analog [STAC92xx Analog]
   Subdevices: 1/1
   Subdevice #0: subdevice #0
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: hossein 1933 F.... pulseaudio
Card0.Amixer.info:
 Card hw:0 'Intel'/'HDA Intel at 0xf6adc000 irq 47'
   Mixer name : 'Intel Cantiga HDMI'
   Components : 'HDA:111d76b2,1028024d,00100302 HDA:80862802,80860101,00100000'
   Controls : 26
   Simple ctrls : 12
DistroRelease: Ubuntu 12.04
HibernationDevice: RESUME=UUID=279e54ac-43a3-4a55-a62e-7bb43680e038
InstallationMedia: Ubuntu 12.04 LTS "Precise Pangolin" - Release amd64 (20120425)
MachineType: Dell Inc. Latitude E4300
Package: linux (not installed)
ProcFB: 0 inteldrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.2.0-29-generic root=UUID=d2a40456-31e0-4564-b9a1-633089047264 ro quiet splash vt.handoff=7
ProcVersionSignature: Ubuntu 3.2.0-29.46-generic 3.2.24
RelatedPackageVersions:
 linux-restricted-modules-3.2.0-29-generic N/A
 linux-backports-modules-3.2.0-29-generic N/A
 linux-firmware 1.79
Tags: precise running-unity
Uname: Linux 3.2.0-29-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: adm cdrom debian-tor dialout dip lpadmin plugdev sambashare sudo
dmi.bios.date: 12/06/2011
dmi.bios.vendor: Dell Inc.
dmi.bios.version: A24
dmi.board.name: 0T296D
dmi.board.vendor: Dell Inc.
dmi.chassis.type: 8
dmi.chassis.vendor: Dell Inc.
dmi.modalias: dmi:bvnDellInc.:bvrA24:bd12/06/2011:svnDellInc.:pnLatitudeE4300:pvr:rvnDellInc.:rn0T296D:rvr:cvnDellInc.:ct8:cvr:
dmi.product.name: Latitude E4300
dmi.sys.vendor: Dell Inc.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1035889

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete

apport information

tags: added: apport-collected precise running-unity
description: updated

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

Changed in linux (Ubuntu):
status: Incomplete → Confirmed

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

Hossein Atashi (atashi.h) wrote :

I opened a terminal running "tail -f /var/log/kern.log" while switching to 11Mbps speed. I received the following message before going into system freeze (there were also a few messages after this but this was the first one, I could not record all of them):

[5969.661000] iwlwifi 0000:0c:00.0: Invalid tbl->lq_type 0

Joseph Salisbury (jsalisbury) wrote :

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v3.5kernel[0] (Not a kernel in the daily directory) and install both the linux-image and linux-image-extra .deb packages.

Once you've tested the upstream kernel, please remove the 'needs-upstream-testing' tag. Please only remove that one tag and leave the other tags. This can be done by clicking on the yellow pencil icon next to the tag located at the bottom of the bug description and deleting the 'needs-upstream-testing' text.

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

If you are unable to test the mainline kernel, for example it will not boot, please add the tag: 'kernel-unable-to-test-upstream'.
Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.6-rc1-quantal/

Changed in linux (Ubuntu):
importance: Undecided → Medium
tags: added: needs-upstream-testing
Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Hossein Atashi (atashi.h) wrote :

Thanks for the instructions. I tried kernel 3.6-rc1-quantal and it DOES fix the problem. At least for the 4-5 times that I tried, it did not cause any system freezes, while the normal kernel 3.2.0-29 freezes within a few seconds. If there is anything I can do, let me know.

tags: added: kernel-fixed-upstream
removed: needs-upstream-testing
Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Luis Henriques (henrix) wrote :

I took a look at recent modifications to the iwlwifi driver and there was one commit that caught my attention:

50e2a30cf6fcaeb2d27360ba614dd169a10041c5 "iwlwifi: disable greenfield transmissions as a workaround"

This commit seems to be a workaround a bug on this driver that happens in some situations, when performing rate scaling.

I've built a test kernel containing a backport of this commit to the Precise kernel and uploaded it here:

http://people.canonical.com/~henrix/lp1035889/v1/amd64/

Hossein, can you please try this kernel and see if it makes any difference? Thanks.

Hossein Atashi (atashi.h) wrote :

Thank you Luis. I just tried your kernel. The problem still exists and changing too 11Mbps causes a system freeze in seconds. If there is anything else I can do, just let me know.

Thanks for your support.

Luis Henriques (henrix) wrote :

Hi Hossein, so the next step is to try to find out the last "bad" kernel and the first "good" kernel. To do this, please go:

http://kernel.ubuntu.com/~kernel-ppa/mainline/

(don't forget to uninstall tested kernels, otherwise you'll fill up your disk :) )

You can start v3.5-quantal, v3.5-rc7-quantal, v3.5-rc6-quantal, etc. Let me know the results, so that we can try to figure out where this problem was fixed. Thanks.

Hossein Atashi (atashi.h) wrote :

OK, I think I have narrowed it down. It seems that the problem exists in "v3.3.8-quantal" while it is fixed in "v3.4-rc1-precise" and "v3.4-precise". So I think any commit that fixed it should have been bewteen these two versions.
Even in the fixed kernels there are some messages(warnings/errors) in "/var/log/kern.log" after switching to 11Mbps, but then a hardware reset is requested and it is recovered. So it doesn't result in a freeze (at least based on what I understood from the messages).

If there is anything else I can do, just let me know.

Luis Henriques (henrix) wrote :

Thanks for testing. Could you please upload the (complete) kernel logs containing the messages you referred in the previous comment?

Hossein Atashi (atashi.h) wrote :
Hossein Atashi (atashi.h) wrote :

I have attached full syslog and kern.log files as well as the errors/warnings generated after switching wifi speed to 11 Mbps. Please let me know if there is anything I can do to help.

Luis Henriques (henrix) wrote :

Hi Hossein,

Sorry for taking so long coming back to you. I was able to reproduce this issue (got a laptop with similar wireless card), and I believe I found the commit that fixes it:

2655e314c4b204966008689eaf3e87ba1f38d55c iwlwifi: trace debug messages

Aparentely, the issue is caused by an overflow of debug messages, that leaves the system in an unusable state.

I've uploaded a new kernel that I would like you to test, to see if it fixes the issue:

http://people.canonical.com/~henrix/lp1035889/v2/amd64/

You will still see a few debug messages, but as far as I could understand they should be harmless. Please let me know if this works for you.

Hi Luis,

thank you very much. It indeed seems to fix the problem I have switched to
11Mbps and I have been working for a few minutes now without any freeze.

Thank you very much again.

2012/8/31 Luis Henriques <email address hidden>

> Hi Hossein,
>
> Sorry for taking so long coming back to you. I was able to reproduce
> this issue (got a laptop with similar wireless card), and I believe I
> found the commit that fixes it:
>
> 2655e314c4b204966008689eaf3e87ba1f38d55c iwlwifi: trace debug messages
>
> Aparentely, the issue is caused by an overflow of debug messages, that
> leaves the system in an unusable state.
>
> I've uploaded a new kernel that I would like you to test, to see if it
> fixes the issue:
>
> http://people.canonical.com/~henrix/lp1035889/v2/amd64/
>
> You will still see a few debug messages, but as far as I could
> understand they should be harmless. Please let me know if this works
> for you.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1035889
>
> Title:
> iwlwifi causes total system freeze on certain WiFi speeds
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1035889/+subscriptions
>

Hossein Atashi (atashi.h) wrote :

Hi Luis,

thank you very much. It indeed seems to fix the problem I have switched to 11Mbps and I have been working for a few minutes now without any freeze.

Thank you very much again.

Luis Henriques (henrix) on 2012-09-03
Changed in linux (Ubuntu):
status: Confirmed → Triaged
Tim Gardner (timg-tpi) on 2012-09-04
Changed in linux (Ubuntu Quantal):
status: Triaged → Fix Released
Changed in linux (Ubuntu Precise):
status: New → Fix Committed
assignee: nobody → Luis Henriques (henrix)
Luis Henriques (henrix) wrote :

This bug is awaiting verification that the kernel for Precise in -proposed solves the problem (3.2.0-31.50). Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-precise' to 'verification-done-precise'.

If verification is not done by one week from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-precise
Hossein Atashi (atashi.h) wrote :

Thank you very much, I tested the package in -proposed and it works flawlessly. I updated the tags accordingly.

tags: added: verification-done-precise
removed: verification-needed-precise
Launchpad Janitor (janitor) wrote :
Download full text (5.5 KiB)

This bug was fixed in the package linux - 3.2.0-31.50

---------------
linux (3.2.0-31.50) precise-proposed; urgency=low

  [Luis Henriques]

  * Release Tracking Bug
    - LP: #1047242

  [ Dave Airlie ]

  * SAUCE: drm/vmwgfx: add MODULE_DEVICE_TABLE so vmwgfx loads at boot
    - LP: #1039157

  [ Kamal Mostafa ]

  * SAUCE: input: Cypress PS/2 Trackpad move PSMOUSE_CYPRESS enum
    - LP: #1041594

linux (3.2.0-31.49) precise-proposed; urgency=low

  [Luis Henriques]

  * Release Tracking Bug
    - LP: #1046216

  [ Cypress Semiconductor Corporation ]

  * SAUCE: input: Cypress PS/2 Trackpad mouse driver
    - LP: #978807
  * SAUCE: input: Cypress PS/2 Trackpad link driver into psmouse-base
    - LP: #978807

  [ Ike Panhc ]

  * [Config] Enable CONFIG_DEVPTS_MULTIPLE_INSTANCES for highbank
    - LP: #1038259

  [ Kamal Mostafa ]

  * SAUCE: input: Cypress PS/2 Trackpad code style cleanup
    - LP: #978807
  * SAUCE: input: Cypress PS/2 Trackpad eliminate dead code
    - LP: #978807
  * SAUCE: input: Cypress PS/2 Trackpad fix no-config stubs
    - LP: #978807
  * SAUCE: input: Cypress PS/2 Trackpad set default debug_level=0
    - LP: #978807

  [ Stefan Bader ]

  * Revert "SAUCE: fix pv-ops for legacy Xen"
    - LP: #1044550
  * SAUCE: Force xsave off on older Xen hypervisors
    - LP: #1044550

  [ Tim Gardner ]

  * [Config] Add smsc{79}5xx to nic-usb-modules
    - LP: #1041397

  [ Upstream Kernel Changes ]

  * Revert "samsung-laptop: make the dmi check less strict"
    - LP: #1028151
  * rds: set correct msg_namelen
    - LP: #1031112
    - CVE-2012-3430
  * bnx2: Fix bug in bnx2_free_tx_skbs().
    - LP: #1039087
  * sch_sfb: Fix missing NULL check
    - LP: #1039087
  * sctp: Fix list corruption resulting from freeing an association on a
    list
    - LP: #1039087
  * caif: Fix access to freed pernet memory
    - LP: #1039087
  * cipso: don't follow a NULL pointer when setsockopt() is called
    - LP: #1039087
  * caif: fix NULL pointer check
    - LP: #1039087
  * wanmain: comparing array with NULL
    - LP: #1039087
  * tcp: Add TCP_USER_TIMEOUT negative value check
    - LP: #1039087
  * USB: kaweth.c: use GFP_ATOMIC under spin_lock
    - LP: #1039087
  * net: fix rtnetlink IFF_PROMISC and IFF_ALLMULTI handling
    - LP: #1039087
  * tcp: perform DMA to userspace only if there is a task waiting for it
    - LP: #1039087
  * net/tun: fix ioctl() based info leaks
    - LP: #1039087
  * e1000: add dropped DMA receive enable back in for WoL
    - LP: #1039087
  * rtlwifi: rtl8192cu: Change buffer allocation for synchronous reads
    - LP: #1039087
  * hfsplus: fix overflow in sector calculations in hfsplus_submit_bio
    - LP: #1039087
  * drm/i915: fixup seqno allocation logic for lazy_request
    - LP: #1039087
  * mac80211: cancel mesh path timer
    - LP: #1039087
  * ath9k: Add PID/VID support for AR1111
    - LP: #1039087
  * ARM: mxs: Remove MMAP_MIN_ADDR setting from mxs_defconfig
    - LP: #1039087
  * ALSA: hda - add dock support for Thinkpad T430s
    - LP: #1039087
  * cfg80211: process pending events when unregistering net device
    - LP: #1039087
  * rt61pci: fix NULL pointer dereference in config_lna_gain
    - LP: #...

Read more...

Changed in linux (Ubuntu Precise):
status: Fix Committed → Fix Released

The verification of this Stable Release Update has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regresssions.

To post a comment you must log in.