8086:008a iwlagn performance with Wireless-N 1030

Bug #919579 reported by markdv77 on 2012-01-21
20
This bug affects 4 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Medium
Unassigned

Bug Description

I have Dell XPS17 laptop with "Intel Corporation Centrino Wireless-N 1030 (rev 34)"
controller and have had problems with the wireless connection since day 1 with 11.10.

After a clean 11.10 install the performance of the wireless connection would degrade to
unusable levels fairly quickly. Symptoms where slow "b/g" transfer rates,
even though iwconfig would show "N" connection speeds, and high latencies causing
problems with DNS lookups.

"Google told me" to try to disable "N" by loading the module with "11n_disable=1"
and/or try the latest firmware. I tried both. The new firmware didn't help at all.
Disabling "N" speeds helped but I was still plagued by latencies sometimes going
up into 200ms+ range making the network sluggish. Reloading the iwlagn module
would restore the connection temporarily.

[ The firmware version I currently have installed is from
[ http://www.intellinuxwireless.org/iwlwifi/downloads/iwlwifi-6000g2b-ucode-17.168.5.2.tgz
[ Which contains version 17.168.5.2 build 35905.
[ The one in the current linux-firmware is 17.168.5.1 build 33993.

Today I decided to try the latest proposed kernel hoping the iwlagn changes it
contains would fix the problem. Unfortunately not.

The transfer rate still drops to "b/g" speeds quickly during the first large network
transfer after reboot or iwlagn reload.
The only improvement is that even though the transfer rate drops, the latency remains
good.

I can reproduce the problem - i.e. trigger the degradation to "b/g" speeds - reliably just
by performing a scp of a file from my home server/router to the laptop. I've ruled out
problems on the server-side or other network elements by performing the same transfer
and other tests with other wireless devices on the same network.

I can clearly see it happening by running a ping to the server while running the scp.
Below is what I observe.

I start pinging my home router/server/etc (linux box).

Sat Jan 21 10:10:14 2012: 64 bytes from ... icmp_req=59 ttl=64 time=2.35 ms
Sat Jan 21 10:10:15 2012: 64 bytes from ... icmp_req=60 ttl=64 time=2.42 ms
Sat Jan 21 10:10:16 2012: 64 bytes from ... icmp_req=61 ttl=64 time=2.41 ms
Sat Jan 21 10:10:17 2012: 64 bytes from ... icmp_req=62 ttl=64 time=2.41 ms

I scp a file from router to laptop. Initially transfer-rate is around 7MB/s.
ping times go up a little but this is to be expected:

Sat Jan 21 10:10:18 2012: 64 bytes from ... icmp_req=63 ttl=64 time=3.77 ms
Sat Jan 21 10:10:19 2012: 64 bytes from ... icmp_req=64 ttl=64 time=7.75 ms
Sat Jan 21 10:10:20 2012: 64 bytes from ... icmp_req=65 ttl=64 time=8.43 ms
Sat Jan 21 10:10:21 2012: 64 bytes from ... icmp_req=66 ttl=64 time=12.0 ms
Sat Jan 21 10:10:22 2012: 64 bytes from ... icmp_req=67 ttl=64 time=13.6 ms
Sat Jan 21 10:10:23 2012: 64 bytes from ... icmp_req=68 ttl=64 time=14.7 ms
Sat Jan 21 10:10:24 2012: 64 bytes from ... icmp_req=69 ttl=64 time=20.7 ms
Sat Jan 21 10:10:25 2012: 64 bytes from ... icmp_req=70 ttl=64 time=24.7 ms
Sat Jan 21 10:10:26 2012: 64 bytes from ... icmp_req=71 ttl=64 time=39.3 ms
Sat Jan 21 10:10:27 2012: 64 bytes from ... icmp_req=72 ttl=64 time=54.9 ms
Sat Jan 21 10:10:28 2012: 64 bytes from ... icmp_req=73 ttl=64 time=83.5 ms
Sat Jan 21 10:10:29 2012: 64 bytes from ... icmp_req=74 ttl=64 time=105 ms

Ping times already started to increase near the end there. And then suddenly
scp tels me the transfer "-stalled-". The ping also stops. And 16 seconds
later everything resumes. Note that aparently no sent packets where lost,
just delayed for 16 seconds:

Sat Jan 21 10:10:46 2012: 64 bytes from ... icmp_req=75 ttl=64 time=16404 ms
Sat Jan 21 10:10:46 2012: 64 bytes from ... icmp_req=76 ttl=64 time=15397 ms
Sat Jan 21 10:10:46 2012: 64 bytes from ... icmp_req=77 ttl=64 time=14390 ms
Sat Jan 21 10:10:46 2012: 64 bytes from ... icmp_req=78 ttl=64 time=13383 ms
Sat Jan 21 10:10:46 2012: 64 bytes from ... icmp_req=79 ttl=64 time=12375 ms
Sat Jan 21 10:10:46 2012: 64 bytes from ... icmp_req=80 ttl=64 time=11367 ms
Sat Jan 21 10:10:47 2012: 64 bytes from ... icmp_req=81 ttl=64 time=10360 ms
Sat Jan 21 10:10:47 2012: 64 bytes from ... icmp_req=82 ttl=64 time=9353 ms
Sat Jan 21 10:10:47 2012: 64 bytes from ... icmp_req=83 ttl=64 time=8345 ms
Sat Jan 21 10:10:47 2012: 64 bytes from ... icmp_req=84 ttl=64 time=7338 ms
Sat Jan 21 10:10:47 2012: 64 bytes from ... icmp_req=85 ttl=64 time=6330 ms
Sat Jan 21 10:10:47 2012: 64 bytes from ... icmp_req=86 ttl=64 time=5323 ms
Sat Jan 21 10:10:47 2012: 64 bytes from ... icmp_req=87 ttl=64 time=4317 ms
Sat Jan 21 10:10:47 2012: 64 bytes from ... icmp_req=88 ttl=64 time=3312 ms
Sat Jan 21 10:10:47 2012: 64 bytes from ... icmp_req=89 ttl=64 time=2305 ms
Sat Jan 21 10:10:47 2012: 64 bytes from ... icmp_req=90 ttl=64 time=1298 ms
Sat Jan 21 10:10:47 2012: 64 bytes from ... icmp_req=91 ttl=64 time=291 ms
Sat Jan 21 10:10:47 2012: 64 bytes from ... icmp_req=92 ttl=64 time=91.6 ms

At this point the transfer is resumed but at a much lower rate. Eventhoug
iwconfig still shows the connection at "N" speeds the actual transfer rate
is more "b/g"-ish. The ping times have increased compared to the first part
of the transfer:

Sat Jan 21 10:10:48 2012: 64 bytes from ... icmp_req=93 ttl=64 time=68.3 ms
Sat Jan 21 10:10:49 2012: 64 bytes from ... icmp_req=94 ttl=64 time=64.0 ms
Sat Jan 21 10:10:50 2012: 64 bytes from ... icmp_req=95 ttl=64 time=85.0 ms
Sat Jan 21 10:10:51 2012: 64 bytes from ... icmp_req=96 ttl=64 time=115 ms
Sat Jan 21 10:10:52 2012: 64 bytes from ... icmp_req=97 ttl=64 time=133 ms
Sat Jan 21 10:10:53 2012: 64 bytes from ... icmp_req=98 ttl=64 time=156 ms
Sat Jan 21 10:10:55 2012: 64 bytes from ... icmp_req=99 ttl=64 time=97.4 ms
Sat Jan 21 10:10:55 2012: 64 bytes from ... icmp_req=100 ttl=64 time=205 ms

I abort the transfer. Notice how the ping times are now _lower_ than before.
Interrestingly they are the same as when I load the module with "11n_disable=1":

Sat Jan 21 10:10:56 2012: 64 bytes from ... icmp_req=101 ttl=64 time=1.15 ms
Sat Jan 21 10:10:57 2012: 64 bytes from ... icmp_req=102 ttl=64 time=1.13 ms
Sat Jan 21 10:10:58 2012: 64 bytes from ... icmp_req=103 ttl=64 time=1.15 ms
Sat Jan 21 10:10:59 2012: 64 bytes from ... icmp_req=104 ttl=64 time=1.36 ms
Sat Jan 21 10:11:00 2012: 64 bytes from ... icmp_req=105 ttl=64 time=1.13 ms
Sat Jan 21 10:11:01 2012: 64 bytes from ... icmp_req=106 ttl=64 time=2.92 ms

If I restart the transfer at this pint it completes without incident at around
2 MB/s. So despite what iwconfig tells me the device seems to operate in "b/g"
mode since the stall.

Other then the decreased performance I havn't noticed any other problems.
Before the upgrade to the latest (proposed) kernel the connection would
be nearly unusable after something like this. Latencies on all packets would
be in the 300-400ms range making browsing a terrible experiance.

So the latest kernel did fix some issues, unforunately not the lapse from N to b/g
performance.

ProblemType: Bug
DistroRelease: Ubuntu 11.10
Package: linux-image-3.0.0-15-server 3.0.0-15.25
ProcVersionSignature: Ubuntu 3.0.0-15.25-server 3.0.13
Uname: Linux 3.0.0-15-server x86_64
NonfreeKernelModules: nvidia
AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.24.
ApportVersion: 1.23-0ubuntu4
Architecture: amd64
ArecordDevices:
 **** List of CAPTURE Hardware Devices ****
 card 0: PCH [HDA Intel PCH], device 0: ALC665 Analog [ALC665 Analog]
   Subdevices: 1/1
   Subdevice #0: subdevice #0
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC1: markdv 6696 F.... pulseaudio
 /dev/snd/controlC0: markdv 6696 F.... pulseaudio
CRDA: Error: [Errno 2] No such file or directory
Card0.Amixer.info:
 Card hw:0 'PCH'/'HDA Intel PCH at 0xd7500000 irq 52'
   Mixer name : 'Realtek ALC665'
   Components : 'HDA:10ec0665,10280570,00100003'
   Controls : 23
   Simple ctrls : 12
Card1.Amixer.info:
 Card hw:1 'NVidia'/'HDA NVidia at 0xd6000000 irq 17'
   Mixer name : 'Nvidia GPU 15 HDMI/DP'
   Components : 'HDA:10de0015,10de0101,00100100'
   Controls : 16
   Simple ctrls : 4
Date: Sat Jan 21 11:32:05 2012
InstallationMedia: Ubuntu-Server 11.10 "Oneiric Ocelot" - Release amd64 (20111011)
MachineType: Dell Inc. Dell System XPS L702X
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.0.0-15-server root=UUID=36901702-186b-490a-91e0-5d35fa3e6f32 ro nodmraid
RelatedPackageVersions:
 linux-restricted-modules-3.0.0-15-server N/A
 linux-backports-modules-3.0.0-15-server N/A
 linux-firmware 1.60
SourcePackage: linux
StagingDrivers: mei
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 11/11/2011
dmi.bios.vendor: Dell Inc.
dmi.bios.version: A14
dmi.board.asset.tag: Base Board Asset Tag
dmi.board.name: 03RG89
dmi.board.vendor: Dell Inc.
dmi.board.version: FAB1
dmi.chassis.type: 8
dmi.chassis.vendor: Dell Inc.
dmi.chassis.version: 0.1
dmi.modalias: dmi:bvnDellInc.:bvrA14:bd11/11/2011:svnDellInc.:pnDellSystemXPSL702X:pvr:rvnDellInc.:rn03RG89:rvrFAB1:cvnDellInc.:ct8:cvr0.1:
dmi.product.name: Dell System XPS L702X
dmi.sys.vendor: Dell Inc.

markdv77 (markdv77) wrote :
Brad Figg (brad-figg) on 2012-01-21
Changed in linux (Ubuntu):
status: New → Confirmed
Joseph Salisbury (jsalisbury) wrote :

Would it be possible for you to test the latest upstream kernel? It will allow additional upstream developers to examine the issue. Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . If possible, please test the latest v3.2 kernel[1] (Not a kernel in the daily directory). Once you've tested the upstream kernel, please remove the 'needs-upstream-testing' tag(Only that one tag, please leave the other tags). This can be done by clicking on the yellow pencil icon next to the tag located at the bottom of the bug description and deleting the 'needs-upstream-testing' text.

If this bug is fixed by the mainline kernel, please add the following tag 'kernel-fixed-upstream-KERNEL-VERSION'. For example, if kernel version 3.2-rc1 fixed the issue, the tag would be: 'kernel-fixed-upstream-v3.2-rc1'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

If you are unable to test the mainline kernel, for example it will not boot, please add the tag: 'kernel-unable-to-test-upstream'. If you believe this bug does not require upstream testing, please add the tag: 'kernel-upstream-testing-not-needed'.

Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[1] http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.2-precise/

Changed in linux (Ubuntu):
importance: Undecided → Medium
tags: added: kernel-da-key needs-upstream-testing
Changed in linux (Ubuntu):
status: Confirmed → Incomplete
markdv77 (markdv77) on 2012-01-25
tags: added: kernel-bug-exists-upstream
removed: needs-upstream-testing
markdv77 (markdv77) wrote :

Tested with http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.2-precise/linux-image-3.2.0-030200-generic_3.2.0-030200.201201042035_amd64.deb

The transfer still stalls for ~16 seconds at times. One difference is that it the link speed is not limited to b/g speeds when it recovers but resumes at N speed. (Which allows it to stall multiple times during the same 600M iso I used for testing.)

So, problem is the same, just the manner in which it 'recovers' is slightly different.

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
markdv77 (markdv77) wrote :

Thanks, That's great... #836250 is such a mess I'm sure it'll never be figured out.

summary: - iwlagn performance with Wireless-N 1030
+ 8086:008a iwlagn performance with Wireless-N 1030
To post a comment you must log in.