Ubuntu

10ec:8171 r8192se_pci + powertop (iwpriv -a) = kernel panic

Reported by Alex Wauck on 2010-05-26
98
This bug affects 17 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Medium
Unassigned
Lucid
Medium
Tim Gardner

Bug Description

Steps to reproduce:
1. Make sure r8192se_pci is loaded (I don't know if RTL8192SE hardware is necessary).
2. Run powertop as root.
3. BOOM!

ProblemType: Bug
DistroRelease: Ubuntu 10.04
Package: linux-image-2.6.32-22-generic 2.6.32-22.33
Regression: No
Reproducible: Yes
ProcVersionSignature: Ubuntu 2.6.32-22.33-generic 2.6.32.11+drm33.2
Uname: Linux 2.6.32-22-generic x86_64
AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.21.
AplayDevices:
 **** List of PLAYBACK Hardware Devices ****
 card 0: SB [HDA ATI SB], device 0: ALC269 Analog [ALC269 Analog]
   Subdevices: 1/1
   Subdevice #0: subdevice #0
Architecture: amd64
ArecordDevices:
 **** List of CAPTURE Hardware Devices ****
 card 0: SB [HDA ATI SB], device 0: ALC269 Analog [ALC269 Analog]
   Subdevices: 1/1
   Subdevice #0: subdevice #0
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: alex 1716 F.... kmix
CRDA: Error: [Errno 2] No such file or directory
Card0.Amixer.info:
 Card hw:0 'SB'/'HDA ATI SB at 0xfbbf8000 irq 16'
   Mixer name : 'Realtek ALC269'
   Components : 'HDA:10ec0269,104383ce,00100004'
   Controls : 12
   Simple ctrls : 7
Date: Wed May 26 11:53:18 2010
InstallationMedia: Kubuntu 10.04 LTS "Lucid Lynx" - Release amd64 (20100427)
MachineType: ASUSTeK Computer INC. 1201T
ProcCmdLine: BOOT_IMAGE=/vmlinuz-2.6.32-22-generic root=UUID=e550d359-f2fd-47d9-a0f8-737b8ee2ffd6 ro quiet splash
ProcEnviron:
 LANGUAGE=
 LANG=en_US.UTF-8
 SHELL=/bin/bash
RelatedPackageVersions: linux-firmware 1.34
RfKill:
 0: hci0: Bluetooth
  Soft blocked: no
  Hard blocked: no
SourcePackage: linux
StagingDrivers: r8192_pci
Title: [STAGING]
dmi.bios.date: 02/02/2010
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: 0317
dmi.board.asset.tag: To Be Filled By O.E.M.
dmi.board.name: 1201T
dmi.board.vendor: ASUSTeK Computer INC.
dmi.board.version: x.xx
dmi.chassis.asset.tag: 0x00000000
dmi.chassis.type: 10
dmi.chassis.vendor: ASUSTeK Computer INC.
dmi.chassis.version: x.x
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr0317:bd02/02/2010:svnASUSTeKComputerINC.:pn1201T:pvrx.x:rvnASUSTeKComputerINC.:rn1201T:rvrx.xx:cvnASUSTeKComputerINC.:ct10:cvrx.x:
dmi.product.name: 1201T
dmi.product.version: x.x
dmi.sys.vendor: ASUSTeK Computer INC.

Alex Wauck (awauck) wrote :
Alex Wauck (awauck) wrote :

This bug also occurs when I use the mainline 2.6.34 kernel with the latest r8192se_pci driver from Realtek's website. Therefore, I believe the r8192se_pci module is ultimately responsible.

Alex Wauck (awauck) wrote :

The r8192se_pci driver seems a bit flaky in general. I would appreciate it if someone familiar with wireless driver code would take a quick look at the r8192se_pci code and tell me if it is, in fact, a disgusting collection of kludges, as I am beginning to suspect that it is.

Jeremy Foshee (jeremyfoshee) wrote :

Hi Alex,

If you could also please test the latest upstream kernel available that would be great. It will allow additional upstream developers to examine the issue. Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Once you've tested the upstream kernel, please remove the 'needs-upstream-testing' tag. This can be done by clicking on the yellow pencil icon next to the tag located at the bottom of the bug description and deleting the 'needs-upstream-testing' text. Please let us know your results.

Thanks in advance.

    [This is an automated message. Apologies if it has reached you inappropriately; please just reply to this message indicating so.]

tags: added: kj-triage
Changed in linux (Ubuntu):
status: New → Incomplete
Alex Wauck (awauck) wrote :

As I said, this problem still occurs with 2.6.34 mainline kernel and latest r8192se_pci from Realtek. I have a sneaking suspcicion that the root cause is that the code is crap.

tags: removed: needs-upstream-testing
Alex Wauck (awauck) on 2010-06-02
Changed in linux (Ubuntu):
status: Incomplete → New
Alex Wauck (awauck) wrote :

I have attached the backtrace when I ran powertop on the mainline 2.6.35-rc1 kernel with latest r8192se_pci from Realtek.

Matt Price (matt-price) wrote :

Alex, any movement on this? e.g. is there a new upstream rtl8192se driver? looks like lp:592745 might be a duplicate, so it's affecting others as well. thanks, matt

Alex Wauck (awauck) wrote :

There is indeed a new version. I am trying it now.

Alex Wauck (awauck) wrote :

Same problem with the newer driver (0017.0507.2010) on 2.6.32. Corruption is now at ffffffff8145ffbd. fglrx 10.6 is in use.

ZachG (zgold550) wrote :

(Thanks matt for forwarding me this bug link)

I am also affected by this. I am using stock open source ati driver. Is there any other information I can provide or anything I can do to help debug this? A quick google doesn't reveal any good sources for other versions of the driver for this card.

Alex Wauck (awauck) wrote :

This bug is also present with 2.6.35-rc3 and r8192se_pci 0017.0507.2010 (one firmware file changed in this version; I made sure that I am using that one.).

Matt Price (matt-price) wrote :

I have also checked with the recent update, 0017.0525.2010. The error persists there as well. Currently corresponding with Roger at Realtek who has been open so far.

Alex Wauck (awauck) wrote :

Is there any effort to get this code upstream? I strongly suspect the code could use a thorough review by the kernel devs.

When you say upstream, do you mean moving the module to open source &
kernel inclusion? i really doubt they have any interest. Though perhaps
if someone who actually knows something spoke with them, the Realtek
people would make some progress.

As far as I can tell, the module is open source. It's just not part of the mainline kernel. I think we should push all open-source drivers that aren't part of the mainline kernel to aim for inclusion. Even if it doesn't get accepted, the kernel devs should at least provide some direction for improvement, so it can be improved and included in the future.

ZachG (zgold550) wrote :

Matt, What so far has been your conversation with realtek? Can you get them to possibly bring the discussion into the open (perhaps even here) so others can pick at their brains and maybe help come up with patches? (I myself would be interested in taking a peak if perhaps somebody could provide a good launching direction)

Matt Price (matt-price) wrote :

Zach, sorry for the delay. so far no real help from realtek though they are at least responding, and I think they want to help. I'm attaching the tarball they sent me with a new version -- it makes no difference to me but might make some difference to others. They say they can't reproduce the bug with powertop, though that seems unlikely to me. Anyway I have asked them to subscribe to this bug and am also writing to the powertop list asking for pointers... this is pretty frustrating but maybe it will eventually get resolved.

Here are the instructions they sent me:

Please find the latest RTL8191SE Linux driver source in the attachment and
try again.
You should clear previous drive or inbox driver first after you install this
driver source.
The previous driver will be stored within
/lib/modules/2.6.XXX/kernel/driver/staging.
Please remove "r8192se_pci.ko" files by following command.
1. sudo su (you should input you root password after it)
2. find /lib/modules/ -name "r8192se_*.ko" -exec ls -l {} \;
3. find /lib/modules/ -name "r8192se_*.ko" -exec rm {} \;

After, You could execute "find /lib/modules/ -name "r8192se_*.ko" -exec ls -l {} \;" to confirm it's clear properly.
And then, install this driver source as below steps. Don't forget to extract
this package before you install it.
1. sudo su
2. make
3. make install
4. reboot
5. ./wlan0up or ./wlan1up

---------------------

they also suggested installing the new firmware before building the module -- this is easy, just move the old /lib/firmware/RTL8192SE folder somewhere safe, and replace it with the firmware/RTL8192SE folder you find in the tarball after extraction.

hope that helps..

Matt Price (matt-price) wrote :
Download full text (4.3 KiB)

i was able to get an strace from powrtop (loo\gging in by ssh from a scrr\een session on a remote computer) and attach the log here. it looks like the issue is triggered by "iwpriv -a" -- here's what looks like the relevant part of the log (at the end):

[pid 2077] execve("/bin/sh", ["sh", "-c", "/sbin/iwpriv -a 2> /dev/null"], [/* 19 vars */] <unfinished ...>
[pid 2064] mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f884e1ea000
[pid 2064] read(4, <unfinished ...>
[pid 2077] <... execve resumed> ) = 0
[pid 2077] brk(0) = 0x2590000
[pid 2077] access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
[pid 2077] mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f5d2ef07000
[pid 2077] access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
[pid 2077] open("/etc/ld.so.cache", O_RDONLY) = 4
[pid 2077] fstat(4, {st_mode=S_IFREG|0644, st_size=131558, ...}) = 0
[pid 2077] mmap(NULL, 131558, PROT_READ, MAP_PRIVATE, 4, 0) = 0x7f5d2eee6000
[pid 2077] close(4) = 0
[pid 2077] access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
[pid 2077] open("/lib/libc.so.6", O_RDONLY) = 4
[pid 2077] read(4, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0`\355\1\0\0\0\0\0"..., 832) = 832
[pid 2077] fstat(4, {st_mode=S_IFREG|0755, st_size=1572232, ...}) = 0
[pid 2077] mmap(NULL, 3680296, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 4, 0) = 0x7f5d2e966000
[pid 2077] mprotect(0x7f5d2eae0000, 2093056, PROT_NONE) = 0
[pid 2077] mmap(0x7f5d2ecdf000, 20480, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 4, 0x179000) = 0x7f5d2ecdf000
[pid 2077] mmap(0x7f5d2ece4000, 18472, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f5d2ece4000
[pid 2077] close(4) = 0
[pid 2077] mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f5d2eee5000
[pid 2077] mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f5d2eee4000
[pid 2077] mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f5d2eee3000
[pid 2077] arch_prctl(ARCH_SET_FS, 0x7f5d2eee4700) = 0
[pid 2077] mprotect(0x7f5d2ecdf000, 16384, PROT_READ) = 0
[pid 2077] mprotect(0x617000, 4096, PROT_READ) = 0
[pid 2077] mprotect(0x7f5d2ef09000, 4096, PROT_READ) = 0
[pid 2077] munmap(0x7f5d2eee6000, 131558) = 0
[pid 2077] getpid() = 2077
[pid 2077] rt_sigaction(SIGCHLD, {SIG_DFL, [CHLD], SA_RESTORER|SA_RESTART, 0x7f5d2e999af0}, {SIG_DFL, [], 0}, 8) = 0
[pid 2077] geteuid() = 0
[pid 2077] brk(0) = 0x2590000
[pid 2077] brk(0x25b1000) = 0x25b1000
[pid 2077] getppid() = 2064
[pid 2077] stat("/root", {st_mode=S_IFDIR|0700, st_size=4096, ...}) = 0
[pid 2077] stat(".", {st_mode=S_IFDIR|0700, st_size=4096, ...}) = 0
[pid 2077] rt_sigaction(SIGINT, NULL, {SIG_DFL, [], 0}, 8) = 0
[pid 2077] rt_sigaction(SIGINT, {0x408189, ~[RTMIN RT_1], SA_RESTORER, 0x7f5d2e999af0}, NULL, 8) = 0
[pid 2077] rt_sigaction(SIGQUIT, NULL, {SIG_DFL, [], 0}, 8) = 0
...

Read more...

Kees Cook (kees) on 2010-07-03
Changed in linux (Ubuntu):
status: New → Confirmed
importance: Undecided → Medium
assignee: nobody → Canonical Kernel Team (canonical-kernel-team)
Matt Price (matt-price) wrote :

and finally, the strace of iwpriv -a itself:

root@roke:~# strace -f iwpriv -a
execve("/sbin/iwpriv", ["iwpriv", "-a"], [/* 19 vars */]) = 0
brk(0) = 0xd2b000
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f61867ac000
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=131558, ...}) = 0
mmap(NULL, 131558, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f618678b000
close(3) = 0
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
open("/lib/libiw.so.30", O_RDONLY) = 3
read(3,

that's all i get when running iwpriv remotely. is that at all informative?

Alex Wauck (awauck) wrote :

Matt, I wonder if you're not getting part of the strace output because the kernel panic happens before the data gets sent over the wire? Can you try running strace in a non-X vt (i.e. ctrl-alt-f1, then run strace)?

ZachG (zgold550) wrote :

1) I can confirm everything Matt has said so far on my machine. iwpriv -a is definitely the cause of the problem
2) Obviously I cant copy paste so good from a non-X vt on a machine that has had a kernel panic. I took a photo however (attached)

3) It looks like it crashes on an ioctl (surprise, surprise). Sadly the exact ioctl isnt there? It shows me ioctl(0x8bec) which looks like memory somehow got corrupted and its passing a bad ioctl? There is another suspicious ioctl before the crash without a macro mapping, 0x8be5. I'm going to try playing with some things, such as:

making iwpriv a noop by converting to a bash script and seeing if that makes the system stable.

Snooping around with iwpriv a bit more, maybe try some gdb break points, see if I can narrow this down further and come up with a patch (to either iwpriv or the driver)

ZachG (zgold550) wrote :

(For all the people who found this thread by googling)
### REALLY HACKY SOLUTION WHICH WILL (hopefully) ENABLE YOU TO USE r8192 WITHOUT CRASHING IN LINUX ###

The usual disclaimer about these kinds of hacks applies -- I am not responsible yada yada, do this at your own risk yada yada, worked for me, not guaranteed to work for you yada yada.

Open a terminal (On Ubuntu 10.04 aka Lucid Lynx: Applications Menu -> system tools -> terminal)
1) sudo mv /sbin/iwpriv /sbin/iwpriv.old
2) sudo echo '#!/bin/bash' > /sbin/iwpriv
3) sudo chmod +x /sbin/iwpriv

NOTES:
1) I did this on my machine, and was still able to connect to and use wifi normally.
2) Obviously this will likely break any application which depends on the iwpriv command, and also makes iwpriv unsable for you on the command line (unless you use iwpriv.old)
3) Powertop now runs fine without crashing with wifi enabled and running
4) Other applications may still do the same thing as iwpriv which can cause the crash. This fix does *NOT* address the root cause of the problem. I'm still going to look for a real fix for the underlying problem with the driver.

At Sun, 04 Jul 2010 02:08:57 -0000,
Alex Wauck wrote:
>
> Matt, I wonder if you're not getting part of the strace output because
> the kernel panic happens before the data gets sent over the wire? Can
> you try running strace in a non-X vt (i.e. ctrl-alt-f1, then run
> strace)?

I'm assuming zach's results are good enough for this? They seem more helpful.

Ok this driver only seems to exist in the Ubuntu kernel in Lucid and later, creating appropriate tasks.

Changed in linux (Ubuntu Lucid):
importance: Undecided → Medium
status: New → Confirmed
summary: - [STAGING] r8192se_pci + powertop = kernel panic
+ r8192se_pci + powertop (iwpriv -a) = kernel panic
Changed in linux (Ubuntu Lucid):
assignee: nobody → Canonical Kernel Team (canonical-kernel-team)
tags: added: kernel-candidate kernel-wireless

You do need the hardware to trigger this issue, simply loading the driver is not sufficient.

Download full text (5.5 KiB)

this today from realtek. Haven't had a chance to test it yet. I don't
quite understand the email message & don't have time to crash my laptop
today trying to fix this, maybe someone else can give it a go?

-------- Forwarded Message --------
From: roger_liang <email address hidden>
To: <email address hidden>
Subject: RE: rtl8192se_pci linux driver bug
Date: Mon, 5 Jul 2010 16:46:52 +0800

Dear Sir,

We have fixed powertop crash issue. Please find the latest RTL8191SE driver
source in the attachment.
But we think this isn't our driver issue. Powertop tool can't be used for
all vendor device.
Its private IO control just can use for IPW cards like ipw3946 and so on.
RTL8191SE Linux driver power save mode is default open under DC mode.
There is no need to open power save mode by powertop. Thanks.

Best Regards,
Roger_Liang
Realtek Semiconductor Corp.

-----Original Message-----
From: <email address hidden> [mailto:<email address hidden>]
Sent: Sunday, July 04, 2010 12:48 AM
To: Roger Liang
Subject: RE: rtl8192se_pci linux driver bug

Dear Roger,

I have refined my error a little bit using strace. I now find that
the crash is actually triggered by the command:
iwpriv -a
when run as root. Here is the output of strace -f powertop
(attached), and also the crash message when running the command. I
have copied my latest post from the ubuntu bug report
(https://bugs.launchpad.net/ubuntu/+source/linux/+bug/585938)

i was able to get an strace from powrtop (loo\gging in by ssh from a
scrr\een session on a remote computer) and attach the log here. it
looks like the issue is triggered by "iwpriv -a" -- here's what looks
like the relevant part of the log (at the end):

[pid 2077] execve("/bin/sh", ["sh", "-c", "/sbin/iwpriv -a 2>
/dev/null"], [/* 19 vars */] <unfinished ...>
[pid 2064] mmap(NULL, 4096, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f884e1ea000
[pid 2064] read(4, <unfinished ...>
[pid 2077] <... execve resumed> ) = 0
[pid 2077] brk(0) = 0x2590000
[pid 2077] access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such
file or directory)
[pid 2077] mmap(NULL, 8192, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f5d2ef07000
[pid 2077] access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such
file or directory)
[pid 2077] open("/etc/ld.so.cache", O_RDONLY) = 4
[pid 2077] fstat(4, {st_mode=S_IFREG|0644, st_size=131558, ...}) = 0
[pid 2077] mmap(NULL, 131558, PROT_READ, MAP_PRIVATE, 4, 0) = 0x7f5d2eee6000
[pid 2077] close(4) = 0
[pid 2077] access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such
file or directory)
[pid 2077] open("/lib/libc.so.6", O_RDONLY) = 4
[pid 2077] read(4,
"\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0`\355\1\0\0\0\0\0"...,
832) = 832
[pid 2077] fstat(4, {st_mode=S_IFREG|0755, st_size=1572232, ...}) = 0
[pid 2077] mmap(NULL, 3680296, PROT_READ|PROT_EXEC,
MAP_PRIVATE|MAP_DENYWRITE, 4, 0) = 0x7f5d2e966000
[pid 2077] mprotect(0x7f5d2eae0000, 2093056, PROT_NONE) = 0
[pid 2077] mmap(0x7f5d2ecdf000, 20480, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 4, 0x179000) = 0x7f5d2ecdf000
[pid 2077] mmap(0x7f5d2ece4000, 18472, PROT_READ|PROT_WRITE,
MAP_PRIVA...

Read more...

Andy Whitcroft (apw) on 2010-07-05
Changed in linux (Ubuntu Lucid):
assignee: Canonical Kernel Team (canonical-kernel-team) → Andy Whitcroft (apw)

Looking at the realtek change it looks that they have reigned in the size of the return buffer from the adhoc peers ioctl. I have made the same change to the version in lucid and produced some test kernels. If those of you who are affected by this problem could install and test these kernels that would be great. Please test the kernels at the URL below and report back here:

    http://people.canonical.com/~apw/lp585938-lucid/

Thanks.

Changed in linux (Ubuntu Lucid):
status: Confirmed → Incomplete
ZachG (zgold550) wrote :

I tried using the new kernel:

zgoldberg@netglider:~/Downloads$ uname -a
Linux netglider 2.6.32-24-generic #38lp585938v201007051739 SMP Mon Jul 5 16:41:46 UTC 2010 i686 GNU/Linux
zgoldberg@netglider:~/Downloads$ /sbin/iwpriv.old -a
<crash>

;(.

I'm a bit confused as to how this was supposed to work. I was under the impression that this driver was not upstream? I've been using r8192se as a module this whole time? I'm going to take a quick look at the tarball that matt linked above and see if I can get that to build.

ZachG (zgold550) wrote :

I can, however, confirm that by using the tarball that Matt attached the driver does not crash with iwpriv -a

Alex Wauck (awauck) wrote :

Matt's 0017.0705.2010 driver compiles and runs with 2.6.35-rc4, and the panic is definitely gone on that kernel, too. This version should be added to Maverick as soon as possible. Also, I really think Canonical should put some pressure on Realtek to get this driver upstream. Everyone wins if we get a driver for this chipset in the mainline kernel.

finally had a chance to test the driver -- not your kernel package
yet, andy, just a quick dkms install of theirs - everything now seems
to work perfectly. I would definitely put this change into the next
update.

thanks much everyone!
matt

At Tue, 06 Jul 2010 21:29:32 -0000,
Alex Wauck wrote:
>
> Matt's 0017.0705.2010 driver compiles and runs with 2.6.35-rc4, and the
> panic is definitely gone on that kernel, too. This version should be
> added to Maverick as soon as possible. Also, I really think Canonical
> should put some pressure on Realtek to get this driver upstream.
> Everyone wins if we get a driver for this chipset in the mainline
> kernel.
>
> --
> r8192se_pci + powertop (iwpriv -a) = kernel panic
> https://bugs.launchpad.net/bugs/585938
> You received this bug notification because you are a direct subscriber
> of the bug.

Changed in linux (Ubuntu):
assignee: Canonical Kernel Team (canonical-kernel-team) → nobody
tags: added: kernel-net
removed: kernel-candidate kernel-wireless

the driver posted by Matt Price works fine on a Toshiba T135 with lucid 64bit,
though... no bluetooth yet =(

thanks matt!

Andy Whitcroft (apw) wrote :

@Zach -- the kernel contained what appeared to be the minor fix that realtek had made to the driver. We are not trivially able to replace the driver as that is against the SRU process but if we can identify the changes that fixed iwpriv -a we could backport just those. The change I found was a single buffer resize that they had made in the ioctl apparently causing the panics. There may be more than one fix we require from the updated driver.

Andy Whitcroft (apw) wrote :

@Zach it would be helpful to confirm whether this is the same stack trace as before the change if you are able.

ZachG (zgold550) wrote :

@Andy -- I'm confused how this should work. This whole time we've been talking about an out-of-tree module. Does the kernel you posted contain that module built in somehow? When I boot into that kernel I don't actually get wifi access at all unless I modprobe one of the version of the r8192 module I have lying around.

Andy Whitcroft (apw) wrote :

Ok, there may be more to this fix than I had previously identified. I have respun the patch for Lucid and rebuilt some test kernels. If those affected could test the latest kernels at the URL below that would be great. Please report back here:

    http://people.canonical.com/~apw/lp585938-lucid/

On Thu, 2010-07-15 at 02:01 +0000, ZachG wrote:
> @Andy -- I'm confused how this should work. This whole time we've been
> talking about an out-of-tree module. Does the kernel you posted contain
> that module built in somehow? When I boot into that kernel I don't
> actually get wifi access at all unless I modprobe one of the version of
> the r8192 module I have lying around.
>
zach, the module is included in the ubuntu kernels; when lucid was
released, it was released with an old version, which has been updated
once since then, to th 0015 version we both started with. andy's
talking about backprting the new patches to that version.

matt

@Matt
Do you have any idea then why when I boot into the new kernel theres no r8192 module in use and wifi doesn't work (i.e. it's not built in)?

On Thu, 2010-07-15 at 19:31 +0000, ZachG wrote:
> @Matt
> Do you have any idea then why when I boot into the new kernel theres no r8192 module in use and wifi doesn't work (i.e. it's not built in)?
>
not sure zach -- i'm mostly internetless this month but what kernel are
you using? i sem to have it in the latest:

$ find /lib/modules -name "*8192*" | grep pci
/lib/modules/2.6.32-17-generic/kernel/drivers/staging/rtl8192e/r8192_pci.ko
/lib/modules/2.6.32-23-generic-tuxonice/updates/dkms/r8192se_pci.ko
/lib/modules/2.6.32-23-generic-tuxonice/kernel/drivers/net/wireless/r8192se_pci.ko
/lib/modules/2.6.32-22-generic/kernel/ubuntu/rtl8192se/r8192se_pci.ko
/lib/modules/2.6.32-22-generic/kernel/drivers/staging/rtl8192e/r8192_pci.ko
/lib/modules/2.6.32-23-generic/kernel/ubuntu/rtl8192se/r8192se_pci.ko

I am having the same problem, and can reproduce it in the same way with powertop.

Did the proposed fixes make it into the 2.6.32-24-generic release that just came out?

Jeremy Foshee (jeremyfoshee) wrote :

Alex Wauck,

Fixes to the rtl8192se were included in the most recently released development kernel 2.6.35-16.22. We believe that these fixes may have an impact on your particular issue. As such, we would like for you to test this kernel and let us know if it resolves the problems you were having.

Thanks in advance for your testing of this kernel!

~JFo

ethan_fr0me (ethan-fr0me) wrote :

This is still an issue for me with maverick amd64. Running from the live CD with the provided r8192se_pci module, the "iwpriv -a" command is enough to cause a kernel panic for me.

ethan_fr0me (ethan-fr0me) wrote :

....but going to the Realtek driver released earlier this week (rtl8192se_linux_2.6.0018.1013.2010) fixes that issue! I can't make it panic any more! Now let me wait and see if I still get disconnects every few hours.

racecar56 (racecar56) wrote :

I actually had this in Debian Squeeze. No telling what the error was there, but it did freeze the computer.
I also don't know if Ubuntu even has this problem...looks like it's fixed.

Jawaid Bazyar (jb-forethought) wrote :

powertop and iwpriv now work without causing a crash for me as well, with the slightly newer driver:

rtl8192se_linux_2.6.0019.1207.2010

from www.realtek.com

I have had numerous odd bvehaviors with the wireless that I attribute to this problem:
1) random blinking caps-lock kernel panics
2) wireless driver sometimes does not shut down properly on suspend, causing suspend to hang up and fail to power off
3) sometimes the wireless driver will simply stop passing data but does not cause kernel panic

I will report back if I continue to see these behaviors, or if my problems are resolved.

Jawaid Bazyar (jb-forethought) wrote :

Even though the new driver did cure the powertop and iwpriv issues, I am still experiencing panics.

/var/log/messages:
Dec 31 11:01:18 bazyar-laptop kernel: [68052.331396] rtl8192se_update_ratr_table: ratr_index=0 ratr_table=0x00000ff5
Dec 31 11:03:18 bazyar-laptop kernel: [68172.126195] rtl8192se_update_ratr_table: ratr_index=0 ratr_table=0x00000ff5
Dec 31 11:05:18 bazyar-laptop kernel: [68291.907083] rtl8192se_update_ratr_table: ratr_index=0 ratr_table=0x00000ff5
Dec 31 11:07:18 bazyar-laptop kernel: [68411.714734] rtl8192se_update_ratr_table: ratr_index=0 ratr_table=0x00000ff5

I seem to get these every 2 minutes. They're all the same except the timestamp.

I'm definitely using the new driver:

bazyar@bazyar-laptop:/var/log$ modinfo r8192se_pci
filename: /lib/modules/2.6.32-26-generic/kernel/drivers/net/wireless/r8192se_pci.ko
license: GPL
version: 0019.1207.2010
author: Copyright(c) 2008 - 2010 Realsil Semiconductor Corporation <email address hidden>
description: Linux driver for Realtek RTL819x WiFi cards
srcversion: B35243106478C16B758F56E
alias: pci:v000010ECd00008174sv*sd*bc*sc*i*
alias: pci:v000010ECd00008173sv*sd*bc*sc*i*
alias: pci:v000010ECd00008172sv*sd*bc*sc*i*
alias: pci:v000010ECd00008171sv*sd*bc*sc*i*
alias: pci:v000010ECd00008192sv*sd*bc*sc*i*
depends:
vermagic: 2.6.32-26-generic SMP mod_unload modversions
parm: ifname: Net interface name, wlan%d=default (charp)
parm: hwwep: Try to use hardware WEP support(default use hw. set 0 to use software security) (int)
parm: channels: Channel bitmask for specific locales. NYI (int)

Kernel:

bazyar@bazyar-laptop:/var/log$ uname -a
Linux bazyar-laptop 2.6.32-26-generic #48-Ubuntu SMP Wed Nov 24 10:14:11 UTC 2010 x86_64 GNU/Linux

I guess I need to figure out how to capture the kernel panic info next.

racecar56 (racecar56) wrote :

Confirmed in Ubuntu 10.10 amd64, same hardware as the OP.

gv (giordano-valloggia) wrote :

Still present in Natty alpha.

Linux zulu 2.6.38-3-generic #30-Ubuntu SMP Thu Feb 10 00:33:26 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux

powertop runs well if I rmmod r8192se_pci

Jörn Horstmann (jhorstmann) wrote :

I can confirm this is still an issue in the final version of natty.

Linux intellibook2011 2.6.38-8-generic #42-Ubuntu SMP Mon Apr 11 03:31:50 UTC 2011 i686 i686 i386 GNU/Linux

A picture of the call trace can be seen here: http://twitpic.com/4t7l6a

Jean-Philippe Orsini (jfi) wrote :

reproduced with natty and clevo s3101 (which seems to be closed to system76 lemur laptop)

Ilan (ilan) wrote :

As of the latest Natty kernel update I'm no longer experiencing this issue when running iwpriv -a and/or powertop.
Hardware is:
- System: Zareason Strata 13
- Wireless: Realtek RTL8191SEvB rev10

Alex Wauck (awauck) wrote :

For what it's worth, I haven't had any problems on the newer kernels (e.g. the ones included with 11.04 and 11.10).

summary: - r8192se_pci + powertop (iwpriv -a) = kernel panic
+ 10ec:8171 r8192se_pci + powertop (iwpriv -a) = kernel panic
tags: added: needs-upstream-testing
removed: networking
gcc (chris+ubuntu-qwirx) wrote :

@apw, I can reproduce this on an Intel Classmate PC running Lucid. What makes it incomplete for Lucid?

Also I'm afraid the patch, although promising, does not resolve the issue for me. I can still panic the kernel every time with "iwpriv -a" or "iwpriv wlan0 adhoc_peer_list". iwpriv shows that the return buffer size has been reduced, so the patch is applied in this kernel. Symptoms and busted canary address are exactly the same as without the patch.

I'm trying to work out why this patch hasn't fixed the problem, but have limited time to devote to it.

gcc (chris+ubuntu-qwirx) wrote :

OK, I think I understand the problem. The table of iw_priv args and functions should alternate SET and GET functions, but Realtek just added them all in order.

net/wireless/wext.c uses IW_IS_SET on the command number to determine whether it's a GET or SET command, and this just checks the last bit. So it's essential that GET commands have an odd command number. Otherwise ioctl_private_iw_point won't allocate a memory buffer to pass into the handler.

r8192_wx_get_adhoc_peers is an iwpriv ioctl that behaves as a GET ioctl, it expects a writable buffer from the wireless extensions system (wext.c) and it writes all over it. However it lives in a SET slot (SIOCIWFIRSTPRIV + 0xc is even) so the kernel hasn't allocated any memory to it (extra_size = 0), although kzalloc did apparently return a valid pointer or this would have been a null pointer reference. Anyway at this point the driver will scribble over random kernel memory and cause a panic soon after.

It's not clear to me why reducing the buffer size works for some people. Perhaps the pointer returned by kzalloc happens to be a memory location big enough to write 1024 bytes to, for them but not for me?

The fix appears to be to correct the allocation of iwpriv ioctl numbers in the driver. I'm going to try my hand at writing a patch for this. Newer kernels (3.2.0) appear to have a completely different driver model based on nl80211 instead of wext, so I can't just backport a patch.

It would also probably be helpful in debugging other drivers if wext.c (1) checked that (apparent) SET ioctls don't expect a GET buffer, and GET ioctls don't expect a SET buffer, or return an error if they do, and (2) set the data pointer to 0 instead of calling kzalloc for 0 bytes.

can't believe this might actually be fixed. As a mere user, what can I
do to help with this? And what versions of ubuntu are currently
affected?

On Wed 08 Aug 2012 10:58:35 AM EDT, gcc wrote:
> OK, I think I understand the problem. The table of iw_priv args and
> functions should alternate SET and GET functions, but Realtek just added
> them all in order.
>
> net/wireless/wext.c uses IW_IS_SET on the command number to determine
> whether it's a GET or SET command, and this just checks the last bit. So
> it's essential that GET commands have an odd command number. Otherwise
> ioctl_private_iw_point won't allocate a memory buffer to pass into the
> handler.
>
> r8192_wx_get_adhoc_peers is an iwpriv ioctl that behaves as a GET ioctl,
> it expects a writable buffer from the wireless extensions system
> (wext.c) and it writes all over it. However it lives in a SET slot
> (SIOCIWFIRSTPRIV + 0xc is even) so the kernel hasn't allocated any
> memory to it (extra_size = 0), although kzalloc did apparently return a
> valid pointer or this would have been a null pointer reference. Anyway
> at this point the driver will scribble over random kernel memory and
> cause a panic soon after.
>
> It's not clear to me why reducing the buffer size works for some people.
> Perhaps the pointer returned by kzalloc happens to be a memory location
> big enough to write 1024 bytes to, for them but not for me?
>
> The fix appears to be to correct the allocation of iwpriv ioctl numbers
> in the driver. I'm going to try my hand at writing a patch for this.
> Newer kernels (3.2.0) appear to have a completely different driver model
> based on nl80211 instead of wext, so I can't just backport a patch.
>
> It would also probably be helpful in debugging other drivers if wext.c
> (1) checked that (apparent) SET ioctls don't expect a GET buffer, and
> GET ioctls don't expect a SET buffer, or return an error if they do, and
> (2) set the data pointer to 0 instead of calling kzalloc for 0 bytes.
>

gcc (chris+ubuntu-qwirx) wrote :

This patch fixes the panic for me.

iwpriv wlan0 getpromisc and setpromisc don't work, and I don't know why, but at least it doesn't panic any more.

@matt-price, you could build yourself an Ubuntu kernel with the attached patch applied and see if it fixes the problem for you. To build such a kernel, for Lucid:

git clone git://kernel.ubuntu.com/ubuntu/ubuntu-lucid.git
cd ubuntu-lucid
patch -p1 < linux-ischool-classmate/rtl8192se_wx.panic.lp585938.patch
CONCURRENCY_LEVEL=3 fakeroot make-kpkg --initrd \
    --append-to-version=-cw-lp585938-1 \
    kernel-image kernel-headers kernel-debug
cd ..
dpkg -i linux-image*

You can find more information here: https://help.ubuntu.com/community/Kernel/Compile/

Regarding affected versions, a quick check of the maverick and natty kernel git trees shows that they appear to be vulnerable. Oneiric and Precise are not.

The attachment "Patch for the rtl8192se wireless driver to fix kernel panics on "iwpriv -a"" of this bug report has been identified as being a patch. The ubuntu-reviewers team has been subscribed to the bug report so that they can review the patch. In the event that this is in fact not a patch you can resolve this situation by removing the tag 'patch' from the bug report and editing the attachment so that it is not flagged as a patch. Additionally, if you are member of the ubuntu-reviewers team please also unsubscribe the team from this bug report.

[This is an automated message performed by a Launchpad user owned by Brian Murray. Please contact him regarding any issues with the action taken in this bug report.]

tags: added: patch
Tim Gardner (timg-tpi) on 2012-10-01
Changed in linux (Ubuntu Lucid):
assignee: Andy Whitcroft (apw) → Tim Gardner (timg-tpi)
status: Incomplete → In Progress
Changed in linux (Ubuntu):
status: Confirmed → Invalid
Tim Gardner (timg-tpi) wrote :

I've been looking at the patch to fix this out-of-tree OEM driver from Realtek and have decided not to apply it, mostly 'cause I think there are better solutions. We offer Compat Wireless backport packages from kernel versions v2.6.33 through v3.3. The rtl8192se driver went mainline as of v3.0, so I would try installing one of the more recent compat wireless packages. You can see which packages are available thusly:

apt-cache search linux-backports-modules-wireless

I would start out by installing CW from 3.3:

sudo apt-get install linux-backports-modules-wireless-3.3-lucid-generic

Changed in linux (Ubuntu Lucid):
status: In Progress → Won't Fix
gcc (chris+ubuntu-qwirx) wrote :

So Ubuntu is going to continue to distribute a kernel which panics when you run iwpriv, and the recommended fix is to manually install linux-backports-modules-wireless.

I don't know why I bother posting anything on Launchpad.

To post a comment you must log in.