[madwifi] Semi-random system lockups in Dapper

Bug #37773 reported by Soren Hansen
36
Affects Status Importance Assigned to Milestone
linux-restricted-modules-2.6.15 (Ubuntu)
Won't Fix
Critical
Unassigned
Dapper
Won't Fix
Critical
Unassigned
linux-restricted-modules-2.6.17 (Ubuntu)
Fix Released
Undecided
Unassigned
Dapper
Fix Released
Undecided
Unassigned

Bug Description

Every once in a while (on average once a day), my new Thinkpad X40 hangs. The system is totally unresponsive, except it responds to Sysrq-combos. I have seen it happen twice while in the console, and this showed up:
"BUG: soft lockup detected on CPU#0"

The sysrq-combo for dumping a backtrace showed some stuff in the madwifi driver, but the backtrace was more then 25 lines long, so I couldn't see the topmost entries in it, but the ath_* symbols were consistent both times, I've seen it happen.

I can't really do any real work without my wifi card, so disabling it for a longer period of time to determine if the errors disappear is not really an option.

A friend of mine has had an identical laptop running Breezy which apparantly works flawlessly. He upgraded do Dapper yesterday, so he'll probably be able to tell me if he has similar problems within a few days.

I'll try removing the restricted_modules package and install the madwifi-ng driver from CVS and see if that makes a difference, but it's kind of hard to tell since sometimes there's days between each crash, while today it has crashed on me 4 or 5 times (all crashes happened while I was fiddling with NetworkManager and wpa_supplicant which also leads me to think that it's related to the wifi card).

Tags: dapper freeze
Revision history for this message
Ankur Kotwal (ankur.kotwal) wrote :

How did your testing go with madwifi-ng?

I had similar problems with madwifi myself. It turns out that the packaged version of madwifi in dapper has a known memory leak. I could trigger it when my machine was acting as an access point - just start a file copy and bam!

Since then I tried using madwifi-ng and haven't had the problem since.

Revision history for this message
Soren Hansen (soren) wrote :

Did you also get the dreaded "BUG: soft lockup detected on CPU#0"?

My testing with madwifi-ng didn't go too well. :-) I couldn't make it connecto to anything at all (most probably my fault). I'll give it another try in a few minutes.

Revision history for this message
Soren Hansen (soren) wrote :

The madwifi-ng works like a charm now. I'll wait a few days and see if it locks up and let you know how it goes.

Revision history for this message
Soren Hansen (soren) wrote :

No lockups as of yet. The errors were undeterministic, and I HAVE had lock up-free days, but hardly this many in a row, so I'd say the bug is fixed. I've had to rebuild wpasupplicant to work with madwifi-ng and I have yet to teach NetworkManager to make sure that the ath0 device is in UP state before asking it to do a scan (this is apparantly behaviour by design in madwifi-ng), but that shouldn't be too hard.

I realise that substituting madwifi for madwifi-ng is quite intrusive in this part of the process, so the optimal solution would of course be to locate the error in madwifi (the "old" version) and fix it. If it can't be found in a short period of time, I really do think we should consider madwifi-ng in place of madwifi. The IBM Thinkpad X40 is by no means an uncommon laptop and the wifi card in it is by no means an uncommon one either, so I think this particular bug will affect quite a few people. For me, it has caused data loss too (due to xfs in combination with laptop-mode holds off disk syncing for quite a while and when this crash happens, anything not synced is lost).

Revision history for this message
Huygens (huygens-25) wrote :

After installing Dapper Flight 5 on my laptop (a Dell Latitude D600), I had encountered a few times the trouble you are mentionning, however, at that time I was not using Wi-Fi. Then, after the kernel upgrade (2.6.15-19) I did not encounter the problem until:
  - I have been using Wi-Fi and I was on battery (I was using the latest updates as of 14th of April).

In this configuration, I just have to wait about 10 minutes for the bug to occur. I thought this could a bug in the proprietary ATI driver (which gave me lost of troubles), so I set-up my X Server to use the open source driver. I still had the bug.

However, if I am not connected to a wireless network, I do not have the problem.

Huygens

Revision history for this message
Huygens (huygens-25) wrote :

In fact, I am not sure that this is wi-fi only related, as it occured also in an area where I had no wi-fi access! But I was on battery, or I had used my battery but they were recharging.

Huygens

PS: does anyone could help me to debug this, I do not know where to start as the log did not told much...

Revision history for this message
Soren Hansen (soren) wrote :

https://wiki.ubuntu.com/DebuggingSystemCrash is a really good place to start.

Revision history for this message
Niels Kristian Bech Jensen (nkbjensen) wrote :

Could these warnings I get while installing linux-restricted-modules have something to do with it? (I know it is on powerpc.)

Setting up linux-restricted-modules-2.6.15-23-powerpc (2.6.15.10-2) ...
WARNING: Module /lib/modules/2.6.15-23-powerpc/madwifi-ng/new_ath_pci.ko ignored, due to loop
WARNING: Module /lib/modules/2.6.15-23-powerpc/madwifi-ng/new_ath_rate_sample.ko ignored, due to loop
WARNING: Loop detected: /lib/modules/2.6.15-23-powerpc/volatile/new_ath_hal.ko which needs new_ath_hal.ko again!
WARNING: Module /lib/modules/2.6.15-23-powerpc/volatile/new_ath_hal.ko ignored, due to loop

Regards,
Niels Kristian

Revision history for this message
Soren Hansen (soren) wrote :

Niels: No, I doubt that's related.

Einar (esr496)
Changed in linux-restricted-modules-2.6.15:
status: Unconfirmed → Confirmed
Revision history for this message
Soren Hansen (soren) wrote :

Raising severity to critical. This makes Dapper unusable on many laptops and even causes data loss in some cases.

Revision history for this message
Luca Lorenzetto (lorenzetto-luca) wrote :

i've the same problem on a thinkpad T20. i've rebuilt madwifi-ng but the problem remained. So i rebuilt hal due to this message:
[ 4088.228269] wifi%d: unable to attach hardware: 'Hardware self-test failed' (HAL status 14)

but nothing changed.

Revision history for this message
Luca Lorenzetto (lorenzetto-luca) wrote :
Download full text (5.3 KiB)

today i reinstalled the original hal and the original modules from ubuntu and while plugging and assigning an ip to the card dmesg says:

[ 298.132141] pccard: CardBus card inserted into slot 1
[ 298.340316] wlan: 0.8.4.2 (svn 2006-05-18)
[ 298.435768] ath_hal: module license 'Proprietary' taints kernel.
[ 298.438934] ath_hal: 0.9.16.16 (AR5210, AR5211, AR5212, RF5111, RF5112, RF2413, RF5413)
[ 298.533350] wlan: 0.8.4.2 (svn 1560)
[ 298.546620] ath_rate_sample: 1.2 (svn 1560)
[ 298.583257] ath_pci: 0.9.4.5 (svn 1560)
[ 298.585146] PCI: Enabling device 0000:06:00.0 (0000 -> 0002)
[ 299.048643] wifi0: 11a rates: 6Mbps 9Mbps 12Mbps 18Mbps 24Mbps 36Mbps 48Mbps 54Mbps
[ 299.050281] wifi0: 11b rates: 1Mbps 2Mbps 5.5Mbps 11Mbps
[ 299.051332] wifi0: 11g rates: 1Mbps 2Mbps 5.5Mbps 11Mbps 6Mbps 9Mbps 12Mbps 18Mbps 24Mbps 36Mbps 48Mbps 54Mbps
[ 299.053420] wifi0: turboA rates: 6Mbps 9Mbps 12Mbps 18Mbps 24Mbps 36Mbps 48Mbps 54Mbps
[ 299.055022] wifi0: turboG rates: 6Mbps 12Mbps 18Mbps 24Mbps 36Mbps 48Mbps 54Mbps
[ 299.056523] wifi0: H/W encryption support: WEP AES AES_CCM TKIP
[ 299.057665] wifi0: mac 5.9 phy 4.3 radio 3.6
[ 299.058455] wifi0: Use hw queue 1 for WME_AC_BE traffic
[ 299.059317] wifi0: Use hw queue 0 for WME_AC_BK traffic
[ 299.060164] wifi0: Use hw queue 2 for WME_AC_VI traffic
[ 299.061027] wifi0: Use hw queue 3 for WME_AC_VO traffic
[ 299.061889] wifi0: Use hw queue 8 for CAB traffic
[ 299.062668] wifi0: Use hw queue 9 for beacons
[ 299.079759] wifi0: Atheros 5212: mem=0x16000000, irq=11
[ 323.330525] wifi0_ifrename: hardware error; reseting
[ 326.075852] wifi0_ifrename: hardware error; reseting
[ 327.765004] wifi0_ifrename: hardware error; reseting
[ 329.453814] wifi0_ifrename: hardware error; reseting
[ 333.207294] ath0: no IPv6 routers present
[ 348.736037] wifi0_ifrename: hardware error; reseting
[ 348.865601] wifi0_ifrename: hardware error; reseting
[ 348.894201] wifi0_ifrename: hardware error; reseting
[ 348.922460] wifi0_ifrename: hardware error; reseting
[ 348.957327] wifi0_ifrename: hardware error; reseting
[ 349.086460] wifi0_ifrename: hardware error; reseting
[ 349.113827] wifi0_ifrename: hardware error; reseting
[ 349.140874] wifi0_ifrename: hardware error; reseting
[ 349.174669] wifi0_ifrename: hardware error; reseting
[ 349.302587] wifi0_ifrename: hardware error; reseting
[ 349.328922] wifi0_ifrename: hardware error; reseting
[ 349.354802] wifi0_ifrename: hardware error; reseting
[ 349.387641] wifi0_ifrename: hardware error; reseting
[ 349.514399] wifi0_ifrename: hardware error; reseting
[ 349.539477] wifi0_ifrename: hardware error; reseting
[ 349.564277] wifi0_ifrename: hardware error; reseting
[ 349.595320] wifi0_ifrename: hardware error; reseting
[ 349.719887] wifi0_ifrename: hardware error; reseting
[ 349.742666] wifi0_ifrename: hardware error; reseting
[ 349.765381] wifi0_ifrename: hardware error; reseting
[ 349.787668] wifi0_ifrename: hardware error; reseting
[ 349.822689] wifi0_ifrename: hardware error; reseting
[ 352.531438] wifi0_ifrename: hardware error; reseting
[ 358.971494] wifi0_ifrename: hardware error; res...

Read more...

Revision history for this message
Ankur Kotwal (ankur.kotwal) wrote :

To get madwifi-ng to work on my system, I had to comment out the contents of the file /etc/udev/rules.d/25-iftab.rules

This way wifi0 showed up.

Revision history for this message
Luca Lorenzetto (lorenzetto-luca) wrote :

i think my notebook's hangups are not related to madwifi, because i compiled madwifi-ng anc commented out the contents of /etc/udev/rules.d/25-iftab.rules, but the freeze reappeared. I think this is an hal problem or something similar. Before the hangup dmesg shows:

[ 69.878357] pnp: Device 00:0d cannot be configured because it is in use.
[ 69.878603] pnp: Device 00:0d cannot be configured because it is in use.

When the card was plugged dmesg showed:

[ 61.994316] wlan: 0.8.4.2 (svn 2006-05-16)
[ 62.195863] ath_hal: module license 'Proprietary' taints kernel.
[ 62.199956] ath_hal: 0.9.16.16 (AR5210, AR5211, AR5212, RF5111, RF5112, RF2413, RF5413)
[ 62.349803] wlan: 0.8.4.2 (svn 1560)
[ 62.366708] ath_rate_sample: 1.2 (svn 1560)
[ 62.455328] ath_pci: 0.9.4.5 (svn 1560)
[ 62.456609] PCI: Enabling device 0000:06:00.0 (0000 -> 0002)
[ 62.917167] wifi0: 11a rates: 6Mbps 9Mbps 12Mbps 18Mbps 24Mbps 36Mbps 48Mbps 54Mbps
[ 62.917194] wifi0: 11b rates: 1Mbps 2Mbps 5.5Mbps 11Mbps
[ 62.917207] wifi0: 11g rates: 1Mbps 2Mbps 5.5Mbps 11Mbps 6Mbps 9Mbps 12Mbps 18Mbps 24Mbps 36Mbps 48Mbps 54Mbps
[ 62.917235] wifi0: turboA rates: 6Mbps 9Mbps 12Mbps 18Mbps 24Mbps 36Mbps 48Mbps 54Mbps
[ 62.917255] wifi0: turboG rates: 6Mbps 12Mbps 18Mbps 24Mbps 36Mbps 48Mbps 54Mbps
[ 62.917274] wifi0: H/W encryption support: WEP AES AES_CCM TKIP
[ 62.917309] wifi0: mac 5.9 phy 4.3 radio 3.6
[ 62.917327] wifi0: Use hw queue 1 for WME_AC_BE traffic
[ 62.917333] wifi0: Use hw queue 0 for WME_AC_BK traffic
[ 62.917339] wifi0: Use hw queue 2 for WME_AC_VI traffic
[ 62.917346] wifi0: Use hw queue 3 for WME_AC_VO traffic
[ 62.917352] wifi0: Use hw queue 8 for CAB traffic
[ 62.917358] wifi0: Use hw queue 9 for beacons
[ 62.963747] wifi0: Atheros 5212: mem=0x16000000, irq=11

(I can't live without wifi :-( ewoking wired is like a prison...)

Revision history for this message
tuxo (beat-fasel) wrote :

I have also experienced lockups of my up-to-date Dapper system over the last week. It contains an atheros card (and it is thus using the madwifi driver). Furthermore, I use the network-manager / knetworkmanager. Both under GNOME and KDE the system freezes randomly after some time when browsing the internet. However, when I switch to the wired network and disable the wireless network, my system seems to be stable so far. Maybe the combination of network-manager and the atheros driver is not healthy?

System Information:

$lspci
0000:00:00.0 Host bridge: VIA Technologies, Inc. VT82C693A/694x [Apollo PRO133x] (rev c4)
0000:00:01.0 PCI bridge: VIA Technologies, Inc. VT82C598/694x [Apollo MVP3/Pro133x AGP]
0000:00:04.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super South] (rev 40)
0000:00:04.1 IDE interface: VIA Technologies, Inc. VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 06)
0000:00:04.2 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 16)
0000:00:04.3 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 16)
0000:00:04.4 Host bridge: VIA Technologies, Inc. VT82C686 [Apollo Super ACPI] (rev 40)
0000:00:09.0 Ethernet controller: Atheros Communications, Inc. AR5212 802.11abg NIC (rev 01)
0000:00:0a.0 USB Controller: NEC Corporation USB (rev 43)
0000:00:0a.1 USB Controller: NEC Corporation USB (rev 43)
0000:00:0a.2 USB Controller: NEC Corporation USB 2.0 (rev 04)
0000:00:0b.0 Multimedia audio controller: Creative Labs SB Live! EMU10k1 (rev 07)
0000:00:0b.1 Input device controller: Creative Labs SB Live! MIDI/Game Port (rev 07)
0000:00:0d.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8139/8139C/8139C+ (rev 10)
0000:01:00.0 VGA compatible controller: ATI Technologies Inc Radeon R200 QL [Radeon 8500 LE]

$ uname -a
Linux lego 2.6.15-23-686 #1 SMP PREEMPT Tue May 23 14:03:07 UTC 2006 i686 GNU/Linux

Revision history for this message
Soren Hansen (soren) wrote :

It COULD be the combination, but if anything at all in userspace can tickle something in kernelspace so badly that it hangs the system, the bug is definitely in the kernel module :-)

I'm not entirely as busy right now as I used to be, so maybe I can switch back from madwifi-ng to the the madwifi in Ubuntu and work entirely from the console so I can maybe get some debugging info out of it..

Revision history for this message
Michael S. (jellicle) wrote :

FWIW: I'm using Dapper with a desktop machine, with an Atheros-chipset PCI wireless card, using WEP.

# lsmod | grep ath
ath_pci 80540 0
ath_rate_sample 17160 1 ath_pci
wlan 144924 4 wlan_wep,ath_pci,ath_rate_sample
ath_hal 148816 3 ath_pci,ath_rate_sample

Using the most recent kernel and restricted modules packages (linux-image-2.6.15-23-686 and linux-restricted-modules-2.6.15-23-686) I get lockups as described above. Usually it happens within an hour or so of booting, and always during network usage.

However, using a slightly older installed kernel image (linux-image-2.6.15-19-386 and linux-restricted-modules-2.6.15-19-386) I DO NOT get the lockups. The only change between when I do and do not get lockups is selecting a different kernel at the GRUB boot menu.

So, I believe that whatever is causing the lockups is a recent change (between 2.6.15-19 and 2.6.15-23). Hopefully this allows you to locate the problem. I agree with the critical bug severity; if Dapper goes live with the current code, anyone using an Atheros chipset WiFi card will be subjected to random lockups.

Revision history for this message
Soren Hansen (soren) wrote :

Could you please find the PCI id of your atheros card and post it here?

The output of this command should do just fine:
lspci -n | grep $(lspci | grep -i atheros | cut -f1 -d' ') | cut -f3 -d' '

Revision history for this message
Todd Deshane (deshantm) wrote :

I am having the same lockups but I am not using the madwifi module and don't have a atheros, instead I have a prism wavelan. I am using the orinoco drivers. What is the best way to properly disable the modules? I rmmod'ed them but that is only temporary. I will see if that fixed the problems though (I am currently happy only using wired for now, don't need wireless). I think that the problem must be in a module somewhere and needs to be fixed so dapper is stable for everybody. Maybe someone could post the code changes or some possible testing methods so we could all try to get rid of these randon hard lockups to make dapper better for everybody.

Revision history for this message
Sitsofe Wheeler (sitsofe) wrote :

Todd:

This is the wrong place to be asking support questions (especially ones unrelated to madwifi). Try over here:
https://launchpad.net/distros/ubuntu/+addticket

Revision history for this message
Soren Hansen (soren) wrote :

If:
 * My current experiments with adding madwifi-ng support to wpasupplicant (without remove madwifi-old support) are succesful
 * I can somehow make network-manager discover whether it's dealing with a madwifi-ng controlled wifi card or not and pass the appropriate options to wpasupplicant
 * We can identify exactly which cards are causing this issue

...we just might be able to pursuade someone into making madwifi-ng the default for those particular cards.

Revision history for this message
Soren Hansen (soren) wrote : Network-manager patch

I had a quick chat with mjg59, and this won't happen for the Dapper release. It's too intrusive, but here's the patches none the less..

This first one is the patch for network-manager. It detects the use of new_ath_pci (the name for madwifi-ng in Ubuntu) and asks wpasupplicant to use the madwifi-ng driver which I'll upload in a few seconds.

Revision history for this message
Soren Hansen (soren) wrote : Adds madwifi-ng support to wpasupplicant

Note: This does NOT remove the madwifi driver, but rather lets the two coexist. Have fun!

Revision history for this message
Michael S. (jellicle) wrote :

As requested above by Soren Hansen:

# lspci -n | grep $(lspci | grep -i atheros | cut -f1 -d' ') | cut -f3 -d' '
168c:0013

The full ID of the card is:

0000:01:01.0 Ethernet controller: Atheros Communications, Inc. AR5212 802.11abg NIC (rev 01)

It's a D-Link DWL-G520.

Revision history for this message
tuxo (beat-fasel) wrote :

Thanks a lot Soren for you patches! I have not tried them yet though.

What I can give feedback on is about my Atheros wifi card (taken from kinfocenter):

Atheros Communications, Inc. AR5212 802.11abg NIC (rev01)
Subsystem: D-Link System Inc: Unkown device 3ab0
Flags: bus master, medium devsel, latency 168, IRQ 177
Memory at de800000 (32-bit, non-prefetchable) [size=64K]
Capabilities: [44] Power Management version 2

$ lspci -n | grep $(lspci | grep -i atheros | cut -f1 -d' ') | cut -f3 -d' '
168c:0013

I hope the patches from Soren are helpful and can be included in the updates repositories after the release of Dapper Drake.

Revision history for this message
Soren Hansen (soren) wrote :

I have the same PCI id on my Atheros card. So, let's imagine that we establish that everybody experiencing this problem has that same PCI id. In order for us to REALLY justify making this quite intrusive change in dapper-updates, I think it'd be a good idea to somehow show that everyone with that pci id has this problem, but that means that we need to find out if there's anyone out there with this same pci id but without this problem.. Any ideas on how we would do that?

Revision history for this message
tuxo (beat-fasel) wrote :

Soren wrote:
> I think it'd be a good idea to somehow show that everyone with that pci id > has this problem, but that means that we need to find out if there's anyone > out there with this same pci id but without this problem.. Any ideas on how > we would do that?

Maybe it would be a good idea to ask people in the ubuntuforums (http://www.ubuntuforums.org) if they also experience system lockups with Atheros-based wifi cards with the PCI Id 168c:0013. Another place would be the ubuntu user mailing lists (https://lists.ubuntu.com/mailman/listinfo/ubuntu-users).

Revision history for this message
Todd Deshane (deshantm) wrote :

I didn't mean to ask a support question... I just think this bug is not limited to only the madwifi module, I think it may also affect other wireless modules too.

I will post more (either in a new bug or in an appropriate channel) once I figure out more. I haven't found any other related bugs or anything on the forums or ubuntu-users list, so this bug seemed like the best match at the time I posted.

Revision history for this message
Soren Hansen (soren) wrote :

Todd, I don't think we're dealing with a general networking or even wifi problem, but rather two specific and separate issues: "ours" in madwifi and "yours" in orinoco, so I think it'd be easier to track if you created a new bug for your issue.

Revision history for this message
Christian Posch (christian-posch) wrote :

It seems, that this problem also occurs with other atheros cards.
Mine has got the PCI ID 168c:001a (it's an SMC EZ Connect g SMCWCB-G). For me the bug is even worse, because I don't get any debug information on the terminal, nor can i reboot with the ALT-SysReq key combinations.
On my system the lockups mostly occur when I use applications with high upload datarates, like BitTorrent.

Revision history for this message
tuxo (beat-fasel) wrote :

For two days now I used only wired networking (via ethernet) and I experienced not a single crash.

In order to find out more about the madwifi bug, I activated the wirless connection again. In contrast to my earlier reports, I disactivated the network-manager this time. However, the computer (equipped with a wireless atheros-based card) froze anyway after some time when there was network activity (browsing the internet).

Therefore, the freezes are not caused by the combination of the network-manager with the madwifi driver but happend also when using the madwifi driver on its own.

Revision history for this message
pbeeson (pbeeson) wrote :

I was having lockups using nm-applet after suspending my laptop then resuming. If I add ath_pci to the MODULES_WHITESPACE in /etc/default/acpi-support, my network problems went away (without nm-applet it wasn't hanging, but the network would never ocme back up). Just a thought.

Revision history for this message
tuxo (beat-fasel) wrote :

Thanks for the input pbeeson. However, I have a desktop machine (with an atheros PCI card) and I never suspend this machine (actually, suspend is not working at all), but shut down the machine after working and booting freshly the next time I use the computer. Therefore, the freezes I experience are not due to an incorrect resume after suspension.

Revision history for this message
Glenn Steen (glennsteen) wrote :

Hi,

I can report similar random lockups related to the madwifi driver. My card is a 3com 3cRPAG175 which (although lspci doesn't mention atheros) actually use the ath*/madwifi modules. Here's my lspci (both -v and -n):

# lspci -v
....
0000:03:00.0 Ethernet controller: 3Com Corporation 3CRPAG175 Wireless PC Card (rev 01)
        Subsystem: 3Com Corporation: Unknown device 1026
        Flags: bus master, medium devsel, latency 168, IRQ 10
        Memory at 34000000 (32-bit, non-prefetchable) [size=64K]
        Capabilities: <available only to root>

# lspci -n
...
0000:03:00.0 0200: a727:0013 (rev 01)

This started (as with most/all the above reports) rather immediately after upgrading from the rock-solid Breezy to the rather less than solid Dapper. Sigh.

I'm quite convinced this is due to something very specific to the madwifi driver (similar sysrq stuff as reported above). I will try the patches posted above next and will, of course, report back with my progress.

If madwifi-ng doesn't help, I suppose I'll have to get a new card (it's pretty poor, as wlan cards go, anyway).

-- Glenn

Revision history for this message
Steen Brisson (zden-launchpad-net) wrote :

My brother in law and I have both had exactly the same symptoms on Fedora Core 4 systems.

For some odd reason (at that time) we got the system at my brother in law up and running flawless relatively easy when we decided to do something about the problem.

Just today I got my own system fixed. In hindsight I makes sense if our problem has been with a madwifi-ng-0.9.4.5 package. With that installed the system more or less instantly hung-up at some level of wireless traffic, and copying a CD image over the net was a bullet-proof way to provoce the hang-up.

Today I installed the newly released replacement version for madwifi-ng that for some odd reason is numbered 0.9.0 (and called madwifi without -ng). On the other system 0.9.6.0 (without -ng, but still in the old version number system) runs flawless as mentioned.

So my conclusion has been, on a Fedora Core 4 system, that the package madwifi-ng-0.9.4.5 has contained something very bad that is already fixed in the latest versions. At least I can now hammer the connection with a file copy, ftp copy and VoIP call all at the same time, without the slightest problem. I've transported over 2 million packets in 15 minutes. That was not possible this morning, on my system.

Revision history for this message
Soren Hansen (soren) wrote : Re: [Bug 37773] Re: [madwifi] Semi-random system lockups in Dapper

On Wed, Jun 07, 2006 at 09:17:48PM -0000, Steen Brisson wrote:
> Just today I got my own system fixed. In hindsight I makes sense if
> our problem has been with a madwifi-ng-0.9.4.5 package.

This bug is actually in madwifi-old, which is the default for most
Atheros cards in Ubuntu. The madwifi-ng included in Ubuntu is from the
madwifi svn as of 2006-05-29 and it works just fine (on Thinkpad X40 at
least).

Besides, with the madwifi-old driver I can transfer huge amouns of data
with no lockups while sometimes it hangs just a few minutes after
booting, so it's not quite the same issue.

Cheers, Søren.

Revision history for this message
Glenn Steen (glennsteen) wrote :

I never found the time to play with the nice patches, but.... the latest kernel update/restricted modules update seems to have made a difference. A bit early to cry success, but it's been on more or less the last 24 hours without a hitch.

-- Glenn

Revision history for this message
Soren Hansen (soren) wrote :

On Fri, Jun 16, 2006 at 07:24:21PM -0000, Glenn Steen wrote:
> I never found the time to play with the nice patches, but.... the latest
> kernel update/restricted modules update seems to have made a difference.
> A bit early to cry success, but it's been on more or less the last 24
> hours without a hitch.

Hmm... As far as I can see there was no change with regard to the
madwifi drivers, so I think it's just luck. Sorry. ;-) If it actually
turns out to have made a difference please let us know.

Cheers, Søren.

Revision history for this message
Glenn Steen (glennsteen) wrote : Re: [Bug 37773] [Bug 37773] Re: [madwifi] Semi-random system lockups in Dapper

> Hmm... As far as I can see there was no change with regard to the
> madwifi drivers, so I think it's just luck. Sorry. ;-) If it actually
> turns out to have made a difference please let us know.

I know, I thought it passing strange too.
Anyway it was just an observation:-).
Since the error seem to be highly intermittent, I'll just keep it on
for a couple of days, doing the odd download or so... But so far
so good;-).

Revision history for this message
Claudia Hardman (claudiaj) wrote :

I've had a terrible lockup problem since SMP was enabled by default in the kernel. Maybe try adding nosmp to the kernel line in grub's menu.lst (you may need noapic too, not sure though. I needed it, but all cases are different). I haven't tried removing that nosmp line ever since I found out it worked, and I don't plan to any time soon until I know my computer wont commit suicide whenever I download something.

Revision history for this message
Gabriele Vivinetto (gabriele.vivinetto) wrote :

Me tto I'm experiencing these freezes.
In addition I'v an ATI video card on my Compaq Evo N800v, and it seems to be a relation between madwifi and ati: https://launchpad.net/distros/ubuntu/+source/xserver-xorg-driver-ati/+bug/38181

Revision history for this message
Soren Hansen (soren) wrote : Re: [Bug 37773] Re: [madwifi] Semi-random system lockups in Dapper

On Tue, Jul 11, 2006 at 11:23:42AM -0000, Gabriele Vivinetto wrote:
> Me tto I'm experiencing these freezes. In addition I'v an ATI video
> card on my Compaq Evo N800v, and it seems to be a relation between
> madwifi and ati:
> https://launchpad.net/distros/ubuntu/+source/xserver-xorg-driver-ati/+bug/38181

I have an Intel graphics chipset, so that's not the case on my system.

Revision history for this message
Michel Lesoinne (michel-colorado) wrote :

I have been experiencing the same problem. I have an Apple MacBook Pro 17" and installed dapper on it. I also noticed it was network related and perhaps triggered by wpa_supplicant/knetworkmanager combo. This led me to find this thread.
Unlike others, my lspci gives me slightly different numbers:
168c:001c

Revision history for this message
Soren Hansen (soren) wrote :

On Thu, Jul 13, 2006 at 04:41:33PM -0000, Michel Lesoinne wrote:
> I have been experiencing the same problem. I have an Apple MacBook Pro
> 17" and installed dapper on it. I also noticed it was network related
> and perhaps triggered by wpa_supplicant/knetworkmanager combo. This
> led me to find this thread. Unlike others, my lspci gives me slightly
> different numbers: 168c:001c

That's very useful info. I've hacked together a .deb that makes
madwifi-ng be loaded instead of madwifi. That fixes it for me, but
apparantly some people have other problems with -ng. I'll just finish
polishing it (probably tomorrow) and upload it somewhere.

Cheers, Søren.

Revision history for this message
Bigchris (gchris) wrote :

Just a quick thanks for the heads up on this bug. I was just about to upgrade a Suse 10.0 machine with an Atheros Wifi adapter to Dapper but I really don't need this kind of trouble, and so will wait until this problem is flattened. Good luck with it and thanks again for keeping me out of trouble!
Cheers, Chris

Revision history for this message
Michel Lesoinne (michel-colorado) wrote :

Soren,

I do not understand the changes you did.... when I installed a new version of madwifi, it removed all the new_ath_* drivers and instead installed the ath_* in their place. So why did you modify NetworkManager and wpa_supplicant?

I took the subversion madwifi and it did not work, as it would freeze after a little bit of time and it did not work with NetworkManager. Now I am trying a snapshot madwifi-ng and still there are no "new_ath_*" modules being loaded.
What am I missing?

Revision history for this message
Soren Hansen (soren) wrote :

On Mon, Jul 17, 2006 at 10:33:26PM -0000, Michel Lesoinne wrote:
> I do not understand the changes you did.... when I installed a new
> version of madwifi, it removed all the new_ath_* drivers and instead
> installed the ath_* in their place. So why did you modify
> NetworkManager and wpa_supplicant?

The change to networkmanager enables it to detect if you're using the
madwifi-ng (new_ath_* actually). madwifi and madwifi-ng has different
API's and the change I've made to wpasupplicant lets it use both
depending on a commandline option (which networkmanager can give it
because I've taught it to know which driver is used).

> I took the subversion madwifi and it did not work, as it would freeze
> after a little bit of time and it did not work with NetworkManager.

It will work with the patched networkmanager and wpasupplicant.

> Now I am trying a snapshot madwifi-ng and still there are no
> "new_ath_*" modules being loaded. What am I missing?

new_ath_* are the names Ubuntu uses for madwifi-ng in order to let
madwifi and madwifi-ng coexist on the system. If you compile them from
SVN, they'll be named ath_*.

Cheers.

Revision history for this message
Michel Lesoinne (michel-colorado) wrote :

I downloaded and installed madwifi-ng-r1686-20060715 it created ath_* so do I need the patch or not? At least now, not using NetworkManager and no wpa, the connection does not die like it did before.

Revision history for this message
Michel Lesoinne (michel-colorado) wrote :

Still as soon as NetworkManager starts it gets the wifi in a weird state and never connects. I have to kill NetworkManager, unload the modules and reload them to be able to connect manually. Soren, do you have a package of what you did that I can install?

Revision history for this message
Soren Hansen (soren) wrote :

On Tue, Jul 18, 2006 at 02:28:44AM -0000, Michel Lesoinne wrote:
> Still as soon as NetworkManager starts it gets the wifi in a weird state
> and never connects. I have to kill NetworkManager, unload the modules
> and reload them to be able to connect manually. Soren, do you have a
> package of what you did that I can install?

Yes. It can be found here:
http://www.linux2go.dk/ubuntu/pool/main/m/madwifi-ng-default/madwifi-ng-default_1.0_all.deb

also, the patched wpasupplicant and NetworkManager:
http://www.linux2go.dk/ubuntu/pool/main/n/network-manager/network-manager_0.6.2-0ubuntu7linux2go2_i386.deb
http://www.linux2go.dk/ubuntu/pool/main/w/wpasupplicant/wpasupplicant_0.4.8-3ubuntu1.1linux2go1_i386.deb

the really easy way is to add:

deb http://www.linux2go.dk/ubuntu dapper main

to your sources.list and run:

$ sudo apt-get install madwifi-ng-default wpasupplicant network-manager

IMPORTANT: For these packages to work, you need to have the
linux-restricted-modules installed and NOT be using the madwifi-ng
drivers from SVN as there are naming differences.

Cheers!

Revision history for this message
Chris Jones (cmsj) wrote :

I have a desktop system (AMD64 on an ABit AV8) which has an atheros chipset (pci id 168c:0013) and it works completely reliably, the machine simply never crashes.
However, I have just also acquited a Thinkpad x40 with the atheros chipset and I have seen a few crashes where the whole system hung. I'm running network-manager on it and I don't notice any particular link between anything and the crashes, but SysRq does still work. Next time I will try and get a dump out of it, but usually it crashes while I'm in X and all I can do is alt-sysrq-b

Revision history for this message
Michel Lesoinne (michel-colorado) wrote :

Soren,

Thanks it now works. But it is subject to an annoying behavior that others have observed. NetworkManager continuously disconnect and reconnects. Apparently the way to get it fixed is stop NetworkManager from continuously scanning...
How does one do that?

Revision history for this message
Soren Hansen (soren) wrote :

On Wed, Jul 19, 2006 at 02:22:36PM -0000, Michel Lesoinne wrote:
> Apparently the way to get it fixed is stop NetworkManager from
> continuously scanning... How does one do that?

I don't know. I have not experienced this behaviour.

Revision history for this message
Denes Kiss (kiss-denes) wrote :

Very similar problem.
My Compaq EVO 110 laptop was rock solid with Breesy. With Dapper it locks up randomly when I use Firefox. No mouse, no keyboard, no any massage. I use ATI Radeon Mobility and the ATI prop. driver. I use the latest upgrade, but this bug persists from the beginning. I have tried both 386 and 686 kernel, open and propr. drivers. No change.
I have no idea what to try yet.
Denes

Revision history for this message
Chris Jones (cmsj) wrote :

Just to add some more notes, I've had the crash happen a few more times and investigated more closely what's going on.
I enabled the wireless LED a few days ago and I've noticed that when the machine hangs, the LED is on and stays on, so I am tempted to guess that it's hanging inside a wireless send/receive type function.

In terms of SysRq commands, there's not much that seems to work - I tried to kill X, dump some stats and do a "safe" sync,unmount,reboot, but the only one that works is the reboot one. I would hazard a guess that this is because the kernel is busy waiting for the wireless driver to finish doign whatever its doing.

However, all of this makes it extraordinarily difficult to track down what's really going on, especially with no serial port. Can anyone think of some other way(s) we can collect information about what failed?

Revision history for this message
Soren Hansen (soren) wrote :

On Tue, Jul 25, 2006 at 11:10:11AM -0000, Chris Jones wrote:
> However, all of this makes it extraordinarily difficult to track down
> what's really going on, especially with no serial port. Can anyone
> think of some other way(s) we can collect information about what
> failed?

There's always netconsole.. Via the wired interface, of course. :-)

Cheers.

Revision history for this message
Matt Zimmerman (mdz) wrote :

Has anyone experiencing this bug tried 2.6.17 in edgy?

Revision history for this message
Soren Hansen (soren) wrote :

On Wed, Aug 23, 2006 at 10:40:30AM -0000, Matt Zimmerman wrote:
> Has anyone experiencing this bug tried 2.6.17 in edgy?

We've switched to madwifi-ng in Edgy, so the problem is gone there. It
only happens with madwifi-old.

Revision history for this message
Allison Karlitskaya (desrt) wrote :

so I get a very similar bug on edgy with madwifi-ng (but i've always been a madwifi-ng user because even in dapper my card wasn't supported by the old driver).

for me the lockup happens when joining or leaving networks. the keyboard is pwned. the mouse still works (moves around) but i can't get too far with that. clicking on stuff generally doesn't work.

i know no way to get a dmesg of the problem.

Revision history for this message
Chris Jones (cmsj) wrote :

I've just started using my X40 fulltime for work and I've had 4 lockups so far today. I have just installed Soren's madwifi-ng (and related) packages as described above because I have to have this working, but it still would be really nice if the bug could be fixed in the version that dapper uses. I'm not terribly keen on upgrading the laptop to Edgy, even when released because I want this to be rock solid.

Revision history for this message
tuxo (beat-fasel) wrote :
Download full text (5.3 KiB)

Dapper is out now for almost 4 months. And I can still not use my wireless, as it freezes up my computer randomly. This bug is labeled critical, but no offical fix has been released yet.

As I got tired of the Ethernet cable crossing our whole appartment and like Chris Jones not keen on installing Edgy on my main production machine (stability is also my number one criteria), I tried to install the packages generously provided by the original bug reporter Soren Hansen. These packages allow you to use the madwifi-ng kernel drivers combined with a patched network-manger.

I added following line to my apt sources list:
deb http://www.linux2go.dk/ubuntu dapper main

Did and apt-get update apt-get upgrade which installled me following packages
madwifi-ng-default wpasupplicant network-manager libnm-util0 network-manager-gnome

So far so good.

However, after rebooting the machine, it hanged at the hardware detection point. After a long time it went past this point and booted into X. However, I was not able to get the wireless network running.

The new atheros kernel modules were loaded:
[~]>lsmod | grep ath
new_ath_pci 98980 0
new_ath_rate_sample 14048 1 new_ath_pci
new_wlan 205276 4 new_wlan_scan_sta,new_ath_pci,new_ath_rate_sample
new_ath_hal 191344 3 new_ath_pci,new_ath_rate_sample

and the ath0 device showed up with ifconfig.

As mentioned by Ankur Kotwal earlier in this thread, I tried to comment out the contents of the file /etc/udev/rules.d/25-iftab.rules. The machine did not hang anymore during the hardware detection point, but the wirless network didn't work anyway.

Grepping the kern.log file for ath leads to following entries during booting:

--------------------- snip --------------------
Sep 26 17:41:57 localhost kernel: [17179594.548000] ath_hal: module license 'Proprietary' taints kernel.
Sep 26 17:41:57 localhost kernel: [17179594.736000] ath_hal: 0.9.14.9 (AR5210, AR5211, AR5212, RF5111, RF5112, RF2413)
Sep 26 17:41:57 localhost kernel: [17179594.832000] ath_rate_sample: 1.2
Sep 26 17:41:57 localhost kernel: [17179594.848000] ath_pci: 0.9.6.0 (EXPERIMENTAL)
Sep 26 17:41:57 localhost kernel: [17179595.348000] ath0: 11b rates: 1Mbps 2Mbps 5.5Mbps 11Mbps
Sep 26 17:41:57 localhost kernel: [17179595.348000] ath0: 11g rates: 1Mbps 2Mbps 5.5Mbps 11Mbps 6Mbps 9Mbps 12Mbps 18Mbps 24Mbps 36Mbps 48Mbps 54Mbps
Sep 26 17:41:57 localhost kernel: [17179595.348000] ath0: turboG rates: 6Mbps 9Mbps 12Mbps 18Mbps 24Mbps 36Mbps 48Mbps 54Mbps
Sep 26 17:41:57 localhost kernel: [17179595.348000] ath0: H/W encryption support: WEP AES AES_CCM TKIP
Sep 26 17:41:57 localhost kernel: [17179595.348000] ath0: mac 5.9 phy 4.3 radio 4.6
Sep 26 17:41:57 localhost kernel: [17179595.348000] ath0: Use hw queue 1 for WME_AC_BE traffic
Sep 26 17:41:57 localhost kernel: [17179595.348000] ath0: Use hw queue 0 for WME_AC_BK traffic
Sep 26 17:41:57 localhost kernel: [17179595.348000] ath0: Use hw queue 2 for WME_AC_VI traffic
Sep 26 17:41:57 localhost kernel: [17179595.348000] ath0: Use hw queue 3 for WME_AC_VO traffic
Sep 26 17:41:57 localhost kernel: [17179595.348000] ath0: Use hw queue 8 for CAB traffic
Sep 26 17:41:57 localhost k...

Read more...

Revision history for this message
Denes Kiss (kiss-denes) wrote :

My laptop hangs up randomly with Ubuntu 6.06. I tried everythings for months, but no result.
I installed Kubuntu 6.06 and interestingly it has been working fine. No hang up any longer. I do not see any differences in the configuration.
It was rock solid with Ubuntu 5.10 too.
Compaq Evo160, Ati Mobility card. I use two wired ethernet interfaces for routing.
KD

Revision history for this message
Shaun Crampton (fasaxc) wrote :

I solved some random wifi-related lockups by uninstalling network-manager and directly configuring wpa_supplicant. I added a bit to the ubuntu wiki https://help.ubuntu.com/community/WifiDocs/WPAHowTo. Basically I just start wpa_supplicant with a pre-up stanza in /etc/network/interfaces rather than using network-manager. This would make sense if KDE doesn't suffer lockups if nm-applet and/or gnome-keyring-daemon etc are at fault.

Revision history for this message
Chris Jones (cmsj) wrote :

I see someone just added this as also affecting l-r-m-2.6.17, ie edgy. I would like to refute this as my atheros based thinkpad has been solidly stable since I switched to madwifi-ng (first in dapper, then by upgrading to edgy).

Juanje - what leads you to add 2.6.17?

Revision history for this message
CNBorn (cnborn) wrote :

Also have this problem on my PentiumIII 866 + GA-60XE(815EP)
But i don't have any wireless card on this machine.
just a 3com ethernet adapter.

just simply Dapper Server 6.06.1 + Kernel 2.6.15 server
running amule+azureus.

randomly shows "BUG: soft lockup detected on CPU#0" and than hanged up.
sysrg function isn't that useful except reboot...

This machine running Memtest+ and have lots of errors, but i don't think that is the problem, before this softlockup state comes , this machine had been tested for few days , and turned all right.

bring this softlockup just recent days. Don't know why...

Revision history for this message
Soren Hansen (soren) wrote :

On Tue, Nov 14, 2006 at 02:58:46PM -0000, CNBorn wrote:
> Also have this problem on my PentiumIII 866 + GA-60XE(815EP)
> But i don't have any wireless card on this machine.
> just a 3com ethernet adapter.

Then you're experiencing a different bug.

> This machine running Memtest+ and have lots of errors, but i don't
> think that is the problem, before this softlockup state comes , this
> machine had been tested for few days , and turned all right.

If Memtest gives you errors, you most certainly have a problem and it
might very well be the cause of this. You should identify your faulty
RAM and throw it far away. :-)

Revision history for this message
GlebV (gvolodin) wrote :

I've experienced the same bug only after last software update. About three times in a row computer worked for 10-15 min and hanged. At first time I haven't even done anything, I wasn't even logged in, because I left computer for a while. On the other day this bug occurred only once. And It hasn't occurred during last two days. I was connected to the Net, but my connection is not wireless.

Revision history for this message
Soren Hansen (soren) wrote :

This bug does not exist in 2.6.17 (Edgy kernel) since it's a madwifi-old issue and Edgy uses madwifi-ng.

Changed in linux-restricted-modules-2.6.17:
status: Unconfirmed → Fix Released
Revision history for this message
Chris Jones (cmsj) wrote :

GlebV: This is not the bug you are looking for, this is specific to the mad-wifi wireless card drivers, so if you are having system lockups, it is a separate bug

Revision history for this message
tuxo (beat-fasel) wrote :

> ** Changed in: linux-restricted-modules-2.6.17 (Ubuntu)
> Status: Unconfirmed => Fix Released
As this bug was never fixed in Dapper (LTS anyone?), I freshly installed Edgy. However, with the madwifi-ng driver present there, my Wifi card with following ID did not even work:

$ lspci -n | grep $(lspci | grep -i atheros | cut -f1 -d' ') | cut -f3 -d' '
168c:0013

I have three computers using the same Wifi card. Not expecting this bug ever getting fixed, I bought three new wireless cards with a Ralink chip that feature an open source driver. My wireless woes are gone now, however, I am very unhappy about the state of how critical bugs are (not) handled in Ubuntu.

Revision history for this message
Soren Hansen (soren) wrote :

On Mon, Nov 27, 2006 at 09:41:36AM -0000, tuxo wrote:
> As this bug was never fixed in Dapper (LTS anyone?), I freshly
> installed Edgy. However, with the madwifi-ng driver present there, my
> Wifi card with following ID did not even work:
>
> $ lspci -n | grep $(lspci | grep -i atheros | cut -f1 -d' ') | cut -f3 -d' '
> 168c:0013

Odd, that's the exact same card I have. Oh, did you still have my
madwifi-ng-default package installed? That might break stuff because the
modules were renamed.

Cheers.

Revision history for this message
tuxo (beat-fasel) wrote :

--- Soren Hansen <email address hidden> wrote:

> Odd, that's the exact same card I have. Oh, did you
> still have my
> madwifi-ng-default package installed? That might
> break stuff because the
> modules were renamed.

Thanks for this info, Soren! Indeed that's very odd,
as I tested the same wifi card (I have two of them,
PCI versions) on another computer, doing a fresh
install of Edgy. Neither with the Life CD, nor the
installed version of Edgy could I get this Wifi card
to work at all. The restricted package was installed
and the kernel modules were loaded fine, I could even
see my router when using network-manager, but I could
not connect to it. By the way, on this computer was
previously installed Warty and there the Wifi card
worked fine.

I did the same test also on the other computer, where
I originally installed your madwif-ng packages, using
the Edgy Live CD (so your madwifi package cannot be
the culprit), but I could neither get a connection to
my router. After upgrading this machine from dapper to
edgy, it still did not work (I made sure to deinstall
your packages and reinstall the original packages
beforehand).

Thanks,
Beat

____________________________________________________________________________________
Do you Yahoo!?
Everyone is raving about the all-new Yahoo! Mail beta.
http://new.mail.yahoo.com

Revision history for this message
Ramin Nakisa (ramin-nakisa) wrote :

This is still broken in 2.6.17. I get random lockups in Edgy. I've got an atheros card, so lspci -n gives 168c:0013. When I boot I get an error:

ERROR
Resource Conflict - PCI Network Controller in slot 01
   Bus:02, Device:02, Function:00

Before it hangs /var/log/messages contains this error:

Mar 25 17:51:37 localhost kernel: [17184040.904000] wifi0: hardware error; reseting
Mar 25 17:51:38 localhost kernel: [17184041.564000] wifi0: ath_chan_set: unable to reset channel 7 (2442Mhz) flags 0xc0 'Hardware didn't respond as expected' (HAL status 3)

The kernel version is 2.6.17-11-386 (version #2).

My laptop is almost unusable. It works for a while then randomly hangs. Perhaps you can fix this before we get Feisty? This has been dragging on for so long without a fix.

Revision history for this message
Soren Hansen (soren) wrote :

On Sun, Mar 25, 2007 at 08:48:20PM -0000, Ramin Nakisa wrote:
> This is still broken in 2.6.17. I get random lockups in Edgy. I've got
> an atheros card, so lspci -n gives 168c:0013. When I boot I get an
> error:

I'm afraid you're experiencing a different issue. This lockup is fixed
in madwifi-ng which is the default in Edgy. The bug is still open in
Dapper, though.

> My laptop is almost unusable. It works for a while then randomly
> hangs. Perhaps you can fix this before we get Feisty? This has been
> dragging on for so long without a fix.

Could you please test if it's fixed in Feisty?

Revision history for this message
Ramin Nakisa (ramin-nakisa) wrote :

On Sunday 25 March 2007 20:58, Soren Hansen wrote:
> On Sun, Mar 25, 2007 at 08:48:20PM -0000, Ramin Nakisa wrote:
> > This is still broken in 2.6.17. I get random lockups in Edgy. I've got
> > an atheros card, so lspci -n gives 168c:0013. When I boot I get an
> > error:
>
> I'm afraid you're experiencing a different issue. This lockup is fixed
> in madwifi-ng which is the default in Edgy. The bug is still open in
> Dapper, though.
>
> > My laptop is almost unusable. It works for a while then randomly
> > hangs. Perhaps you can fix this before we get Feisty? This has been
> > dragging on for so long without a fix.
>
> Could you please test if it's fixed in Feisty?

Hi Soren,

How do I upgrade to Feisty before the official release? I don't suppose it
could be more unstable than my current Ubuntu install!

This sounds very similar to the previous problem (X40, exactly the same
network card) so how do you know it's a different issue?

Thanks,

Ramin.

Revision history for this message
Soren Hansen (soren) wrote :

On Sun, Mar 25, 2007 at 09:43:04PM -0000, Ramin Nakisa wrote:
> > Could you please test if it's fixed in Feisty?
> How do I upgrade to Feisty before the official release? I don't
> suppose it could be more unstable than my current Ubuntu install!

https://help.ubuntu.com/community/FeistyUpgrades

> This sounds very similar to the previous problem (X40, exactly the
> same network card) so how do you know it's a different issue?

Because the issue in this bug report has to do with broken locking in
the madwifi-old driver. Since Edgy (and Feisty) use madwifi-ng, where
this particular bug is fixed, and you're running edgy, it must be a
different issue.

Revision history for this message
Ramin Nakisa (ramin-nakisa) wrote :

It is still broken in Feisty. I did a clean install from the latest ISO, did
all the updates. Then it crashed with this error in /var/log/messages:

wifi0: hardware error; resetting
wifi0: ath_reset: unable to reset hardware: 'Hardware didn't respond as
expected' (HAL status 3)

Revision history for this message
Soren Hansen (soren) wrote :

On Tue, Mar 27, 2007 at 05:47:22PM -0000, Ramin Nakisa wrote:
> It is still broken in Feisty. I did a clean install from the latest ISO, did
> all the updates. Then it crashed with this error in /var/log/messages:

> wifi0: hardware error; resetting
> wifi0: ath_reset: unable to reset hardware: 'Hardware didn't respond as
> expected' (HAL status 3)

Again: This is not the same issue. Could you please open another bug, so
we can track them separately?

https://launchpad.net/ubuntu/+source/linux-source-2.6.20/+filebug

For a start, please attach a complete dmesg and an "lspci -vv" to the
new bug report.

Thanks!

Revision history for this message
Erik Andrén (erik-andren) wrote :

Is this still an issue in gutsy?
Please try with a live cd of the latest (tribe 5) release:
http://cdimage.ubuntu.com/releases/gutsy/tribe-5/

Revision history for this message
Chris Jones (cmsj) wrote :

Erik: this is purely an issue in dapper. It was fixed in edgy because it was fixed in madwifi-ng and not in madwifi.

Revision history for this message
tuxo (beat-fasel) wrote :

As of Gutsy Gibbon, this bug seems to be fixed for me too.

Revision history for this message
Luca Lorenzetto (lorenzetto-luca) wrote :

me too, no problems related to madwifi.

--
"E' assurdo impiegare gli uomini di intelligenza eccellente per fare
calcoli che potrebbero essere affidati a chiunque se si usassero delle
macchine"
Gottfried Wilhelm von Leibnitz, Filosofo e Matematico (1646-1716)

Luca Lorenzetto, http://www.dancetj.net , <email address hidden>

Revision history for this message
Mircea Deaconu (mirceade) wrote :

Hello! First of all let me say I am truly regretting doing this. This is a spam message sent to all critical bug message lists. It's purpose: making this (https://bugs.launchpad.net/ubuntu/+source/acpi-support/+bug/59695) bug critical too. This is a long standing bug and has a very serious impact on laptop type of hardware. It's priority is set to "wishlist" and I just cannot take this anymore. I DO NOT CARE if my account gets suspended. I am doing what's right for all my friends using Ubuntu on their laptops.

Revision history for this message
Chris Jones (cmsj) wrote :

Hi

Mircea Deaconu wrote:
> making this (https://bugs.launchpad.net/ubuntu/+source/acpi-
> support/+bug/59695) bug critical too. This is a long standing bug and
> has a very serious impact on laptop type of hardware. It's priority is

This is simply incorrect. I have just commented on the bug to explain why.

Please do not spam legitimate bugs. There are proper processes for
raising technical issues within Ubuntu.

Cheers,
--
Chris Jones
  <email address hidden>
   www.tenshu.net

Changed in linux-restricted-modules-2.6.15:
importance: Undecided → Critical
status: New → Confirmed
Changed in linux-restricted-modules-2.6.17:
status: New → Fix Released
Revision history for this message
matteroso (ttammar) wrote :
Download full text (3.7 KiB)

I'm having a similar problem in Feisty with the Madwifi-ng driver on a TrendNet TEW-443PI
Already ran memtest - no problems there. Here's what the kern.log outputs before I have to reboot:

Kern.log:
Apr 13 22:09:40 Sparta kernel: [ 8413.883195] BUG: soft lockup detected on CPU#1!
Apr 13 22:09:40 Sparta kernel: [ 8413.883223] [softlockup_tick+156/240] softlockup_tick+0x9c/0xf0
Apr 13 22:09:40 Sparta kernel: [ 8413.883241] [update_process_times+51/128] update_process_times+0x33/0x80
Apr 13 22:09:40 Sparta kernel: [ 8413.883250] [smp_apic_timer_interrupt+112/128] smp_apic_timer_interrupt+0x70/0x80
Apr 13 22:09:40 Sparta kernel: [ 8413.883258] [apic_timer_interrupt+40/48] apic_timer_interrupt+0x28/0x30
Apr 13 22:09:40 Sparta kernel: [ 8413.883271] [_spin_lock_irqsave+47/80] _spin_lock_irqsave+0x2f/0x50
Apr 13 22:09:40 Sparta kernel: [ 8413.883280] [<f89e28f4>] ieee80211_free_node+0x24/0x80 [wlan]
Apr 13 22:09:40 Sparta kernel: [ 8413.883308] [<f89dd6d7>] ieee80211_input_all+0x57/0x90 [wlan]
Apr 13 22:09:40 Sparta kernel: [ 8413.883333] [<f8b21fe1>] ath_rx_tasklet+0x751/0x7e0 [ath_pci]
Apr 13 22:09:40 Sparta kernel: [ 8413.883353] [tasklet_action+99/224] tasklet_action+0x63/0xe0
Apr 13 22:09:40 Sparta kernel: [ 8413.883362] [__do_softirq+130/256] __do_softirq+0x82/0x100
Apr 13 22:09:40 Sparta kernel: [ 8413.883372] [do_softirq+85/96] do_softirq+0x55/0x60
Apr 13 22:09:40 Sparta kernel: [ 8413.883379] [do_IRQ+69/128] do_IRQ+0x45/0x80
Apr 13 22:09:40 Sparta kernel: [ 8413.883387] [common_interrupt+35/48] common_interrupt+0x23/0x30
Apr 13 22:09:40 Sparta kernel: [ 8413.883399] [<f8a2bff0>] zz033ebfbf+0xc0/0x148 [ath_hal]
Apr 13 22:09:40 Sparta kernel: [ 8413.883414] [<f8a3f3b8>] zz0b709eff+0x38/0x48 [ath_hal]
Apr 13 22:09:40 Sparta kernel: [ 8413.883430] [<f8b16b77>] ath_txq_update+0x87/0xe0 [ath_pci]
Apr 13 22:09:40 Sparta kernel: [ 8413.883449] [<f8a2ba46>] zz0067d221+0x1a/0x34 [ath_hal]
Apr 13 22:09:40 Sparta kernel: [ 8413.883463] [<f8b16bef>] ath_wme_update+0x1f/0x90 [ath_pci]
Apr 13 22:09:40 Sparta kernel: [ 8413.883474] [<f89e9ac3>] ieee80211_wme_updateparams_locked+0x123/0x210 [wlan]
Apr 13 22:09:40 Sparta kernel: [ 8413.883499] [<f89ea141>] ieee80211_wme_initparams_locked+0x41/0x170 [wlan]
Apr 13 22:09:40 Sparta kernel: [ 8413.883524] [<f89eb521>] ieee80211_wme_initparams+0x21/0x40 [wlan]
Apr 13 22:09:40 Sparta kernel: [ 8413.883546] [<f89e3a00>] ieee80211_sta_join1+0xa0/0x200 [wlan]
Apr 13 22:09:40 Sparta kernel: [ 8413.883567] [<f89ee1d0>] mlmelookup+0x0/0x60 [wlan]
Apr 13 22:09:40 Sparta kernel: [ 8413.883589] [<f89ee1d0>] mlmelookup+0x0/0x60 [wlan]
Apr 13 22:09:40 Sparta kernel: [ 8413.883614] [<f89eee2f>] ieee80211_ioctl_setmlme+0xbf/0x250 [wlan]
Apr 13 22:09:40 Sparta kernel: [ 8413.883641] [wireless_process_ioctl+701/992] wireless_process_ioctl+0x2bd/0x3e0
Apr 13 22:09:40 Sparta kernel: [ 8413.883650] [<f89eed70>] ieee80211_ioctl_setmlme+0x0/0x250 [wlan]
Apr 13 22:09:40 Sparta kernel: [ 8413.883673] [sock_ioctl+0/528] sock_ioctl+0x0/0x210
Apr 13 22:09:40 Sparta kernel: [ 8413.883681] [dev_ioctl+531/896] dev_ioctl+0x213/0x380
Apr 13 22:09:40 Sparta kernel: [ 8413.883697] [sock_ioctl+0/528] sock_ioctl+0x0/0...

Read more...

Revision history for this message
Tim Gardner (timg-tpi) wrote :

In order to get wireless stack bug fixes you are going to have to upgrade to a more recent release.

Changed in linux-restricted-modules-2.6.15:
status: Confirmed → Won't Fix
status: Confirmed → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.