Random Kernel Panics on Dell Latitude 2100 on Karmic

Bug #412704 reported by nh2
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Won't Fix
Medium
Stefan Bader
Nominated for Jaunty by qbanin

Bug Description

While the preinstalled Ubuntu Jaunty runs fine (well done, Dell!) on my Latitude 2100, I experience random hard kernel panics with Ubuntu 9.10.

It seems the problem has something to do with network activity. I experience it on:
- Firefox browsing
- copying files from/to a Samba share with Nautilus
- even on file transfers through sshfs and scp.

Unfortunately, the bug is semi-reproducable in such way that e.g. a scp fails in about one of five cases.

I neither know which part of the system is responsible for these panics nor how to identify the error, since the system just freezes and no information is printed to the console or /var/log/messages. As I have KMS enabled, the last thing displayed is my mouse cursor inside an empty TTY console (which is very interesting, too). Disabling KMS did not bring up any further information.

Does anyone have a suggestion how to identify the error?

affects: ubuntu → linux (Ubuntu)
Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

Please run the following command which will automatically gather and attach general kernel debug information:

apport-collect -p linux 412704

Unfortunately without seeing the output of the kernel panic, it'll be hard to nail down what's going wrong here. You may want to try the latest mainline kernel build a well just to verify if this issue exists with the upstream kernel. See https://wiki.ubuntu.com/KernelTeam/MainlineBuilds . Let us know your results. Thanks.

Changed in linux (Ubuntu):
status: New → Incomplete
tags: added: kj-triage needs-kernel-logs needs-upstream-testing
Revision history for this message
nh2 (nh2) wrote : apport-collect data

AplayDevices:
 **** List of PLAYBACK Hardware Devices ****
 card 0: Intel [HDA Intel], device 0: ALC272 Analog [ALC272 Analog]
   Subdevices: 1/1
   Subdevice #0: subdevice #0
Architecture: i386
ArecordDevices:
 **** List of CAPTURE Hardware Devices ****
 card 0: Intel [HDA Intel], device 0: ALC272 Analog [ALC272 Analog]
   Subdevices: 1/1
   Subdevice #0: subdevice #0
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: niklas 3074 F.... pulseaudio
CRDA: Error: [Errno 2] No such file or directory
Card0.Amixer.info:
 Card hw:0 'Intel'/'HDA Intel at 0xf6ebc000 irq 21'
   Mixer name : 'Realtek ALC272'
   Components : 'HDA:10ec0272,102802d6,00100001'
   Controls : 9
   Simple ctrls : 5
DistroRelease: Ubuntu 9.10
HibernationDevice: RESUME=UUID=ba53b338-1b61-42d4-88dc-90466beb1620
MachineType: Dell Inc. Latitude 2100
Package: linux (not installed)
ProcCmdLine: BOOT_IMAGE=/boot/vmlinuz-2.6.31-5-generic root=UUID=86e8116d-5d76-4a71-a5fd-a1f3ddf83af9 ro splash
ProcEnviron:
 LANG=de_DE.UTF-8
 SHELL=/usr/bin/zsh
ProcVersionSignature: Ubuntu 2.6.31-5.24-generic
RelatedPackageVersions:
 linux-backports-modules-2.6.31-5-generic N/A
 linux-firmware 1.15
Uname: Linux 2.6.31-5-generic i686
UserGroups: adm admin cdrom dialout lpadmin plugdev sambashare
dmi.bios.date: 06/02/2009
dmi.bios.vendor: Dell Inc.
dmi.bios.version: A01
dmi.board.name: 0W786N
dmi.board.vendor: Dell Inc.
dmi.chassis.type: 8
dmi.chassis.vendor: Dell Inc.
dmi.modalias: dmi:bvnDellInc.:bvrA01:bd06/02/2009:svnDellInc.:pnLatitude2100:pvr:rvnDellInc.:rn0W786N:rvr:cvnDellInc.:ct8:cvr:
dmi.product.name: Latitude 2100
dmi.sys.vendor: Dell Inc.

Revision history for this message
nh2 (nh2) wrote :
Revision history for this message
nh2 (nh2) wrote :
Revision history for this message
nh2 (nh2) wrote :
Revision history for this message
nh2 (nh2) wrote :
Revision history for this message
nh2 (nh2) wrote :
Revision history for this message
nh2 (nh2) wrote :
Revision history for this message
nh2 (nh2) wrote :
Revision history for this message
nh2 (nh2) wrote :
Revision history for this message
nh2 (nh2) wrote :
Revision history for this message
nh2 (nh2) wrote :
Revision history for this message
nh2 (nh2) wrote :
Revision history for this message
nh2 (nh2) wrote :
Revision history for this message
nh2 (nh2) wrote :
Revision history for this message
nh2 (nh2) wrote :
Revision history for this message
nh2 (nh2) wrote :

The panic appeared while apport was uploading, so I'll try it again.

Revision history for this message
nh2 (nh2) wrote :

Since that abort, apport regrets working, so I opened Bug 414055 so I perhaps get a hint how to apport-report again.

Furthermore, I tried the upstream kernel 2.6.31-rc6 without luck, the panic keeps appearing.

Revision history for this message
nh2 (nh2) wrote :

OK, this seems to be the Wifi driver, since it stops happening on wired network.
As lspci reports, it is an Intel Corporation Wireless WiFi Link 5100.
I also got a printout (photo attached) using scp on TTY 1.

Is this enough information to narrow it down (I still have no idea how to record the full panic output)? Shall I try out some other kernels/drivers/whatever?

Thanks.

Revision history for this message
nh2 (nh2) wrote :

I've tried the whole thing on 2.6.30.5:

I scp'd something to another computer. The first two times, it worked flawlessly, but the third time the following appeared:
[ 314.757531] iwlagn 000:0c:00.0: iwl_tx_agg_start on ra = 00:1f:3f:d4:e8:c9 tid = 0
[ 314.757595] iwlagn 000:0c:00.0: HW queue is empty

I scp'd again, this time:
[ 395.230500] iwlagn 000:0c:00.0: Microcode SW error detected. Restarting 0x92000000.

Then, wifi was down. Pinging a computer in my network resulted in:
[ 541.806238] iwlagn 000:0c:00.0: Invalid station for AGG tid 0
[ 541.807424] iwlagn 000:0c:00.0: Invalid station for AGG tid 0
[ 541.828119] iwlagn 000:0c:00.0: mac90211-phy0: failed to remove key (0, 00:1f:3f:d4:e8:c9) from hardware (-22)

After a minute, wifi was back.

Note that on 2.6.30.5, the computer did post that message but not freeze.

In 2.6.29, there is no error at all, so I suspect this to be a regression.

tags: removed: needs-kernel-logs
Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Changed in linux (Ubuntu):
importance: Undecided → Medium
tags: added: regression-release
tags: added: regression-potential
removed: regression-release
Revision history for this message
Stefan Bader (smb) wrote :

The linux-backports-modules package has newer versions of the wireless drivers (following compat-wireless). Could you give those a try to see whether this works better for you?

Changed in linux (Ubuntu):
assignee: nobody → Stefan Bader (stefan-bader-canonical)
Revision history for this message
nh2 (nh2) wrote :

Yes, linux-backports-modules works better for me, I get a microcode error instead of a panic.

I have opened a bug on the Intel Wireless Linux Bugzilla: http://bugzilla.intellinuxwireless.org/show_bug.cgi?id=2077

The panic bug is in Linux 2.6.31, but it is fixed (replaced by a microcode error) in never versions of the wifi driver (see the upstream bug report for details) - this is why the panic does not appear in neither linux-backports-modules nor Linux 2.6.32-rc1.

The Intel devs have suggested eventually porting the fixing commit into Linux 2.6.31. In that case, Ubuntu intervention will not be needed, else, we probably have to patch the driver (which should be easy as we know the commit).

I will keep this bug up to date.

Revision history for this message
qbanin (qbanin) wrote :

I think this issue is somehow related to interrupt sharing. I have Nvidia 8600M graphic card and my problem with Microcode errors and networking restart started, when i've added the line "options nvidia NVreg_EnableMSI=1" to /etc/modprobe.d/options.conf

after disabling this option I have no more "iwlagn 000:0c:00.0: Microcode SW error detected. Restarting 0x92000000" messages.

Revision history for this message
Stefan Bader (smb) wrote :

As we got the panic fixed with the l-b-m version of the wireless driver and the remaining firmware errors seem to be solvable with not using msi, I will close this bug off.

Changed in linux (Ubuntu):
status: Confirmed → Won't Fix
Revision history for this message
nh2 (nh2) wrote :

This is fixed in Ubuntu 10.04 and 10.10.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.