Kernel oops at ahci_deinit_port

Bug #663182 reported by Christophe Dumez
34
This bug affects 7 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
New
Undecided
Unassigned

Bug Description

I upgraded My DELL T3500 desktop from Lucid to Maverick today. Unfortunately, I get a kernel oops on startup with the new Kernel:

[ 18.357380] Stack:
[ 18.357436] 00000001 fbadad12 f6f4dd78 f8427d6d f844e000 f6f4dd78 f6df630c 0000001e
[ 18.357768] <0> f844e000 f6f4ddc0 f842856b f842a2ef f8444d0c f75c2920 f842a205 fffffffb
[ 18.358189] <0> fffffffb 0000001e f842a205 f75c2920 f844e000 f75f4060 f844e008 00000000
[ 18.358663] Call Trace:
[ 18.358722] [<f8427d6d>] ? ahci_deinit_port+0x1d/0x90 [libahci]
[ 18.358787] [<f842856b>] ? ahci_init_controller+0x5b/0x110 [libahci]
[ 18.358854] [<f844111e>] ? ahci_pci_init_controller+0x3e/0x50 [ahci]
[ 18.358921] [<f8441cb5>] ? ahci_init_one+0x3f5/0x7d0 [ahci]
[ 18.358988] [<c05ee769>] ? mutex_unlock+0x19/0x20
[ 18.359052] [<c027ac77>] ? sysfs_new_dirent+0x67/0x100
[ 18.359115] [<c027b07a>] ? sysfs_addrm_finish+0x1a/0xb0
[ 18.359181] [<c0375fe3>] ? local_pci_probe+0x13/0x20
[ 18.359244] [<c0376f98>] ? pci_device_probe+0x68/0x90
[ 18.359308] [<c0414d00>] ? really_probe+0x50/0x150
[ 18.359372] [<c041c387>] ? pm_runtime_barrier+0x57/0xb0
[ 18.359435] [<c0414e3c>] ? driver_probe_device+0x3c/0x60
[ 18.359498] [<c0414ee1>] ? __driver_attach+0x81/0x90
[ 18.359562] [<c0414303>] ? bus_for_each_dev+0x53/0x80
[ 18.359625] [<c0414bce>] ? driver_attach+0x1e/0x20
[ 18.359687] [<c0414e60>] ? __driver_attach+0x0/0x90
[ 18.359750] [<c0414595>] ? bus_add_driver+0xd5/0x280
[ 18.359814] [<c0376ed0>] ? pci_device_remove+0x0/0x40
[ 18.359877] [<c04151da>] ? driver_register+0x6a/0x130
[ 18.359940] [<c03771d5>] ? __pci_register_driver+0x45/0xb0
[ 18.360005] [<f8448017>] ? ahci_init+0x17/0x19 [ahci]
[ 18.360068] [<c0103042>] ? do_one_initcall+0x32/0x1a0
[ 18.360132] [<f8448000>] ? ahci_init+0x0/0x19 [ahci]
[ 18.360197] [<c0188f1b>] ? sys_init_module+0x9b/0x1e0
[ 18.360261] [<c0222ec2>] ? sys_write+0x42/0x70
[ 18.360325] [<c010939f>] ? sysenter_do_call+0x12/0x28

Kernel version is: linux-image-2.6.35-22-generic-pae

Also note that I get the following messages just before the crash:
[ 3.267124] ahci 0000:00:1f.2: controller reset failed (0xffffffff)
[ 3.770018] ahci 0000:00:1f.2: failed to stop engine (-5)
[ 4.273087] ahci 0000:00:1f.2: failed to stop engine (-5)
[ 4.776214] ahci 0000:00:1f.2: failed to stop engine (-5)
...

Revision history for this message
Christophe Dumez (hydr0g3n) wrote :
Revision history for this message
Christophe Dumez (hydr0g3n) wrote :
description: updated
Revision history for this message
Christophe Dumez (hydr0g3n) wrote :

I have tried the mainline kernel from http://kernel.ubuntu.com/~kernel-ppa/mainline/v2.6.36-rc7-maverick/
and I get the exact same issue.

Revision history for this message
lowey71 (milo-loweys) wrote :

Had exactly the same problem after upgrading with similar Dell T3500 system. 2.6.32-25-generic working fine though.

Revision history for this message
Christophe Dumez (hydr0g3n) wrote :

I confirm that v2.6.32-25 is working.
I tested v2.6.36 final (vanilla) and I get the same crash.

Revision history for this message
Christophe Dumez (hydr0g3n) wrote :

After more testing, I can say:
- 2.6.34.7 vanilla: NOT WORKING
- 2.6.35-23 from maverick-proposed: NOT WORKING
- 2.6.36 (final) vanilla: NOT WORKING

Revision history for this message
trbs (trbs) wrote :

We have the same problem on our developer workstations (Dell T-3500) running Ubuntu Maverick AMD64.

Tried the following kernels:

- 2.6.32-25: WORKING
- 2.6.33-6: WORKING
- 2.6.33-7: WORKING
- 2.6.34-rc1: NOT WORKING
- 2.6.34-1: NOT WORKING

So the problem seems to be introduced with the 2.6.34 kernel.

Our machines report exactly the same error message as given above.

Revision history for this message
at (arjan-tijms) wrote :

I just upgraded from Ubuntu 64 bits 10.04 to 10.10 and I too have this problem. I have a Dell T3500 with a W3520, 6GB memory, 2 Quadro FX 570 cards and a X25-M SSD.

When booting Ubuntu 10.10 I get the following message after a while:

ALERT! /dev/disk/by-uuid/[some UUID] does not exist. Dropping to a shell!

When I enable verbose messages in grub.cfg I also see the "failed to stop engine (-5)" messages. Switching back to the 2.6.32-25 kernel lets me boot up again, but needless to say I would prefer the 2.6.35 kernel.

Revision history for this message
he536 (henk53602) wrote :

This seems to be a duplicate of:

 - https://bugs.launchpad.net/ubuntu/+bug/653238
 - https://bugs.launchpad.net/ubuntu/+bug/658560
 - https://bugs.launchpad.net/ubuntu/+bug/659149

There was already a bugzilla entry on kernel.org:

 - https://bugzilla.kernel.org/show_bug.cgi?id=16228

Other distros had been affected as well, most of them in June already:

 - https://bugzilla.redhat.com/show_bug.cgi?id=620313
 - http://fedoraproject.org/wiki/KernelCommonProblems (at the bottom of the page)
 - http://forums.gentoo.org/viewtopic-p-6455079.html
 - https://bbs.archlinux.org/viewtopic.php?pid=779136

Inserting pci=nocrs in the kernel line in grub seems to solve it. Since this bug is so well known, I wonder why Ubuntu didn't include it by default for the Dell T3500.

Revision history for this message
Bjorn Helgaas (bjorn-helgaas) wrote :

This is a duplicate of bug #653238, not of bug #647043.

Bug #653238 was marked a duplicate of bug #647043, but that is incorrect.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.