Kernel hangs on boot (SATA, AMD64/i386)

Bug #190492 reported by Gian-Luca Dei Rossi on 2008-02-09
112
This bug affects 3 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
High
Unassigned
Declined for Hardy by Tim Gardner

Bug Description

After a dist-upgrade from gutsy to hardy (amd64 architecture and distribution), the new default kernel (2.6.24-7-generic) doesn't boot anymore... kernel output is normal until those messages:

ata1: SATA link up 3.0 Gbps (SStatus 123 Scontrol 300)
ata1.00: qc timeout (cmd 0xec)
ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
ata1: failed to recover some devices, retrying in 5 secs

(repeated also for ata2)

The old kernel (2.6.22-14-generic from gutsy) still boot regularly.

The SATA controller is (from lspci)
00:0f.0 SATA controller: VIA Technologies, Inc. VT8251 AHCI/SATA 4-Port Controller

I'll attach a more verbose lspci output in a followup to this report.

Thanks

Attached is the output of lspci -vvxxx on that system

Hi,

Care to attach the following:

* sudo lspci -vvnn > lspci-vvnn.log

Also, would you be able to take a digital photo of the messages you are seeing and attach it to this bug report? For more information regarding the kernel team bug policy, please refer to https://wiki.ubuntu.com/KernelTeamBugPolicies . Thanks again and we appreciate your help and feedback.

Changed in linux:
status: New → Incomplete

Photo of the messages. Sorry for the poor quality, I haven't a true digital camera at hand.

lspci -vvnn

Solved appending pci=nomsi to kernel parameters

Changed in linux:
assignee: nobody → ubuntu-kernel-team
importance: Undecided → Medium
status: Incomplete → Triaged
Nat Tuck (nat-ferrus) wrote :

I'm getting this same result, fixable with "pci=nomsi". Not booting without adding kernel parameters is moderately annoying.

Kevin_b_er (ktbvz2) wrote :

I can confirm this on a similar system with an AMD64 and a VIA VT8251 southbridge on i386 hardy beta. So this affects both i386 and amd64 builds.

I also note the output of dmesg contains an "APIC error on CPU0: 0(0)" as the above mentioned error repeats itself before each qc timeout, but this does NOT show up in the startup lines.

Kevin_b_er (ktbvz2) wrote :

This is still a problem in i386 8.04 final.

Same problem for me. Hardy final release on AMD64.

Motherboard is an Asus A8V-VM with a VT8251 AHCI/SATA controller.

Adding the "pci=nomsi" line indeed fixed the problem.

ikesterhaney (ikesterhaney) wrote :

How do you add this line to the install DVD? I tried adding it to no avail.
I was able to get debian to boot with this switch, but my hard drives were not detected.
All I get is the USB thumb drive.

I have the sis chipset, which conforms to the AHCI specification.

bford16 (bford16) wrote :

Same problem, also on Asus A8V-VM with VT8251. Booting with pci=nomsi gets the DVD-RW drive recognized, but breaks ability to play DVDs with Kaffeine: "cannot read encrypted disc" errors appear. Libdvdcss2 is installed, and DVDs play when booting without pci=nomsi.

bford16 (bford16) wrote :

UPDATE: in Gnome, booting with pci=nomsi does indeed solve the problem with DVD writing and playback. No further problems with the drive at this time.

ikesterhaney (ikesterhaney) wrote :

Is there any information about the cause of this bug? All these bugs get named as duplicates of this one. I have not seen any progress towards resolution of this problem. To clarify, if I use pci=nomsi I get no hard disks. Ubuntu will not install. Something must be causing this bug. If the AHCI driver does not work it should be fixed, or at least mitigated so that we know if Ubuntu can be installed or not.

I only have SATA drives right now and the SATA drives are not accessible with the nomsi option.

Could someone please look into the root cause of this issue?

Thank you.

Just a note.
I had the same problem with Gentoo 2008 beta1 live CD, but now I can boot fine from Gennto 2008 beta2 live CD.

They have fixed this problem somehow.

impert (cwallace-free) wrote :

      >Solved appending pci=nomsi to kernel parameters

How do you do this please?

>How do you do this please

By adding the options to kernel line to grub configuration file /boot/grub/menu.list

Or you can edit the Grub menu entries directly from the boot menu by pressing the <e> key.
See http://www.gnu.org/software/grub/manual/html_node/Menu-interface.html for more info.

impert (cwallace-free) wrote :

OK, and thanks very much.

machrider (machrider) wrote :

Same issue here. Unbelievable that the Ubuntu team would push out Hardy before fixing this kind of issue. What a nightmare for an average user!

The old kernel (2.6.22) worked fine. After turning off the "quiet" and "splash" kernel options, the 2.6.24 kernel produces error messages like:
    ata3: failed to IDENTIFY
    ata3: failed to recover some devices
After maybe 5 minutes, it drops me to a busybox shell. The "pci=nomsi" fix works for me. I have no idea what this means. What are the side effects of using this option?

* lspci-vvnn output attached.
* I'm running RAID 5 with three SATA WD 500GB hard drives.

If i was on the Ubuntu team, I'd be telling them to stop pushing the Distribution Upgrades until this issue is fixed. This is amazingly bad.

pehaimi (perttu-haimi) wrote :

I just tried to boot without the "pci=nomsi" option. The regression is
still present in 2.6.24-18 kernel.

$ uname -r
2.6.24-18-generic

$ lsb_release -rd
Description: Ubuntu 8.04
Release: 8.04

$ apt-cache policy linux-image-2.6.24-18-generic
linux-image-2.6.24-18-generic:
  Installed: 2.6.24-18.32
  Candidate: 2.6.24-18.32
  Version table:
 *** 2.6.24-18.32 0
        500 http://security.ubuntu.com hardy-security/main Packages
        500 http://archive.ubuntu.com hardy-updates/main Packages
        100 /var/lib/dpkg/status

FWIW, this got me too. I upgraded my AMD64/SATA installation of Gutsy to Hardy and it took me 3 hours to find "pci=nomsi". Very frustrating! I have three stock kernels installed (2.6.24-19-generic, 2.6.22-14-generic, 2.6.20-16-generic), and only 2.6.20-16-generic will boot now without "pci=nomsi".

Thanks for the info, folks.

bartoruiz (bartoruiz) wrote :

3 days here!

to find this "workaround"....can believe this bug still exists

Jason Marshall (jasinner) wrote :

Just wanted to confirm this remains a problem in v8.04.1 both i386 and x64. Wanted to add that the command all_generic_ide helps the harddrive become detected. Still problems detecting CD drive inside the OS.

Larry Garfield (lgarfield) wrote :

I just got bitten by this as well upgrading from Edgy to Hardy. Problem still exists in 2.6.24-19. It also didn't matter if my SATA drive was set to emulate IDE or to use AHCI. Same problem either way. Once I added pci=nomsi (what does that mean, anyway?) to the boot command in menu.lst, it booted properly.

The Ubuntu Kernel Team is planning to move to the 2.6.27 kernel for the upcoming Intrepid Ibex 8.10 release. As a result, the kernel team would appreciate it if you could please test this newer 2.6.27 Ubuntu kernel. There are one of two ways you should be able to test:

1) If you are comfortable installing packages on your own, the linux-image-2.6.27-* package is currently available for you to install and test.

--or--

2) The upcoming Alpha5 for Intrepid Ibex 8.10 will contain this newer 2.6.27 Ubuntu kernel. Alpha5 is set to be released Thursday Sept 4. Please watch http://www.ubuntu.com/testing for Alpha5 to be announced. You should then be able to test via a LiveCD.

Please let us know immediately if this newer 2.6.27 kernel resolves the bug reported here or if the issue remains. More importantly, please open a new bug report for each new bug/regression introduced by the 2.6.27 kernel and tag the bug report with 'linux-2.6.27'. Also, please specifically note if the issue does or does not appear in the 2.6.26 kernel. Thanks again, we really appreicate your help and feedback.

hernan (maldacenah) wrote :

The problem continue with 8.10 beta version. I have a asus a8v-mx (AMD64 3000+ )with vt 8251 and only one sata hdd and the following error appears:

ata1: SATA link up 3.0 Gbps (SStatus 123 Scontrol 300)
ata1.00: qc timeout (cmd 0xec)
ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
ata1: failed to recover some devices, retrying in 5 secs

thanks

Jorge Morais (jorgemorais) wrote :

I also confirm that the problem persists with 8.10 beta - I don't know if there has been more than one beta, so be informed that I used the iso CD image named ubuntu-8.10-beta-alternate-i386.iso, with md5sum 108696aafe01d4e90ee145c31ad05b82. I burned it to a new blank CD-R at low speed (8x), checked the CD for defects, and none was found.

With a normal boot, the HD is not detected.
With the pci=nomsi kernel parameter, the HD is detected (I didn't proceed to install the system though).

The motherboard is an ASUS a8v-x. The hard disk is SATA. In the BIOS setup, the item 'Serial ATA IDE controller' is configured as 'SATA'; the options are 'Disabled', 'SATA', 'RAID' and 'AHCI'. Changing to 'AHCI' didn't seem to help. I have not tried 'RAID', nor (obviously) 'Disabled'.

I have attached, in .tar.gz format, the output of dmesg, lsmod, lspci -vv, /proc/cmdline and /proc/cpuinfo for two situations: with a normal boot (except that I removed the 'quiet' kernel parameter) and a boot with the 'pci=nomsi' kernel parameter (and 'quiet' removed too). In both cases, I booted with the alternate CD, chose English language, Brazilian keymap, edited the boot parameters, then allowed the installer to reach its first prompt. Then I switched into a virtual console and gathered the previously mentioned information, saving it into a pen drive.

My attempt to install Linux in that computer involved Ubuntu 8.04 (failed to install), then Ubuntu 7.10 (failed to install), then Debian Etch. Debian Etch installed successfully, but was unstable. KDE would sometimes abruptly disappear and the system would go back to KDM. Then, since NTFS-3g complained that Debian Etch's 2.6.18 kernel was too old, I tried 2.6.24. Didn't work, with error messages similar to Ubuntu. I tried 2.6.26 from Lenny, with the same problem. I then read http://en.wikipedia.org/wiki/Ahci#Common_problems_switching_to_AHCI_under_Linux, saw that VT8251 is faulty, confirmed that this chip is present on the computer, and tried the pci=nomsi workaround. Now the Debian system seems to work stably (and NTFS-3g works without complaints).

I beg you to mention this bug in the Intrepid Release Notes, along with the pci=nomsi workaround. And it should be explained wether or not this workaround has bad side effects (I don't know. I have read the MSI-HOWTO.txt from the kernel documentation, but I still don't know if pci=nomsi is safe).

Of course, making the installer automatically detect the faulty chip and work around the problem would be excellent, if viable. Even better would be to change the kernel itself.

Of course, if you make any additional information request, I will respond as quickly as possible.
Thank you for working in this excellent OS,
     Jorge Peixoto from Brasil

Jorge Morais (jorgemorais) wrote :

I have retested Ubuntu 7.10 and it seems to be able to detect the disk. I don't remember what went wrong the last time I tried 7.10 and prompted me to install Debian Etch, but It seems that Ubuntu 7.10 is able to at least detect the disk.

Also, my previous attachment had the output from lspci -vv. This new one has output from lspci -vvnn (again from Intrepid Beta, with and without the pci=nomsi kernel parameter).

Jorge Morais (jorgemorais) wrote :

This problem has reached Intrepid final. Why wasn't it at least mentioned in the release notes?
Anyway, I have tested the problematic computer with Mandriva one 2009 (gnome CD), and the problem does *not* seem to appear. I have attached some information about the computer when booted trough the Mandriva CD.

Jorge Morais (jorgemorais) wrote :

Note: the description of the attachment above says lspci -vvvv, but the actual command I used was lspci -vvnn. Sorry.

Jorge Morais (jorgemorais) wrote :

I have installed Kubuntu 8.10 yesterday (2008-11-20) in a friend's computer through the final release alternate CD, and updated it shortly afterwards. The problem persists.
I am using the workaround pci=nomsi and it appears to work.
I am very disappointed. Still no mention in the release notes (which in my opinion is absurd), still no official explanation of what exactly the pci=nomsi workaround does or at least a general statement regarding its safety.
Please, at least mention this bug in the release notes. It would save people many hours of debugging.
The first time the bug appeared and I didn't know of the workaround, I spent days trying to change SATA configuration in the BIOS setup, examine the hard disk for defects, exchange hard disk, try other Linux versions, and of course a lot of Google searching (and looking in the release notes). Then I finally remembered that months ago I had read in a Wikipedia article that certain motherboards have problems with SATA under Linux; I went to the article and found the treasured workaround.
My colleague told me "How do you claim Linux is easy to use, you spent days trying to install it in Wanderson's computer!".
This bug is serious. Please mention it in the release notes.

On Fri, 2008-11-21 at 11:19 +0000, Jorge Peixoto de Morais Neto wrote:

> I have installed Kubuntu 8.10 yesterday (2008-11-20) in a friend's computer through the final release alternate CD, and updated it shortly afterwards. The problem persists.
> I am using the workaround pci=nomsi and it appears to work.
> I am very disappointed. Still no mention in the release notes (which in my opinion is absurd), still no official explanation of what exactly the pci=nomsi workaround does or at least a general statement regarding its safety.
> Please, at least mention this bug in the release notes. It would save people many hours of debugging.
> The first time the bug appeared and I didn't know of the workaround, I spent days trying to change SATA configuration in the BIOS setup, examine the hard disk for defects, exchange hard disk, try other Linux versions, and of course a lot of Google searching (and looking in the release notes). Then I finally remembered that months ago I had read in a Wikipedia article that certain motherboards have problems with SATA under Linux; I went to the article and found the treasured workaround.
> My colleague told me "How do you claim Linux is easy to use, you spent days trying to install it in Wanderson's computer!".
> This bug is serious. Please mention it in the release notes.

Jorge,

I quite agree with you, but I am not part of the Ubuntu team. I know how
frustrating this is from my own personal experience. What helped me was
having another computer on which I could get to the forums and the
Launchpad site. Heaven help anyone who doesn't have access to another
box.
I'm disappointed that I haven't had a reply to the email that I sent to
the documentation team. Perhaps if you sent one too? We may have to join
the team to get them to pay any attention to this.
Incidentally, with my new setup (different motherboard) Ubuntu works
fine without the pci=nomsi, so it is definitely hardware-related. On the
other hand , I had a whole lot of other problems which made me regret
the change. They seem to be fixed now though.

Regards,
Campbell Wallace

impert (cwallace-free) wrote :

Jorge,

Please excuse me; I thought your email was addressed only to me - I
didn't realise you had posted it on the Launchpad bug, which is the
right thing to do, of course. Perhaps another word to the documentation
people would not do any harm.

Cheers,
CW

Changed in linux:
importance: Medium → High
bford16 (bford16) wrote :

I was one of the original posters of this bug. When I installed Intrepid (by dist-upgrade) on my AMD64 machine, I was delighted to discover that the problem went away. My SATA DVD-RW drive is now completely and smoothly supported.

The pci=nomsi option turns off a type of memory access used primarily by PCI-express cards. For some reason I do not completely understand, using the msi-type memory access conflicts with kernel recognition of SATA drives, for kernels between 2.6.16 (I think) and 2.6.27. Using the pci=nomsi switch is completely safe and harmless.

However, this bug seems to be fixed in the 2.6.27 kernel. Since you upgraded your installation in-place, you may not have this kernel active on your computer. Check the results of 'uname -r' to see which kernel you are using. I had the 2.6.27 kernel "kept back" when I did my upgrade, because I had customized /boot/grub/menu.lst, and I wanted to keep the customization. If you find that you don't have the 2.6.27 kernel active, try running "sudo update-grub," and make sure to choose the "package maintainer's version" when prompted. This will overwrite your /boot/grub/menu.lst, so you might want to back it up first (you will lose your customizations). If this doesn't get your 2.6.27 kernel installed, or if you already have it, I do not have further suggestions (I'm still pretty new at this...). But you should be able to find help in the Ubuntu Forums.

Jorge Morais (jorgemorais) wrote :

(In response to immediately previous post, https://bugs.launchpad.net/ubuntu/+source/linux/+bug/190492/comments/33)
>Using the pci=nomsi switch is completely safe and harmless.
Where did you get that information?

And more importantly
>However, this bug seems to be fixed in the 2.6.27 kernel.
No. I tried to install from the final release alternate Kubuntu 8.10 CD and got hit by the problem. To install, I had to manually used pci=nomsi (using F6 to edit the kernel parameters).
Also, I recently (on 2008-11-21) updated the system and tried removing the pci=nomsi workaround from menu.lst. Didn't work, so the workaround is still needed.

I don't know why your Sata DVD drive now works, but for me the problem has remained the same until now.
I have no idea whether this but affects Sata DVD drivers differently from Sata HD drives, or if your bug was a different one.

Jorge Morais (jorgemorais) wrote :

Well, let me make two corrections to my post:
1) "tried removing the pci=nomsi workaround from menu.lst. Didn't work". To be precise, I didn't actually edited menu.lst, I used the handy Grub editing capacity to remove pci=nomsi from the kernel command line.
2) "I have no idea whether this but affects (...)" I obviously meant "this bug", not "this but".

And more importantly I have not only tested the 8.10 alternate Kubuntu CD, but also 8.10 final CD (don't remember if it was liveCD or alternate), and this bug also appears there (as expected - Kubuntu and Ubuntu share the same kernel and much else)

Per a decision made by the Ubuntu Kernel Team, bugs will longer be assigned to the ubuntu-kernel-team in Launchpad as part of the bug triage process. The ubuntu-kernel-team is being unassigned from this bug report. Refer to https://wiki.ubuntu.com/KernelTeamBugPolicies for more information. Thanks.

Bryan Wu (cooloney) wrote :

Can you confirm this issue exists with the most recent Jaunty Jackalope 9.04 release - http://www.ubuntu.com/news/ubuntu-9.04-desktop . Please let us know your results. Thanks.

And for Hardy and Intrepid, I will take a look at backport patches.

Thanks
-Bryan

Changed in linux (Ubuntu):
status: Triaged → Incomplete

Yes, I confirm, it still hangs on boot without pci=nomsi, same hardware as my original bug report, ubuntu jaunty amd64 and kernel 2.6.28-11-generic.

machrider (machrider) wrote :

Does NOT hang for me anymore. I'm running Hardy 8.04 with kernel
2.6.24-24-generic. One of the recent kernel updates must've changed
something (I only just rebooted this week, previous to that it has been
~200 days, so I'm not sure exactly which kernel fixed it for me).

I got ATA errors on reboot, and had to *remove* pci=nomsi to get my
computer to start up normally. My hardware details are posted up
thread.

Jason Marshall (jasinner) wrote :

It does still hang without pci=nomsi with jaunty however it gives more verbose feedback about the problem than previous versions. Says something like "your chipset is not fully supported causing some hardware not to detect". pci=nomsi works around the problem still though.

Bryan, as of the 9.04 upgrade, boot does NOT hang for me any longer. I am now running kernel 2.6.28-11-generic #42-Ubuntu SMP Fri Apr 17 01:58:03 UTC 2009. Let me know if there is more specific info I can provide.

Daniel Milde (daniel-milde) wrote :

MB Asus A8V-XE/SATA2/R/PCIe, 939/2K, VIA K8T890

Boot doesn't hang now, but no disks are found.

Never had this problem before, upgraded from Intrepid to Jaunty, problem showed up. Confirmed on 2.6.28-12-generic. pci=nomsi works fine as a workaround, but this is a critical bug.

That is, my boot hangs for about a minute, then proceeds to boot fine. Disks are also detected without a problem.

Paul Dufresne (paulduf) wrote :

I hope I am not mixing too much unrelated stuff...
I am trying to search similar infos but since I am not understanding really well all the infos...
But before closing all this tabs... let's have them been remember here.

There seems to be a similar bugs under Xen:
http://lists.xensource.com/archives/html/xen-users/2006-10/msg00289.html

Also under Xen, a known Debian Developper have reported a similar bug:
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=411202

http://ubuntuforums.org/archive/index.php/t-289018.html claims to fix it with 2.6.19, and make an interesting
comment about why we should avoid modules for fixing this.

http://kerneltrap.org/mailarchive/linux-kernel/2008/4/15/1432604 suggest a clear link with Sleep mode.
For the reporter, going in BIOS and changing from "SATA Power Management" on "Enable" to "Disable".

Paul Dufresne (paulduf) on 2009-07-28
Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Jeremy Foshee (jeremyfoshee) wrote :

This bug report was marked as Confirmed a while ago but has not had any updated comments for quite some time. Please let us know if this issue remains in the current Ubuntu release, http://www.ubuntu.com/getubuntu/download . If the issue remains, click on the current status under the Status column and change the status back to "New". Thanks.

[This is an automated message. Apologies if it has reached you inappropriately; please just reply to this message indicating so.]

tags: added: kj-triage
Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Fabio Lanzi (fabio-lanzi) wrote :

This bug still exists in Karmic Koala 9.10 32 bit. I have found it by installing on a Asus A8V-MX/S (Via 8251 SB).
The workaround pci=nomsi still works.

Changed in linux (Ubuntu):
status: Incomplete → New
Changed in linux (Ubuntu):
status: New → Confirmed
Changed in linux (Ubuntu):
status: Confirmed → Triaged
Tim Gardner (timg-tpi) wrote :

Please try the 10.04 LTS.

Changed in linux (Ubuntu):
status: Triaged → Won't Fix
To post a comment you must log in.