Slow boot caused by SATA controller reset

Bug #595448 reported by Jarrett Miller
94
This bug affects 16 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Confirmed
Medium
Unassigned

Bug Description

Ubuntu 10.04 suffers from a very slow boot on due to a 10 second or so pause while the kernel sets up the SATA controller. The problem occurs on every boot at this point in the kernel initialization sequence:

ata1: link is slow to respond, please be patient (ready=0)
Jun 15 13:51:02 jmvpro kernel: [ 14.780019] ata1: SRST failed (errno=-16)
Jun 15 13:51:02 jmvpro kernel: [ 20.330017] ata1: link is slow to respond, please be patient (ready=0)
Jun 15 13:51:02 jmvpro kernel: [ 24.830018] ata1: SRST failed (errno=-16)
Jun 15 13:51:02 jmvpro kernel: [ 30.380018] ata1: link is slow to respond, please be patient (ready=0)
Jun 15 13:51:02 jmvpro kernel: [ 59.840018] ata1: SRST failed (errno=-16)
Jun 15 13:51:02 jmvpro kernel: [ 59.840022] ata1: limiting SATA link speed to 1.5 Gbps
Jun 15 13:51:02 jmvpro kernel: [ 64.850019] ata1: SRST failed (errno=-16)
Jun 15 13:51:02 jmvpro kernel: [ 64.850021] ata1: reset failed, giving up

This is an Intel DQ45CB motherboard with the latest bios. I believe ata1 is the eSATA port. I think my hard disks are connected to ata2 and ata3

This problem is also present in the 2.6.35 maverick backport kernel from the kernel-ppa.
This problem did not occur with ubuntu 9.10.

ProblemType: Bug
DistroRelease: Ubuntu 10.04
Package: linux-image-2.6.32-22-generic 2.6.32-22.36
Regression: Yes
Reproducible: Yes
ProcVersionSignature: Ubuntu 2.6.32-22.36-generic 2.6.32.11+drm33.2
Uname: Linux 2.6.32-22-generic x86_64
AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.21.
Architecture: amd64
ArecordDevices:
 **** List of CAPTURE Hardware Devices ****
 card 0: Intel [HDA Intel], device 0: AD198x Analog [AD198x Analog]
   Subdevices: 2/2
   Subdevice #0: subdevice #0
   Subdevice #1: subdevice #1
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: jarrett 1584 F.... pulseaudio
CRDA: Error: [Errno 2] No such file or directory
Card0.Amixer.info:
 Card hw:0 'Intel'/'HDA Intel at 0xe0620000 irq 22'
   Mixer name : 'Analog Devices AD1882'
   Components : 'HDA:11d41882,80861003,00100300'
   Controls : 41
   Simple ctrls : 24
CurrentDmesg:
 [ 77.952035] e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
 [ 77.952836] ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
 [ 88.240013] eth0: no IPv6 routers present
 [ 422.836625] Too many connections
Date: Thu Jun 17 06:44:31 2010
HibernationDevice: RESUME=UUID=f8a7aaf0-a741-457b-ab95-2b06f7bc8814
InstallationMedia: Ubuntu 10.04 LTS "Lucid Lynx" - Release amd64 (20100427.1)
IwConfig:
 lo no wireless extensions.

 eth0 no wireless extensions.
ProcCmdLine: BOOT_IMAGE=/boot/vmlinuz-2.6.32-22-generic root=/dev/mapper/vg0-root ro
ProcEnviron:
 LANG=en_US.utf8
 SHELL=/bin/bash
RelatedPackageVersions: linux-firmware 1.34
RfKill:

SourcePackage: linux
dmi.bios.date: 04/12/2010
dmi.bios.vendor: Intel Corp.
dmi.bios.version: CBQ4510H.86A.0121.2010.0412.0911
dmi.board.asset.tag: To be filled by O.E.M.
dmi.board.name: DQ45CB
dmi.board.vendor: Intel Corporation
dmi.board.version: AAE30148-205
dmi.chassis.type: 3
dmi.modalias: dmi:bvnIntelCorp.:bvrCBQ4510H.86A.0121.2010.0412.0911:bd04/12/2010:svn:pn:pvr:rvnIntelCorporation:rnDQ45CB:rvrAAE30148-205:cvn:ct3:cvr:

Revision history for this message
Jarrett Miller (spook) wrote :
Revision history for this message
Jeremy Foshee (jeremyfoshee) wrote :

Hi Jarrett,

If you could also please test the latest upstream kernel available that would be great. It will allow additional upstream developers to examine the issue. Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Once you've tested the upstream kernel, please remove the 'needs-upstream-testing' tag. This can be done by clicking on the yellow pencil icon next to the tag located at the bottom of the bug description and deleting the 'needs-upstream-testing' text. Please let us know your results.

Thanks in advance.

    [This is an automated message. Apologies if it has reached you inappropriately; please just reply to this message indicating so.]

tags: added: kj-triage
Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Jarrett Miller (spook) wrote :

Just tested with Mainline Kernel 2.6.34-lucid (aka 2.6.34-020634-generic) and the problem still persists. Removing "needs-upstream-testing" tag per automated instructions.

tags: removed: needs-upstream-testing
Revision history for this message
Jarrett Miller (spook) wrote :

Just for completeness I also just tested with the 2.6.35 "current" build which as of today is 2.6.35-999.201006151505_amd64.deb and the problem is still present there as well.

Changed in linux (Ubuntu):
status: Incomplete → New
Revision history for this message
Jarrett Miller (spook) wrote :

I just tested 2.6.31-0206113-generic and the problem is not present in that build. I am now gearing up to run a git bisect. It will be my first time doing a bisect some hopefully it will go well. Any advice would be appreciated.

Revision history for this message
Jarrett Miller (spook) wrote :

Updating progress so far. Seeing as how I tested the 2.6.31-0206113-generic deb file I assumed v2.6.31 of the mainline repo was also good. So I ran a git bisect between 2.6.31 and 2.6.32. Unfortunantly that looks like it was a big waste of time because on a hunch I tested v2.6.31 via mainline git and it also has the problem. So there is some difference between 2.6.31 in the linus git repo and whatever was done to build the 2.6.31-0206113-generic that is provided by the kernel team. I am now testing with v2.6.30 from linus git repo to see the problem is present in that build.

If v2.6.30 from the git repo works ok I will bisect between 2.6.30 and 2.6.31. Will report more later.

Revision history for this message
Jarrett Miller (spook) wrote :

Ok at this point I have done all the testing I know how to do and I will have to wait for guidance. As far as I an tell this bug affects mainline as far back as 2.6.30. I tested 2.6.30 up to the current rc for 2.6.35 all built from Linus git tree. They all have the problem. However Ubuntu 9.10 did not have this issue nor does the 2.6.31-0206113-generic kernel that is available from the kernel teams ppa archive.

So it looks like there is some patch or config difference between Linus 2.6.31 and the 2.6.31-0206113-generic builds. I do not know how to track down such a change. I would be happy to further test the system if someone would give me some guidance in to how to narrow down what is causing the problem.

Thanks

Changed in linux (Ubuntu):
status: New → Triaged
tags: added: kernel-core kernel-needs-review
Revision history for this message
israel vainsencher (israel-mat) wrote :

boot was reasonably fast up to 9.10
upgrading to 10.04 was a headache.
now the boot is terribly slow.

vaio vgn-B100B
uname -a
Linux vaiozao 2.6.34-020634-generic #020634 SMP Mon May 17 20:34:55 UTC 2010 i686 GNU/Linux
lspci
00:00.0 Host bridge: Intel Corporation 82852/82855 GM/GME/PM/GMV Processor to I/O Controller (rev 02)
00:00.1 System peripheral: Intel Corporation 82852/82855 GM/GME/PM/GMV Processor to I/O Controller (rev 02)
00:00.3 System peripheral: Intel Corporation 82852/82855 GM/GME/PM/GMV Processor to I/O Controller (rev 02)
00:02.0 VGA compatible controller: Intel Corporation 82852/855GM Integrated Graphics Device (rev 02)
00:02.1 Display controller: Intel Corporation 82852/855GM Integrated Graphics Device (rev 02)
00:1d.0 USB Controller: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) USB UHCI Controller #1 (rev 03)
00:1d.1 USB Controller: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) USB UHCI Controller #2 (rev 03)
00:1d.7 USB Controller: Intel Corporation 82801DB/DBM (ICH4/ICH4-M) USB2 EHCI Controller (rev 03)
00:1e.0 PCI bridge: Intel Corporation 82801 Mobile PCI Bridge (rev 83)
00:1f.0 ISA bridge: Intel Corporation 82801DBM (ICH4-M) LPC Interface Bridge (rev 03)
00:1f.1 IDE interface: Intel Corporation 82801DBM (ICH4-M) IDE Controller (rev 03)
00:1f.3 SMBus: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) SMBus Controller (rev 03)
00:1f.5 Multimedia audio controller: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) AC'97 Audio Controller (rev 03)
00:1f.6 Modem: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) AC'97 Modem Controller (rev 03)
02:04.0 CardBus bridge: Texas Instruments PCI7420 CardBus Controller
02:04.2 FireWire (IEEE 1394): Texas Instruments PCI7x20 1394a-2000 OHCI Two-Port PHY/Link-Layer Controller
02:04.3 Mass storage controller: Texas Instruments PCI7420/7620 Combo CardBus, 1394a-2000 OHCI and SD/MS-Pro Controller
02:08.0 Ethernet controller: Intel Corporation 82801DB PRO/100 VE (MOB) Ethernet Controller (rev 83)
02:0b.0 Network controller: Intel Corporation PRO/Wireless 2200BG [Calexico2] Network Connection (rev 05)

Revision history for this message
Jarrett Miller (spook) wrote :

I now have more info. It is most certainly a kernel config file issue. If I build from linux git tree using the config-2.6.31-0206113-generic as the base config file it produces a good kernel. I tested this with both v2.6.31 and v2.6.35-rc3.

If I do the exact same thing but instead use config-2.6.32-22-generic that ships with lucid then I get the big pause during boot.

just to be clear the following lists exactly what I am doing:
$ git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
$ git checkout v2.6.35-rc3
$ cd linux-2.6
$ cp /boot/config2.6.31-0206113-generic .config
$ yes '' | make oldconfig
$ make-kpkg clean
$ CONCURRENCY_LEVEL=`getconf _NPROCESSORS_ONLN` fakeroot make-kpkg --initrd --append-to-version=-bughunt kernel_image kernel_headers
$ cd ..
$ sudo dpkg -i *.deb
$ sudo update-initramfs -c -k 2.6.35-rc3-bunghunt
$ sudo update-grub
$ sudo reboot

This procedure produces a good kernel
If instead I replace line #4 above with
cp /boot/config-2.6.32-22-generic .confg

then it produces a bad kernel.
I will attach a diff of the two kernel config files.

Revision history for this message
Jarrett Miller (spook) wrote :

This is a diff of the lucid vs 2.6.31-0206113-generic config files

Revision history for this message
Jarrett Miller (spook) wrote :

YEAH! I have a fix. As soon as I changed a few of the ACHI drivers to be compiled in (like they were in karmic) from being loadable modules (like they are in lucid) the problem went away. for the record here are the changes I made to the config file.

diff lucidconfig linux-2.6/.config
3,4c3,4
< # Linux kernel version: 2.6.35-rc3
< # Thu Jun 24 11:15:51 2010
---
> # Linux kernel version: 2.6.35-rc3-ahci
> # Fri Jun 25 13:21:38 2010
1646c1646
< CONFIG_SATA_AHCI=m
---
> CONFIG_SATA_AHCI=y
1648c1648
< CONFIG_SATA_INIC162X=m
---
> CONFIG_SATA_INIC162X=y
1693c1693
< CONFIG_PATA_IT821X=m
---
> CONFIG_PATA_IT821X=y
1719c1719
< CONFIG_PATA_MPIIX=m
---
> CONFIG_PATA_MPIIX=y

So is it possible this change can get integrated in to official Ubuntu kernels or am I going to be stuck compiling kernels from source on this machine?

Jarrett Miller (spook)
tags: added: bitesize
Revision history for this message
Fabián Rodríguez (magicfab) wrote :

Bug #220706 which is now marked invalid had a few workaround suggestions:
- Inverting the IDE cables
- Setting IDE drives to cable select
- Making sure BIOS and CD/DVD drivers firmware was upgraded
- Increasing the boot waiting time with a kernel parameter (rootdelay=)
- Waiting 3-4 mionutes, then hitting CTRL-D would resume the boot sequence normally if busybox presented

Can you confirm any of the above or is there another known workaround for this issue ?

Revision history for this message
Jarrett Miller (spook) wrote :

Its a SATA DVD-RW drive so suggestions 1 and 2 don't make any sense.

As for number 3, I stated in my first post the motherboard BIOS is the latest available. The DVD drive firmware is also the most recent available.

As for 4&5, they are not applicable. The boot delay I submitted this bug before occurs well before the initramfs is expanded let alone the root file system mounted. So they just don't apply at all. Let me be very clear here. This pause in boot up occurs very early during kernel initialization before the initramfs is utilized. For example if I boot with a GRUB configuration line that has no initramfs specified the boot delay is still present. Of course the system fails to boot if I do this but I did try it out just to be sure it was not an initramfs issue which is why I submitted the bug against the kernel and not some component of the initramfs bootup system.

I am not sure what in general you want me to check for a workaround. As I previously posted I have already provided a solution.

The karmic kernel .config file had CONFIG_SATA_AHCI=y specified while lucid specifies CONFIG_SATA_AHCI=m. this is the problem. It is a simple regression from karmic and is why I added the bitesize tag per the instructions on the wiki.

Andy Whitcroft (apw)
tags: added: kernel-candidate kernel-reviewed
removed: kernel-needs-review
Revision history for this message
Jarrett Miller (spook) wrote :

Just wanted to add a few more details. It appears the problem is that with the Lucid kernel config is that three different drivers are fighting it out for control of my disk controller. The system is configured in the BIOS for AHCI mode. By default in Lucid first ata_piix loads and initializes the controller. Next pata_acpi loads and tries to initialize the controller. Then finally ahci.ko loads and tries to initialize the controller.

ata_piix and pata_acpi are built in to the kernel while ahci.ko is a loadable module.
getting rid of the boot delay is as simple as rebuilding the kernel with AHCI built in to the kernel. Doing this fixes the bug.

however I just wanted to note that the system still appears "funny" to me. For example "disk utility" aka palimpest will show my "SATA Host Adapter" with my hard drives and DVD drive connected to it. But it also shows a secondary ide controller with nothing connected to it that is being driven by ata_piix. This board lacks ide, its pure sata, and with the BIOS configured for AHCI I do not expect to see an ide controller present.

To completely remedy this and make the system appear as expected (at least what I expect) I have to also change ata_piix and pata_ahci to be loadable modules and then blacklist them along with ata_generic in /etc/modprobe.conf.d/blacklist.conf.

Doing that makes the system function as expected. That is to say only ahci is loaded and controlling the SATA controller.

Cheers,
Jarrett

P.S. I just re-verified all of this today using the lucid git tree and building the "Ubuntu-2.6.32-24.38" named tag from git.

Revision history for this message
Jarrett Miller (spook) wrote :

One last tidbit. I finally figured out what the IDE controller was and why it was loading and being controlled by ata_piix. This board is an Intel vPro board and it supports IDE redirection over LAN. Basically at the firmware level it lets be remote mount an ISO image as if it was a locally connected CD-ROM drive.

This virtual IDE CD-ROM drive is what is connected to the controller that ata_piix is driving. So the system was behaving as expected outside of the very slow boot.

So in summary. all that is needed to fix this bug is to change CONFIG_SATA_AHCI= to 'y' from 'm'.

I verified it by making the change in debian.master/config/amd64/config/config.flavour.*

I then rebuilt the lucid kernel, installed it and I am a happy ubuntu user again.

Andy Whitcroft (apw)
Changed in linux (Ubuntu):
importance: Undecided → Medium
tags: removed: kernel-candidate
Revision history for this message
icewater (a-ubuntu) wrote :

This has bitten me too, using Ubuntu Server 10.04. I was able to install, but shortly after boot I get two "ata9: SRST failed (errno=-16)" errors and my console session stops responding.

I can provide hardware info if it would add value to this bug report.

Revision history for this message
Jarrett Miller (spook) wrote :

Ummm.... So whats the status of this.
The problem persists in Maverick Beta 2.
Is this fix not going to get incorporated?

If not can you at least mark this bug as "won't fix" so I can move on with my life and setup an internal system for deploying a custom compiled kernel to all of our machines?

Jarrett Miller (spook)
tags: added: kconfig
Revision history for this message
Jarrett Miller (spook) wrote :

This bug still exists in Maverick. Just retested with 2.6.35-22-server.
Is anyone out there even tracking this bug or should I just give up?

Revision history for this message
Jarrett Miller (spook) wrote :

Just tested with the new 2.6.35-22.34 kernel that was pushed out today. The bug is still present.

Revision history for this message
Jarrett Miller (spook) wrote :

just tested with 2.6.35-23.40 in ubuntu 10.10 and the bug is still present.

Should I just give up at this point?

Revision history for this message
Dylan Justice (dsjstc) wrote :

Sir, your work is appreciated. I too have this bug. I might suggest that you spend some time on Ubuntuforums finding people who have this bug and telling them to come here and subscribe to the bug.

Revision history for this message
Dylan Justice (dsjstc) wrote :

Just to be clear: can I work around this by rebuilding my initrd image, or do I actually need to build a custom kernel?

Revision history for this message
Jarrett Miller (spook) wrote :

yes you need to rebuild the kernel. The problem is the pata_acpi driver that is built in to the kernel. That driver is meant as a driver of last resort. The kernel developers only load it when all other drivers fail to load. BUT, they way the ubuntu kernel devs build the Lucid kernel they changed the ahci driver to be a loadable module. So prior to lucid ahci.ko was present in the kernel image itself and thus would load just fine and the kernel would never even try to load the dreaded papa_acpi driver.

I think they made this change to fix some broken laptops. I search around launchpad and found some folks complain about kernel's prior to lucid because the ahci driver was built in these people could not disable (via blacklisting it). On their systems AHCI caused disk corruption but their BIOS did not allow them to switch back to ATA mode. So I guess the ubuntu devs made this change to accommodate these people and their broken laptops.

Anyway you need to build your own kernel.
https://help.ubuntu.com/community/Kernel/Compile

Just be sure to run:
debian/rules editconfigs
and then (for either x86 or x86-64 edit your config and change ahci to be built in to the kernel instead of as a loaded module)

Revision history for this message
Chris Declama (declama) wrote :

Jarrett Miller,

Thanks for your diligence. I just built a system with a 1090T and discovered this bug two nights ago.

I will attempt to rebuild the kernel tonight though I have never attempted anything like it before.

Revision history for this message
Chris Declama (declama) wrote :

Just out of curiosity, does everyone with the problem have a dual screen?

I re-installed and carefully installed updates until I got setting up my second monitor. I made a backup of xorg.config in my etc/X11 directory. Then, I configured my xorg.conf file by running 'sudo nvidia-settings' in the terminal. Upon reboot, I got the "too many connections" error. I was able to sign in, revert to the backed up xorg.conf, reboot and all is well.

Revision history for this message
Jarrett Miller (spook) wrote :

Nope. This problem has nothing to do with X.

Revision history for this message
TommyBoy (thomaslloyd) wrote :

I have found the this problem is intermittent on my machine. However I have managed to fix the problem by disconnecting the CD-ROM drive from the IDE interface. This solves the hang on boot problem. When the CD is connected 8 out of 10 times my system just hangs. This is a bad bug for those affected.

Changed in linux (Ubuntu):
status: Triaged → Opinion
status: Opinion → Confirmed
Revision history for this message
Bryan Bonvallet (btbonval) wrote :

Jarrett,

The good news is that the deafening silence you referred to in a forum post is a problem the Ubuntu Team is trying to solve:
https://wiki.ubuntu.com/Specs/ImproveSponsorshipQueue

The bad news is that the proposed solutions are waiting on 11.04 release.

My advice would be to hop on IRC and mention the issue there, referencing this ticket.

Revision history for this message
Pablo (itu-pablo) wrote :

I confirm this bug is happening on Maverick (2.6.35-28-generic), exactly as described: a 10 sec. wait on boot time because ata1 (intel 310 ssd) does not respond. After 10 seconds boot resumes.

 I would like to try rebuilding the kernel, but i'm kind of a newbie and am not sure how to do that. Can anyone confirm there will be a fix for this on 11.04?

Revision history for this message
Jarrett Miller (spook) wrote :

Sorry the problem still exists in 11.04.
You need to build your own kernel to remedy it.

Revision history for this message
Pablo (itu-pablo) wrote :

Thanks for the response. I tried rebuilding the kernel but got to an error (compiling) where it says I'm missing the ahci, ahci_platform and libahci modules. How do I solve these?

As a comment, I changed both "CONFIG_SATA_AHCI=y " and "CONFIG_SATA_AHCI_PLATFORM=y" options , which are located in two different config files: config.common.ports and config.common.ubuntu (did not know if I should change both of those files or just the ubuntu one, but did just in case...).

Also both files are located on debian.master/config/ (not on the config.flavour.* file in the given ARCH folder, as is suggested on a previous post).

Revision history for this message
Jarrett Miller (spook) wrote :
Download full text (4.7 KiB)

You edited it in the wrong place. Sorry I wish I could make this easy so that you could just download a kernel from my PPA but I have not figured out how to upload a kernel to the PPA build service. But here are some instructions that should produce a fixed kernel for you. These instructions are for 10.10. If you want 11.04 you need to modify this procedure slightly. Now before you start I would recommend that you install one of the kernels supplied by Ubuntu that you do not want to use. IOW if you want to use generic install linux-server and linux-headers-server. Or if you want to use server do the opposite. This way you can uninstall the Ubuntu supplied "generic" kernel so that yours does not have version conflicts with the Ubuntu supplied packages. Once you have installed the server kernel and rebooted then uninstall the Ubuntu supplied generic packages.

First install some required software.
sudo apt-get install fakeroot build-essential crash kexec-tools makedumpfile kernel-wedge git-core libncurses5 libncurses5-dev libelf-dev asciidoc binutils-dev libqt3-mt-dev

next create a new directory to work from lets call it ~/ubuntu but name it what ever you like. now
mkdir ~/ubuntu
cd ~/ubuntu

now clone the Ubuntu kernel git tree:
git clone git://kernel.ubuntu.com/ubuntu/ubuntu-maverick.git

This will take a while and when done you should have a folder called ubuntu-maverick. it would be a good idea right now to create a tar.gz backup of that folder. That way if you want to start over you don't need to redownload the git tree. You can just delete the ubuntu-maverick folder and then uncompress your clean backup. either way change in to that folder
cd ubuntu-maverick

now you need to checkout released kernel from the git repository. the git repository contains all of the kernels and by default it is checked out in the state of the kernel that the devs are currently working on but have not yet released. So it may have bugs and you want to go back in time a bit to the most recent public release which should give you a known good kernel source tree to start with. as of right now (4-20-2011) that is 2.6.35-28.50 . typing in the following command will list all of the kernels
git tag -l

In that list you should see Ubuntu-2.6.35-28.50 which is what we will use for this. So now we need to check that tag out. we can do that with the following command
git checkout Ubuntu-2.6.35-28.50 -b fixed

That command will check out that taged release and create a new named branch called "fixed"
You now have that revision checked out so we can start the build process. First you need to prep the build tree by running a couple of scripts. so run:
fakeroot debian/rules clean
debian/rules updateconfigs

now you are ready to edit your configs. you should not edit the config files directly. instead you should use the ubuntu kernel configure script. Now you need to know what flavor and platform you want. in these instructions I will use 64 bit generic as that is what I want. To begin the edit process run the following command:
debian/rules editconfigs

this command will start and interactive process that allows you to properly edit the configurations of all of the various ...

Read more...

Revision history for this message
Pablo (itu-pablo) wrote :

Jarret,
  thank you very much for your detailed step-by-step guide. Most of the steps were exactly the ones I had gone through (i had also discovered the editconfigs script to change configs through the GUI there, but after my last post, though it did not make any difference).
  what I was not doing was running the build-indep and build-prearch . However, this is when I run into my first issue: prearch is not a valid target. Because of that, I went for build-arch, which seemed to do a lot of stuff. I realized then that build-arch actually builds all of generic, server and virtual for the given architecture.
  So, to sum up: did not find prearch, built anyway, got the packages in ../ and am installing as we speak. I am not quite sure which packages I should install but decided to try everything except those that said server and virtual. I'll tell you how that went when I reboot.

Revision history for this message
Pablo (itu-pablo) wrote :

Ok, the thing did not resolve the issue. I rebooted (like 5 times now) and am still getting the 10 sec delay with an ata1 SRST failed message.

I installed the custom packages, got rid of the ubuntu ones, rebooted and nothing. Then I uninstalled the custom ones, made sure I only had an old ubuntu kernel version (2.6.35-22) and installed them again. Still nothing.

"cat /boot/config-2.6.35-28-generic | grep AHCI" says:
CONFIG_SATA_AHCI=y
CONFIG_SATA_AHCI_PLATFORM=m

So everything seemed to work as intended but issue is not resolved. Either this is not a proper workaround or my bug is a different one... I listen to any other suggestions.

Jarret: thanks very much anyways, you did a tremendous job by guiding me step-by-step through the thing.

Revision history for this message
Jarrett Miller (spook) wrote :

sorry it did not work. it sounds like you have a different issue. could you post the following files as they might offer a clue.

sudo dmidecode -q > dmi.txt
sudo lspci -vvv > pci.txt
sudo lsmod > mods.txt

P.S. the prearch issue was a typo on my part. the correct target is "perarch" not "prearch". Again sorry about the typo.

Revision history for this message
Pablo (itu-pablo) wrote : Re: [Bug 595448] Re: Slow boot caused by SATA controller reset
Download full text (4.4 KiB)

Right now I don't have my pc, but i will try those commands as soon as I get
back home. Do you think not running "perarch" could have caused the fix not
to work?

Thanks!

2011/4/21 Jarrett Miller <email address hidden>

> sorry it did not work. it sounds like you have a different issue. could
> you post the following files as they might offer a clue.
>
> sudo dmidecode -q > dmi.txt
> sudo lspci -vvv > pci.txt
> sudo lsmod > mods.txt
>
> P.S. the prearch issue was a typo on my part. the correct target is
> "perarch" not "prearch". Again sorry about the typo.
>
> --
> You received this bug notification because you are a direct subscriber
> of the bug.
> https://bugs.launchpad.net/bugs/595448
>
> Title:
> Slow boot caused by SATA controller reset
>
> Status in “linux” package in Ubuntu:
> Confirmed
>
> Bug description:
> Ubuntu 10.04 suffers from a very slow boot on due to a 10 second or so
> pause while the kernel sets up the SATA controller. The problem occurs
> on every boot at this point in the kernel initialization sequence:
>
> ata1: link is slow to respond, please be patient (ready=0)
> Jun 15 13:51:02 jmvpro kernel: [ 14.780019] ata1: SRST failed
> (errno=-16)
> Jun 15 13:51:02 jmvpro kernel: [ 20.330017] ata1: link is slow to
> respond, please be patient (ready=0)
> Jun 15 13:51:02 jmvpro kernel: [ 24.830018] ata1: SRST failed
> (errno=-16)
> Jun 15 13:51:02 jmvpro kernel: [ 30.380018] ata1: link is slow to
> respond, please be patient (ready=0)
> Jun 15 13:51:02 jmvpro kernel: [ 59.840018] ata1: SRST failed
> (errno=-16)
> Jun 15 13:51:02 jmvpro kernel: [ 59.840022] ata1: limiting SATA link
> speed to 1.5 Gbps
> Jun 15 13:51:02 jmvpro kernel: [ 64.850019] ata1: SRST failed
> (errno=-16)
> Jun 15 13:51:02 jmvpro kernel: [ 64.850021] ata1: reset failed, giving
> up
>
> This is an Intel DQ45CB motherboard with the latest bios. I believe
> ata1 is the eSATA port. I think my hard disks are connected to ata2
> and ata3
>
> This problem is also present in the 2.6.35 maverick backport kernel from
> the kernel-ppa.
> This problem did not occur with ubuntu 9.10.
>
> ProblemType: Bug
> DistroRelease: Ubuntu 10.04
> Package: linux-image-2.6.32-22-generic 2.6.32-22.36
> Regression: Yes
> Reproducible: Yes
> ProcVersionSignature: Ubuntu 2.6.32-22.36-generic 2.6.32.11+drm33.2
> Uname: Linux 2.6.32-22-generic x86_64
> AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.21.
> Architecture: amd64
> ArecordDevices:
> **** List of CAPTURE Hardware Devices ****
> card 0: Intel [HDA Intel], device 0: AD198x Analog [AD198x Analog]
> Subdevices: 2/2
> Subdevice #0: subdevice #0
> Subdevice #1: subdevice #1
> AudioDevicesInUse:
> USER PID ACCESS COMMAND
> /dev/snd/controlC0: jarrett 1584 F.... pulseaudio
> CRDA: Error: [Errno 2] No such file or directory
> Card0.Amixer.info:
> Card hw:0 'Intel'/'HDA Intel at 0xe0620000 irq 22'
> Mixer name : 'Analog Devices AD1882'
> Components : 'HDA:11d41882,80861003,00100300'
> Controls : 41
> Simple ctrls : 24
> CurrentDmesg:
> [ 77.952035] e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow
>...

Read more...

Revision history for this message
Jarrett Miller (spook) wrote :

no I don't think perarch had anything to do with it. I think you have a different problem than the one I created this bug for. Anyway I forgot one more command. When you do get the chance also run
sudo dmseg > dmesg.txt

but make sure you are booted from your custom compiled kernel when you run that command.

Revision history for this message
Pablo (itu-pablo) wrote :

Ok, here are the results:

Revision history for this message
Pablo (itu-pablo) wrote :
Revision history for this message
Pablo (itu-pablo) wrote :
Revision history for this message
Pablo (itu-pablo) wrote :
Revision history for this message
Pablo (itu-pablo) wrote :

On dmesg are the messages that made me think I had this bug:

[ 4.110339] usb 2-1.6: new full speed USB device using ehci_hcd and address 5
[ 7.399508] ata1: link is slow to respond, please be patient (ready=0)
[ 12.018550] ata1: COMRESET failed (errno=-16)
[ 12.368494] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)

Revision history for this message
Jarrett Miller (spook) wrote :

You most certainly have a different issue. Your kernel is not attempting to load ata_piix or pata_acpi modules. Only the SATA link to your SSD gives this error. The original bug this happened for every port in the machine even if a drive is not connected to it. Since you have a brand new cougar point chipset I am guessing the older AHCI driver is the 35 kernel series is to old. You may just need to use a more recent kernel.

You could try forcing the link speed to 3gbs as both of your drives are SATA II but it looks like they are connected to the 6gbs STATA3 ports on the chipset.

If you edit your grub boot line you can add libata.force=3.0
Not sure if that still works. It used to work on older kernels but I vaguely recall reading that ahci module now uses libahci and no longer relies on libata.

or you could try adding libahci.skip_host_reset=1 but I am not sure if setting that parameter has any negative consequences.
Anyway you should file a new bug report about your issue as it is different that the one in this bug report. On the plus side I see no reason why you need to rebuild your kernel so you can revert back to the ubuntu supplied packages. Your kernel appears to be correctly discovering all of your devices and loading the proper modules.

Cheers

Revision history for this message
Jarrett Miller (spook) wrote :

one little post script. you might also want to submit a bug report about the kernel not recognizing AES-NI. Your new Sandy Bridge chip most certainly supports AES-NI but the tail end of your dmesg log says that AES-NI was not detected. This will slow down any AES encryption on your system.

Revision history for this message
Pablo (itu-pablo) wrote :

My bad then, I was thrown off by my lack of knowledge and the bug title, which seemed pretty suited for my situation. As you say the causes seem to be different so I might be better off upgrading to 11.04 (beta) and see what happens there, I was going to update upon official release next week anyway. Maybe my architecture will become fully supported and I will no longer see this.

For the last time I want to thank you, Jarret, for the immense patience and technical support.

Revision history for this message
asphixmx (ubuntomanos) wrote :

I have a Dell laptop, m1210 using Ubuntu 11.04. When booting with livecd I knew my delay problem was because of the dvd. Maybe the problem is like Jarret's. I don't know how to compile a kernel, I'll use Jarret's guide (thanks for that). But I'm afraid when a kernel update comes out, I'll have to do the same thing again.... Hope this kernel bug is fixed soon, it's been more than a year this bug appeared (with Ubuntu 8.04 I had no problem at all).

Revision history for this message
Tomasz Sterna (smoku) wrote :
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.