Fails to find boot device in Intel D945Gnt

Bug #290153 reported by Scott Kitterman
174
This bug affects 19 people
Affects Status Importance Assigned to Milestone
Release Notes for Ubuntu
Invalid
Undecided
Unassigned
busybox (Ubuntu)
Invalid
High
Unassigned
Intrepid
Invalid
High
Unassigned
Jaunty
Invalid
High
Unassigned
Karmic
Invalid
High
Unassigned
initramfs-tools (Ubuntu)
Invalid
Undecided
Unassigned
Intrepid
Invalid
Undecided
Unassigned
Jaunty
Invalid
Undecided
Unassigned
Karmic
Invalid
Undecided
Unassigned
linux (Ubuntu)
Won't Fix
High
Unassigned
Intrepid
Won't Fix
High
Unassigned
Jaunty
Won't Fix
High
Unassigned
Karmic
Won't Fix
High
Unassigned

Bug Description

The machine in question is built around an Intel D945Gnt motherboard. It worked fine in Hardy with Kubuntu KDE3. dist-upgraded today using the kubuntu.org recommendations. After upgrade, was dropped to an initramfs shell and the boot device was not found, "Gave up waiting for root device." Root device is a SATA hard drive.

After waiting a minute or two, I exited the initramfs shell and the boot process continued normally.

Adding rootdelay=90 to the current kernel line in menu.list gets a good boot. It's at least a work around.

Changed in linux-meta:
importance: Undecided → High
description: updated
Revision history for this message
KimOlsen (kimfolsen80) wrote :

Have a similar problem. I have windows on two sata drives (in a nvraid) and intrepid on a third drive (non raid). When I boot it drop to the busybox and initramfs, but if I exit then it boots. But when I log in, my raid is not present and my /dev/mapper/ only shows control.

Revision history for this message
KimOlsen (kimfolsen80) wrote :

Forgot to mention that I do not have the Intel D945, but an abit nforce4 ultra. And since I have the raid issue, mine might not be related.

Revision history for this message
Scott Kitterman (kitterman) wrote :

Specific error message after initramfs appears:

ata4: SRST failed (errorno=-16)
ata4: SRST failed (errorno=-16)
ata4: reset failed, giving up

description: updated
Changed in busybox:
importance: Undecided → High
Revision history for this message
KimOlsen (kimfolsen80) wrote :

Adding my dmesg file after a boot (That I have to force to continue by exit in initramfs). Have not tried the rootdelay yet, and still not sure if mine is related. But some things are similar.

Revision history for this message
Scott Kitterman (kitterman) wrote :

Proposed release note:

Boot failures on systems with Intel D945 motherboards have been reported. If this failure occurs, the system will drop to a busybox initramfs shell with a "Gave up waiting for root device." error. Wait a minute or two and then exit the initramfs shell by typing 'exit'. Booting should proceed normally. If it doesn't wait a bit longer and try again. Once the system boots, edit /boot/grub/menu.lst and add rootdelay=90 to the kernel stanza for your current kernel.

Revision history for this message
KimOlsen (kimfolsen80) wrote :

Found that my problem might be related to bug #33269 instead. So will try a few suggestions from there

Revision history for this message
KimOlsen (kimfolsen80) wrote :

Could get it to boot without any interaction in the initramfs with a rootdelay=130, but still no raid... Proposed workarounds in bug #33269 did not seem to work.

Revision history for this message
mirhciulica (mirhciulica) wrote :

I have D945GNT, one sata and one external hard drive, and I don't have any problem. So this problem is related to raid.

Revision history for this message
Scott Kitterman (kitterman) wrote : Re: [Bug 290153] Re: Fails to find boot device in Intel D945Gnt

Mine is two SATA HD and ATAPI CD. No RAID. So it's not just RAID. You're
running Intrepid?

Revision history for this message
mirhciulica (mirhciulica) wrote :

Yes, I'm running Intreprid. I didn't mention the SATA optical drive in my first post, because I think that it has no relevance. I don't have any problems. Maybe this is related to distro upgrade.

Revision history for this message
Robbie Williamson (robbiew) wrote :

Updated release notes with text from Comment #5.

Changed in ubuntu-release-notes:
status: New → Fix Released
Revision history for this message
Fabri Velas (fabrivelas) wrote :

I have the same problem with an Intel SATA controller 82801GBM/GHM (ICH7 Family) on a Hitachi ATA Disk (HTS541010G9SA00), see bug #288388.

Revision history for this message
Scott Kitterman (kitterman) wrote :

Did the workaround work for you?

Revision history for this message
brazzmonkey (brazzmonkey) wrote :

i've got similar issue since kernel 2.6.27-6 (intrepid beta), 2.6.27-5 works fine (this is the one i currently use).
my system is an acer 9814 featuring an intel mobo, kubuntu is 64-bit edition. i have a fakeraid 2-disk setup with stripped / and mirrored /home.

waiting longer or setting longer boot delay doesn't work for me. traditional "kernel alive, kernel really alive" message is not displayed either. i just end up with the initramfs shell with an error message saying that my /dev/mapper/my_root_raid_partition_here is not found ; the only workaround i found is booting kernel 2.6.27-5.

this computer has been using ubuntu since edgy (6.10) mostly with 32-bit editions, and no raid. i never came across this issue before.
i gave a try to opensuse 11.0 (which actually is how i set up my raid configuration) and didn't have this issue either.

i'm not sure if my issue is actually related to this bug, but if it is, something must have changed between kernels 2.6.27-5 and 2.6.27-7 and this "something" could be the cause of my issue.

Revision history for this message
Fabri Velas (fabrivelas) wrote :

@Scott: Yes, the workaround worked. I put a delay of 60 seconds into my menu.lst.

Revision history for this message
Tom Bott (tombott) wrote :

Same issue here, but workaround worked.

Board:

ASUS PSGV-MX
HD's

1 x Sata (Boot)
1x IDE (Fat32 Data)

Ubuntu 8.10 - Clean install.

Revision history for this message
qwerty (escalantea) wrote :

I have a D865PERL (Intel), and the message in the /var/log/dmesg is :

[ 4.543880] sd 2:0:0:0: [sdb] Attached SCSI disk
[ 34.820043] ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[ 34.820101] ata3.00: cmd c8/00:07:de:32:56/00:00:00:00:00/e7 tag 0 dma 3584 in
[ 34.820104] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[ 34.820201] ata3.00: status: { DRDY }
[ 34.820257] ata3: soft resetting link
[ 34.992358] ata3.00: configured for UDMA/133
[ 34.992371] ata3: EH complete

The SATA HD is frozen for 30 seconds and It freezes one more time (another 30 seconds) after i write the userid and password in the login screen.

[ 102.493418] NVRM: loading NVIDIA UNIX x86 Kernel Module 173.14.12 Thu Jul 17 18:11:36 PDT 2008
[ 132.392040] ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[ 132.392120] ata3.00: cmd c8/00:08:3d:8b:46/00:00:00:00:00/e2 tag 0 dma 4096 in
[ 132.392123] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[ 132.392257] ata3.00: status: { DRDY }
[ 132.392322] ata3: soft resetting link
[ 132.564355] ata3.00: configured for UDMA/133
[ 132.564368] ata3: EH complete

I have tried rootdelay=90, it solves the first freezing, but the second freezing remains.

Revision history for this message
qwerty (escalantea) wrote :

Just in case ... my lspci :

Revision history for this message
qwerty (escalantea) wrote :

I solved the problem with my SATA HD by disabling (Bios) the option "PCI IDE Bus Master".

I guess the problem could be a flag (Bios?? or Linux module??), but there is another "interesting effect" that I got by disabling "PCI IDE Bus Master" in my Intel D865PERL Bios:

I also have an IDE HD which was working at UDMA2 (33.3 MB/s) and now is now working at UDMA5 (100 MB/s).

Revision history for this message
Sandro Mani (sandromani) wrote :

I am also experiencing the same problem with a Asus p4p800-se (i865pe chipset). I get the "gave up waiting for root device" as well as constant "SRST failed (errorno=-16)" messages (the harddrives are SCSI, the only ATA devices are two cd roms). Bug #278176 is possibly related.

Tried disabling PCI IDE Bus Master, as well as changing IDE Modes (Compatibility / Enchanced), without any luck.

Revision history for this message
Tanzbaer (tanzbaer) wrote :

I got the same problem with my MSI MS-6577 v4.0 GL6E.

Revision history for this message
Tanzbaer (tanzbaer) wrote :

The bootdelay option does NOT work for me.

Revision history for this message
Tanzbaer (tanzbaer) wrote :

And I get a similar problem when trying to boot the liveCD...

Revision history for this message
Scott Kitterman (kitterman) wrote :

>The bootdelay option does NOT work for me.

90 seconds worked for me. It may need to be bigger in some cases. If you haven't, try a
substantially larger value and then if it works, experiment with how far you can reduce it.

Changed in linux:
assignee: nobody → ubuntu-kernel-team
status: New → Triaged
Revision history for this message
ivanmara (aesthete2005) wrote :

I got the same problem with my Supermicro X5DE8-GG (Serverworks GC-SL Chipset) motherboard with scsi drives on Ubuntu server 8.10 ... The rootdelay=90 workaround work for me ... on 8.04 everything was OK

Revision history for this message
FernanAguero (fernan-ciudad) wrote :

I have the same problem, and adding rootdelay=90 didn't solve the problem.

My system:
Dell Optiplex 740n
CPU: AMD 64 Athlon X2 Dual 3800+ 2.00 GHz
HD: WDC WD800JD-75MSA3 (80 Gb, SATA)
I believe the system uses an nVidia nForce chipset

The problem started happening after upgrading to 8.10. I have now intrepid installed but have to boot the system with the 2.6.24-21-generic kernel (the 2.6.27-7-generic produces the error).

If I type 'exit' at the initramfs shell, booting continues normally, and I can even get a graphical display login. However both before typing 'exit' and after that (i.e. at all times) the system emits error/warning messages at 5-7 second intervals (see attached dmesg/kernel.log). During these intervals the system seem to do some checks on the drives (the light on the CD/DVD drive turns on and off, and I can hear a noise, similar to the one during a normal boot when the drive is checked for media).

Revision history for this message
FernanAguero (fernan-ciudad) wrote :
Revision history for this message
FernanAguero (fernan-ciudad) wrote :
Revision history for this message
FernanAguero (fernan-ciudad) wrote :
Revision history for this message
POIRIER David (poirier-david2) wrote :

i have repported this problems in Bug #294377

rootdelay=90 is ok for my computer:

fujitsu-siemens celsius 610 serverboard
motherboard is TYAN Srverboard Thunder i7505 (S2665)
fsb chipset( northbridge) intel i705
scsi adaptec 7902 dual channel

thanks

Revision history for this message
SpinningAround (spinningaround-deactivatedaccount) wrote :

Have the same problem tried adding rootdelay=200 but no sucess, it stoped at SD 6.0.0.0 (comment unsure) SCSI

Revision history for this message
SpinningAround (spinningaround-deactivatedaccount) wrote :
Revision history for this message
SpinningAround (spinningaround-deactivatedaccount) wrote :

Forgot to mention, above files are from LiveCD

Revision history for this message
POIRIER David (poirier-david2) wrote :

Hie,

For my system adding rootdelay=90 solve the problem

after adding in the menu.lst in the kernel line
rootdelay=90
have you update grub ?
You must make it for change the boot option ( apply the change)

in a terminal
sudo update-grub

 i have an other computer which the system( a few older)

Motherboard Asus a7n8x-x
chipset fsb (northbridge) nvidia force2

I have a problem like your,(see attached dmseg/kernel.log) for boot
ubuntu 8.10 intrepid ibex kernel 2-6-27-7

boot options without quiet splash solve the problem

Le samedi 08 novembre 2008 à 17:32 +0000, FernanAguero a écrit :
> I have the same problem, and adding rootdelay=90 didn't solve the
> problem.
>
> My system:
> Dell Optiplex 740n
> CPU: AMD 64 Athlon X2 Dual 3800+ 2.00 GHz
> HD: WDC WD800JD-75MSA3 (80 Gb, SATA)
> I believe the system uses an nVidia nForce chipset
>
> The problem started happening after upgrading to 8.10. I have now
> intrepid installed but have to boot the system with the
> 2.6.24-21-generic kernel (the 2.6.27-7-generic produces the error).
>
> If I type 'exit' at the initramfs shell, booting continues normally, and
> I can even get a graphical display login. However both before typing
> 'exit' and after that (i.e. at all times) the system emits error/warning
> messages at 5-7 second intervals (see attached dmesg/kernel.log). During
> these intervals the system seem to do some checks on the drives (the
> light on the CD/DVD drive turns on and off, and I can hear a noise,
> similar to the one during a normal boot when the drive is checked for
> media).
>
>
> ** Attachment added: "dmesg (Dell Optiplex 740, 2.6.24-21-generic)"
> http://launchpadlibrarian.net/19493286/dmesg-2.6.24-21-generic.out
>

Revision history for this message
Marcel Ibes (mibes-avaya) wrote :

I believe this issue also relates to bug: #256637

Revision history for this message
Scott Kitterman (kitterman) wrote :

That one was fixed before this problem came up. so it's certainly not the same issue.

Revision history for this message
Marcel Ibes (mibes-avaya) wrote :

What makes you say that that issue was fixed?

It still occurs even in the latest kernel release: 2.6.27-8

This was acknowledged and the issue is still being worked on. See also the linked bug in the kernel bug tracker: http://bugzilla.kernel.org/show_bug.cgi?id=11445

Revision history for this message
Scott Kitterman (kitterman) wrote :

Because it was marked Fix Released. If it's not that should be corrected.

Revision history for this message
FernanAguero (fernan-ciudad) wrote :

> Because it was marked Fix Released. If it's not that should be
> corrected.
>
> --
> Fails to find boot device in Intel D945Gnt
> https://bugs.launchpad.net/bugs/290153

Scott,

the bug was marked as affecting both the 8.10 Release Notes,
and probably other components (busy box, initramfs, linux
kernel).

Only the part of the bug that affects the '8.10 Release
Notes' was marked as 'Fixed'.

The part that affects the linux kernel was triaged and
assigned to the kernel team.

Hope that makes it clearer,

fernan

PS: I'm still waiting for a fix, 2.6.27-8 didn't help

Revision history for this message
Scott Kitterman (kitterman) wrote :

Ah. That's what I get for reading too fast. Thanks.

Revision history for this message
Pawel Tecza (ptecza) wrote :

I have very similar problem with Intrepid on Dell PowerEdge 1550 with 2 SCSI disks
with RAID1 and LVM. /boot partition is also mirrored, but without LVM. After booting
kernel 2.6.27-7-generic I can see the following error message:

ALERT! /dev/mapper/horus-root does not exist, Dropping to the shell!

Fortunately workaround with rootdelay=90 works for me. Thanks for that!

Below is lspci output of my server:

00:00.0 Host bridge: Broadcom CNB20HE Host Bridge (rev 23)
00:00.1 Host bridge: Broadcom CNB20HE Host Bridge (rev 01)
00:00.2 Host bridge: Broadcom CNB20HE Host Bridge (rev 01)
00:00.3 Host bridge: Broadcom CNB20HE Host Bridge (rev 01)
00:01.0 Ethernet controller: Intel Corporation 82557/8/9/0/1 Ethernet Pro 100 (rev 08)
00:02.0 Ethernet controller: Intel Corporation 82557/8/9/0/1 Ethernet Pro 100 (rev 08)
00:03.0 VGA compatible controller: ATI Technologies Inc Rage XL (rev 27)
00:0f.0 ISA bridge: Broadcom OSB4 South Bridge (rev 50)
00:0f.1 IDE interface: Broadcom OSB4 IDE Controller
00:0f.2 USB Controller: Broadcom OSB4/CSB5 OHCI USB Controller (rev 04)
02:05.0 SCSI storage controller: Adaptec AIC-7899P U160/m (rev 01)
02:05.1 SCSI storage controller: Adaptec AIC-7899P U160/m (rev 01)

And here is more details about my system setup:

ptecza@horus:~$ sudo pvscan
  PV /dev/sda2 VG horus lvm2 [9,31 GB / 0 free]
  Total: 1 [9,31 GB] / in use: 1 [9,31 GB] / in no VG: 0 [0 ]

ptecza@horus:~$ sudo vgscan
  Reading all physical volumes. This may take a while...
  Found volume group "horus" using metadata type lvm2

ptecza@horus:~$ sudo lvscan
  ACTIVE '/dev/horus/swap' [1,00 GB] inherit
  ACTIVE '/dev/horus/root' [8,31 GB] inherit

ptecza@horus:~$ cat /proc/partitions
major minor #blocks name

   8 0 17921843 sda
   8 1 248976 sda1
   8 2 17671500 sda2
 254 0 1048576 dm-0
 254 1 8716288 dm-1

Revision history for this message
Magnes (magnesus2) wrote :

Is there a need for more information (dmesg output etc.)? Is there a work on this bug? It's quite important one. Has anyone tried using self compiled kernels (I can't do that, sorry, no time)? Maybe it got fix in some newer version of Linux kernel?

Revision history for this message
Scott Kitterman (kitterman) wrote :

Reopening against Ubuntu Release notes. Since this problem is clearly wider thn Intel D945Gnt, the release note should be revised.

Changed in ubuntu-release-notes:
status: Fix Released → Confirmed
Revision history for this message
SpinningAround (spinningaround-deactivatedaccount) wrote :

Finally got ubuntu working by adding all_generic_ide to boot line, I updated ubuntu and tried running ubuntu without all_generic_ide but it failed with the same problem as before. It looks like it stop somewhere around sd 6:0:0:0: .

Uploading dmesg from when I used all_generic_ide

Revision history for this message
AD (ad-lfc) wrote :

I get this problem and i wated 10 minutes then typed exit it didnt work then waited an HOUR and tryed it again and it doesnt work

Revision history for this message
Marcel Ibes (mibes-avaya) wrote :

@SpinningAround

I tried your work-around by adding the "all_generic_ide" to the kernel options and it works!!

So far the good news. Although I can now boot with the latest kernel, the solution has a negative impact on disk performance:

-------------

With "all_generic_ide" (kernel: 2.6.27-8-generic):

>> hdparm -tT /dev/md1

/dev/md1:
 Timing cached reads: 1768 MB in 2.00 seconds = 884.18 MB/sec
 Timing buffered disk reads: 54 MB in 3.06 seconds = 17.66 MB/sec

Without "all_generic_ide" (kernel: 2.6.27-4-generic):

>> hdparm -tT /dev/md1

/dev/md1:
 Timing cached reads: 1852 MB in 2.00 seconds = 926.06 MB/sec
 Timing buffered disk reads: 318 MB in 3.00 seconds = 105.84 MB/sec

--------------

As you can see the buffered disk reads goes down from 105.84 MB/sec to 17.66 MB/sec on my Raid-0 disk array.

So thanks for supplying us with a work-around, but for the longer term we need an actual fix.

Regards,
Marcel

PS.
My motherboard has an NVIDIA chipset.

Revision history for this message
Magnes (magnesus2) wrote :

Many people here have NForce I suppose. I have NForce 570. I didn't add root_delay because I wait to test real solutions when they appear but I suppose it would work (right know I wait 30 seconds, type exit and hit enter and that works).

Andy Whitcroft (apw)
Changed in linux:
assignee: ubuntu-kernel-team → apw
status: Triaged → In Progress
Changed in busybox:
status: New → Invalid
status: New → Invalid
Changed in initramfs-tools:
status: New → In Progress
status: In Progress → Invalid
status: New → Invalid
Changed in linux:
assignee: nobody → apw
status: New → In Progress
Revision history for this message
Andy Whitcroft (apw) wrote :

I believe that a fix for this has just hit the upstream tree in the commit below:

    3c324283e6cdb79210cf7975c3e40d3ba3e672b2

This commit was pushed to the 2.6.27.y stable tree, and had been pulled into the next intrepid tree.

Revision history for this message
Andy Whitcroft (apw) wrote :

The kernel containing this fix has now been accepted into intrepid-proposed, please test and give feedback here. Please see https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you in advance!

Changed in linux:
status: In Progress → Fix Released
Revision history for this message
Andy Whitcroft (apw) wrote :

This commit is part of the v2.6.28-rc4 release. This should be in Jaunty alpha2 release which should be based on v2.6.28

Revision history for this message
Pawel Tecza (ptecza) wrote :

Andy Whitcroft pisze:
> The kernel containing this fix has now been accepted into intrepid-
> proposed, please test and give feedback here. Please see
> https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to
> enable and use -proposed. Thank you in advance!
>
> ** Changed in: linux (Ubuntu Intrepid)
> Status: In Progress => Fix Released

Hi Andy,

I still can't boot my server using Linux kernel 2.6.27-8-generic without
manual help :) It stops on BusyBox and I need to type `exit` to
continue. I've noticed that I can do it immediately, without any delay.

My best regards,

Pawel

Revision history for this message
Scott Kitterman (kitterman) wrote :

Unmarking fix released as the proposed kernel does not fix this issue.

I do not see any change in the behavior with this update. I dropped the rootdelay work around and ended up in busybox, just like before. I typed exit immediately and it failed. After waiting some time, exit would again proceed to a normal boot. Adding rootdelay in again works. The only difference I note is that the internal fan now occasionally runs VERY hard for several seconds when it shouldn't need to during the boot process. Not sure if that qualifies as a regression or an improvement.

On a related note, I also installed the proposed kernel in a Dell Latitude D430 for testing and have not seen any regressions after some light testing.

Changed in linux:
status: Fix Released → Confirmed
Revision history for this message
Scott Kitterman (kitterman) wrote :

After switching back to the -release kernel, never mind on the fan issue. It does it too (I don't reboot this machine very often).

Revision history for this message
Andy Whitcroft (apw) wrote :

On Fri, Nov 21, 2008 at 01:31:08PM -0000, Paweł Tęcza wrote:

> I still can't boot my server using Linux kernel 2.6.27-8-generic without
> manual help :) It stops on BusyBox and I need to type `exit` to
> continue. I've noticed that I can do it immediately, without any delay.

Right the fix mentioned is in the kernel which was uploaded to -proposed;
version 2.6.26-10.20. To get this kernel you would need to enable the
proposed updates as detailed below:

 https://wiki.ubuntu.com/Testing/EnableProposed

-apw

Andy Whitcroft (apw)
Changed in linux:
status: Confirmed → In Progress
Revision history for this message
Andy Whitcroft (apw) wrote :

@Scott Kitterman

Thanks for testing, can you confirm that the version you tested was the 2.6.27-10.20? If so then could you then attach the dmesg output from a good boot with the boot delay, as well as an lsmod output.

Revision history for this message
Scott Kitterman (kitterman) wrote :

On Friday 21 November 2008 11:14, Andy Whitcroft wrote:

> version 2.6.26-10.20.

2.6.27-8-generic is what I installed from proposed.

Revision history for this message
Andy Whitcroft (apw) wrote :

@Scott Kitterman -- that is very strange as the current kernel in the proposed pocket should be 2.6.26-10.20 as shown below:

     https://launchpad.net/ubuntu/+source/linux

If -proposed was already enabled in your update-manager you may have to hit Check to make sure the package lists are up to date.

Revision history for this message
Andy Whitcroft (apw) wrote :

bah, in all cases that is 2.6.27-10.20.

Revision history for this message
Scott Kitterman (kitterman) wrote :

It's still in New:

https://launchpad.net/ubuntu/intrepid/+source/linux/2.6.27-10.20

So I guess the previous comments aren't relevant to this bug.

Revision history for this message
Andy Whitcroft (apw) wrote :

On Fri, Nov 21, 2008 at 05:27:54PM -0000, Scott Kitterman wrote:
> It's still in New:
>
> https://launchpad.net/ubuntu/intrepid/+source/linux/2.6.27-10.20
>
> So I guess the previous comments aren't relevant to this bug.

My bad, I thought it had hit the archives allready. Sorry to have
wasted your time testing the wrong one. If you could test that -10.20
release hits the archives that would be great.

-apw

Revision history for this message
Scott Kitterman (kitterman) wrote :

On Friday 21 November 2008 06:18, Andy Whitcroft wrote:
>        Status: In Progress => Fix Released

One other small point is we don't usually mark Fix Released until it's
in -updates. For -proposed we normally use Fix Committed.

Revision history for this message
Haydn1 (haydn-miller) wrote :

Hey guys,

I just wanted to thank you for the work you are doing. I was hoping you could help decrypt your conversation.

Is the proposed kernel available?

If so how can I upgrade to get my computer back up and running?

If not is there another workaround that will allow me to get up and running? (I've tried all of the above)

Thanks,

Haydn

Revision history for this message
Scott Kitterman (kitterman) wrote :

It's not available yet.

Revision history for this message
Haydn1 (haydn-miller) wrote :

Is there a way to install an earlier kernel?

I am currently only able to boot up into my Hardy Alt install disc.

Any help would be greatly appreciated.

Revision history for this message
Andy Whitcroft (apw) wrote :

On Fri, Nov 21, 2008 at 06:57:12PM -0000, Scott Kitterman wrote:

> One other small point is we don't usually mark Fix Released until it's
> in -updates. For -proposed we normally use Fix Committed.

Thanks for the heads up, I miss understood the way the janitor handled
the stated and thought I was tracking those in step with it. Process is
always the hardest part to learn!

-apw

Revision history for this message
charly4711 (karl-h-beckers) wrote :

Or is there a way to dl the binaries resulting from the following build for testing?
https://launchpad.net/ubuntu/+source/linux/2.6.27-10.20/+build/793011

Revision history for this message
Pawel Tecza (ptecza) wrote :

Andy Whitcroft pisze:
> On Fri, Nov 21, 2008 at 01:31:08PM -0000, Paweł Tęcza wrote:
>
>> I still can't boot my server using Linux kernel 2.6.27-8-generic without
>> manual help :) It stops on BusyBox and I need to type `exit` to
>> continue. I've noticed that I can do it immediately, without any delay.
>
> Right the fix mentioned is in the kernel which was uploaded to -proposed;
> version 2.6.26-10.20. To get this kernel you would need to enable the
> proposed updates as detailed below:
>
> https://wiki.ubuntu.com/Testing/EnableProposed

I have enabled -proposed, of course:

ptecza@horus:~$ grep proposed /etc/apt/sources.list
deb http://pl.archive.ubuntu.com/ubuntu/ intrepid-proposed main
restricted universe multiverse

Because I can't see binary 2.6.26-10.20 packages to download, I'm
building my own now. I'll let you know when I finish builing and
testing the new kernel.

BTW, I've noticed problem with permissions to /dev/null file under
2.6.27-8-generic:

ptecza@horus:~$ ls -l /dev/null
crw------- 1 root root 1, 3 2008-11-14 10:25 /dev/null

So when I try to ssh to horus server, then I can see a lot of
"-bash: /dev/null: Permission denied" lines:

ptecza@anahaim:~$ ssh horus
ptecza@horus's password:
Linux horus 2.6.27-8-generic #1 SMP Thu Nov 6 17:33:54 UTC 2008 i686

The programs included with the Ubuntu system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Ubuntu comes with ABSOLUTELY NO WARRANTY, to the extent permitted by
applicable law.

To access official Ubuntu documentation, please visit:
http://help.ubuntu.com/
Last login: Mon Nov 24 12:27:50 2008 from XXX.XXX.XXX.XXX.XX
-bash: /dev/null: Permission denied
-bash: /dev/null: Permission denied
-bash: /dev/null: Permission denied
-bash: /dev/null: Permission denied
[...]
-bash: /dev/null: Permission denied

I think it can be useful information for you.

Revision history for this message
Pawel Tecza (ptecza) wrote :

Paweł Tęcza pisze:

> Because I can't see binary 2.6.26-10.20 packages to download, I'm
> building my own now. I'll let you know when I finish builing and
> testing the new kernel.

Unfortunately I don't have good news for you. The issue still exists
under 2.6.26-10.20 and I need to "kick" my server to boot.

ptecza@horus:~$ uname -a
Linux horus 2.6.27-10-generic #1 SMP Mon Nov 24 12:48:09 UTC 2008 i686
GNU/Linux

But it's comforting that the permissions of /dev/null file are OK now :)

ptecza@horus:~$ ls -l /dev/null
crw-rw-rw- 1 root root 1, 3 2008-11-14 10:25 /dev/null

Revision history for this message
Andy Whitcroft (apw) wrote :

@Pawel Tecza -- it is not 100% clear your symptoms are the same as you are able to exit immediatly. Could we get the boot log from the -10.20 kernel and your lspci output attached to this bug so we can try and confirm.

Revision history for this message
Scott Kitterman (kitterman) wrote :

Verification fails for me. System still drops to busybox and after a wait
typing exit gets to a normal boot.

I also tried the updated kernel on my laptop and no obvious regressions.

Revision history for this message
Marcel Ibes (mibes-avaya) wrote :

Just tested the latest kernel from proposed (2.6.27-10) and I am still unable to boot.

During boot it starts giving these messages:

ata2: COMRESET failed (errno=-16)
ata2: hard resetting link
ata2: link is slow to respond, please be patient (ready=0)

This goes on for a while until eventually it returns:

ata2: reset failed giving up
ata2: EH complete

It then drops into the busybox and the only thing I can do is "reboot" and launch the last working kernel: "2.6.27-4-generic"

Attached to the ata2 controller are two of the four SATA disks in my system, and I need the disks to boot my system.

I have attached the results of "lspci -vv" to this report, in case it might be helpful.

Revision history for this message
FernanAguero (fernan-ciudad) wrote :

Hi,

It works!

I just installed 2.6.27-10 from intrepid-proposed (many other changes were included in the update, BTW, all from standard ubuntu channels), rebooted and voilá! No busybox, no exit into initramfs shell ... everything works as expected.

This is all on a Dell Optiplex 740n (AMD64 Athlon X2) running Ubuntu x86.

Attached are some diagnostics:
 lspci -vv (lspci.vv.2.6.27-10.txt)
 dmesg (dmesg.2.6.27-10.txt)
 uname -a (uname.2.6.27-10.txt)
 hdparm -tT (hdparm.2.6.27-10.txt, hdparm.2.6.24-21.txt)

The output of hdparm shows that cached reads seem a little bit faster on the 2.6.24-21 kernel. But otherwise disk performance seems to be equal (buffered reads).

Thanks!

Revision history for this message
FernanAguero (fernan-ciudad) wrote :
Revision history for this message
FernanAguero (fernan-ciudad) wrote :
Revision history for this message
FernanAguero (fernan-ciudad) wrote :
Revision history for this message
Pawel Tecza (ptecza) wrote :

Andy Whitcroft pisze:
> @Pawel Tecza -- it is not 100% clear your symptoms are the same as you
> are able to exit immediatly. Could we get the boot log from the -10.20
> kernel and your lspci output attached to this bug so we can try and
> confirm.

Sure. Please see the attachments.

Revision history for this message
Melvin Garcia (virtualspectre8) wrote :

I reported the same bug on an Acer Aspire 5050-5410.

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/293161

Can someone do something about it?

Revision history for this message
Magnes (magnesus2) wrote :

I no longer have this bug after updating the system two days ago. I have proposed repository turned on, I have kernel 2.6.27-10-generic now (2.6.27.10.13), I have a mainboard with Nvidia nForce 570 SLI.

Revision history for this message
Richard Kleeman (kleeman) wrote :

I am the reporter of duplicate bug 244608.

With the updated kernel 2.6.27.10.13 my system still does not boot

lspci gives

00:00.0 Host bridge: Intel Corporation E7525 Memory Controller Hub (rev 0c)
00:00.1 Class ff00: Intel Corporation E7525/E7520 Error Reporting Registers (rev 0c)
00:02.0 PCI bridge: Intel Corporation E7525/E7520/E7320 PCI Express Port A (rev 0c)
00:03.0 PCI bridge: Intel Corporation E7525/E7520/E7320 PCI Express Port A1 (rev 0c)
00:04.0 PCI bridge: Intel Corporation E7525/E7520 PCI Express Port B (rev 0c)
00:1d.0 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB UHCI Controller #1 (rev 02)
00:1d.1 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB UHCI Controller #2 (rev 02)
00:1d.2 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB UHCI Controller #3 (rev 02)
00:1d.3 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB UHCI Controller #4 (rev 02)
00:1d.7 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB2 EHCI Controller (rev 02)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev c2)
00:1f.0 ISA bridge: Intel Corporation 82801EB/ER (ICH5/ICH5R) LPC Interface Bridge (rev 02)
00:1f.2 IDE interface: Intel Corporation 82801EB (ICH5) SATA Controller (rev 02)
00:1f.3 SMBus: Intel Corporation 82801EB/ER (ICH5/ICH5R) SMBus Controller (rev 02)
00:1f.5 Multimedia audio controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) AC'97 Audio Controller (rev 02)
02:00.0 PCI bridge: Intel Corporation 6700PXH PCI Express-to-PCI Bridge A (rev 09)
02:00.1 PIC: Intel Corporation 6700/6702PXH I/OxAPIC Interrupt Controller A (rev 09)
02:00.2 PCI bridge: Intel Corporation 6700PXH PCI Express-to-PCI Bridge B (rev 09)
02:00.3 PIC: Intel Corporation 6700PXH I/OxAPIC Interrupt Controller B (rev 09)
03:02.0 SCSI storage controller: Adaptec AIC-7902B U320 (rev 10)
03:02.1 SCSI storage controller: Adaptec AIC-7902B U320 (rev 10)
04:01.0 Multimedia audio controller: Creative Labs SB Audigy (rev 04)
04:01.1 Input device controller: Creative Labs SB Audigy Game Port (rev 04)
04:01.2 FireWire (IEEE 1394): Creative Labs SB Audigy FireWire Port (rev 04)
04:02.0 Ethernet controller: Intel Corporation 82546GB Gigabit Ethernet Controller (rev 03)
04:02.1 Ethernet controller: Intel Corporation 82546GB Gigabit Ethernet Controller (rev 03)
05:00.0 PCI bridge: nVidia Corporation Device 01b3 (rev a3)
06:00.0 PCI bridge: nVidia Corporation Device 01b3 (rev a3)
06:01.0 PCI bridge: nVidia Corporation Device 01b3 (rev a3)
07:00.0 3D controller: nVidia Corporation G71 [GeForce 7950 GX2] (rev a1)
08:00.0 VGA compatible controller: nVidia Corporation G71 [GeForce 7950 GX2] (rev a1)

I assume the scsi disks are the issue for me.

Revision history for this message
Richard Kleeman (kleeman) wrote :

Sorry I should have said I was using

kernel 2.6.27.10.20

Revision history for this message
Sandro Mani (sandromani) wrote :

The issue still persists on my computer, using the latest proposed kernel to date. I also guess the issue is caused by the SCSI controller, see bug #278176 (the issue seems to exist also in other distributions, i.e. https://bugzilla.redhat.com/show_bug.cgi?id=473305).

Specifically my controller is an Adaptec 29160N, but I also noticed the same issue on a Adaptec 29320-R.
In both cases the mainboards had i865 or i875 series Intel chipsets.

Revision history for this message
charly4711 (karl-h-beckers) wrote :

The issue persists for me, too. Please find my lspci output attached. For me, the boot drives are hooked up to a VT8233 based SATA controller. dmesg has this to say: The two drivers sda and sdb are used in a lvm volume group.

[ 26.769215] VP_IDE: IDE controller at PCI slot 0000:00:11.1
[ 26.769243] ACPI: Unable to derive IRQ for device 0000:00:11.1
[ 26.769245] ACPI: PCI Interrupt 0000:00:11.1[A]: no GSI
[ 26.769258] VP_IDE: chipset revision 6
[ 26.769260] VP_IDE: not 100% native mode: will probe irqs later
[ 26.769272] VP_IDE: VIA vt8233a (rev 00) IDE UDMA133 controller on pci0000:00:11.1
[ 26.769282] ide0: BM-DMA at 0x9400-0x9407, BIOS settings: hda:DMA, hdb:DMA
[ 26.769295] ide1: BM-DMA at 0x9408-0x940f, BIOS settings: hdc:DMA, hdd:DMA
[ 26.769304] Probing IDE interface ide0...
[ 26.793413] sd 0:0:0:0: [sda] 312581808 512-byte hardware sectors (160042 MB)
[ 26.793458] sd 0:0:0:0: [sda] Write Protect is off
[ 26.793461] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
[ 26.793478] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[ 26.793555] sd 0:0:0:0: [sda] 312581808 512-byte hardware sectors (160042 MB)
[ 26.793564] sd 0:0:0:0: [sda] Write Protect is off
[ 26.793567] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
[ 26.793579] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[ 26.793585] sda: sda1
[ 26.811244] sd 0:0:0:0: [sda] Attached SCSI disk
[ 26.811433] sd 1:0:0:0: [sdb] 312581808 512-byte hardware sectors (160042 MB)
[ 26.811448] sd 1:0:0:0: [sdb] Write Protect is off
[ 26.811450] sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
[ 26.811465] sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[ 26.811507] sd 1:0:0:0: [sdb] 312581808 512-byte hardware sectors (160042 MB)
[ 26.811515] sd 1:0:0:0: [sdb] Write Protect is off
[ 26.811518] sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
[ 26.811531] sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[ 26.811534] sdb: sdb1
[ 26.831648] sd 1:0:0:0: [sdb] Attached SCSI disk
[ 26.836660] sd 0:0:0:0: Attached scsi generic sg0 type 0
[ 26.836693] sd 1:0:0:0: Attached scsi generic sg1 type 0

Revision history for this message
charly4711 (karl-h-beckers) wrote :

Btw., since I cannot boot AT ALL with the new kernel, the output above was taken from my old 2.6.22-14-generic

Revision history for this message
Bryan McLellan (btm) wrote :

Upgraded a Dell SC1425 from hardy to intrepid. Controller: Adaptec ASC-39320(B), module: aic79xx

linux-image-2.6.27-7-generic drops to in the intitramfs, waiting a bit and running exit allows the startup to continue.

linux-image-2.6.24-19-generic booted fine.

[ 3.370368] e1000: eth1: e1000_probe: Intel(R) PRO/1000 Network Connection
[ 18.052543] scsi0 : Adaptec AIC79XX PCI-X SCSI HBA DRIVER, Rev 3.0
[ 18.052545] <Adaptec (Dell OEM) 39320 Ultra320 SCSI adapter>
[ 18.052547] aic7902: Ultra320 Wide Channel A, SCSI Id=7, PCI-X 101-133Mhz, 512 SCBs

You can see from the timestamps it takes a little while for the disk to show up.

Installed linux-image-2.6.27-10-generic from proposed ( https://wiki.ubuntu.com/Testing/EnableProposed )

Same issue.

added rootdelay=90 to kopt line in /boot/grub/menu.lst, ran update-grub, reboot, system boots fine.

Revision history for this message
SpinningAround (spinningaround-deactivatedaccount) wrote :

I tried out 2.6.27-10-generic but it didn't solve the problem, it stuck at the same spot as before, similar to the post harly4711 wrote on 2008-11-30

I have the following hardware:
MSI K8N Neo4-F
Samsung SpinPoint F1 HD502IJ

Revision history for this message
Richard Kleeman (kleeman) wrote :

The 2.6.28-2 kernel in intrepid does not solve this for me.

Revision history for this message
Anthony DeChiaro (adechiaro) wrote :

On my end rootdelay=200 doesn't even work. As you can see from dmesg, it initializes USB and some of the ata busses, then hangs for awhile and drops to initramfs. Then it waits awhile longer and finally detects the rest of the hardware and continues booting. Gigabyte GA-G31M-S2L MB here (Intel G31/ICH7 chipsets). Two internal SATA drives + 4 more SATA drives in external enclosure connected via eSATA.

Revision history for this message
Dave Stadnick (dstadnick) wrote :

rootdelay=90 has remedied my "Gave up waiting for root device" error. This is on a different motherboard than D45gnt and with a SCSI interface - not SATA.

The 8.10 server install is on a Rackable Systems Phantom4 4.0 system that uses an Intel SE7501WV2 motherboard with dual Zeons and 4 SCSI drives configured in a 2xRAID 1 mirror. A floppy is not installed though I have a CD reader on an AT interface that I used for loading the distro. The motherboard has an Adaptec AIC-7902W SCSI controller.

I threw "all_generic_ide floppy_off irqpoll" at grub without success.

Thanks so much to everyone who has reported the issue and possible work-arounds.

Revision history for this message
Alessio Gaeta (meden) wrote :

Hello.
I experienced the same problem with an ASUS A6Rp, but I can add some (hoping) useful infos: the system is unable to boot only after the sequence: suspend -> failed resume -> forced poweroff (I did not tried with hibernate).
I faced with the problem while testing suspend/resume for finding correct quirks. Adding the option rootdelay=90 I was able to boot regularly; then I tried to reboot and all worked fine without added options.
Considering that /dev/disk/by-uuid/<uuid> exists and points to correct device, maybe the problem is in some sort of inconsistent state of the controller/driver caused by failed resume (or something so).

Revision history for this message
Alessio Gaeta (meden) wrote :

Forget my previous comment: described behavior is not repeatable. Sometimes system boots, some others doesn't, with or without rootdelay. Sorry for the noise.

Revision history for this message
davidyu (yuyich) wrote : David YC Yu is out of the office

I will be out of the office starting 2009/01/07 and will not return until
2009/02/02.

Not in office, will check email soon. Have a nice day

Revision history for this message
beyond (cwietelman) wrote :

I'm exhausted. I wanted to try out Ubuntu after going back to Windows a few months ago. I DL and burned Ubuntu Ultimate 2.0. I am aware of problems with bad burns (been there), so I burned the disk at 2X.

Popped the DVD in yesterday and came up with this error also. I have spent the last 24 hours, almost straight, troubleshooting everything I could imagine. I am a computer tech, and I do have a little experience with Ubuntu and other Debian based distros. For what I do know about Debian and Ubuntu, NOTHING worked for me.

First I thought it was a HDD issue, so I swapped drives...4 of them and all IDE. Nothing, busy-bastard error every time. I read through some forums and tried all the posted workarounds...nothing again.

Tried then different DVD drives...same on all. (There is some speculation that bad DVD drives are to blame...BS)

Something interesting though; On one occasion I pressed F6 to edit the boot command, I took out the quiet splash command and pressed enter. It showed the modules initiating and giving a general "OK" to everything, until it got to my USB 2.0 hub. There it froze and immediately gave the bastard box error. Conflict/error while loading the USB driver or hub?

Anyone resolve this by removing a USB mouse and/or keyboard and using PS2? Just a though...probably nothing.

When I am at the initramfs error and I type exit I get a listing of root directories that are not "found", but this IS a live CD, as I am not trying to install the OS, yet. Seems like maybe* the live cd is not able to create those virtual directories...for whatever reason.

My burn is good, burned at 2X. Memory test comes back fine. I run TinyMe, Puppy and Backtrack 2&3 on this hardware on a regular basis. Also installed Ubuntu 8.10 just a couple of months ago...NEVER any problems like this. I LOVE Ubuntu and am very fond of the Ultimate builds,,, but this is a little more than discouraging.

Another reason I know this DVD is "Good". I installed this on VMware with both the .iso image, and the DVD it's self as the source. No problems and it runs find on VMWare. Oh yea... I did FIRST get the initramfs error, but by default VMWare creates every virtual disk as SCSI. I went back and switch that to IDE, fired up VMWare and no problems.

If I had a SATA or SCSI drive, I would try that... but hell. Im tired. Frustrated.

Here is a rundown of my hardware. Maybe it will help...someone. If you have similar hardware and you find a workaround... email or message me.

3.0GHz Pentium 4HT - Prescott
Foxconn 661FX
2GB DDR RAM
Maxtor 260GB IDE HDD
Western Digital 180GB IDE HDD
512MB ATI Saphire X1650 Pro
Sony DVD-RW AW-Q170A

If you need more details on hardware, see attachment.

Revision history for this message
Andy Whitcroft (apw) wrote :

There do not seem to be any further updates to the NV driver specificially in mainline at this time. So our next step is to confirm whether the problem is seen in the Jaunty kernels. If those who see this problem could either test the Jaunty Alpha-3 live cd or the Jaunty 2.6.28-4-10 kernels from the urls below and report back:

    http://archive.ubuntu.com/ubuntu/pool/main/l/linux/linux-image-2.6.28-4-generic_2.6.28-4.10_amd64.deb
    http://archive.ubuntu.com/ubuntu/pool/main/l/linux/linux-image-2.6.28-4-generic_2.6.28-4.10_i386.deb

Revision history for this message
Scott Kitterman (kitterman) wrote : Re: [Bug 290153] Re: Fails to find boot device in Intel D945Gnt

On Thu, 15 Jan 2009 17:21:44 -0000 Andy Whitcroft <email address hidden> wrote:
>There do not seem to be any further updates to the NV driver
>specificially in mainline at this time.

I'm a bit confused how this relates to the bug. The original report (mine)
is on an all Intel box.

Revision history for this message
Dave Stadnick (dstadnick) wrote :

Additional information regarding my report of 2008-12-21 - same system, except install to an AT drive continues to have the problem; however, the SCSI controller was still enabled. With the SCSI controller disabled in the BIOS, clean boot off the AT drive. With reference to the original report, this is on an Intel SE7501WV2 motherboard with Adaptec AIC-7902W SCSI controller.

Revision history for this message
charly4711 (karl-h-beckers) wrote :

Since the links to the jaunty kernels above are dead, I dl'ed the kernel 2.6.28-5 from packages.ubuntu.com. I installed the following packages:
linux-backports-modules-2.6.28-5-generic_2.6.28-5.2_i386.deb
linux-headers-2.6.28-5_2.6.28-5.15_all.deb
linux-headers-2.6.28-5-generic_2.6.28-5.15_i386.deb
linux-image-2.6.28-5-generic_2.6.28-5.15_i386.deb
linux-restricted-modules_2.6.28.5.5_i386.deb
linux-restricted-modules-2.6.28-5-generic_2.6.28-5.5_i386.deb
linux-restricted-modules-generic_2.6.28.5.5_i386.deb

For me, the problem persists, only that I now get the COMRESET stuff before the bootsplash. So where previously the order was: 1. bootsplash, 2. message, 3. timeout. 4. busybox, it is now: 1. message, 2. bootsplash, 3. timeout, 4. busybox.

Revision history for this message
Jonathan Musther (musther-deactivatedaccount) wrote :

I had this issue a few months ago on 8.04, but was about to upgrade to 8.10 anyway, so did that. Haven't had a problem since until today. Interestingly, when I removed 'splash quiet' from the kernel line, the boot was normal. I'm thinking maybe s short rootdelay will do the trick for me, will try that anyway.

Revision history for this message
Scott Kitterman (kitterman) wrote :

I can confirm that the latest kernel just in intrepid-updates does not fix
this problem.

Revision history for this message
d2globalinc (shane-2710studios) wrote :

I can also confirm this - had report this issue here: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/291791

But looks like this is also where I belong. I have tested Intrepid 64bit w/ Kernel 2.6.27-11 and the liveCD I made with that would not boot, the same is with any standard intrepid LiveCD's. (Have not test 32bit).

Jaunty Alpha3 Did not work, I'm downloading latest Alpha 4 to try (this time both 32 and 64bit).

Will post results and try the delay workaround mentioned in here.

Revision history for this message
Magnes (magnesus2) wrote :

I reported that the problem was gone for me but I discovered recently that it's not entirely gone. I don't have busybox anymore as long as I don't attach additional SATA drive. When I attach the SATA drive (third HDD in my system) then it's busybox again and sometimes writing exit and hitting enter causes a crash. I'll probably test it more in the future, maybe it's different bug.

Revision history for this message
d2globalinc (shane-2710studios) wrote :

Update to my previous post - I have tested Alpha4 Jaunty with 32 and 64bit liveCD's - and no luck with either one. I'm really excited about some of the things in Jaunty (ext4, etc) - so hopefully this can be resolved before its out of beta. I'm looking forward to even faster boot up times - so I'm not really wanting to put in any delay work-around here.

I'll keep testing newer builds and let everyone know what I find out.. I'm running Hardy 2.6.24-23-generic x86_64 without a problem!

If anyone needs me to test anything else - just ask! - Thanks! - Shane Menshik - D2 GLOBAL INC.

Revision history for this message
charly4711 (karl-h-beckers) wrote :

"I'm really excited about some of the things in Jaunty..."

I for my part am really excited of not being able to run nvidia drivers on this box for four months. I'm quite loth to use nvidia packaged drivers. Anybody affected here, knows whether machines affected by this work with any other Debian-based distro?

Revision history for this message
Scott Kitterman (kitterman) wrote :

This bug has nothing to do with Nvidia drivers.

Revision history for this message
d2globalinc (shane-2710studios) wrote :

Nvidia Drivers? Are you talking about a display issue? This bug report is for issues with Intrepid and also Jaunty not being able to boot to or with a sata dvd-rom drive connected.

Revision history for this message
d2globalinc (shane-2710studios) wrote :

Back on topic - I just got Ubuntu 8.10 to boot on the liveCD - I had to add the option all_generic_ide to the kernel line . This of course allowed me to boot the liveCD but I don't see this as being an option for a regular install since its going to really degrade performance.. But perhaps this can help get to the bottom of the issue? Let me know if I need to test anything else! Thanks!

Revision history for this message
charly4711 (karl-h-beckers) wrote :

OK, have calmed down a bit since yesterday, when I had multiple X crashes with the vesa driver.
The connection with nvidia drivers was that the only way I can boot this machine atm. is using a way old kernel image I only still have out of pure coincidence and because I've been upgrading this machine through various releases. Restricted drivers have long since been upgraded, dkms wouldn't work, etc.
Have now manually dl'ed corresponding kernel sources, configured them and can now use dkms. So, I can work for now.

Revision history for this message
Coeus (steve-overlee) wrote :

Some what of a workaround.

For those of you who are able to boot using a USB flash drive, you can use UNetbootin which will make a bootable flash drive for you. So, when you restart your computer, boot using the flash drive and install from there. I had no issues with this method and it was a lot faster than CD.

Revision history for this message
Scott Kitterman (kitterman) wrote :

i'm not sure what bug you're workinbg around, but I'm pretty sure it isn't
this one.

Revision history for this message
Magnes (magnesus2) wrote :

After last kernel update on Intrepid the bug is back. :|
I have to wait about a minute and then write exit and hit enter.

Revision history for this message
Andy Whitcroft (apw) wrote :

Now that we have mainline kernel builds could those who are affected by this try the latest mainline kernel and see if that works correctly. This may help us figure out what is different in the jaunty kernels. I would like to get a comparison with both the latest mainline kernel: v2.6.29-rc7

    http://kernel.ubuntu.com/~kernel-ppa/mainline/v2.6.29-rc7/

plus with the unmodified equivalent of the current Intrepid:

    http://kernel.ubuntu.com/~kernel-ppa/mainline/v2.6.27.18/

or Jaunty kernels:

    http://kernel.ubuntu.com/~kernel-ppa/mainline/v2.6.28.7/

More information on those builds can be found at the URL below:
    https://wiki.ubuntu.com/KernelMainlineBuilds

Revision history for this message
Richard Kleeman (kleeman) wrote :

I tested this on the following kernels

2.6.28-9
2.6.29-rc7

The problem remains the same for both.

Revision history for this message
nomoa (dcausse) wrote :

We are also affected by this bug. We have had to use rootdelay workaround with a 420 delay. We use use a Dell PV-124-T LTO-2 autochanger which takes ages to load and because devices appears after the autochanger init we have to wait for long time.
I'm a bit confused with proposed kernel fix, cause this server is our bacula backup system I cannot use it as scratchpad, we'll wait for an official fix, sorry. Feel free to ask complementary informations.

# uname -a
Linux potiron 2.6.27-11-server #1 SMP Thu Jan 29 20:19:41 UTC 2009 i686 GNU/Linux

# lspci :
00:00.0 Host bridge: Intel Corporation E7230/3000/3010 Memory Controller Hub
00:01.0 PCI bridge: Intel Corporation E7230/3000/3010 PCI Express Root Port
00:1c.0 PCI bridge: Intel Corporation 82801G (ICH7 Family) PCI Express Port 1 (rev 01)
00:1c.4 PCI bridge: Intel Corporation 82801GR/GH/GHM (ICH7 Family) PCI Express Port 5 (rev 01)
00:1c.5 PCI bridge: Intel Corporation 82801GR/GH/GHM (ICH7 Family) PCI Express Port 6 (rev 01)
00:1d.0 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #1 (rev 01)
00:1d.1 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #2 (rev 01)
00:1d.2 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #3 (rev 01)
00:1d.7 USB Controller: Intel Corporation 82801G (ICH7 Family) USB2 EHCI Controller (rev 01)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev e1)
00:1f.0 ISA bridge: Intel Corporation 82801GB/GR (ICH7 Family) LPC Interface Bridge (rev 01)
00:1f.1 IDE interface: Intel Corporation 82801G (ICH7 Family) IDE Controller (rev 01)
00:1f.2 IDE interface: Intel Corporation 82801GB/GR/GH (ICH7 Family) SATA IDE Controller (rev 01)
00:1f.3 SMBus: Intel Corporation 82801G (ICH7 Family) SMBus Controller (rev 01)
02:00.0 PCI bridge: Intel Corporation 6702PXH PCI Express-to-PCI Bridge A (rev 09)
03:02.0 SCSI storage controller: Adaptec AHA-3960D / AIC-7899A U160/m (rev 01)
03:02.1 SCSI storage controller: Adaptec AHA-3960D / AIC-7899A U160/m (rev 01)
04:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5721 Gigabit Ethernet PCI Express (rev 11)
05:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5721 Gigabit Ethernet PCI Express (rev 11)
06:05.0 VGA compatible controller: XGI Technology Inc. (eXtreme Graphics Innovation) Volari Z7/Z9/Z9s

# cat /proc/scsi/scsi
Attached devices:
Host: scsi1 Channel: 00 Id: 00 Lun: 00
  Vendor: HL-DT-ST Model: CD-ROM GCR-8240N Rev: 1.06
  Type: CD-ROM ANSI SCSI revision: 05
Host: scsi3 Channel: 00 Id: 00 Lun: 00
  Vendor: ATA Model: WDC WD800JD-75LS Rev: 09.0
  Type: Direct-Access ANSI SCSI revision: 05
Host: scsi4 Channel: 00 Id: 00 Lun: 00
  Vendor: ATA Model: WDC WD1600JS-75M Rev: 03.0
  Type: Direct-Access ANSI SCSI revision: 05
Host: scsi0 Channel: 00 Id: 06 Lun: 00
  Vendor: CERTANCE Model: ULTRIUM 2 Rev: 1775
  Type: Sequential-Access ANSI SCSI revision: 03
Host: scsi0 Channel: 00 Id: 06 Lun: 01
  Vendor: DELL Model: PV-124T Rev: 0008
  Type: Medium Changer ANSI SCSI revision: 02

Revision history for this message
Richard Kleeman (kleeman) wrote :

Some good news at last!

With the latest Jaunty kernel (2.6.28-11-generic) this problem has disappeared. I have tried a large number of different kernels and this is the only one since intrepid's first kernel that does not have the problem. Let me know if you need any more information from my system.

Revision history for this message
Andy Whitcroft (apw) wrote :

@Richard -- it would be good to confirm the nearest previous version you tried where it does not work. To limit the search for the trigger.

@All -- if those of you who are affected could try the Jaunty 2.6.28-11 kernel and report back if that also fixes the issue for you.

Changed in linux:
status: In Progress → Incomplete
Revision history for this message
Richard Kleeman (kleeman) wrote :

OK I retried this with the 2.6.28-11-generic and unfortunately it is not working so the fix I reported
does not work. It is weird that it should boot OK one time and not the next.

Revision history for this message
TJ (tj) wrote :

Andy, is this closely coupled to libata issues or could it be yet another by-product of the RCU idle issue I proposed a patch for last week (Friday, 14:52 - "rcu: Teach RCU that idle task is not quiscent state at boot") ?

Revision history for this message
FernanAguero (fernan-ciudad) wrote :

> @Richard -- it would be good to confirm the nearest previous version you
> tried where it does not work. To limit the search for the trigger.
>
> @All -- if those of you who are affected could try the Jaunty 2.6.28-11
> kernel and report back if that also fixes the issue for you.

Andy,

can you briefly tell us how to 'try the Jaunty 2.6.28-11
kernel' ... is this something we can do using the update
manager and then go back to our normal intrepid life?

Can we just download/install/boot the 2.6.28-11 kernel in an
intrepid background? Would that be of help to debug the
issue?

Sorry for the questions, but I'd like to be of help.

Cheers,

Fernan

Revision history for this message
Andy Whitcroft (apw) wrote :

@FernanAguero -- it seems that this fix no longer works for the reporter either so I would ignore it for now.

Revision history for this message
Andy Whitcroft (apw) wrote :

Ok, I have built some test kernels with the patch suggested by TJ. Perhaps those of you affected by this issue could try the kernels and report back here. Kernels are at the URL below:

    http://people.ubuntu.com/~apw/lp290153-jaunty/

Revision history for this message
Richard Kleeman (kleeman) wrote :

I tried the kernel provided by Andy and the problem remains. I used the amd64 version of the kernel.

Revision history for this message
Steve Langasek (vorlon) wrote :

As we don't appear to have a handle on this bug's origin, I'm milestoning it for jaunty-updates. It remains a candidate for intrepid/jaunty SRU if a fix becomes available.

Changed in linux (Ubuntu Jaunty):
milestone: none → jaunty-updates
Revision history for this message
Seth (bugs-sehe) wrote :
Download full text (4.0 KiB)

Have this problem since my Jaunty upgrade. Upgraded from existing intrepid install. Config identical to hostname config. In fact I can still boot into a snapshot of the pre-upgrade intrepid (have backups, at all times). Immediately exiting the busybox shell works all the time.

Did anyone investigate whether all reporting users use LVM? It seemed so to me. Note: LVM, not RAID per se. (I use LVM with no encryption, striping, raid or whatever. Plain LVM+xfs for ease of administration).

Note that I also have the same problem now when booting into that intrepid install (using the same kernel/initramfs 2.6.28-11-generic). I can guarantee (lvm snapshot isolation) that it wasn't touched *EXCEPT* for the shared /boot partition. This narrows it to a kernel/initramfs change in the kernel for Jaunty Beta (09 aprl 2009) vs. intrepid latest kernel?

I do recall that there had been a kernel upgrade in the 'regular' updates for my intrepid box yesterday. I *have* rebooted the intrepid box without problems *after* that particular kernel update but *before* the dist-upgrade to Jaunty. I have *not* seen a smooth reboot since dist-upgrading to Jaunty.

I did also note that the dist-upgrade seems to use a deviant method of rebooting after install? It seems to bypass the BIOS and instead invoke the boot-loader code directly (no BIOS POST screens appeared). This might not play well with my hdparm settings since I auto-suspend most of my disks after some 15 seconds and this easily leads to timing issues when disks need to be awoken. However, the same issues also arise when doing cold/BIOS boots.

Here is my system details:
MOBO Asus P5b Deluxe Wifi
root@hostname:~# uname -a
Linux hostname.sehe.nl 2.6.28-11-generic #41-Ubuntu SMP Wed Apr 8 04:38:53 UTC 2009 i686 GNU/Linux
root@hostname:~# lvm version
  LVM version: 2.02.39 (2008-06-27)
  Library version: 1.02.27 (2008-06-25)
  Driver version: 4.14.0
root@hostname:~# lspci
00:00.0 Host bridge: Intel Corporation 82P965/G965 Memory Controller Hub (rev 02)
00:01.0 PCI bridge: Intel Corporation 82P965/G965 PCI Express Root Port (rev 02)
00:1a.0 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI Controller #4 (rev 02)
00:1a.1 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI Controller #5 (rev 02)
00:1a.7 USB Controller: Intel Corporation 82801H (ICH8 Family) USB2 EHCI Controller #2 (rev 02)
00:1b.0 Audio device: Intel Corporation 82801H (ICH8 Family) HD Audio Controller (rev 02)
00:1c.0 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 1 (rev 02)
00:1c.4 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 5 (rev 02)
00:1c.5 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 6 (rev 02)
00:1d.0 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI Controller #1 (rev 02)
00:1d.1 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI Controller #2 (rev 02)
00:1d.2 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI Controller #3 (rev 02)
00:1d.7 USB Controller: Intel Corporation 82801H (ICH8 Family) USB2 EHCI Controller #1 (rev 02)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev f2)
00:1f.0 ISA bri...

Read more...

Revision history for this message
Seth (bugs-sehe) wrote :

PS. My exact method of distupgrading was actually 'update-manager -d' in case anyone was wondering about the reboot-method thingie I mentioned

Revision history for this message
Scott Kitterman (kitterman) wrote :

No. This issue is in Intrepid too and no lvm involved.

Revision history for this message
Seth (bugs-sehe) wrote :

Added findings:

(1) i was overly optimistic when I said that exiting the initramfs shell immediately helped. It often requires a wait of 30-60 seconds to succeed.
(2) an added observation that could shed some light: the hard-disk activity led lights up continously since 'booting' the grub selection. This continues *even after* getting the busybox prompt. Watching the activity light for inactivity is a very reliable indicator for 'disk readiness'. In other words, booting succeeds as soon as the hard disk activity has died down. Perhaps there is some disk scanning activity involved that takes a long time, locking the disk? Is there anyway to telling what the system is doing during that long period of disk activity?

Ok, so what should happen for this issue to get more attention? All I can see now is that this ticket is 'Invalid' and various other confusing states in a large number of packages(?). I appreciate if no-one knows exactly what is going on, but it seems pretty clear that it is a pervasive problem and bogs many users.

I'm willing to help out with whatever smart questions that might be asked. At the moment, though I'm at a loss.

Revision history for this message
Scott Kitterman (kitterman) wrote :

This is currently a priority item for both the kernel team and the release
team. Since no solution has been found yet, the odds of this getting fixed
before release are low. More attention won't help.

Revision history for this message
Seth (bugs-sehe) wrote :

Has a cause been identified? I can certainly help with that part! Here's some more I dugg up (attached)

Bootcharts from my intrepid install directly before upgrading
And from the jaunty directly after upgrading (luckily, bootchart was alwyas installed on my system)

Lvm seems to be taking an awful long time *and* displaying the odd disk activity in bootchart that I described. I wouldn't know what lvm is doing in that time, and thus why it might longer than expected. I guess I will try to get rid of my snapshots - just to see whether that changes the pathological behaviour. I'll also be digging around for an older kernel/initrd that I might have on backup somewhere to see whether I can pry them apart to find relevant differences. Any pointers on how to do that are appreciated.

Revision history for this message
Seth (bugs-sehe) wrote :
Revision history for this message
Seth (bugs-sehe) wrote :

As you can see the boot time is pretty pathetic now. Up from 40s (before upgrade) to 2"14s (resuming as quickly as possible)

Revision history for this message
Scott Kitterman (kitterman) wrote :

since this bug happened on Intrepid and non-lvm systems, I suspect you are
having a different problem and should file a new bug.

Revision history for this message
Seth (bugs-sehe) wrote :

Thanks for the hint. I found bug #332270... it might be more akin to what I'm describing. Still might be fruitful to cross examine that one: udev's change to inotify is known to cause problems/slowness in booting by constant firings of change events. This might happen without lvm? I don't know about the versions used on intrepid, though.

Revision history for this message
Scott Kitterman (kitterman) wrote :

What is incomplete about this bug?

Revision history for this message
Seth (bugs-sehe) wrote :

Ok, my symptoms resolved: FWIW I found out that 'lvchange -a y' was taking ages on my volume group that contained a snaphot of 18Gb that was 50% full. Ages measuring in minutes, it would prevent successfull boot. I found out by doing repeated

/sbin/lvm vgchange -a n
/sbin/lvm vgchange -a y

/sbin/lvm lvremove vg/snapshot

/sbin/lvm vgchange -a n
/sbin/lvm vgchange -a y

Of course I made a backup of the data in the snapshot that I actually wanted to keep :)

Anyone seeing prolonged disk activity while failing boot, check this, or head over to bug #332270... I'm still considering to file this as a bug against lvm2 (unless I find some documentation on lvm2 that I should have read, stating that this behaviour is by design).

Regards,
Seth

Revision history for this message
wirechief (wirechief) wrote :

I have found that adding rootdelay=90 resolves an issue with dropping to a intramfs busybox shell when booting with
a usb-stick made by usb-creator in jaunty 9.04 , that bug is https://bugs.launchpad.net/bugs/276822
Linux wirechief-laptop 2.6.28-11-generic #41-Ubuntu SMP Wed Apr 8 04:39:23 UTC 2009 x86_64 GNU/Linux

Revision history for this message
Scott Kitterman (kitterman) wrote :

I think this was meant to go to confirmed.

Changed in linux (Ubuntu Jaunty):
status: Incomplete → Confirmed
Revision history for this message
Scott Kitterman (kitterman) wrote :

I just upgraded the machine that cause me to file this bug to Jaunty and it now boots without rootdelay. I still get serveral of the ata4: SRST failed (errorno=-16) errors in the boot process. They come at ~ 22, 32, 67, and I think 72 and 74. So it looks like the root cause is still there, but 2.6.28 is more patient.

Revision history for this message
Morgan Jones (maclover201) wrote :

Oddly enough, I had this problem constantly when I was running the 2.6.27 kernel and above (getting the SRST failed. I decided to switch from Ubuntu's kernel to the mainline Linux kernel for now (and am running 2.6.29.1 at the time) and it seems to have partially fixed the problem. The only weird thing from viewing dmesg output in 2.6.29.1 is that the CPU stalls out for ~14 seconds. It doesn't need rootdelay either, but there's obviously still a stall while it probes the IDE interfaces.

This may somehow be related to bug 294123.

Anyway, the strange thing is, what caused my previous errors was this configuration:

00:00.0 Host bridge: Silicon Integrated Systems [SiS] 651 Host (rev 02)
00:01.0 PCI bridge: Silicon Integrated Systems [SiS] Virtual PCI-to-PCI bridge (AGP)
00:02.0 ISA bridge: Silicon Integrated Systems [SiS] SiS962 [MuTIOL Media IO] (rev 04)
00:02.1 SMBus: Silicon Integrated Systems [SiS] SiS961/2 SMBus Controller
00:02.5 IDE interface: Silicon Integrated Systems [SiS] 5513 [IDE]
00:02.6 Modem: Silicon Integrated Systems [SiS] AC'97 Modem Controller (rev a0)
00:02.7 Multimedia audio controller: Silicon Integrated Systems [SiS] AC'97 Sound Controller (rev a0)
00:03.0 USB Controller: Silicon Integrated Systems [SiS] USB 1.1 Controller (rev 0f)
00:03.1 USB Controller: Silicon Integrated Systems [SiS] USB 1.1 Controller (rev 0f)
00:03.2 USB Controller: Silicon Integrated Systems [SiS] USB 1.1 Controller (rev 0f)
00:03.3 USB Controller: Silicon Integrated Systems [SiS] USB 2.0 Controller
00:0e.0 Network controller: RaLink RT2561/RT61 802.11g PCI
00:0f.0 Multimedia video controller: Sony Corporation Device 8087 (rev 01)
00:12.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8139/8139C/8139C+ (rev 10)
00:13.0 FireWire (IEEE 1394): NEC Corporation uPD72874 IEEE1394 OHCI 1.1 3-port PHY-Link Ctrlr (rev 01)
01:00.0 VGA compatible controller: nVidia Corporation NV17 [GeForce4 MX 440] (rev a3)

I don't even see anything Intel in there, so there may be a problem with more than Intel cards. Attached is my 2.6.27.10 dmesg output, clearly showing the SRST failed messages.

Revision history for this message
Morgan Jones (maclover201) wrote :
Revision history for this message
Morgan Jones (maclover201) wrote :

It's apparent that something was fixed between 2.6.27 and 2.6.28 and fixed even more between 2.6.28 and 2.6.29.1.

Revision history for this message
Forest (foresto) wrote :

Just to add to the list of hardware experiencing this problem:

3ware 8006-2LP RAID controller
Intel P35 chipset

I haven't tried the newest kernel; I'm still using the standard Intrepid repositories.

Revision history for this message
Andy Whitcroft (apw) wrote :

Based on Scott's feedback here on Jaunty I am going to close the Jaunty task fixed, as the device is found there without need for modification of the boot configuration. I suspect Karmic is also fixed but is as yet untested.

Changed in linux (Ubuntu Jaunty):
status: Confirmed → Fix Released
Revision history for this message
d2globalinc (shane-2710studios) wrote :

I still have this issue in Jaunty (2.6.28-11) x86_64 when I place my sata dvd drive on the same channels as my sata (not sata2) hard drives.. So I dont know why this is being marked as fixed..

Revision history for this message
davidyu (yuyich) wrote : David YC Yu work in the customer site

I will be out of the office starting 05/03/2009 and will not return until
05/11/2009.

I'll check email daily

Revision history for this message
Scott Kitterman (kitterman) wrote : Re: [Bug 290153] Re: Fails to find boot device in Intel D945Gnt

I think fixed is too strong a term as it still takes a very long time for
the device to turn up. I think the consequences of this bug are reduced
(slow boot instead of failed boot), but the underlying defect is still
present.

Revision history for this message
Richard Kleeman (kleeman) wrote :

Fixed for me in version 2.6.28-12 of the kernel. Relative to the old kernels in 8.04 the boot time is now much faster as well.

Thanks for fixing this!

Revision history for this message
Andy Whitcroft (apw) wrote :

@Scott -- possibly yes. I was basing that on both your feedback and that of @Morgan who seems to note things improving over time. However looking at his feedback I am not sure his kernel is configured correctly. @Richard just throws more confusing in as he is now working.

@Morgan -- I notice that your later two kernels are reporting the disks as hdX devices but the 2.6.27.10 kernel reported them as scsi devices. I presume you have used some different options to compile those so that it is using different drivers to handle them.

@Richard -- could you indicate which version prior to this did not work, and if possible could you attach dmesg output for boots with both of those two for comparison.

Revision history for this message
Andy Whitcroft (apw) wrote :

@Scott -- actually could you also attach a current dmesg output from your working but very slow boot for me.

Revision history for this message
Morgan Jones (maclover201) wrote :

Hmm, I don't think I configured those kernels differently. Let me check to see if the new 2.6.29.3 works.

Revision history for this message
Morgan Jones (maclover201) wrote :

2.6.29.3 still gives those SRST failed messages.

In summary, here's what each kernel version does for me.

2.6.24-21-generic: Boots right up
2.6.27.10: SRST failed
2.6.28-rc6: SRST failed
2.6.28: SRST failed
2.6.28-11-generic: SRST failed, but boots up after 60 seconds
2.6.28.4: SRST failed
2.6.28.5: SRST failed
2.6.28.6: SRST failed
2.6.29: SRST failed
2.6.29.1: Better, but CPU sometimes stalls and I do get the SRST failed messages.
2.6.29.2: Regressed to problems of 2.6.29, with SRST failed messages
2.6.29.3: SRST failed

This is evidence of an ongoing problem. @Andy, how can I check what kind of device each kernel reports my HD as?

Attached: 2.6.29 dmesg output

Revision history for this message
Andy Whitcroft (apw) wrote :

you will find you have /dev/sd* or /dev/hd* for these kernels. and the same strings should be reported in the dmesg.

Revision history for this message
Forest (foresto) wrote :

I'm using 2.6.28-11-generic on Jaunty, and the bug is still present. (It first appeared for me with one of the Intrepid kernels; never happened with earlier releases.)

Of note, I do not have any drives on the motherboard's SATA bus. My boot device is a 3ware hardware RAID card whose driver is part of the generic kernel. Please let me know if more information from my system would help.

Changed in linux (Ubuntu Jaunty):
status: Fix Released → Confirmed
Revision history for this message
Jake (palmzealot-gmail) wrote :

I don't know if this helps or not, but I'll throw it in anyway. I had the same issue with 2.6.28-12-generic on the same Intel MB. When I tried to reinstall the kernel from the CLI, APT gave me a message about DPKG needing to be reconfigured. I followed the command it gave me (sudo dpkg --configure -a). It went through process of reconfiguring everything, including the kernel. From that point, I've been able to boot just fine, without a workaround.

Revision history for this message
gunbladeiv (gunbladeiv) wrote :

i still had the same issue after upgrade to karmic kernel of 2.6.30-10-generic. So i think the bug exist in karmic.
it does boot find when i exit the initramfs , and rootdelay=30 didnt work for me. will try to add another 30 to the rootdelay number.

Revision history for this message
Andy Whitcroft (apw) wrote :

@Morgan -- did you manage to figure out which configuration you were using for your various kernels. It might be worth do the same test with the kernels from the Mainline Kernel archive as we build those with the same options as we build the Ubuntu kernels.

@Jake -- thats rather wierd I would not expect that to make any difference at all. What version of the kernel are you running now (uname -a output).

Revision history for this message
Morgan Jones (maclover201) wrote :

@Andy- The tests I did were mainly with the mainline kernel. I got the same issues I had with the (-generic) kernels.
Two more updates here:
2.6.29.4 - SRST failed
2.6.30 - SRST failed
Still not fixed in 2.6.30, obviously. I think (in 2.6.29.4) my kernel sees my drive as an sda/sdb device. It doesn't work either way, unfortunately. I think the root of the issue is the kernel not being able to softreset the hard drive, which eventually results in a timeout and the kernel doing _something_ to manage to reset the HD and eventually boot the system.

Morgan

Revision history for this message
Andy Whitcroft (apw) wrote :

@Morgan -- would you be able to test the latest 2.6.31-rc4 mainline kernel (or later if its appeared) and post me a dmesg from that kernel. Thanks!

    https://wiki.ubuntu.com/KernelTeam/MainlineBuilds

Revision history for this message
Morgan Jones (maclover201) wrote :

Ok, I will do so when I get around to it. I am out of town until Saturday and will not be at the computer involved until Monday. I am running 2.6.30.3 mainline right now and this issue has not been fixed. But I will try.

Revision history for this message
Morgan Jones (maclover201) wrote :

This could be fixed in 2.6.31-rc4.

From the changelog:

libata: fix follow-up SRST failure path

ata_eh_reset() was missing error return handling after follow-up SRST allowing EH to continue the normal probing path after reset failure. This was discovered while testing new WD 2TB drives which take longer than 10 secs to spin up and cause the first follow-up SRST to time out.

I will actually remotely connect and start building 2.6.31-rc4 now. I will remote reboot and post the dmesg by the end of today. Ignore my previous message.

Revision history for this message
Morgan Jones (maclover201) wrote :

@Andy - Do you know of a boot param that increases dmesg buffersize? I can't get the full log from 2.6.31-rc4 as it is truncated.

Thanks! Morgan

Revision history for this message
Morgan Jones (maclover201) wrote :

Full dmesg output for kernel 2.6.31-rc4

Still get the SRST failed message... but it only probes interfaces for ~60 seconds which is better.

Steve Langasek (vorlon)
Changed in linux (Ubuntu Karmic):
milestone: jaunty-updates → none
Revision history for this message
Andy Whitcroft (apw) wrote :

@Morgan Jones -- the dmesg buffer size was upped by default in the latest karmic kernels, you can also dynamically change it using the log_buf_len=nnn (this needs to be a power of 2) kernel parameter. Would you also be able to test with the 2.6.31 based kernels in Karmic and let us know how this bug stands.

Revision history for this message
manicmike (mike-better-access) wrote :

I upgraded to Karmic last night (about 16 hours ago) and it wouldn't boot.
Haven't tried the delay, and can't access the machine now (because it wouldn't boot, of course).

Revision history for this message
Andy Whitcroft (apw) wrote :

There is a commit related to link init failures in the final karmic kernel which may be relevant. So it may be worth testing with 2.6.31-14.46:

  commit e827d71dd6d39b3e28a519cb0bace9634d42aa7d
  Author: Tejun Heo <email address hidden>
  Date: Tue Oct 6 17:08:40 2009 +0900

    libata: fix incorrect link online check during probe

    commit 3b761d3d437cffcaf160a5d37eb6b3b186e491d5 upstream.

    While trying to work around spurious detection retries for
    non-existent devices on slave links, commit
    816ab89782ac139a8b65147cca990822bb7e8675 incorrectly added link
    offline check logic before ata_eh_thaw() was called. This means that
    if an occupied link goes down briefly at the time that offline check
    was performed, device class will be cleared to ATA_DEV_NONE and libata
    wouldn't retry thus failing detection of the device.

Changed in linux (Ubuntu Karmic):
milestone: none → karmic-updates
status: Confirmed → Incomplete
Changed in linux (Ubuntu Jaunty):
assignee: Andy Whitcroft (apw) → nobody
Changed in linux (Ubuntu Intrepid):
assignee: Andy Whitcroft (apw) → nobody
status: In Progress → Confirmed
Revision history for this message
Pete Graner (pgraner) wrote :

@Scott K. can you please test per comment #163 and let us know if that patch fixed the issue.

Thanks

Revision history for this message
Scott Kitterman (kitterman) wrote :

I upgraded the affected box today and it seems similar to Jaunty. It does boot without the rootdelay workaround that I had to use in Intrepid, but is still very slow

[ 1.088696] sda: sda1 sda2 < sda5 >
[ 1.131086] sd 2:0:0:0: [sda] Attached SCSI disk
[ 5.944009] ata4: link is slow to respond, please be patient (ready=0)
[ 10.928009] ata4: device not ready (errno=-16), forcing hardreset
[ 16.124011] ata4: link is slow to respond, please be patient (ready=0)
[ 20.940009] ata4: SRST failed (errno=-16)
[ 26.136008] ata4: link is slow to respond, please be patient (ready=0)
[ 30.952009] ata4: SRST failed (errno=-16)
[ 36.148011] ata4: link is slow to respond, please be patient (ready=0)
[ 65.996009] ata4: SRST failed (errno=-16)
[ 71.024009] ata4: SRST failed (errno=-16)
[ 71.024013] ata4: reset failed, giving up

I'm attaching /var/log/dmesg. Please let me know if you need anything else.

Revision history for this message
Scott Kitterman (kitterman) wrote : Appears the problem still exists in Karmic
  • dmesg Edit (38.8 KiB, application/octet-stream; name="dmesg")

I upgraded the system that caused me to file this bug to Karmic today and
it appears similar to Jaunty. It takes forever to boot, but gets there
eventually without the root delay work around.

[ 1.088696] sda: sda1 sda2 < sda5 >
[ 1.131086] sd 2:0:0:0: [sda] Attached SCSI disk
[ 5.944009] ata4: link is slow to respond, please be patient (ready=0)
[ 10.928009] ata4: device not ready (errno=-16), forcing hardreset
[ 16.124011] ata4: link is slow to respond, please be patient (ready=0)
[ 20.940009] ata4: SRST failed (errno=-16)
[ 26.136008] ata4: link is slow to respond, please be patient (ready=0)
[ 30.952009] ata4: SRST failed (errno=-16)
[ 36.148011] ata4: link is slow to respond, please be patient (ready=0)
[ 65.996009] ata4: SRST failed (errno=-16)
[ 71.024009] ata4: SRST failed (errno=-16)
[ 71.024013] ata4: reset failed, giving up

/var/log/dmesg attached.

Changed in linux (Ubuntu Karmic):
status: Incomplete → Confirmed
Revision history for this message
Magnes (magnesus2) wrote :

In Karmic my problem went (almost) away. My dvd-rom works flawlessly (but I have to do more tests). System boots with every hard disk configuration. The only remaining problem: while booting there is a lot of text about "searching" for sda1, sda2 etc.under the white Ubuntu logo for a few seconds. But system boots fast - in 30 seconds.

Revision history for this message
woolysheep (woolysheep) wrote :

i've got similar issue since Ubuntu Karmic (tried Daily build/beta), I am using a VIA VT8237R/VT8237R Plus Chipset wich Supports dual channel native SATA controller up to 1.5Gb/s with RAID 0 or RAID 1. I am running "fake"raid 1 at the moment.

waiting longer or setting longer boot delay doesn't work for me, as well using options like all_ide_general (forgot the order now)

Installed Ubuntu 9.04 and it works perfectly again, will stick with this version for the moment. (also because i have problems with Grub 2 installing on my dmraid)

tags: added: iso-testing
Revision history for this message
davidyu (yuyich) wrote : David YC Yu not in office

I will be out of the office starting 11/01/2009 and will not return until
11/30/2009.

I'll check email daily

Revision history for this message
ginalfa (ginalfa) wrote :

I am also experiencing a similar problem on my Asrock P4i65GV (intel ICH5) chipset.
I know that in in intrepid and jaunty was necessary to blacklist intel_agp module to make that mobo working.
With karmik the only solution to make live CD bootable (and installable) is to expand inittramfs and squashfs filesystems and manually modify /etc/modprobe.d/blacklist.conf files in both images, (by adding intel_agp) recreate fs and the iso.
here attached my lspci output

Revision history for this message
Scott Kitterman (kitterman) wrote : Re: [Bug 290153] Re: Fails to find boot device in Intel D945Gnt

>I am also experiencing a similar problem on my Asrock P4i65GV (intel ICH5) chipset.

The problem described in this comment is a different bug. Please file a new bug for this.

tags: added: intrepid jaunty karmic
removed: regression-potential
Revision history for this message
Scott Kitterman (kitterman) wrote :

Since it worked in Hardy, isn't it still regression potential? I have one
box on Hardy due to this buh.

Revision history for this message
Sergio Zanchetta (primes2h) wrote :

Thank you for reporting this bug to Ubuntu. Intrepid Ibex 8.10 reached EOL on 30 March 2010.
Please see this document for currently supported Ubuntu releases:
https://wiki.ubuntu.com/Releases

Please feel free to report any other bugs you may find.
Thank you.

Changed in linux (Ubuntu Intrepid):
status: Confirmed → Won't Fix
Revision history for this message
Sergio Zanchetta (primes2h) wrote :

I've just realized I made a mistake, Intrepid Ibex 8.10 "will reach" EOL on 30 "APRIL" 2010.

Sorry for this.

Anyway, I think that one month doesn't make any difference now.

Revision history for this message
Loïc Minier (lool) wrote :

Closing this oldish release notes task; I understand that the scope of the problem is smaller now, since the systems actually boot albeit slowly; so I'm closing the task, but please reopen if it necessary.

Changed in ubuntu-release-notes:
status: Confirmed → Invalid
tags: added: cherry-pick kernel-uncat
Revision history for this message
Forest (foresto) wrote :

This problem went away for me in the more recent Karmic kernels, but after upgrading to Lucid, it has resurfaced. I am once again intermittently getting dumped to a busybox prompt instead of booting.

3ware 8006-2LP RAID controller
Intel P35 chipset

Revision history for this message
Scott Kitterman (kitterman) wrote :

Different Chipset. Different bug.

Andy Whitcroft (apw)
Changed in linux (Ubuntu):
assignee: Andy Whitcroft (apw) → nobody
Changed in linux (Ubuntu Karmic):
assignee: Andy Whitcroft (apw) → nobody
Revision history for this message
Brad Figg (brad-figg) wrote :

Jaunty is no longer supported.

Changed in linux (Ubuntu Jaunty):
milestone: jaunty-updates → none
status: Confirmed → Won't Fix
Revision history for this message
Seth Forshee (sforshee) wrote :

Karmic is no longer supported. Please test to see whether or not this problem still exists in natty. Thanks!

Changed in linux (Ubuntu Karmic):
milestone: karmic-updates → none
status: Confirmed → Won't Fix
Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
Brad Figg (brad-figg) wrote : Unsupported series, setting status to "Won't Fix".

This bug was filed against a series that is no longer supported and so is being marked as Won't Fix. If this issue still exists in a supported series, please file a new bug.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: Incomplete → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Related questions

Remote bug watches

Bug watches keep track of this bug in other bug trackers.