unable to boot

Bug #38688 reported by Alejandro Cornejo
10
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Invalid
Medium
Unassigned

Bug Description

It fails to boot, after mounting the filesystems it displays a message saying something like:

waiting for root filesystem...

It hangs there indefinetly, this did not happen with kernel 2.6.15-19. I dont know if it is relevant, but my HD is SATA, and my partition is formated to ext3.

I have a single hard disk installed, which has four partitions, here I list the information reported by fdisk:

Disk /dev/sda: 160.0 GB, 160000000000 bytes
255 heads, 63 sectors/track, 19452 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot Start End Blocks Id System
/dev/sda1 1 5 40131 de Dell Utility
/dev/sda2 * 6 1280 10241437+ 7 HPFS/NTFS
/dev/sda3 1281 8929 61440592+ 83 Linux
/dev/sda4 8930 19452 84525997+ b W95 FAT32

Revision history for this message
Dennis Kaarsemaker (dennis) wrote :

May be udev bøg, subscribing Scott.

Revision history for this message
Scott James Remnant (Canonical) (canonical-scott) wrote :

It shouldn't hang indefinitely; wait three minutes and it should fail with "ALERT! Unable to mount root filesystem"

At that point, run "ls /dev/sd*" and see what you have present.

Changed in linux-source-2.6.15:
status: Unconfirmed → Needs Info
Revision history for this message
Alejandro Cornejo (acornejo) wrote :

That is exactly what happened, I never bothered waiting that long.

The alert message is different though:

ALERT! /dev/sda3 does not exist. Dropping to a shell!

If I do ls -/dev/sd* I get this:

/dev/sdd /dev/sdc /dev/sdb /dev/sda

So it is right, there is no /dev/sda3. This is weird, it does seem to be
udev related.

What should i do now?

Revision history for this message
Scott James Remnant (Canonical) (canonical-scott) wrote :

Ok, interesting. If you run "modprobe sd_mod" does /dev/sda3 appear?

Could you provide the output of "cat /sys/class/scsi_device/*/type"

Revision history for this message
Alejandro Cornejo (acornejo) wrote :

Ok, after running modprobe sd_mod, nothing happens, /dev/sda3 is still missing.

I tried running the command you suggested literally but it didnt work, so i typed it manually for each directory in /sys/class/scsi_device

Also, here is the type and model entries, (i thought model could be usefull too).

0:0:0:0/model DVD-ROM TS-H352C
0:0:0:0/type 5
0:0:1:0/model DVD+-RW
0:0:1:0/type 5
1:0:0:0/model 223 U HS-CF
1:0:0:0/type 0
1:0:0:1/model 223 U HS-MS
1:0:0:1/type 0
1:0:0:2/model 223 U HS-SM
1:0:0:2/type 0
1:0:0:3/model 223 U HS-SD/MMC
1:0:0:3/type 0

So in case you are wondering, the first two entries are obviously my DVD and DVDRW drives, and the last four are from my usb multi card reader.

I also did a cat /sys/block/sd{a,b,c,d}/device/model

here is the result:

HS-CF
HS-MS
HS-SM
HS-SD/MMC

So it is using my usb multi reader as sd{a,b,c,d}

If i boot with kernel 2.6.15-19 I get the following with cat /sys/block/sd{a,b,c,d,e}/device/model

ST3160828AS
HS-CF
HS-MS
HS-SM
HS-SD/MMC

So my hard disk is first.

So? what is next? I tried to gather more info about the bug, but the prompt is really limited, even commands like lsmod/rmmod/fdisk are missing.

Revision history for this message
Scott James Remnant (Canonical) (canonical-scott) wrote :

Ok, so it hasn't detected your hard drive at all!

That's interesting.

Can you boot with -19 and run "udevinfo -a -p /block/sda" providing the output.

Revision history for this message
Alejandro Cornejo (acornejo) wrote :

ok, here it is:

device '/sys/block/sda' has major:minor 8:0
  looking at class device '/sys/block/sda':
    KERNEL=="sda"
    SUBSYSTEM=="block"
    SYSFS{dev}=="8:0"
    SYSFS{range}=="16"
    SYSFS{removable}=="0"
    SYSFS{size}=="312500000"
    SYSFS{stat}==" 13502 108744 673213 92592 8283 38084 370430 414383 0 62733 506974"

follow the "device"-link to the physical device:
  looking at the device chain at '/sys/devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0':
    BUS=="scsi"
    ID=="0:0:0:0"
    DRIVER=="sd"
    SYSFS{device_blocked}=="0"
    SYSFS{iocounterbits}=="32"
    SYSFS{iodone_cnt}=="0x552b"
    SYSFS{ioerr_cnt}=="0x0"
    SYSFS{iorequest_cnt}=="0x552b"
    SYSFS{model}=="ST3160828AS "
    SYSFS{queue_depth}=="1"
    SYSFS{queue_type}=="none"
    SYSFS{rev}=="8.04"
    SYSFS{scsi_level}=="6"
    SYSFS{state}=="running"
    SYSFS{timeout}=="30"
    SYSFS{type}=="0"
    SYSFS{vendor}=="ATA "

  looking at the device chain at '/sys/devices/pci0000:00/0000:00:1f.2/host0/target0:0:0':
    BUS==""
    ID=="target0:0:0"
    DRIVER=="unknown"

  looking at the device chain at '/sys/devices/pci0000:00/0000:00:1f.2/host0':
    BUS==""
    ID=="host0"
    DRIVER=="unknown"

  looking at the device chain at '/sys/devices/pci0000:00/0000:00:1f.2':
    BUS=="pci"
    ID=="0000:00:1f.2"
    DRIVER=="ata_piix"
    SYSFS{class}=="0x010180"
    SYSFS{device}=="0x27c0"
    SYSFS{irq}=="225"
    SYSFS{local_cpus}=="03"
    SYSFS{modalias}=="pci:v00008086d000027C0sv00001028sd000001D1bc01sc01i80"
    SYSFS{subsystem_device}=="0x01d1"
    SYSFS{subsystem_vendor}=="0x1028"
    SYSFS{vendor}=="0x8086"

  looking at the device chain at '/sys/devices/pci0000:00':
    BUS==""
    ID=="pci0000:00"
    DRIVER=="unknown"

Hope it helps, i'm will reboot right now, and do a modprobe ata_piix, libata, and see if that helps.

Revision history for this message
Alejandro Cornejo (acornejo) wrote :

.... so if you are wondering... after doing modprobe ata_piix, modprobe libata, modprobe sd_mod, etc..

nothing changed.. I also killed udevd and restarted it, as well as executing udevplug.

I am really just *#!@#$ around, as I am not familiar with the inner workings of the kernel init process or udev.

So.. what is next? (again)

Revision history for this message
Scott James Remnant (Canonical) (canonical-scott) wrote :

Nope, that was a good guess :)

Could you try rebooting into the broken system again, and do "grep ata_piix /proc/modules" to see whether it's loaded; if not, could you check whether you have anything like /dev/hd* -- your SATA controller may have regressed to the dark ages.

Also can you do ls /sys/block and give the list of what's in there (ignore the ram and dm stuff)

Revision history for this message
Alejandro Cornejo (acornejo) wrote :

ok, sit tight (dont go to sleep) ill be back in 3+fraction minutes to report on what happened.

Revision history for this message
Alejandro Cornejo (acornejo) wrote :

The module ata_piix was being loaded (nice trick to use cat instead of lsmod :) )

Here is the info, although i'm pretty sure its useless:

ata_piix 12548 0 -live 0xffffffff880ab000

I had no /dev/hd* entries, the closest thing is /dev/hpet, which i'm pretty sure has nothing to do with hard drives.

Also, aside from ram[0-15] and sd{a,b,c,d}, there is nothing in /sys/block, no even the dm-[0-3] u mentioned.

waiting for instructions... (is there any hack to reduce that 3 minute timeout?)

Revision history for this message
Scott James Remnant (Canonical) (canonical-scott) wrote :

Easiest hack is to put "break=mount" on the kernel command-line.

I don't suppose you have a digital camera (or one in a phone) handy? If so, could you boot without quiet or splash on the command-line and with break=mount.

(You can edit the command line from the GRUB boot menu if you've never done that before)

You'll get a screen or more full of messages, use shift-pgup and try and take photos of them all and attach them to this bug.

Revision history for this message
Scott James Remnant (Canonical) (canonical-scott) wrote :

Ben: This is clearly a kernel bug ... ata_piix is loaded (same as old kernel) but their drive just flat-out isn't appearing (not even in /sys/block)

Changed in linux-source-2.6.15:
assignee: nobody → kernel-team
Revision history for this message
Alejandro Cornejo (acornejo) wrote :

Is there a less archaic way of doing this? I have a USB drive, some memory sticks, a laptop?

I followed your instructions anyway, I dont know how to add attachments in here, so I uploaded the photos to the web, here is the URL:

http://www.mcc.unam.mx/~acornejoc/public/temp/

Revision history for this message
Scott James Remnant (Canonical) (canonical-scott) wrote :

Unfortunately at the point you are, there's not much of a system available. Can you keep those photos online until this bug is resolved?

Revision history for this message
Alejandro Cornejo (acornejo) wrote :

Ok, so i think the key is in this pic:

http://www.mcc.unam.mx/~acornejoc/public/temp/DSCN1438.JPG

The output differs from the previous kernel, here it is (using dmesg with 15-19)

[ 17.791201] ata1: SATA max UDMA/133 cmd 0x1F0 ctl 0x3F6 bmdma 0xFFA0 irq 14
[ 17.945593] ata1: dev 0 cfg 49:2f00 82:346b 83:7701 84:4003 85:3469 86:3401 87:4003 88:207f
[ 17.945598] ata1: dev 0 ATA-7, max UDMA/133, 312500000 sectors: LBA48
[ 17.945831] ata1: dev 0 configured for UDMA/133
[ 17.945834] scsi0 : ata_piix
[ 17.946085] Vendor: ATA Model: ST3160828AS Rev: 8.04
[ 17.946100] Type: Direct-Access ANSI SCSI revision: 05
[ 17.946347] ata2: SATA max UDMA/133 cmd 0x170 ctl 0x376 bmdma 0xFFA8 irq 15
[ 18.267525] ata2: dev 0 cfg 49:0f00 82:0210 83:4011 84:4000 85:0000 86:0001 87:4000 88:0407
[ 18.267528] ata2: dev 0 ATAPI, max UDMA/33
[ 18.267530] ata2(0): applying bridge limits
[ 18.421481] ata2: dev 1 cfg 49:0b00 82:0000 83:0000 84:0000 85:0000 86:0000 87:0000 88:0407
[ 18.421484] ata2: dev 1 ATAPI, max UDMA/33
[ 18.421486] ata2(0): applying bridge limits
[ 18.421671] ata2: dev 0 configured for UDMA/33
[ 18.421865] ata2: dev 1 configured for UDMA/33
[ 18.421868] scsi1 : ata_piix
[ 18.422492] Vendor: TSSTcorp Model: DVD-ROM TS-H352C Rev: DE02
[ 18.422507] Type: CD-ROM ANSI SCSI revision: 05
[ 18.423974] Vendor: _NEC Model: DVD+-RW ND-3530A Rev: 103C
[ 18.423989] Type: CD-ROM ANSI SCSI revision: 05

So when using 15-19 I get ata1=hard drive, ata2=dvd+dvdrw

In contrast, in 15-20 I get ata1=0x1fe IDE port busy, and after that, ata1=dvd+dvdrw and I get no hard drive.

Revision history for this message
Scott James Remnant (Canonical) (canonical-scott) wrote :

1438 shows ata_piix loading, and shows the error "0x1f0 IDE port busy"
1441 shows ide-generic loading, and the (expected) errors "I/O resource 0x1F0-0x1F7 not free."

So looks like ata_piix is refusing to claim the port your hard-drive is on.

Changed in linux-source-2.6.15:
status: Needs Info → Confirmed
Revision history for this message
Alejandro Cornejo (acornejo) wrote :

Yes, the pics will be online, good thing u are night owls too. Any further suggestions?

Revision history for this message
Alejandro Cornejo (acornejo) wrote :

(forum admin's): Damn, every time i reload this shit double posts, you should use phpbb
(scott+ben): Where can i get the changelog of what happened from 15-19, to 15-20, i tried www.kernel.org, but then i realized the modifications where probably by the ubuntu team and have nothing to do with the kernel project.

Revision history for this message
Scott James Remnant (Canonical) (canonical-scott) wrote :

I don't have any particular suggestions at this point; there's clearly Strange Things Afoot at the Circle K(ernel)

https://launchpad.net/distros/ubuntu/dapper/+source/linux-source-2.6.15/+changelog

Is the current changelog

Revision history for this message
Ben Collins (ben-collins) wrote :

Can you send the output of "cat /proc/ioports"?

Revision history for this message
Alejandro Cornejo (acornejo) wrote :

Ok, here it is:

http://www.mcc.unam.mx/~acornejoc/public/temp/DSCN1447.JPG
http://www.mcc.unam.mx/~acornejoc/public/temp/DSCN1450.JPG

If you could tell me how to get the vfat module to load (it does not appear on /lib/modules/version/kernel/fs ) I will be able to mount my usbstick and stop taking pictures (which is really a slow process because I have to preprocess them in gimp to make them viewable/usable)

Revision history for this message
Alejandro Cornejo (acornejo) wrote :

Ok, so after some hacking (there should be a manual somewhere that shows how to modify ubuntu's initrd, which was tricky) i managed to use my thumb drive at the prompt.

I uploaded two files:

http://www.mcc.unam.mx/~acornejoc/public/temp/ioports-new
which is the 15-20 kernel version of /proc/ioports

and

http://www.mcc.unam.mx/~acornejoc/public/temp/ioports-old
which is the 15-19 kernel version of /proc/ioports

There are few differences, the only thing word mentioning seems to be pnp. It seems that libata is loaded inside pnp address space in the new version, so maybe that has something to do with this. I will attempt to do a isapnp=0, i think i read somewhere that deactivates pnp. Waiting for further instructions...

Revision history for this message
Alejandro Cornejo (acornejo) wrote :

I dont know if anyone is still reading this, but I'll keep on writing anyhow. I made a small C program that uses klogctl to dump the kernel messages to a file in my thumb drive. I tried to do a naive cp /proc/kmsg to the thumb drive, but that did not work. I've uploaded this file to:

http://www.mcc.unam.mx/~acornejoc/public/temp/kmsg

So now all the bug related files are simple text files, no need to go through the images.

Revision history for this message
Alejandro Cornejo (acornejo) wrote :

Well, still no response from the developers, however, for anyone expiriencing this bug, here is a quick & dirty fix:

The problem is that the pnpacpi module is claiming more adresses than it should, or its not properly sharing this address space with libata/ata_piix.

If you append the option pnpacpi=off to your kernel the problem should disappear. Its not much of a solution if you ask me, but it works, and it seems the kernel guys are too busy with other bugs to fix this right now. So use it as a palliative until this bug is properly fixed.

Good night, and good luck ;)

Revision history for this message
Ben Collins (ben-collins) wrote :

The "kernel developers" are just really busy, since it's close to a release. Believe me, I haven't lost site of this bug.

Revision history for this message
Dennis Kaarsemaker (dennis) wrote : Re: [Bug 38688] Re: unable to boot

> The "kernel developers" are just really busy

You really are a master in understatements. I'm not really being useful
here, just wanted to say: keep up the good work!

Revision history for this message
Alejandro Cornejo (acornejo) wrote :

Sorry if I offended anyone with my last post, I didn't mean to. I never implied that anyone lost track of this bug, I just implied what my post said, refrasing: "They are probably too busy to fix this right now, because when people are busy they have to prioritize work, not because they think this bug is not important, but because human parallelism is not possible."

Anyhow, I'm obviously not a kernel guru, but I AM an expirienced linux user/developer who is willing to help in anyway possible, so if there is anything I can do to help you out a little bit, please let me know (and it is a very REAL offer, i'm not joking).

Changed in linux-source-2.6.15:
assignee: kernel-team → ubuntu-kernel-team
Revision history for this message
Launchpad Janitor (janitor) wrote : This bug is now reported against the 'linux' package

Beginning with the Hardy Heron 8.04 development cycle, all open Ubuntu kernel bugs need to be reported against the "linux" kernel package. We are automatically migrating this linux-source-2.6.15 kernel bug to the new "linux" package. We appreciate your patience and understanding as we make this transition. Also, if you would be interested in testing the upcoming Intrepid Ibex 8.10 release, it is available at http://www.ubuntu.com/testing . Please let us know your results. Thanks!

Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

The Ubuntu Kernel Team is planning to move to the 2.6.27 kernel for the upcoming Intrepid Ibex 8.10 release. As a result, the kernel team would appreciate it if you could please test this newer 2.6.27 Ubuntu kernel. There are one of two ways you should be able to test:

1) If you are comfortable installing packages on your own, the linux-image-2.6.27-* package is currently available for you to install and test.

--or--

2) The upcoming Alpha5 for Intrepid Ibex 8.10 will contain this newer 2.6.27 Ubuntu kernel. Alpha5 is set to be released Thursday Sept 4. Please watch http://www.ubuntu.com/testing for Alpha5 to be announced. You should then be able to test via a LiveCD.

Please let us know immediately if this newer 2.6.27 kernel resolves the bug reported here or if the issue remains. More importantly, please open a new bug report for each new bug/regression introduced by the 2.6.27 kernel and tag the bug report with 'linux-2.6.27'. Also, please specifically note if the issue does or does not appear in the 2.6.26 kernel. Thanks again, we really appreicate your help and feedback.

Revision history for this message
Launchpad Janitor (janitor) wrote : Kernel team bugs

Per a decision made by the Ubuntu Kernel Team, bugs will longer be assigned to the ubuntu-kernel-team in Launchpad as part of the bug triage process. The ubuntu-kernel-team is being unassigned from this bug report. Refer to https://wiki.ubuntu.com/KernelTeamBugPolicies for more information. Thanks.

Revision history for this message
kernel-janitor (kernel-janitor) wrote :

This bug report was marked as Confirmed a while ago but has not had any updated comments for quite some time. Please let us know if this issue remains in the current Ubuntu release, http://www.ubuntu.com/getubuntu/download . If the issue remains, click on the current status under the Status column and change the status back to "New". Thanks.

[This is an automated message. Apologies if it has reached you inappropriately; please just reply to this message indicating so.]

tags: added: kj-triage
Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
Jeremy Foshee (jeremyfoshee) wrote :

This bug report was marked as Incomplete and has not had any updated comments for quite some time. As a result this bug is being closed. Please reopen if this is still an issue in the current Ubuntu release http://www.ubuntu.com/getubuntu/download . Also, please be sure to provide any requested information that may have been missing. To reopen the bug, click on the current status under the Status column and change the status back to "New". Thanks.

[This is an automated message. Apologies if it has reached you inappropriately; please just reply to this message indicating so.]

tags: added: kj-expired
Changed in linux (Ubuntu):
status: Incomplete → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.