Unable to boot after kenel update (3.8.0-29).

Bug #1215269 reported by Tyler Johnson on 2013-08-22
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
High
Unassigned

Bug Description

After installing the updated kernel (3.8.0-29), I was suddenly unable to boot. I will experience a frozen screen for several seconds, followed by an error claiming that my installation partition cannot be found. However, the 3.8.0-27 kernel seems to work just fine.

I attempted an fsck, but no errors were found. I then removed the -29 kernel completely (sudo apt-get purge, followed by sudo update-grub), reinstalled it, but ended up with the same result.

Now for some system details: I'm using Ubuntu 13.04 x86_64, with all the latest stable updates installed. My machine's hard drive is controlled by an Intel ICH9R/DO/DH SATA AHCI Controller chipset, currently set to AHCI mode (as opposed to IDE or RAID mode). Ubuntu is stored on an exteded partition, which is formatted with EXT4 (/dev/sda6). My first three partitions are Windows installs, each with formatted with NTFS. Also, I have /dev/sda7 as my Swap partition.

If there are anymore details I can provide, please don't hesitate to ask.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1215269

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
description: updated
Changed in linux (Ubuntu):
importance: Undecided → High
tags: added: performing-bisect raring
Joseph Salisbury (jsalisbury) wrote :

I'd like to perform a bisect to figure out what commit caused this regression. We need to identify the earliest kernel where the issue started happening as well as the latest kernel that did not have this issue.

Can you test the following kernels and post back? We are looking for the first kernel version that exhibits this bug:

v3.8.13.4: http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.8.13.4-raring/
v3.8.13.5: http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.8.13.5-raring/

Thanks in advance!

So...i've had a similiar problem after updating from 3.8.0-27 to 3.8.0-29.

while booting i got a "general error mounting filesystems" with a prompt to enter the root-pw for maintanance, which was frozen so i couldn't enter the root pw. or do anything.
in contrary to your problem, the 3.8.0-27 version of the kernel didn't boot after installing the 29 version.

i read that it could be possible that the update wasn't finished, so i tried booting a live-usb stick (/dev/sdb2/ is my partition with linux)
sudo mount /dev/sdb2 /mnt
sudo chroot /mnt
sudo apt-get update
sudo apt-get upgrade
sudo apt-get dist-upgrade

didn't work.

so i tried recovery mode,
got into root console,
compared /etc/fstab with "blkid" output, everything was ok, uuid were the right ones.

calling dmesg, one of the lastest messages said: mountall segfaulted with 4 (after that 2 adresses, ip / sp, both != null)
calling mountall manually segfaulted also,
called mountall with gdb and "run", errors in dl-reloc.c 239 and rtld.c (but i don't have the stacktrace now since the problem is gone, so i can't reproduce, but more to that later)

got into root console again,
filesystem was readonly -> "mount -n -o remount,rw /"
(calling mountall still segfaults)
CTRL+D to get into recovery-menu
start networking
got into root console again
sudo apt-get update && sudo apt-get upgrade && sudo apt-get dist-upgrade
sudo apt-get clean
sudo apt-get autoclean
sudo apt-get --purge autoremove
sudo dpkg --configure -a
sudo dpkg --clear-avail
sudo apt-get -f install
sudo apt-get --fix-missing install

got no output so i thought nothing happend,

searched /var/log for hints what could be wrong. found something like "IOMMU disabled, please enable it"
reboot, enable IOMMU,
suddenly the system boots, (logging in and starting firefox, and so on)
reboot, disable IOMMU, system boots...

currently i'm not sure what it fixed, why the system boots now or if the problem is really fixed.

dmesg (see attachment), looks like filesystem is readonly again..
could the problem be a problem with EXT4 ?

not sure if i should file a seperate bug.

Joseph Salisbury (jsalisbury) wrote :

Also, the 3.8.0-30.43 kernel is now available in the -proposed repository. Would it be possible for you to test this latest kernel and post back if it resolves this bug?

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed.

Thank you in advance!

after reboot the filesystem was writeable again, not sure what caused the error listed in dmesg, thus it doesn't make much sense trying the other kernel, i think (if you meant me).

So, after a few days of further testing i'm convinced that the 3.8.0-29 version of the kernel still has some problems.

yesterday, i got the black screen with a blinking underscore again, but after rebooting i got a screen with
"/ has to be checked for errors", done, automatic reboot, login worked.

then again, a few minutes ago something went wrong again (as you can see in the attached dmesg2.out) and the filesystem was readonly again (well, in fstab is configured to mount as ro when an error occurs, thats no surprise than).

i will now try using the 3.8.0-30.43 kernel as proposed by Joseph.

Luis Henriques (henrix) wrote :

Stefan, could you please test the kernel in -proposed as per comment #4? Thanks.

Well, i think at least when it comes to my problems, handling this bug is finished. looks like hardware defect, what wasn't obvious .

i added the -proposed sources as described and when i wanted to install something with apt-get, i got an error while the installation was running. readonly filesystem. lookslike the SSD had an error again and remounted in readonly, i tried making it writeable again
mount -n -o remount,rw /
error, bad filesystem, orphaned inodes, wat?
restart, readonly immediately, remount with rw, fails, same error message
try e2fsck /dev/sdb2
finds something, fix, fix, fix...restart
suddenly boot process hangs in ACPI / SMART check, SSD is not recognized anymore. disconnect SSD, works (at least until "no operating system on drives found")

might have been a coincidence, maybe the kernel update caused it, but i doubt it. sorry guys for wasting your time (and at least 5 days searching myself).

Launchpad Janitor (janitor) wrote :

[Expired for linux (Ubuntu) because there has been no activity for 60 days.]

Changed in linux (Ubuntu):
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers