Oneiric's x64 cloud image [20111124] won't boot (wrong kernel, buggy /etc/fstab, buggy /etc/network/interfaces)

Reported by Bruno França dos Reis on 2011-11-27
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ubuntu
Critical
Ben Howard

Bug Description

This report is against the latest (as of now, 2011-11-27, 02:01 GMT-2) Oneiric's x64 cloud image, available on http://cloud-images.ubuntu.com/oneiric/current/ (which is Ubuntu 11.10 (Oneiric Ocelot) Daily Build [20111124]).

I tried to launch an instance on my Eucalyptus cloud, but I wouldn't boot. I started digging, and found many issues with this image.

#####

(1) Wrong Kernel
(a) Symptom.
The kernel said it couldn't find /dev/sda1: VFS: Cannot open root device "sda1" or unknown-block(0,0).

(b) Analysis.
Somehow I noticed that the kernel that came in the tarball was the "-generic" one, and not the "-virtual" one (as is the case with Natty, for example, which works fine). The file is called "oneiric-server-cloudimg-amd64-vmlinuz-generic". Chrooting into the image, `dpkg -l | grep linux` shows that linux-image-generic is installed, where I'd expect linux-image-virtual.

(c) Solution.
I removed the -generic kernel packages, installed the -virtual kernel packages (`apt-get purge linux-image-generic linux-image-virtual+`), cleaned up /boot by removing the -generic images (rm /boot/*-generic*), and modified by hand /boot/grub/grub.cfg (replace all instances of -generic by -virtual). [note: I tried update-grub, but it wrote a completely different grub.cfg file, then I decided to revert it and edit grub.cfg by hand].
Then I removed the kernel that came in the tarball ball (oneiric-server-cloudimg-amd64-vmlinuz-generic) and replaced it with the -virtual one from inside the image (`cp work/boot/vmlinuz-3.0.0-13-virtual oneiric-server-cloudimg-amd64-vmlinuz-virtual`).
The boot would now find /dev/sda1, as expected.

(2) Broken /etc/fstab
(a) Symptom.
At the end of the output from the virtual machine, I'd see "mount: mount point ext4 does not exist", and it wouldn't finish the boot sequence

(b) Analysis.
The file /etc/fstab is malformed. The contents of the file are:
LABEL=cloudimg-rootfs ext4 defaults 0 1

That's all. It is obviously missing the mount point (/), as one would expect from the error message. Comparing it to Natty's cloud image fstab, it seems to be missing /proc as well.

(c) Solution.
Replace the contents of the file with:
proc /proc proc nodev,noexec,nosuid 0 0
LABEL=cloudimg-rootfs / ext4 defaults 0 1

(3) Broken /etc/network/interfaces
(b) Analysis.
The file /etc/network/interfaces does not configure the device eth0, only the loopback device. The original content of the file is:
# interfaces(5) file used by ifup(8) and ifdown(8)
auto lo
iface lo inet loopback

(c) Solution.
Add the following configuration for eth0 (copied from Natty), at the end of /etc/network/interfaces:
# The primary network interface
auto eth0
iface eth0 inet dhcp

#####

After all these fixes, the instance finally managed to boot. Apparently it is working. However, there might be more bugs I couldn't detect (and that would not interfere with the boot sequence).

I urge you to fix this image, and to double check it.

Oh, and ***please make it easier to report bugs***. I lost more than 1 hour only to find this form (even asking at #ubuntu-server and #launchpad). It's really disgusting when you want to help and the system makes it extremelly difficult to help.

Clint Byrum (clint-fewbar) wrote :

Confirming per IRC conversation seen in #ubuntu-server

Changed in ubuntu:
assignee: nobody → Canonical Server Team (canonical-server)
importance: Undecided → Medium
status: New → Confirmed
tags: added: regression-release
Bruno França dos Reis (bfreis) wrote :

As asked by SpamapS in #ubuntu-server, I quickly tested the current Precise cloud image (Ubuntu 12.04 (Precise Pangolin) Daily Build [20111126]).

It didn't work.

I didn't analyze it, but comparing to what I've seen on Oneirice, apparently the Kernel is ok (it manages to find /dev/sda1) and /etc/fstab is buggy (it stops at "mount: mount point ext4 does not exist").

Adam Gandelman (gandelman-a) wrote :

It appears both oneiric and precise images are not booting on ec2 either, perhaps due to the incorrect kernel packaged (they end up attempting to boot memtest instead)

Attaching console output from two m1.small instances

Gustavo Niemeyer (niemeyer) wrote :

This feels like a Critical.

Scott Moser (smoser) on 2011-11-28
Changed in ubuntu:
importance: Medium → Critical
milestone: none → precise-alpha-1
Ben Howard (utlemming) wrote :

I deployed a new version of live-build, which had particular trouble with EBS volumes. The code was rolled back last night on 2011-11-27. I'm building a new Oneiric images now.

Changed in ubuntu:
assignee: Canonical Server Team (canonical-server) → Ben Howard (utlemming)
status: Confirmed → In Progress
Ben Howard (utlemming) wrote :

New build of precise has been kicked off as well.

Ben Howard (utlemming) wrote :

New daily for Oneiric has been published:
http://cloud-images.ubuntu.com/oneiric/20111128/

Ben Howard (utlemming) wrote :

New Precise images has been published:
http://cloud-images.ubuntu.com/precise/20111128.1

Changed in ubuntu:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers