ami-bbf514d2: Sometimes does not start booting (empty console output, no network)

Bug #398568 reported by Eric Hammond on 2009-07-12
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ubuntu on EC2
Medium
John Johansen
Jaunty
Undecided
Unassigned
linux (Ubuntu)
Medium
John Johansen
Jaunty
Low
John Johansen

Bug Description

This report relates to
  ami-bbf514d2 soren-us-east-1-jaunty-20090711/ubuntu-ec2-jaunty-i386.img.manifest.xml
which is a temporary development/testing image, but Mark asked for issues with it to be reported here, so for the record...

Most instances I start with this image boot fine and I'm able to connect to them, but sometimes an instance started with this image fail to come up. The symptoms include:

1. Instance states it is in the "running" state

2. No network access.

3. No console output (which might indicate that the kernel did not get very far into the boot process).

Since it might be relatedto the problem, the image uses kernel+ramdisk:
  aki-21f01148 canonical-cloud-us/vmlinuz-2.6.28-12-xen-i386.manifest.xml
  ari-3bf01152 canonical-cloud-us/initrd.img-2.6.28-12-xen-i386.manifest.xml

In one of my instances with the problem, the availability zone is my "us-east-1a" which I know is not the new fourth availability zone recently added by Amazon because I have been running instances continually in it for over a year. There is another availability zone (possibly Amazon's new one) which experiences the problem much more predictably.

Eric Hammond (esh) wrote :

This problems seems to be related to the kernel (since the instance hasn't even started writing to the console output). Also, other AMIs used with this kernel+ramdisk have similar issues:
  http://developer.amazonwebservices.com/connect/thread.jspa?threadID=34050

Scott Moser (smoser) on 2009-08-12
Changed in ubuntu-on-ec2:
status: New → Confirmed
Rick Clark (dendrobates) on 2009-08-26
Changed in ubuntu-on-ec2:
assignee: nobody → Scott Moser (smoser)
importance: Undecided → Critical
Matt Zimmerman (mdz) on 2009-08-31
Changed in ubuntu-on-ec2:
assignee: Scott Moser (smoser) → John Johansen (jjohansen)
John Johansen (jjohansen) wrote :

This problem occurs when using a pv-ops kernel, which is the default kernel used by ami-bbf514d2. When ami-bbf514d2 is booted using a kernel built with the xen patchset, it works in all availability zones.

This has now been tested with the intrepid + xen kernel (used with ami-064aad6f). A vanilla 2.6.28 + xen patchset, and the new Jaunty kernel + xen patchset.

John Johansen (jjohansen) wrote :

Fixing the symptom reported in this bug is blocked on obtaining further information from Amazon.

John Johansen (jjohansen) wrote :

Amazon indicates that XEN 3.02 compatibility is required and closet test environment to EC2 is CentOS 5.0

Jaunty kernel + xen patch set has been tested with Karmic test ami. Jaunty kernel + xen patch set has been build under Karmic dev environment to verify Karmic dev environment is not causing errors.

xen patch set has been forward ported to 2.6.29 as an iterative step to 2.6.31.

John Johansen (jjohansen) wrote :

We now have a vanilla 2.6.31 + xen patch kernel build but it fails in CentOS 5.0, with the following message from xmcreate

Error: (2 'Invalid Kernel', 'elf_xen_addr_calc_check: ERROR: ELF start or entries are out of bounds.\n')

Scott Moser (smoser) on 2009-09-08
tags: added: ec2-images uec-images
Scott Moser (smoser) wrote :

marking ubuntu-on-ec2 project tasks as invalid.

Changed in linux (Ubuntu):
assignee: nobody → John Johansen (jjohansen)
importance: Undecided → Critical
status: New → Confirmed
Changed in ubuntu-on-ec2:
status: Confirmed → Invalid
Changed in ubuntu-on-ec2:
importance: Critical → Medium
Changed in linux (Ubuntu):
importance: Critical → Medium
John Johansen (jjohansen) wrote :

Importance lowered as there is a function karmic kernel with the Xen patchset (lpn Bug #418130) that can be used. It is still desirable to fix this bug as using a pv-ops kernel is desirable in the long run.

This bug is still blocked on obtaining further information from Amazon.

Scott Moser (smoser) wrote :

I'm marking the development task of linux(Ubuntu) to 'Fix Released' based on fixes in 418130 . The karmic kernel doesn't have this issue as far as we know.

I've also done 'Nominate for Release' for Jaunty to indicate that the bug affects that kernel/ami and no where else.

Changed in linux (Ubuntu):
status: Confirmed → Fix Released
Changed in linux (Ubuntu Jaunty):
status: New → Confirmed
Scott Moser (smoser) wrote :

marking this low priority. the issue is a problem, but it is low on the overall priorities to fix this in jaunty.

Changed in linux (Ubuntu Jaunty):
assignee: nobody → John Johansen (jjohansen)
importance: Undecided → Low
Scott Moser (smoser) on 2009-09-15
tags: removed: uec-images

This bug was nominated against a series that is no longer supported, ie jaunty. The bug task representing the jaunty nomination is being closed as Won't Fix.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu Jaunty):
status: Confirmed → Won't Fix
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers