ami-bbf514d2: Sometimes does not start booting (empty console output, no network)

Bug #398568 reported by Eric Hammond
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ubuntu on EC2
Invalid
Medium
John Johansen
Jaunty
Invalid
Undecided
Unassigned
linux (Ubuntu)
Fix Released
Medium
John Johansen
Jaunty
Won't Fix
Low
John Johansen

Bug Description

This report relates to
  ami-bbf514d2 soren-us-east-1-jaunty-20090711/ubuntu-ec2-jaunty-i386.img.manifest.xml
which is a temporary development/testing image, but Mark asked for issues with it to be reported here, so for the record...

Most instances I start with this image boot fine and I'm able to connect to them, but sometimes an instance started with this image fail to come up. The symptoms include:

1. Instance states it is in the "running" state

2. No network access.

3. No console output (which might indicate that the kernel did not get very far into the boot process).

Since it might be relatedto the problem, the image uses kernel+ramdisk:
  aki-21f01148 canonical-cloud-us/vmlinuz-2.6.28-12-xen-i386.manifest.xml
  ari-3bf01152 canonical-cloud-us/initrd.img-2.6.28-12-xen-i386.manifest.xml

In one of my instances with the problem, the availability zone is my "us-east-1a" which I know is not the new fourth availability zone recently added by Amazon because I have been running instances continually in it for over a year. There is another availability zone (possibly Amazon's new one) which experiences the problem much more predictably.

Tags: ec2-images
Revision history for this message
Eric Hammond (esh) wrote :

This problems seems to be related to the kernel (since the instance hasn't even started writing to the console output). Also, other AMIs used with this kernel+ramdisk have similar issues:
  http://developer.amazonwebservices.com/connect/thread.jspa?threadID=34050

Scott Moser (smoser)
Changed in ubuntu-on-ec2:
status: New → Confirmed
Rick Clark (dendrobates)
Changed in ubuntu-on-ec2:
assignee: nobody → Scott Moser (smoser)
importance: Undecided → Critical
Matt Zimmerman (mdz)
Changed in ubuntu-on-ec2:
assignee: Scott Moser (smoser) → John Johansen (jjohansen)
Revision history for this message
John Johansen (jjohansen) wrote :

This problem occurs when using a pv-ops kernel, which is the default kernel used by ami-bbf514d2. When ami-bbf514d2 is booted using a kernel built with the xen patchset, it works in all availability zones.

This has now been tested with the intrepid + xen kernel (used with ami-064aad6f). A vanilla 2.6.28 + xen patchset, and the new Jaunty kernel + xen patchset.

Revision history for this message
John Johansen (jjohansen) wrote :

Fixing the symptom reported in this bug is blocked on obtaining further information from Amazon.

Revision history for this message
John Johansen (jjohansen) wrote :

Amazon indicates that XEN 3.02 compatibility is required and closet test environment to EC2 is CentOS 5.0

Jaunty kernel + xen patch set has been tested with Karmic test ami. Jaunty kernel + xen patch set has been build under Karmic dev environment to verify Karmic dev environment is not causing errors.

xen patch set has been forward ported to 2.6.29 as an iterative step to 2.6.31.

Revision history for this message
John Johansen (jjohansen) wrote :

We now have a vanilla 2.6.31 + xen patch kernel build but it fails in CentOS 5.0, with the following message from xmcreate

Error: (2 'Invalid Kernel', 'elf_xen_addr_calc_check: ERROR: ELF start or entries are out of bounds.\n')

Scott Moser (smoser)
tags: added: ec2-images uec-images
Revision history for this message
Scott Moser (smoser) wrote :

marking ubuntu-on-ec2 project tasks as invalid.

Changed in linux (Ubuntu):
assignee: nobody → John Johansen (jjohansen)
importance: Undecided → Critical
status: New → Confirmed
Changed in ubuntu-on-ec2:
status: Confirmed → Invalid
Changed in ubuntu-on-ec2:
importance: Critical → Medium
Changed in linux (Ubuntu):
importance: Critical → Medium
Revision history for this message
John Johansen (jjohansen) wrote :

Importance lowered as there is a function karmic kernel with the Xen patchset (lpn Bug #418130) that can be used. It is still desirable to fix this bug as using a pv-ops kernel is desirable in the long run.

This bug is still blocked on obtaining further information from Amazon.

Revision history for this message
Scott Moser (smoser) wrote :

I'm marking the development task of linux(Ubuntu) to 'Fix Released' based on fixes in 418130 . The karmic kernel doesn't have this issue as far as we know.

I've also done 'Nominate for Release' for Jaunty to indicate that the bug affects that kernel/ami and no where else.

Changed in linux (Ubuntu):
status: Confirmed → Fix Released
Changed in linux (Ubuntu Jaunty):
status: New → Confirmed
Revision history for this message
Scott Moser (smoser) wrote :

marking this low priority. the issue is a problem, but it is low on the overall priorities to fix this in jaunty.

Changed in linux (Ubuntu Jaunty):
assignee: nobody → John Johansen (jjohansen)
importance: Undecided → Low
Scott Moser (smoser)
tags: removed: uec-images
Revision history for this message
Leann Ogasawara (leannogasawara) wrote : Closing unsupported series nomination.

This bug was nominated against a series that is no longer supported, ie jaunty. The bug task representing the jaunty nomination is being closed as Won't Fix.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu Jaunty):
status: Confirmed → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers