Ubuntu 12.10 cloud images do not full provision on Azure

Bug #1055686 reported by Ben Howard
22
This bug affects 2 people
Affects Status Importance Assigned to Milestone
grub2 (Ubuntu)
Fix Released
Critical
Colin Watson
Quantal
Fix Released
Critical
Colin Watson

Bug Description

Details are still forcoming, but Ubuntu Cloud Images for 12.10 do not finish the provisioning state. Per the dmesg output, the images complete boot, but do not respond to SSH nor do they report to the Azure fabric that the images are role-ready.

Tags: bot-comment
Revision history for this message
Ubuntu Foundations Team Bug Bot (crichton) wrote :

Thank you for taking the time to report this bug and helping to make Ubuntu better. It seems that your bug report is not filed about a specific source package though, rather it is just filed against Ubuntu in general. It is important that bug reports be filed about source packages so that people interested in the package can find the bugs about it. You can find some hints about determining what package your bug might be about at https://wiki.ubuntu.com/Bugs/FindRightPackage. You might also ask for help in the #ubuntu-bugs irc channel on Freenode.

To change the source package that this bug is filed about visit https://bugs.launchpad.net/ubuntu/+bug/1055686/+editstatus and add the package name in the text box next to the word Package.

[This is an automated message. I apologize if it reached you inappropriately; please just reply to this message indicating so.]

tags: added: bot-comment
Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote :

The issue is that Grub2.00 is not recognizing the target. The error message is:
"error: couldn't find suitable memory target."

grub-core/lib/relocator.c
1410 #ifdef GRUB_MACHINE_EFI
1411 grub_efi_mmap_iterate (hook, avoid_efi_boot_services);
1412 #elif defined (__powerpc__)
1413 (void) avoid_efi_boot_services;
1414 grub_machine_mmap_iterate (hook);
1415 #else
1416 (void) avoid_efi_boot_services;
1417 grub_mmap_iterate (hook);
1418 #endif
1419 if (!found)
1420 return grub_error (GRUB_ERR_BAD_OS, "couldn't find suitable memory target");
1421 }
1422 while (1)

Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote :

Increasing the boot time out did not fix the problem.

Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote :

The analysis is that grub is attempting to allocate memory but is not able to find a suitable memory location. This would explain why the initrd is not being loaded into ram.

Colin Watson (cjwatson)
Changed in ubuntu:
milestone: ubuntu-12.10-beta-2 → ubuntu-12.10
Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote :

last part of boot. xz compress avi.

Changed in ubuntu:
assignee: Ben Howard (utlemming) → nobody
Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote :

Full boot log with grub2 options of "set debug=all; set pager=1" turned on.

Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote :

On a wild hunch, I built an image with a minimial initrd -- just the bits that are needed to boot -- and the system booted. This means that the problem is with the size of the initramfs.

On a default install the initramfs is ~4.5MB. However, if you install the linux-image-extras-virtual (requirement for Azure due to the need for the UFS module), that bloats the initramfs to ~19MB on the lastest kernel.

This means that the issue is likely the platform.

Here is the configuration for the initramfs that I used:
$ cat /etc/initramfs-tools/modules
sd_mod
scsi_mod
hid
hid_hyperv
hv_utils
hv_storvsc
hv_vmbus
hv_netvsc
crc_itu_t

$ cat /etc/initramfs-tools/initramfs.conf
MODULES=list
BUSYBOX=n
COMPCACHE_SIZE=""
COMPRESS=xz
BOOT=local
DEVICE=
NFSROOT=auto

Revision history for this message
Colin Watson (cjwatson) wrote :

So, there is certainly a limit on the size of the compressed kernel + its init_size + the compressed initrd, and on this platform as you say it's a little under 64MB. However, I can't make this add up to a problem in this case, and I think there may be some double-counting going on in GRUB that exacerbates the problem.

I looked hard at how GRUB allocates memory for the initrd. The maximum address is capped to 0x37FFFFFF, thereby excluding the 0x40000000-0xABFFFFFF range that's available, so the effective available range per E820 data is 0x100000-0x3FEFFFF, minus what's already been allocated for the kernel. The minimum address is then taken as the target physical address of the kernel (0x1000000) plus its required init space (0x134F000) plus the initrd size. But the initrd is going to be allocated *above* this address. Why does it need to add the initrd size? The effect of this calculation is to limit the initrd size to about 15MB when it should be limited to about 30MB.

You might like to try this patch:

  http://paste.ubuntu.com/1275848/

Beware that since it is nearly midnight here I've only minimally tested this.

Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote :

Patch is confirmed to work on Windows Azure with a 18M initrd.

[ 0.894608] registered taskstats version 1
[ 1.093496] Freeing initrd memory: 18432k freed
[ 1.105264] Key type trusted registered
[ 1.110574] Key type encrypted registered

Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote :

Regarding patching Grub2 or zero-day SRU: given that the known affected system, I think that a zero-day SRU is fine, unless there is another Grub2 build that needs committing. We would love to have this patch hit -proposed so that we build against it.

Andy Whitcroft (apw)
affects: ubuntu → grub2 (Ubuntu)
Revision history for this message
Colin Watson (cjwatson) wrote :

Thanks. I'm unavailable for most of today, but I've forwarded the patch upstream to start with. I'll prepare an upload for quantal-proposed as soon as I can.

Steve Langasek (vorlon)
Changed in grub2 (Ubuntu Quantal):
status: Confirmed → Triaged
milestone: ubuntu-12.10 → quantal-updates
Revision history for this message
Colin Watson (cjwatson) wrote :

This is in quantal-proposed now.

Colin Watson (cjwatson)
Changed in grub2 (Ubuntu Quantal):
status: Triaged → In Progress
assignee: nobody → Colin Watson (cjwatson)
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package grub2 - 2.00-7ubuntu11

---------------
grub2 (2.00-7ubuntu11) quantal-proposed; urgency=low

  * Fix incorrect initrd minimum address calculation (LP: #1055686).
  * Add keystatus and loadenv to signed image (LP: #1066399).
 -- Colin Watson <email address hidden> Sun, 14 Oct 2012 09:30:55 +0100

Changed in grub2 (Ubuntu Quantal):
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.