booting cloud image without initramfs broken

Bug #1377308 reported by Scott Moser
20
This bug affects 3 people
Affects Status Importance Assigned to Milestone
cloud-init
Expired
High
Unassigned
cloud-init (Ubuntu)
Expired
High
Unassigned
Trusty
Won't Fix
Medium
Unassigned

Bug Description

Booting without a initramfs was broken by the cloud-init change for
bug 1353008 (http://pad.lv/1353008).

This affects arm guests where a bootloader is not used that would load
kernel and initramfs.

There are 2 workarounds:
a.) remove the offensive code
  sudo mount-image-callback ubuntu.img -- \
     sh -c 'f="$MOUNTPOINT/etc/init/cloud-init-local.conf";
            sed -e "/^start on/s/ and mounted .*//" -i.dist $f &&
            diff -u $f.dist $f'

b.) register and boot with an initramfs
  This is done by
   i.) getting the initramfs out of the image:
     sudo mount-image-callback ubuntu.img -- \
       sh -c 'cp $MOUNTPOINT/boot/initrd* . && chmod ugo+r initrd*'
   ii.) upload the initramfs to glance
     glance image-create --name=ubuntu-ramdisk --public \
        --container-format ari --disk-format ari < initrd*
     record the ramdisk id
   iii.) register with --property ramdisk_id=$RAMDISK_ID
     normally for "ami" style images on arm, the user had been
     uploading with --property kernel_id=<kernel_id>.
     now, you need to upload with:
       glance image-create --name="$NAME" \
          --public --container-format ami --disk-format ami \
          --property "kernel_id=$KERNEL_ID" \
          --property "ramdisk_id=$RAMDISK_ID" \ < ubuntu.img

 c.) register 'kernel command line' to include 'rw'.
     glance image-create .... --property kernel_args="root=/dev/vda rw"

Related bugs:
 * bug 1031065:cloud-init-nonet runs 'start networking' explicitly
 * bug 643289: [mountall] idmapd does not starts to work after system reboot
 * bug 1353008:[cloud-init] MAAS Provider: LXC did not get DHCP address, stuck in "pending"

ProblemType: Bug
DistroRelease: Ubuntu 14.04
Package: cloud-init 0.7.5-0ubuntu1.2
ProcVersionSignature: User Name 3.13.0-36.63-generic 3.13.11.6
Uname: Linux 3.13.0-36-generic aarch64
ApportVersion: 2.14.1-0ubuntu3.4
Architecture: arm64
Date: Thu Jan 1 00:02:09 1970
Ec2AMI: ami-00000007
Ec2AMIManifest: FIXME
Ec2AvailabilityZone: nova
Ec2InstanceType: m1.5GB
Ec2Kernel: aki-00000005
Ec2Ramdisk: ari-00000003
PackageArchitecture: all
ProcEnviron:
 TERM=screen
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=en_US.UTF-8
 SHELL=/bin/bash
SourcePackage: cloud-init
UpgradeStatus: No upgrade log present (probably fresh install)
mtime.conffile..etc.init.cloud.init.local.conf: 2014-10-03T19:49:16.813801

Revision history for this message
Scott Moser (smoser) wrote :
Changed in cloud-init (Ubuntu):
status: New → Triaged
Changed in cloud-init:
status: New → Triaged
importance: Undecided → High
Changed in cloud-init (Ubuntu):
importance: Undecided → High
Revision history for this message
Scott Moser (smoser) wrote :

The offensive change was this in /etc/init/cloud-init-local.conf
-start on mounted MOUNTPOINT=/ and mounted MOUNTPOINT=/run
+start on mounted MOUNTPOINT=/

Revision history for this message
Scott Moser (smoser) wrote :
Revision history for this message
Scott Moser (smoser) wrote :
Revision history for this message
Scott Moser (smoser) wrote :
Revision history for this message
Scott Moser (smoser) wrote :
Scott Moser (smoser)
description: updated
Revision history for this message
Scott Moser (smoser) wrote :

I just noticed from reading one of the related bugs that booting image without 'ro' on the kernel command line might also be a requirement to trigger this. I'd like to see a kernel booted iwth 'ro' on the command line .

that said, I tried to reproduce this with a daily image of trusty (20141003) and could not.
that attempt looked like this:

$ tgz_url=http://cloud-images.ubuntu.com/trusty/current/trusty-server-cloudimg-amd64.tar.gz
$ tgz=${tgz_url##*/}
$ wget -O "${tgz_url}" "$tgz"

$ mkdir dist
$ tar -C dist -Scvzf "$tgz"
$ dist_disk=$(echo dist/*.img)
$ kernel=$(echo dist/*vmlinuz*)

$ cat > user-data <<EOF
#cloud-config
password: passw0rd
chpasswd: { expire: False }
ssh_pwauth: True
EOF
$ echo "instance-id: $(uuidgen || echo i-abcdefg)" > meta-data
$ cloud-localds seed.img user-data meta-datak

$ qemu-img create -f qcow2 -b "$dist_disk" disk.img
$ qemu-system-x86_64 -enable-kvm -net nic -net user,hostfwd=tcp::2222-:22 \
    -drive file=disk.img,if=virtio -drive file=seed.img,if=virtio \
    -kernel "${kernel}" -append "root=LABEL=cloudimg-rootfs ro" -curses

i could not cause a hang here with or without 'ro'

Revision history for this message
Scott Moser (smoser) wrote :

so to clarify above, i could not recreate the error on amd64, but it most certainly *does* fail on arm64 (AArch64).

Raghuram Kota (rkota)
tags: added: hs-arm64
tags: added: hs-moonshot
tags: added: hs-moonshot-maas-juju
removed: hs-moonshot
Revision history for this message
Scott Moser (smoser) wrote :

ok. so i can reproduce this on both arm64 and ppc64el.
On ppc64el, both on trusty and on utopic.
Adding '-initrd <extracted-initramfs>' fixes the problem.

Heres an improved copy/paste to recreate. It does *not* fail on (arch=amd64).

arch=ppc64el
rel=trusty
tgz_url=http://cloud-images.ubuntu.com/$rel/current/${rel}-server-cloudimg-${arch}.tar.gz
tgz=${tgz_url##*/}
qemu=qemu-system-${arch}
[ "$arch" = "amd64" ] && qemu="qemu-system-x86_64"
[ "$arch" = "ppc64el" ] && qemu="qemu-system-ppc64"

[ -f "$tgz" ] || { wget "${tgz_url}" -O "$tgz.part" && mv "$tgz.part" "$tgz"; }

mkdir -p dist
( cd dist && ls *$rel*$arch*.img 2>/dev/null ) || tar -C dist -Sxvzf "$tgz"
dist_disk=$(echo dist/*$rel*$arch*.img)
kernel=$(echo dist/*$rel*$arch*vmlinu?*)

cat > user-data <<EOF
#cloud-config
password: passw0rd
chpasswd: { expire: False }
ssh_pwauth: True
EOF
echo "instance-id: $(uuidgen || echo i-abcdefg)" > meta-data
cloud-localds seed.img user-data meta-data

qemu-img create -f qcow2 -b "$dist_disk" disk.img

# on intel:
$qemu -enable-kvm \
   -net nic -net user,hostfwd=tcp::2222-:22 \
   -drive file=disk.img,if=virtio -drive file=seed.img,if=virtio \
   -kernel "${kernel}" -append "root=/dev/vda ro" -curses

# on ppc64el
$qemu -m 1G -enable-kvm -machine pseries,usb=off -device spapr-vscsi \
   -device spapr-vlan,netdev=net00 -netdev type=user,id=net00 \
   -drive file=disk.img,if=virtio -drive file=seed.img,if=virtio \
   -kernel "$kernel" -append "root=/dev/vda console=hvc0 ro --verbose" \
   -display none -nographic

Revision history for this message
Scott Moser (smoser) wrote :
Revision history for this message
Scott Moser (smoser) wrote :
Revision history for this message
Scott Moser (smoser) wrote :

a bit more info. on ppc64el at least, you can boot with 'rw' as a kernel parameter and fix this. thats less than ideal, but it adds an interesting piece of information. previously i just assumed we were blocked in a hang where / was mounted rw but /run was not yet mounted. it is the other way, though, in that '/run' gets mounted and we're blocked on /.

description: updated
Revision history for this message
Scott Moser (smoser) wrote :
Revision history for this message
Scott Moser (smoser) wrote :
Revision history for this message
Scott Moser (smoser) wrote :

comment 3 above is ordered wrong. the change that caused this is:
/etc/init/cloud-init-local.conf
+start on mounted MOUNTPOINT=/
+start on mounted MOUNTPOINT=/ and mounted MOUNTPOINT=/run

the reason was that cloud-init-local needs to write to / and to /run. previously it was using /run without declaring the need for it. also cloud-init generally wants 'mounted' to block things.

Revision history for this message
Scott Moser (smoser) wrote :

bah.
/etc/init/cloud-init-local.conf
- start on mounted MOUNTPOINT=/
+start on mounted MOUNTPOINT=/ and mounted MOUNTPOINT=/run

tags: added: hs-arm64-maas-juju
Scott Moser (smoser)
Changed in cloud-init (Ubuntu Trusty):
status: New → Triaged
importance: Undecided → Medium
Changed in cloud-init:
assignee: nobody → Roufique Hossain (roufique)
Changed in cloud-init (Ubuntu):
assignee: nobody → Roufique Hossain (roufique)
Changed in cloud-init (Ubuntu Trusty):
assignee: nobody → Roufique Hossain (roufique)
Dan Watkins (oddbloke)
Changed in cloud-init (Ubuntu Trusty):
assignee: Roufique Hossain (roufique) → nobody
Changed in cloud-init (Ubuntu):
assignee: Roufique Hossain (roufique) → nobody
Changed in cloud-init:
assignee: Roufique Hossain (roufique) → nobody
Revision history for this message
Dan Watkins (oddbloke) wrote :

Drive-by mark as Incomplete, as the way initramfses and cloud images interact has changed substantially since 2014.

Changed in cloud-init (Ubuntu Trusty):
status: Triaged → Won't Fix
Changed in cloud-init (Ubuntu):
status: Triaged → Incomplete
Changed in cloud-init:
status: Triaged → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for cloud-init because there has been no activity for 60 days.]

Changed in cloud-init:
status: Incomplete → Expired
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for cloud-init (Ubuntu) because there has been no activity for 60 days.]

Changed in cloud-init (Ubuntu):
status: Incomplete → Expired
Revision history for this message
James Falcon (falcojr) wrote :
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.