purging cloud-init from user-data commands does not work under systemd

Bug #1427999 reported by Martin Pitt
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
autopkgtest (Ubuntu)
Fix Released
High
Martin Pitt
cloud-init (Ubuntu)
Fix Released
Low
Unassigned

Bug Description

Apparently yesterday the cloud images at http://cloud-images.ubuntu.com/vivid/ were switched to boot with systemd, with init=/lib/systemd/systemd . 20150303/ still worked (with upstart), 20150304/ hangs:

[ 15.009125] cloud-init[541]: Cloud-init v. 0.7.7 running 'init-local' at Wed, 04 Mar 2015 07:31:54 +0000. Up 13.71 seconds.
[ OK ] Started Initial cloud-init job (pre-networking).
         Starting Initial cloud-init job (metadata service crawler)...
[ 15.471963] cloud-init[747]: Cloud-init v. 0.7.7 running 'init' cron.service
[...]
[ 15.824259] cloud-init[781]: Cloud-init v. 0.7.7 running 'modules:config' at Wed, 04 Mar 2015 07:31:56 +0000. Up 15.73 seconds.
[ 15.824463] cloud-init[781]: 2015-03-04 07:31:56,991 - cc_emit_upstart.py[WARNING]: Emission of upstart event cloud-config failed due to: Unexpected error while running command.
[ 15.824589] cloud-init[781]: Command: ['initctl', 'emit', 'cloud-config', 'CLOUD_CFG=/var/lib/cloud/instance/cloud-config.txt']
[ 15.824698] cloud-init[781]: Exit code: 1
[ 15.824802] cloud-init[781]: Reason: -
[ 15.824903] cloud-init[781]: Stdout: ''
[ 15.825003] cloud-init[781]: Stderr: 'initctl: Unable to connect to Upstart: Failed to connect to socket /com/ubuntu/upstart: Connection refused\n'
[ 16.003444] cloud-init[781]: Generating locales...
[ 16.036106] cloud-init[781]: en_US.UTF-8... up-to-date
[ 16.040818] cloud-init[781]: Generation complete.
         Stopping OpenBSD Secure Shell server...
[ OK ] Stopped OpenBSD Secure Shell server.
[ OK ] Sta

and there it hangs. When trying to boot, it tries to talk to 169.254.169.254 which times out.

The initctl emit failure looks related, this obviously won't work under systemd. This probably needs to be replaced with some "systemctl start ..." equivalent? Note that you can test for systemd with [ -d /run/systemd/system ], that's the canonical test.

Martin Pitt (pitti)
Changed in cloud-init (Ubuntu):
importance: Undecided → Critical
Revision history for this message
Martin Pitt (pitti) wrote :

Reproducer:

wget http://cloud-images.ubuntu.com/vivid/current/vivid-server-cloudimg-amd64-disk1.img
echo -e 'instance-id: nocloud\nlocal-hostname: test1\n' > meta-data
cat <<EOF > user-data
#cloud-config
password: ubuntu
ssh_pwauth: True
runcmd:
 - (while [ ! -e /var/lib/cloud/instance/boot-finished ]; do sleep 1; done;
    apt-get -y purge cloud-init; shutdown -P now) &
EOF
genisoimage -output seed.iso -volid cidata -joliet -rock user-data meta-data
kvm -serial stdio -snapshot -drive file=vivid-server-cloudimg-amd64-disk1.img,if=virtio -drive file=seed.iso,if=virtio,readonly

This uses -snapshot so that you can mess around with the downloaded image several times without changing it.

This hangs with today's 20150304 image, but works fine on 20150303's. Apparently it's stuck on purging cloud-init, which worked until yesterday. (We do that in CI so that you can boot the resulting images without the seed.iso, and also to minimize the installed packages so that we can actually run tests of cloud-init and similar themselves).

Changed in cloud-init (Ubuntu):
importance: Critical → Medium
summary: - cloud images don't boot with systemd
+ cloud-init does not purge with systemd
Revision history for this message
Martin Pitt (pitti) wrote : Re: cloud-init does not purge with systemd

So the hang is just specific to purging, lowering importance. However, this still looks a bit worrying:

[ 7.142392] cloud-init[787]: Command: ['initctl', 'emit', 'cloud-config', 'CLOUD_CFG=/var/lib/cloud/instance/cloud-config.txt']
[ 7.143936] cloud-init[787]: Exit code: 1
[ 7.145140] cloud-init[787]: Reason: -
[ 7.146076] cloud-init[787]: Stdout: ''
[ 7.147401] cloud-init[787]: Stderr: 'initctl: Unable to connect to Upstart: Failed to connect to socket /com/ubuntu/upstart: Connection refused\n'

Does anything depend on this functionality?

Revision history for this message
Martin Pitt (pitti) wrote :

Ah, I know now. cloud-init's prerm stops cloud-init-local.service, so if you are trying to purge cloud-init from user-data, stopping this will kill apt/dpkg/etc. I'll work around that in autopkgtest, but I wonder if we want to support that case in cloud-init itself. I'll have a closer look and prepare a patch.

summary: - cloud-init does not purge with systemd
+ purging cloud-init from user-data commands does not work under systemd
Changed in cloud-init (Ubuntu):
importance: Medium → Low
assignee: nobody → Martin Pitt (pitti)
Changed in autopkgtest (Ubuntu):
status: New → In Progress
importance: Undecided → High
Revision history for this message
Martin Pitt (pitti) wrote :

Workaround committed to autopkgtest.

Changed in autopkgtest (Ubuntu):
status: In Progress → Fix Committed
Changed in cloud-init (Ubuntu):
assignee: Martin Pitt (pitti) → nobody
Changed in autopkgtest (Ubuntu):
assignee: nobody → Martin Pitt (pitti)
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package autopkgtest - 3.11.1

---------------
autopkgtest (3.11.1) unstable; urgency=medium

  * Fix autopkgtest-reboot to also work when being called through sudo, and
    for forking test scripts.
  * Avoid failure if /var/cache/apparmor/click-ap.rules does not exist any
    more after a reboot, like on snappy.
  * adt-buildvm-ubuntu-cloud: Avoid cloud-init's prerm stopping cloud-init's
    services while we are still running them. This makes it possible to purge
    cloud-init from user-data, and avoids killing apt/dpkg underneath us. This
    needs a cleaner solution, but is a good enough workaround for now.
    (LP: #1427999)
  * adt-buildvm-ubuntu-cloud: Don't wait between serial console reads while
    we have data. Provides faster/smoother output in --verbose mode.
  * ssh-setup/nova: Fix error message on missing keypair, the command is
    "keypair-add", not "keypair-create". Thanks Thomi Richards! (LP: #1428433)
  * adt-buildvm-ubuntu-cloud: Add workaround for recent cloud-init regression
    that disables ssh (LP #1428495)
  * adt-run: Suggest common reason for unsatisfiable test dependencies.
    Suggest using a current image, or run apt-get update/-U (for apt-get) or
    --setup-commands ro-apt-update (for temp dir install mode). (LP: #1425682)
  * Quiesce confusing "failed to create symbolic link
    /sbin/autopkgtest-reboot" warning on read-only testbeds.

 -- Martin Pitt <email address hidden> Thu, 05 Mar 2015 13:33:12 +0100

Changed in autopkgtest (Ubuntu):
status: Fix Committed → Fix Released
Revision history for this message
Joshua Powers (powersj) wrote :

This appears to be fixed so marked fix released.

I took pitti's steps to reproduce in #1 and used a xenial image. There was no hang or unexplained error messages.

Changed in cloud-init (Ubuntu):
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.