power_state does not take effect when runcmd errors

Bug #1449318 reported by Laurence Rowe on 2015-04-28
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
cloud-init
High
Scott Moser
cloud-init (Ubuntu)
High
Unassigned
Wily
Low
Unassigned
Xenial
High
Unassigned

Bug Description

When the runcmd errors the power-state-change does not take effect and the instance is not powered off.

AMI ID: ubuntu-vivid-15.04-amd64-server-20150422 (ami-2d10241d)

Instance launched on EC2 using awscli:

$ aws --region us-west-2 ec2 run-instances --image-id ami-2d10241d --instance-type t2.medium --security-groups ssh-http-https --user-data file://fail.cfg

Minimal fail.cfg cloud config:
```
#cloud-config
power_state:
  mode: poweroff

runcmd:
- set -e
- python3 -c "raise Exception"
```

Longer fail.cfg used for retrieving logs:
```
#cloud-config
output:
  all: '| tee -a /var/log/cloud-init-output.log'

power_state:
  mode: poweroff

bootcmd:
- cloud-init-per once ssh-users-ca echo "TrustedUserCAKeys /etc/ssh/users_ca.pub" >> /etc/ssh/sshd_config

runcmd:
- set -e
- python3 -c "raise Exception"

write_files:
- path: /etc/ssh/users_ca.pub
  content: <my ssh ca key>
```

Related branches

Laurence Rowe (lrowe) wrote :
Laurence Rowe (lrowe) wrote :

Does not affect ubuntu-trusty-14.04-amd64-server-20150325 (ami-5189a661) or ubuntu-utopic-14.10-amd64-server-20150202 (ami-bd471c8d).

Laurence Rowe (lrowe) wrote :
Download full text (3.2 KiB)

This is still a problem in Wily 15.10 (ami-225ebd11)

On the ec2 console I see:

Cloud-init 0.7.7 received SIGTERM, exiting...
  Filename: /usr/lib/python3/dist-packages/cloudinit/config/cc_power_state_change.py
  Function: run_after_pid_gone
  Line number: 209
    Filename: /usr/lib/python3/dist-packages/cloudinit/util.py
    Function: fork_cb
    Line number: 285
      Filename: /usr/lib/python3/dist-packages/cloudinit/config/cc_power_state_change.py
      Function: handle
      Line number: 109

And in /var/log/cloud-init.log:

Oct 28 21:16:14 ubuntu [CLOUDINIT] helpers.py[DEBUG]: Running config-power-state-change using lock (<FileLock using file '/var/lib/cloud/instances/i-ea987333/sem/config_power_state_change'>)
Oct 28 21:16:14 ubuntu [CLOUDINIT] util.py[DEBUG]: Reading from /proc/1062/cmdline (quiet=False)
Oct 28 21:16:14 ubuntu [CLOUDINIT] util.py[DEBUG]: Read 58 bytes from /proc/1062/cmdline
Oct 28 21:16:14 ubuntu [CLOUDINIT] cc_power_state_change.py[DEBUG]: After pid 1062 ends, will execute: shutdown -P now
Oct 28 21:16:14 ubuntu [CLOUDINIT] util.py[DEBUG]: Forked child 1080 who will run callback run_after_pid_gone
Oct 28 21:16:14 ubuntu [CLOUDINIT] handlers.py[DEBUG]: finish: modules-final/config-power-state-change: SUCCESS: config-power-state-change ran successfully
Oct 28 21:16:14 ubuntu [CLOUDINIT] cloud-init[DEBUG]: Ran 11 modules with 1 failures
Oct 28 21:16:14 ubuntu [CLOUDINIT] util.py[DEBUG]: Reading from /proc/1062/cmdline (quiet=False)
Oct 28 21:16:14 ubuntu [CLOUDINIT] util.py[DEBUG]: Read 58 bytes from /proc/1062/cmdline
Oct 28 21:16:14 ubuntu [CLOUDINIT] util.py[DEBUG]: Creating symbolic link from '/run/cloud-init/result.json' => '../../var/lib/cloud/data/result.json'
Oct 28 21:16:14 ubuntu [CLOUDINIT] util.py[DEBUG]: Reading from /proc/uptime (quiet=False)
Oct 28 21:16:14 ubuntu [CLOUDINIT] util.py[DEBUG]: Read 12 bytes from /proc/uptime
Oct 28 21:16:14 ubuntu [CLOUDINIT] util.py[DEBUG]: cloud-init mode 'modules' took 0.502 seconds (0.50)
Oct 28 21:16:14 ubuntu [CLOUDINIT] handlers.py[DEBUG]: finish: modules-final: FAIL: running modules for final
Oct 28 21:16:14 ubuntu [CLOUDINIT] util.py[DEBUG]: Cloud-init 0.7.7 received SIGTERM, exiting...#012 Filename: /usr/lib/python3/dist-packages/cloudinit/config/cc_power_state_change.py#012 Function: run_after_pid_gone#012 Line number: 209#012 Filename: /usr/lib/python3/dist-packages/cloudinit/util.py#012 Function: fork_cb#012 Line number: 285#012 Filename: /usr/lib/python3/dist-packages/cloudinit/config/cc_power_state_change.py#012 Function: handle#012 Line number: 109
Oct 28 21:16:14 ubuntu [CLOUDINIT] util.py[WARNING]: Failed forking and calling callback run_after_pid_gone
Oct 28 21:16:14 ubuntu [CLOUDINIT] util.py[DEBUG]: Failed forking and calling callback run_after_pid_gone#012Traceback (most recent call last):#012 File "/usr/lib/python3/dist-packages/cloudinit/util.py", line 285, in fork_cb#012 child_cb(*args, **kwargs)#012 File "/usr/lib/python3/dist-packages/cloudinit/config/cc_power_state_change.py", line 209, in run_after_pid_gone#012 time.sleep(.25)#012 File "/usr/lib/python3/dist-packages/cloudinit/signal_handler.py", line 63, in...

Read more...

Scott Moser (smoser) on 2015-11-19
Changed in cloud-init:
status: New → Confirmed
importance: Undecided → High
Scott Moser (smoser) wrote :

It seems this is fixable by setting KillMode=process
See http://www.freedesktop.org/software/systemd/man/systemd.kill.html

$ bzr diff
=== modified file 'systemd/cloud-final.service'
--- systemd/cloud-final.service 2015-04-09 15:54:01 +0000
+++ systemd/cloud-final.service 2015-11-30 20:17:47 +0000
@@ -8,6 +8,7 @@
 ExecStart=/usr/bin/cloud-init modules --mode=final
 RemainAfterExit=yes
 TimeoutSec=0
+KillMode=process

 # Output needs to appear in instance console output
 StandardOutput=journal+console

Scott Moser (smoser) wrote :

verified taht works with:
a.) boot wily system with no user-data at all
b.) sudo tee /etc/cloud/cloud.cfg.d/99_poweroff.cfg <<EOF
#cloud-config
runcmd:
 - "echo ------------ exiting fail ------------------; exit 1;"
power_state:
  mode: poweroff
EOF

c.) rm -Rf /var/lib/cloud
d.) sudo reboot
e.) system will come back up and not poweroff due to this bug
f.) sudo sed -i -e 's,TimeoutSec=0,TimeoutSec=0\nKillMode=process,' /lib/systemd/system/cloud-final.service
g.) sudo rm -Rf /var/lib/cloud
h.) sudo reboot

system will turn itself off after reboot.

system wil boot and poweroff

Laurence Rowe (lrowe) wrote :

I'm able to work around this on 15.10 by applying smoser's fix as part of my cloud config:

#cloud-config
power_state:
  mode: poweroff
runcmd:
  - set -ex
  - systemctl daemon-reload
  - python3 -c "raise Exception"
write_files:
  - path: /etc/systemd/system/cloud-final.service.d/override.conf
    content: |
      [Service]
      # See https://bugs.launchpad.net/cloud-init/+bug/1449318
      KillMode=process

Scott Moser (smoser) wrote :

This is fixed in trunk at revno 1157.

Changed in cloud-init:
assignee: nobody → Scott Moser (smoser)
status: Confirmed → Fix Committed
Changed in cloud-init (Ubuntu):
status: New → Fix Released
Changed in cloud-init (Ubuntu Wily):
status: New → Fix Released
status: Fix Released → Won't Fix
Changed in cloud-init (Ubuntu Xenial):
status: New → Fix Released
Changed in cloud-init (Ubuntu):
importance: Undecided → High
Changed in cloud-init (Ubuntu Wily):
importance: Undecided → Low
Changed in cloud-init (Ubuntu Xenial):
importance: Undecided → High
Scott Moser (smoser) wrote :

This is fixed in cloud-init 0.7.7

Changed in cloud-init:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers