power_state does not take effect when runcmd errors

Bug #1449318 reported by Laurence Rowe
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
cloud-init
Fix Released
High
Scott Moser
cloud-init (Ubuntu)
Fix Released
High
Unassigned
Wily
Won't Fix
Low
Unassigned
Xenial
Fix Released
High
Unassigned

Bug Description

When the runcmd errors the power-state-change does not take effect and the instance is not powered off.

AMI ID: ubuntu-vivid-15.04-amd64-server-20150422 (ami-2d10241d)

Instance launched on EC2 using awscli:

$ aws --region us-west-2 ec2 run-instances --image-id ami-2d10241d --instance-type t2.medium --security-groups ssh-http-https --user-data file://fail.cfg

Minimal fail.cfg cloud config:
```
#cloud-config
power_state:
  mode: poweroff

runcmd:
- set -e
- python3 -c "raise Exception"
```

Longer fail.cfg used for retrieving logs:
```
#cloud-config
output:
  all: '| tee -a /var/log/cloud-init-output.log'

power_state:
  mode: poweroff

bootcmd:
- cloud-init-per once ssh-users-ca echo "TrustedUserCAKeys /etc/ssh/users_ca.pub" >> /etc/ssh/sshd_config

runcmd:
- set -e
- python3 -c "raise Exception"

write_files:
- path: /etc/ssh/users_ca.pub
  content: <my ssh ca key>
```

Related branches

Revision history for this message
Laurence Rowe (lrowe) wrote :
Revision history for this message
Laurence Rowe (lrowe) wrote :

Does not affect ubuntu-trusty-14.04-amd64-server-20150325 (ami-5189a661) or ubuntu-utopic-14.10-amd64-server-20150202 (ami-bd471c8d).

Revision history for this message
Laurence Rowe (lrowe) wrote :
Download full text (3.2 KiB)

This is still a problem in Wily 15.10 (ami-225ebd11)

On the ec2 console I see:

Cloud-init 0.7.7 received SIGTERM, exiting...
  Filename: /usr/lib/python3/dist-packages/cloudinit/config/cc_power_state_change.py
  Function: run_after_pid_gone
  Line number: 209
    Filename: /usr/lib/python3/dist-packages/cloudinit/util.py
    Function: fork_cb
    Line number: 285
      Filename: /usr/lib/python3/dist-packages/cloudinit/config/cc_power_state_change.py
      Function: handle
      Line number: 109

And in /var/log/cloud-init.log:

Oct 28 21:16:14 ubuntu [CLOUDINIT] helpers.py[DEBUG]: Running config-power-state-change using lock (<FileLock using file '/var/lib/cloud/instances/i-ea987333/sem/config_power_state_change'>)
Oct 28 21:16:14 ubuntu [CLOUDINIT] util.py[DEBUG]: Reading from /proc/1062/cmdline (quiet=False)
Oct 28 21:16:14 ubuntu [CLOUDINIT] util.py[DEBUG]: Read 58 bytes from /proc/1062/cmdline
Oct 28 21:16:14 ubuntu [CLOUDINIT] cc_power_state_change.py[DEBUG]: After pid 1062 ends, will execute: shutdown -P now
Oct 28 21:16:14 ubuntu [CLOUDINIT] util.py[DEBUG]: Forked child 1080 who will run callback run_after_pid_gone
Oct 28 21:16:14 ubuntu [CLOUDINIT] handlers.py[DEBUG]: finish: modules-final/config-power-state-change: SUCCESS: config-power-state-change ran successfully
Oct 28 21:16:14 ubuntu [CLOUDINIT] cloud-init[DEBUG]: Ran 11 modules with 1 failures
Oct 28 21:16:14 ubuntu [CLOUDINIT] util.py[DEBUG]: Reading from /proc/1062/cmdline (quiet=False)
Oct 28 21:16:14 ubuntu [CLOUDINIT] util.py[DEBUG]: Read 58 bytes from /proc/1062/cmdline
Oct 28 21:16:14 ubuntu [CLOUDINIT] util.py[DEBUG]: Creating symbolic link from '/run/cloud-init/result.json' => '../../var/lib/cloud/data/result.json'
Oct 28 21:16:14 ubuntu [CLOUDINIT] util.py[DEBUG]: Reading from /proc/uptime (quiet=False)
Oct 28 21:16:14 ubuntu [CLOUDINIT] util.py[DEBUG]: Read 12 bytes from /proc/uptime
Oct 28 21:16:14 ubuntu [CLOUDINIT] util.py[DEBUG]: cloud-init mode 'modules' took 0.502 seconds (0.50)
Oct 28 21:16:14 ubuntu [CLOUDINIT] handlers.py[DEBUG]: finish: modules-final: FAIL: running modules for final
Oct 28 21:16:14 ubuntu [CLOUDINIT] util.py[DEBUG]: Cloud-init 0.7.7 received SIGTERM, exiting...#012 Filename: /usr/lib/python3/dist-packages/cloudinit/config/cc_power_state_change.py#012 Function: run_after_pid_gone#012 Line number: 209#012 Filename: /usr/lib/python3/dist-packages/cloudinit/util.py#012 Function: fork_cb#012 Line number: 285#012 Filename: /usr/lib/python3/dist-packages/cloudinit/config/cc_power_state_change.py#012 Function: handle#012 Line number: 109
Oct 28 21:16:14 ubuntu [CLOUDINIT] util.py[WARNING]: Failed forking and calling callback run_after_pid_gone
Oct 28 21:16:14 ubuntu [CLOUDINIT] util.py[DEBUG]: Failed forking and calling callback run_after_pid_gone#012Traceback (most recent call last):#012 File "/usr/lib/python3/dist-packages/cloudinit/util.py", line 285, in fork_cb#012 child_cb(*args, **kwargs)#012 File "/usr/lib/python3/dist-packages/cloudinit/config/cc_power_state_change.py", line 209, in run_after_pid_gone#012 time.sleep(.25)#012 File "/usr/lib/python3/dist-packages/cloudinit/signal_handler.py", line 63, in...

Read more...

Scott Moser (smoser)
Changed in cloud-init:
status: New → Confirmed
importance: Undecided → High
Revision history for this message
Scott Moser (smoser) wrote :

It seems this is fixable by setting KillMode=process
See http://www.freedesktop.org/software/systemd/man/systemd.kill.html

$ bzr diff
=== modified file 'systemd/cloud-final.service'
--- systemd/cloud-final.service 2015-04-09 15:54:01 +0000
+++ systemd/cloud-final.service 2015-11-30 20:17:47 +0000
@@ -8,6 +8,7 @@
 ExecStart=/usr/bin/cloud-init modules --mode=final
 RemainAfterExit=yes
 TimeoutSec=0
+KillMode=process

 # Output needs to appear in instance console output
 StandardOutput=journal+console

Revision history for this message
Scott Moser (smoser) wrote :

verified taht works with:
a.) boot wily system with no user-data at all
b.) sudo tee /etc/cloud/cloud.cfg.d/99_poweroff.cfg <<EOF
#cloud-config
runcmd:
 - "echo ------------ exiting fail ------------------; exit 1;"
power_state:
  mode: poweroff
EOF

c.) rm -Rf /var/lib/cloud
d.) sudo reboot
e.) system will come back up and not poweroff due to this bug
f.) sudo sed -i -e 's,TimeoutSec=0,TimeoutSec=0\nKillMode=process,' /lib/systemd/system/cloud-final.service
g.) sudo rm -Rf /var/lib/cloud
h.) sudo reboot

system will turn itself off after reboot.

system wil boot and poweroff

Revision history for this message
Laurence Rowe (lrowe) wrote :

I'm able to work around this on 15.10 by applying smoser's fix as part of my cloud config:

#cloud-config
power_state:
  mode: poweroff
runcmd:
  - set -ex
  - systemctl daemon-reload
  - python3 -c "raise Exception"
write_files:
  - path: /etc/systemd/system/cloud-final.service.d/override.conf
    content: |
      [Service]
      # See https://bugs.launchpad.net/cloud-init/+bug/1449318
      KillMode=process

Revision history for this message
Scott Moser (smoser) wrote :

This is fixed in trunk at revno 1157.

Changed in cloud-init:
assignee: nobody → Scott Moser (smoser)
status: Confirmed → Fix Committed
Changed in cloud-init (Ubuntu):
status: New → Fix Released
Changed in cloud-init (Ubuntu Wily):
status: New → Fix Released
status: Fix Released → Won't Fix
Changed in cloud-init (Ubuntu Xenial):
status: New → Fix Released
Changed in cloud-init (Ubuntu):
importance: Undecided → High
Changed in cloud-init (Ubuntu Wily):
importance: Undecided → Low
Changed in cloud-init (Ubuntu Xenial):
importance: Undecided → High
Revision history for this message
Scott Moser (smoser) wrote :

This is fixed in cloud-init 0.7.7

Changed in cloud-init:
status: Fix Committed → Fix Released
Revision history for this message
James Falcon (falcojr) wrote :
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.