package_reboot_if_required breaks power_state: shutdown

Bug #1960360 reported by Linus
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
cloud-init
Invalid
Undecided
Unassigned

Bug Description

I'm using the following User-Data file to provision a Ubuntu Server 20.04 VM(focal-server-cloudimg-amd64):

#cloud-config
hostname: UbuntuServer20.04Template
manage_etc_hosts: true
user: ***
password: ***
ssh_authorized_keys:
  - ssh-ed25519 ***
chpasswd:
  expire: False
#Packages to install
packages:
  - qemu-guest-agent
#Packages and update configuration
package_update: true
package_upgrade: true
package_reboot_if_required: true
#Set locale and timezone
locale: en_US
timezone: Europe/Stockholm
#Setup NTP server
ntp:
  enabled: true
  ntp_client: ntp
  servers:
    - sth1.ntp.se
    - sth2.ntp.se
    - sth3.ntp.se
    - sth4.ntp.se
#Setup apt source
apt:
  primary:
    - arches: [amd64]
      uri: http://se.archive.ubuntu.com/ubuntu/
write_files:
- content: |
    sshd : 10.20.0.1
  path: /etc/hosts.allow
  append: True
- content: |
    #Custom SSH configuration
    Protocol 2
  path: /etc/ssh/sshd_config.d/custom
  permissions: '0600'
  owner: root:root
power_state:
  delay: "+1"
  mode: poweroff
  timeout: 120
  condition: True

The problem is that the VM never shuts down, it does however reboot for package installation. But after that nothing more happens.

However if I also use:
cloud_final_modules:
 - power-state-change
The power_state command works, but then the VM never reboots for packages upgrade.

Revision history for this message
Linus (kagstrom2100) wrote :

Logfile

Revision history for this message
James Falcon (falcojr) wrote :
Download full text (4.9 KiB)

Looking at "cloud-init.log" in the tarball, it looks like the package upgrade reboot never actually happened. A small part of the errors in the log:

2022-02-07 23:45:20,468 - cc_package_update_upgrade_install.py[WARNING]: Rebooting after upgrade or install per /var/run/reboot-required
2022-02-07 23:45:20,470 - subp.py[DEBUG]: Running command ['/sbin/reboot'] with allowed return codes [0] (shell=False, capture=True)
2022-02-07 23:45:20,897 - util.py[WARNING]: Requested reboot did not happen!
2022-02-07 23:45:20,898 - util.py[DEBUG]: Requested reboot did not happen!
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/cloudinit/util.py", line 371, in multi_log
    wfh.flush()
OSError: [Errno 5] Input/output error

Looks like similar errors are spread throughout the log. E.g.:

2022-02-07 23:45:21,143 - util.py[DEBUG]: Running module ssh-authkey-fingerprints (<module 'cloudinit.config.cc_ssh_authkey_fingerprints' from '/usr/lib/python3/dist-packages/cloudinit/config
/cc_ssh_authkey_fingerprints.py'>) failed
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/cloudinit/util.py", line 371, in multi_log
    wfh.flush()
OSError: [Errno 5] Input/output error

During handling of the above exception, another exception occurred:

OSError: [Errno 5] Input/output error

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/cloudinit/stages.py", line 976, in _run_modules
    ran, _r = cc.run(run_name, mod.handle, func_args,
  File "/usr/lib/python3/dist-packages/cloudinit/cloud.py", line 54, in run
    return self._runners.run(name, functor, args, freq, clear_on_fail)
  File "/usr/lib/python3/dist-packages/cloudinit/helpers.py", line 186, in run
    results = functor(*args)
  File "/usr/lib/python3/dist-packages/cloudinit/config/cc_ssh_authkey_fingerprints.py", line 108, in handle
    _pprint_key_entries(user_name, key_fn,
  File "/usr/lib/python3/dist-packages/cloudinit/config/cc_ssh_authkey_fingerprints.py", line 94, in _pprint_key_entries
    util.multi_log(text="%s%s\n" % (prefix, line),
  File "/usr/lib/python3/dist-packages/cloudinit/util.py", line 371, in multi_log
    wfh.flush()
OSError: [Errno 5] Input/output error

Also looks like power module is running as expected:
2022-02-07 23:45:21,327 - helpers.py[DEBUG]: Running config-power-state-change using lock (<FileLock using file '/var/lib/cloud/instances/a98c1df0519498bd2b67aa457761f8b49fbdf4a6/sem/config_p
ower_state_change'>)
2022-02-07 23:45:21,328 - util.py[DEBUG]: Reading from /proc/2180/cmdline (quiet=False)
2022-02-07 23:45:21,328 - util.py[DEBUG]: Read 58 bytes from /proc/2180/cmdline
2022-02-07 23:45:21,328 - cc_power_state_change.py[DEBUG]: After pid 2180 ends, will execute: shutdown -P +1
2022-02-07 23:45:21,329 - util.py[DEBUG]: Forked child 8872 who will run callback run_after_pid_gone
2022-02-07 23:45:21,339 - util.py[DEBUG]: Reading from /proc/2180/cmdline (quiet=False)
2022-02-07 23:45:21,340 - util.py[DEBUG]: Read 58 bytes from /proc/2180/cmdline
2022-02-07 23:45:21,346 - handlers.py[DEBUG]: finish: modules-final/config-power-state-change: SUCCE...

Read more...

Changed in cloud-init:
status: New → Incomplete
Revision history for this message
Chad Smith (chad.smith) wrote :

Thanks for the logs.
Tracebacks in cloud-init.log point to an I/O error running /sbin/reboot on this system and maybe an NTP problem too

Command: ['/sbin/reboot']
11:42 Exit code: -
11:42 Reason: [Errno 5] Input/output error
11:42 that's the symptom/problem on this system
11:43 grep Traceback /var/log/cloud-init.log

maybe unrelated, but seeing ntp errors too in journalctl -b 0
s "kernel reports TIME_ERROR: 0x41: Clock Unsynchronized"

Revision history for this message
Linus (kagstrom2100) wrote :

@James Falcon I've got no idea, maybe it's the cloud-image I'm running? Could you try running my User-Config file and see if it works for you? For reference I'm running the following bash script to deploy the VM:

#!/bin/bash

#Create Template Script

#Path to script folder "/var/cloud-init"

VMID=9000
VMNAME=UbuntuServer2004Template

cd /var/cloud-init

wget -N https://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64.img -O images/focal-server-cloudimg-amd64.img
qm create $VMID --memory 2048 --net0 virtio,bridge=vmbr0 --net1 virtio,bridge=vmbr1 --cores 1 --name $VMNAME
qm importdisk $VMID images/focal-server-cloudimg-amd64.img local-zfs
qm set $VMID --scsihw virtio-scsi-pci --scsi0 local-zfs:vm-$VMID-disk-0
qm set $VMID --ide2 local-zfs:cloudinit
qm set $VMID --boot c --bootdisk scsi0
#Path to local "/var/lib/vz"
qm set $VMID --cicustom "user=local:snippets/UbuntuServerTemplate-UserConfig.yml,network=local:snippets/UbuntuServerTemplate-NetworkConfig.yml"
qm start $VMID

Changed in cloud-init:
status: Incomplete → Opinion
status: Opinion → Invalid
status: Invalid → Incomplete
Revision history for this message
Linus (kagstrom2100) wrote :

Just fixed this! Needed to add a serial socket to the VM according to https://pve.proxmox.com/wiki/Cloud-Init_Support this, didn't think I needed it as everything else seemed to work fine!

So I just added the following to my bash script:

qm set $VMID --serial0 socket --vga serial0

Revision history for this message
James Falcon (falcojr) wrote :

Glad you got it figured out. I'm going to close this bug then. Feel free to re-open if there's still more to address here.

Changed in cloud-init:
status: Incomplete → Invalid
Revision history for this message
James Falcon (falcojr) wrote :
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.