cloud-init restarts during package upgrade when it shouldn't, regenerating SSH host keys on Azure Focal
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
cloud-init (Ubuntu) |
Fix Released
|
High
|
Unassigned | ||
Focal |
Fix Released
|
High
|
Unassigned |
Bug Description
=== Begin SRU Template ===
[Impact]
In 0f6db8c5 first included in focal in version 22.3-13-
# Automatically added by dh_systemd_
if [ "$1" = "configure" ] || [ "$1" = "abort-upgrade" ] || [ "$1" = "abort-deconfigure" ] || [ "$1" = "abort-remove" ] ; then
if [ -d /run/systemd/system ]; then
systemctl --system daemon-reload >/dev/null || true
if [ -n "$2" ]; then
deb-
fi
fi
fi
This script makes cloud-init restart on upgrade with services running interleaved and out of order. On older Azure instances, this creates a race on upgrade that can lead to ssh host keys being replaced.
[Test Case]
1.
- Launch a focal instance
- Ensure /var/lib/
- Ensure /var/log/
2.
- Upgrade a focal instance from pre-22.3 to the latest version
- Ensure /var/lib/
- Ensure /var/log/
Verification Test script
#!/bin/bash
set -ex
echo ==== Focal Verification of https:/
# Create upgrade to proposed script
cat > setup_proposed.sh <<EOF
#/bin/bash
mirror=http://
echo deb \$mirror \$(lsb_release -sc)-proposed main | tee /etc/apt/
apt-get update -q;
apt-get install -qy cloud-init;
EOF
lxc launch ubuntu-daily:focal sru-f
while ! lxc exec sru-f -- cloud-init status --wait --long; do
sleep 5
done
echo "=== OLD version Before cloud-init upgrade"
lxc exec sru-f -- cloud-init -v
echo "=== Track number of boot events cloud-init saw before upgrade from 22.4.2"
BOOT_RECORDS=$(lxc exec sru-f -- cloud-init analyze show | grep 'boot record')
echo "=== OLD version postinst contains try-restart logic to restart cloud-init services"
lxc exec sru-f grep try-restart /var/lib/
echo "=== Upgrade to -proposed version of cloud-init: ....20.04.2"
lxc file push setup_proposed.sh sru-f/
lxc exec sru-f -- bash /setup_proposed.sh
lxc exec sru-f -- cloud-init -v
echo "=== Number of boot events from cloud-init service restart should not have changed"
NEW_BOOT_
if [ "$BOOT_RECORDS" == "$NEW_BOOT_RECORDS" ]; then
echo SUCCESS same number of boot records across upgrade
else
echo FAILURE: different boot records seen across upgrade $NEW_BOOT_RECORDS
fi
echo "=== NEW version postinst does NOT contain try-restart logic for cloud-init services"
lxc exec sru-f grep try-restart /var/lib/
echo === No new log messages indicating NEW version of cloud-init was restarted
lxc exec sru-f grep 22.4.2-
[Regression Potential]
It is possible some functionality added in version 10 of debhelper is no longer available. Since this change was made in version 22.3 of cloud-init, this would be limited to focal versions of cloud-init 22.3 and 22.4. The control.tar.gz when building with debhelper compat 9 vs 10 has been examined to ensure there are no other functional changes to the output besides the fix for this regression and this analysis is included in https:/
[Other Info]
PR with change: https:/
Potentially related debhelper bug: https:/
=== End SRU Template ===
description: | updated |
description: | updated |
description: | updated |
description: | updated |
Changed in cloud-init (Ubuntu): | |
status: | New → In Progress |
Changed in cloud-init (Ubuntu Focal): | |
status: | New → In Progress |
summary: |
- cloud-init restarts after upgrade when it shouldn't + cloud-init restarts after upgrade when it shouldn't, regenerating SSH + host keys on Azure Focal |
summary: |
- cloud-init restarts after upgrade when it shouldn't, regenerating SSH - host keys on Azure Focal + cloud-init restarts during package upgrade when it shouldn't, + regenerating SSH host keys on Azure Focal |
description: | updated |
description: | updated |
description: | updated |
Changed in cloud-init (Ubuntu Focal): | |
importance: | Undecided → High |
Changed in cloud-init (Ubuntu): | |
importance: | Undecided → High |
description: | updated |
Changed in cloud-init (Ubuntu): | |
status: | In Progress → Fix Released |
In discussion with @vorlon. general sentiment for making debhelper compat version changes in stable releases is a no-no because of this type of bug being introduced by automated debhelper injected content in the binary package. In order to verify that we don't have a potential to regress something else by moving back from debhelper-compat 10 to debhelper-compat 9 we need to also analyze a diff of any debian control files that are automatically generated.
the diff of such control files is here and obtained by the following procedure.
``` 0ubuntu0~ 20.04.1; smith/cloud- init-uploads -y; smith/cloud- init-uploads cloud-init 22.4.2- 0ubuntu0~ 20.04.2
lxc launch ubuntu-dev-tools sru-f;
lxc exec sru-f bash
$ apt install ubuntu-dev-tools -y;
$ pull-lp-debs cloud-init 22.4.2-
$ add-apt-repository ppa:chad.
$ pull-ppa-debs ppa:chad.
$ dpkg-deb -R cloud-init_ 22.4.2- 0ubuntu0~ 20.04.1_ all.deb 20.04.1 22.4.2- 0ubuntu0~ 20.04.2_ all.deb 20.04.2
$ dpkg-deb -R cloud-init_
$ diff -urN 20.04.1 20.04.2 | pastebinit
```
Resulting pastebin of that effort on focal:
https:/ /paste. ubuntu. com/p/pCWx2tCNV q/
Analysis on the potential downgrade exposure paths for debhelper compat 10 back to 9 to ensure no dragons lurk in the debhelper compat downgrade path for any control scripts.
== debian/control 0ubuntu0~ 20.04.1 0ubuntu0~ 20.04.2
version changes and a Installed Size increase
-Version: 22.4.2-
+Version: 22.4.2-
== DEBIAN/md5sums deltas: lib/python3/ dist-packages/ cloudinit/ version. py # changed due to .2 vs .1 in pkg'd version string share/doc/ cloud-init/ changelog. Debian. gz
usr/
usr/
== DEBIAN/postinst
-- postinst.1 dropped the try-restart section which caused the original restart of services across package upgrade here when we moved to debhelper-compat 10
-# Automatically added by dh_systemd_ start/12. 10ubuntu1 config. service' 'cloud- config. target' 'cloud- final.service' 'cloud- init-hotplugd. service' 'cloud- init-hotplugd. socket' 'cloud- init-local. service' 'cloud- init.service' 'cloud-init.target' >/dev/null || true
-if [ "$1" = "configure" ] || [ "$1" = "abort-upgrade" ] || [ "$1" = "abort-deconfigure" ] || [ "$1" = "abort-remove" ] ; then
- if [ -d /run/systemd/system ]; then
- systemctl --system daemon-reload >/dev/null || true
- if [ -n "$2" ]; then
- deb-systemd-invoke try-restart 'cloud-
- fi
- fi
-fi
-- postinst.2: identical dh_python3 section is a little earlier in the file in deb compat level 9
+
+# Automatically added by dh_python3:
+if which py3compile >/dev/null 2>&1; then
+ py3compile -p cloud-init
+fi
+if which pypy3compile >/dev/null 2>&1; then
+ pypy3compile -p cloud-init || true
+fi
+
+# End automatically added section
-- postinst.3: remove escaping of tilde '\~' in all rm_conffile lines in debhelper-compat when moving back to compat level 9
compat lvl 10: dpkg-maintscrip t-helper rm_conffile /etc/init/ cloud-config. conf 0.7.9-243- ge74d775- 0ubuntu2\ ~ -- "$@"
compat lvl 9: dpkg-maintscrip t-helper rm_conffile /etc/init/ cloud-config. conf 0.7.9-243- ge74d775- 0ubuntu2~ -- "$@"
--- postinst.4: Additional bookend comments around each rm_conffile line
+# Autom...