cloud-init might incorrectly consider reboot as new-instance during kernel upgrade or downgrade

Bug #1835584 reported by Anh Vo (MSFT) on 2019-07-05
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
cloud-init
Undecided
Chad Smith

Bug Description

Between 4.14 kernel and 4.15 kernel this below commit changed the product uuid of a VM from uppercase to lowercase. Data Sources that use this number to represent instance-id (e.g., Azure) will go through new-instance code path at reboot following a kernel upgrade/downgrade (that is affected by the change). This is problematic for customers who provision with password on Azure because the password is not saved on disk new-instance provisioning will disables password access to VM in that case

Commit:
https://github.com/torvalds/linux/commit/712ff25450bd01366301eef81c33e865d901e7b7#diff-f2bd14bc67b5e2da67116bca971bbd0b

Repro Steps:
Deploy a 18.04-LTS latest VM on Azure (kernel version is 4.18.0-1023-Azure as of July 5th 2019).
Downgrade the kernel to 4.14.119 (using the .deb packages here https://kernel.ubuntu.com/~kernel-ppa/mainline/v4.14.119/)
Configure grub to boot into the 4.14 kernel.
Observe in cloud-init log that new instance first boot is happening

In my VM I can see the product uuid changed the casing
4.18 kernel:
$ cat /sys/devices/virtual/dmi/id/product_uuid
1fd1b593-e79e-724c-9b33-d8634642d5f5
$ ls /var/lib/cloud/instances
1fd1b593-e79e-724c-9b33-d8634642d5f5

After downgrade:
$ cat /sys/devices/virtual/dmi/id/product_uuid
1FD1B593-E79E-724C-9B33-D8634642D5F5
$ ls /var/lib/cloud/instances
1FD1B593-E79E-724C-9B33-D8634642D5F5 1fd1b593-e79e-724c-9b33-d8634642d5f5

DataSourceAzure.py is already using instance_id_matches_system_uuid, which converts the uuid to lowercase, to compare instance-id

def check_instance_id(self, sys_cfg):
        # quickly (local check only) if self.instance_id is still valid
        return sources.instance_id_matches_system_uuid(self.get_instance_id())

However, the issue lies in stages.py's is_new_instance() method, which does not convert uuid to lowercase before comparison, which results in is_new_instance returning True when it should be False. This affects methods like apply_network_config, setup, activate, etc...

Paride Legovini (paride) on 2019-07-08
Changed in cloud-init:
status: New → Triaged
Chad Smith (chad.smith) on 2021-01-29
Changed in cloud-init:
assignee: nobody → Chad Smith (chad.smith)
status: Triaged → In Progress

This bug is believed to be fixed in cloud-init in version 21.1. If this is still a problem for you, please make a comment and set the state back to New

Thank you.

Changed in cloud-init:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers