Comment 7 for bug 1938299

Revision history for this message
Chad Smith (chad.smith) wrote :

This is the same symptom as a GCP issue raised.

The following commit[1] introduced this change in behavior for cloud-init by correcting an bug in the behavior of cloud-init which didn't apply networking changes in cloud-init "init" stage after writing files. This version of cloud-init 2.3.

google-guest-agent and cloud-init both race trying to setup networking on a newly booting VM as of cloud-init 21.2-43-g184c836a-0ubuntu1 did introduce a change that now calls "netplan apply" after cloud-init writes fallback network configuration files which request dhcp on the primary nic.

We were able to reproduce this same issue with latest images rolled out to GCP containing cloud-init 21.3. The `netplan apply` called by cloud-init does disrupt the dhclient calls that google-guest-agent uses to create the initial network config and sets up google project users and keys on the VM https://github.com/GoogleCloudPlatform/guest-agent#network

Cloud-init team is working this issue as top-priority.
Workarounds for this issue are:
1. rebooting the instance after launch which allows google-guest-agent.service to come back up without cloud-init interaction

2.

Workarounds at the moment launch a VM with updated cloud-init 21.3 with the following user-data:

cat > clouinit_start_google_guest_agent.yaml <<EOF
#cloud-config
runcmd:
 - systemctl start google-guest-agent.service
EOF

root@publishing-f:~# gcloud compute instances create sf2-impish-v20200720 --zone us-west1-a --network "default" --no-restart-on-failure --image-project ubuntu-os-cloud-devel --image daily-ubuntu-2110-impish-v20210720 --machine-type n1-standard-2 --metadata-from-file user-data=cloudinit_start_google_guest_agent.yaml

# see ssh working on that image launch
gcloud compute ssh sf2-impish-v20200720 --zone us-west1-a