google-startup-scripts runs before cloud-init finished network setup
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
google-guest-agent (Ubuntu) |
Confirmed
|
Undecided
|
Unassigned | ||
Xenial |
New
|
Undecided
|
Unassigned | ||
Bionic |
New
|
Undecided
|
Unassigned | ||
Focal |
New
|
Undecided
|
Unassigned | ||
Jammy |
New
|
Undecided
|
Unassigned | ||
Mantic |
Won't Fix
|
Undecided
|
Unassigned | ||
Noble |
Confirmed
|
Undecided
|
Unassigned |
Bug Description
[ Impact ]
In certain situations (consistently with ubuntu-pro=31.2 and cloud-init=23.4.4), cloud-config.
Since pro and cloud-init are backported to all older releases, this bug will affect them too.
The change that results in this race condition is the removal an ordering condition between pro and cloud-init, so adding `After=
[ Test Plan ]
To reproduce:
Using startup_script.sh:
#!/bin/bash
cp /etc/apt/
$ gcloud compute instances create startup-test --image daily-ubuntu-
[...]
$ ssh [INSTANCE IP]
> diff /tmp/startup-
0a1,8
> ## Note, this file is written by cloud-init on first boot of an instance
> ## modifications made here will not survive a re-bundle.
> ## if you wish to make changes you can:
> ## a.) add 'apt_preserve_
> ## or do the same in user-data
> ## b.) add sources in /etc/apt/
> ## c.) make changes to template file /etc/cloud/
>
3,4c11,12
< deb http://
< # deb-src http://
---
> deb http://
> # deb-src http://
8,9c16,17
< deb http://
< # deb-src http://
---
[...]
Since this bug particularly effects first boot (once sources.list is configured with the GCE mirrors on first boot it will remain correctly configured), the best way to test that fix is correctly created will be to create an image with pro pinned at 31.2, cloud-init pinned at 23.4.4, and google-guest-agent install from proposed. The test would be:
1. Create an instance with startup script as above
$ gcloud compute instances create startup-test --image [IMAGE_NAME] --image-project [IMAGE PROJECT] --metadata-
2. SSH into the instance and verify pro/cloud-
> pro --version
32.1~[RELEASE]
> cloud-init --version
/usr/bin/cloud-init 23.4.4-
> apt-cache policy google-guest-agent
[ensure from -proposed]
3. Verify startup script ran correctly after cloud-config.
> diff /tmp/startup-
>
[ Where problems could occur ]
Since this introduces a new ordering constraint, it will likely have performance impacts (google-
[ Other Info ]
Original bug report retained below.
New GCP dailies are failing startup-script tests, due to configuration via cloud-init not being fully completed, apt sources for example, when startup scripts are run. The failure can be reproduced as follows:
Using startup_script.sh:
#!/bin/bash
cp /etc/apt/
$ gcloud compute instances create startup-test --image daily-ubuntu-
[...]
$ ssh [INSTANCE IP]
> diff /tmp/startup-
0a1,8
> ## Note, this file is written by cloud-init on first boot of an instance
> ## modifications made here will not survive a re-bundle.
> ## if you wish to make changes you can:
> ## a.) add 'apt_preserve_
> ## or do the same in user-data
> ## b.) add sources in /etc/apt/
> ## c.) make changes to template file /etc/cloud/
>
3,4c11,12
< deb http://
< # deb-src http://
---
> deb http://
> # deb-src http://
8,9c16,17
< deb http://
< # deb-src http://
---
[...]
On earlier images (such as ubuntu-
On v20240307 (startup scripts execute correctly):
catred@
The time when unit became active or started is printed after the "@" character.
The time the unit took to start is printed after the "+" character.
google-
└─multi-user.target @28.480s
└─ubuntu-
└─cloud-
└
On v20240314 (startup scripts fail):
catred@
The time when unit became active or started is printed after the "@" characte>
The time the unit took to start is printed after the "+" character.
google-
└─multi-user.target @29.237s
└─chrony.service @30.240s +56ms
└─basic.target @13.364s
└
This can be fixed by adding an explict `After=
Status changed to 'Confirmed' because the bug affects multiple users.