Unable to SSH Into Instance when deploying Impish 21.10
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
cloud-init (Ubuntu) |
Fix Released
|
Critical
|
Unassigned | ||
Bionic |
Fix Released
|
Undecided
|
Unassigned | ||
Focal |
Fix Released
|
Undecided
|
Unassigned | ||
Hirsute |
Fix Released
|
Undecided
|
Unassigned | ||
Impish |
Fix Released
|
Critical
|
Unassigned | ||
google-guest-agent (Ubuntu) |
Fix Released
|
Medium
|
Unassigned | ||
Bionic |
Fix Released
|
Undecided
|
Unassigned | ||
Focal |
Fix Released
|
Undecided
|
Unassigned | ||
Hirsute |
Won't Fix
|
Undecided
|
Unassigned | ||
Impish |
Won't Fix
|
Medium
|
Unassigned | ||
netplan.io (Ubuntu) |
Fix Released
|
Undecided
|
Unassigned | ||
Bionic |
Won't Fix
|
Undecided
|
Unassigned | ||
Focal |
Won't Fix
|
Undecided
|
Unassigned | ||
Hirsute |
Won't Fix
|
Undecided
|
Unassigned | ||
Impish |
Won't Fix
|
Undecided
|
Unassigned |
Bug Description
=== Begin SRU Template ===
[Impact]
In PR #919 (81299de), we refactored some of the code used to bring up networks across distros. Previously, the call to bring up network interfaces during 'init' stage unintentionally resulted in a no-op such that network interfaces were NEVER brought up by cloud-init, even if new network interfaces were found after crawling the metadata.
In #919, the code was altered to bring up these discovered network interfaces. On Ubuntu, this results in a 'netplan apply' call during 'init' stage for any ubuntu-based distro on a datasource that has a NETWORK dependency. On GCE, this additional 'netplan apply' conflicts with the google-guest-agent service, shutting that service down due to that project's PartOf= systemd relationship resulting in an instance that can not be connected to.
To fix this, we added a new 'disable_
[Test Case]
An integration test has been added at `tests/
1. Launch an instance on GCE
2. Install the cloud-init version with the fix
3. Add a file, '/etc/cloud/
disable_
4. Run cloud-init clean --logs
5. Create a new image based on this instance
6. Launch a new instance based on the new image
7. Instance should launch successfully and able to be ssh'ed into
8. "['netplan', 'apply']" should not be present anywhere in /var/log/
9. "Bringing up newly configured network interfaces" should not exist anywhere in /var/log/
In the failure case, we will fail at step 7.
[Regression Potential]
The code in question determines whether to bring up interfaces after applying network config. Accidentally not doing this should not be a problem as we previously (unintentionally) did not bring these interfaces up. Accidentally bringing up interfaces when we shouldn't be also generally shouldn't cause a large problem outside of GCE, because outside of GCE there aren't (that we're aware of) other processes independently setting up network. If this setup determination code somehow fails, it happens early enough in boot that it could leave an instance unusable, however, the code is small enough and defensive enough that we don't believe that is a possibility.
Additionally, any cloud datasource that is discovered in `init-local` stage (Azure, Ec2, Hetzner, IBMCloud, OpenStack and Oracle) aren't exposed to this code path because full network config it emitted before system network is brought up so there is no need to call `netplan apply` at that time.
[Other Info]
Github PR: https:/
Upstream commit: https:/
=== End SRU Template ===
Original bug report:
Google Instances deployed with the Ubuntu 21.10 Daily images are inaccessible via SSH.
gcloud compute instances create sf-impish-v20200720 --zone us-west1-a --network "default" --no-restart-
Will result in a successful deploy yet, inaccessible via ssh from the end users configured laptop.
This appears to affect all daily images after 20210719.
daily-ubuntu-
daily-ubuntu-
daily-ubuntu-
daily-ubuntu-
daily-ubuntu-
daily-ubuntu-
daily-ubuntu-
This problem also appears to be reproducible via the gcloud UI, create a new virtual machine using the daily-ubuntu-
The google-
google-
Loaded: loaded (/lib/systemd/
Active: active (running) since Tue 2021-07-27 19:47:48 UTC; 18h ago
Main PID: 711 (google_guest_ag)
Tasks: 9 (limit: 8924)
Memory: 19.7M
CGroup: /system.
└─711 /usr/bin/
Jul 27 19:47:55 sean-imp gpasswd[1469]: user google added by root to group floppy
Jul 27 19:47:55 sean-imp gpasswd[1475]: user google added by root to group audio
Jul 27 19:47:55 sean-imp gpasswd[1481]: user google added by root to group dip
Jul 27 19:47:55 sean-imp gpasswd[1487]: user google added by root to group video
Jul 27 19:47:55 sean-imp gpasswd[1493]: user google added by root to group plugdev
Jul 27 19:47:55 sean-imp gpasswd[1499]: user google added by root to group netdev
Jul 27 19:47:55 sean-imp gpasswd[1505]: user google added by root to group lxd
Jul 27 19:47:55 sean-imp gpasswd[1511]: user google added by root to group google-sudoers
Jul 27 19:47:55 sean-imp GCEGuestAgent[711]: 2021-07-
Jul 27 19:47:55 sean-imp google_
lines 1-19/19 (END)
description: | updated |
summary: |
- Unable to SSH Into Instance when deploying Impish 12.10 + Unable to SSH Into Instance when deploying Impish 21.10 |
tags: | added: rls-ii-incoming |
tags: | added: fr-1631 |
tags: | removed: rls-ii-incoming |
tags: | added: impish |
description: | updated |
Changed in netplan.io (Ubuntu Bionic): | |
status: | New → Won't Fix |
Changed in netplan.io (Ubuntu Focal): | |
status: | New → Won't Fix |
Changed in netplan.io (Ubuntu Hirsute): | |
status: | New → Won't Fix |
Changed in netplan.io (Ubuntu Impish): | |
status: | New → Won't Fix |
description: | updated |
description: | updated |
description: | updated |
[Summary] g184c836a- 0ubuntu1 due to comparing two different daily impish images before and after this update.
I believe this problem is related to a change in behavior in cloud-init version 21.2-43-
The problem appears in a gcp account which currently has multple global sshkeys associated with different users. For example, we have keys for a 'testuser' and 'testuser2' account. When booting the older serial (as well as older releases), we see accounts created for 'testuser', 'testuser2' and 'ubuntu'. For the newer serial, we only see an account for 'ubuntu'. As our test automation uses one of the 'testuser' keys, it can no longer access impish VMs.
I've included the package list below from my two test systems. Including the google-agent packages since these could also be suspect.
[Expected behavoir] _keys updated with those public keys.
If I have a gcp account with global ssh keys associated with non-ubuntu users, I expect those users to be present in the VM after launch and .ssh/authorized
[Current behavoir]
Only the 'ubuntu' user is being created
[Package list with unexpected behavior - impish 20210728 serial] g184c836a- 0ubuntu1 all initialization and customization tool for cloud instances -copymods 0.47ubuntu1 all copy initramfs modules into root filesystem for later use -dyn-netconf 0.47ubuntu1 all write a network interface file in /run for BOOTIF 00-0ubuntu1 amd64 Google Compute Engine Guest Agent osconfig- agent 20210219. 00-0ubuntu1 amd64 Google OS Config Agent agent-1- 0:amd64 0.105-31 amd64 PolicyKit Authentication Agent API
$ dpkg -l|grep cloud-init
ii cloud-init 21.2-43-
ii cloud-initramfs
ii cloud-initramfs
$ dpkg -l |grep agent
ii google-guest-agent 20210414.
ii google-
ii gpg-agent 2.2.20-1ubuntu4 amd64 GNU privacy guard - cryptographic agent
ii libpolkit-
ii lxd-agent-loader 0.4 all LXD - VM agent loader
[Package list with expected behavior - impish 20190719 serial] g899bfaa9- 0ubuntu2 all initialization and customization tool for cloud instances -copymods 0.47ubuntu1 all copy initramfs modules into root filesystem for later use -dyn-netconf 0.47ubuntu1 all write a network interface file in /run for BOOTIF 00-0ubuntu1 amd64 Google Compute Engine Guest Agent osconfig- agent 20210219. 00-0ubuntu1 amd64 Google OS Config Agent agent-1- 0:amd64 0.105-31 amd64 PolicyKit Authentication Agent API
$ dpkg -l|grep cloud-init
ii cloud-init 21.2-3-
ii cloud-initramfs
ii cloud-initramfs
$ dpkg -l |grep agent
ii google-guest-agent 20210414.
ii google-
ii gpg-agent 2.2.20-1ubuntu4 amd64 GNU privacy guard - cryptographic agent
ii libpolkit-
ii lxd-agent-loader 0.4 ...