2021-07-28 14:15:50 |
Sean Feole |
bug |
|
|
added bug |
2021-07-28 14:37:42 |
Joshua Powers |
description |
Google Instances deployed with the Ubuntu 12.10 Daily images are inaccessible via SSH.
gcloud compute instances create sf-impish-v20200720 --zone us-west1-a --network "default" --no-restart-on-failure --image-project ubuntu-os-cloud-devel --image daily-ubuntu-2110-impish-v20210720 --machine-type n1-standard-2
Will result in a successful deploy yet, inaccessible via ssh from the end users configured laptop.
This appears to affect all daily images after 20210719.
daily-ubuntu-2110-impish-v20210719 ubuntu-os-cloud-devel ubuntu-2110 READY
daily-ubuntu-2110-impish-v20210720 ubuntu-os-cloud-devel ubuntu-2110 READY
daily-ubuntu-2110-impish-v20210721 ubuntu-os-cloud-devel ubuntu-2110 READY
daily-ubuntu-2110-impish-v20210723 ubuntu-os-cloud-devel ubuntu-2110 READY
daily-ubuntu-2110-impish-v20210724 ubuntu-os-cloud-devel ubuntu-2110 READY
daily-ubuntu-2110-impish-v20210725 ubuntu-os-cloud-devel ubuntu-2110 READY
daily-ubuntu-2110-impish-v20210728 ubuntu-os-cloud-devel ubuntu-2110
This problem also appears to be reproducible via the gcloud UI, create a new virtual machine using the daily-ubuntu-2110-impish-v20210720 or greater and instruct the virtual machine to import a ssh_pub_key in the security tab. The Instance will start, yet still be inaccessible via the users private sshkey
The google-guest-agent.service appears to be responsible for adding the google project ssh keys to the instance once its deployed. Please see below when queried on the 20210719 image:
google-guest-agent.service - Google Compute Engine Guest Agent
Loaded: loaded (/lib/systemd/system/google-guest-agent.service; enabled; vendor preset: enabled)
Active: active (running) since Tue 2021-07-27 19:47:48 UTC; 18h ago
Main PID: 711 (google_guest_ag)
Tasks: 9 (limit: 8924)
Memory: 19.7M
CGroup: /system.slice/google-guest-agent.service
└─711 /usr/bin/google_guest_agent
Jul 27 19:47:55 sean-imp gpasswd[1469]: user google added by root to group floppy
Jul 27 19:47:55 sean-imp gpasswd[1475]: user google added by root to group audio
Jul 27 19:47:55 sean-imp gpasswd[1481]: user google added by root to group dip
Jul 27 19:47:55 sean-imp gpasswd[1487]: user google added by root to group video
Jul 27 19:47:55 sean-imp gpasswd[1493]: user google added by root to group plugdev
Jul 27 19:47:55 sean-imp gpasswd[1499]: user google added by root to group netdev
Jul 27 19:47:55 sean-imp gpasswd[1505]: user google added by root to group lxd
Jul 27 19:47:55 sean-imp gpasswd[1511]: user google added by root to group google-sudoers
Jul 27 19:47:55 sean-imp GCEGuestAgent[711]: 2021-07-27T19:47:55.1699Z GCEGuestAgent Info: Updating keys for user google.
Jul 27 19:47:55 sean-imp google_guest_agent[711]: 2021/07/27 19:47:55 logging client: rpc error: code = PermissionDenied desc = Clo>
lines 1-19/19 (END) |
Google Instances deployed with the Ubuntu 21.10 Daily images are inaccessible via SSH.
gcloud compute instances create sf-impish-v20200720 --zone us-west1-a --network "default" --no-restart-on-failure --image-project ubuntu-os-cloud-devel --image daily-ubuntu-2110-impish-v20210720 --machine-type n1-standard-2
Will result in a successful deploy yet, inaccessible via ssh from the end users configured laptop.
This appears to affect all daily images after 20210719.
daily-ubuntu-2110-impish-v20210719 ubuntu-os-cloud-devel ubuntu-2110 READY
daily-ubuntu-2110-impish-v20210720 ubuntu-os-cloud-devel ubuntu-2110 READY
daily-ubuntu-2110-impish-v20210721 ubuntu-os-cloud-devel ubuntu-2110 READY
daily-ubuntu-2110-impish-v20210723 ubuntu-os-cloud-devel ubuntu-2110 READY
daily-ubuntu-2110-impish-v20210724 ubuntu-os-cloud-devel ubuntu-2110 READY
daily-ubuntu-2110-impish-v20210725 ubuntu-os-cloud-devel ubuntu-2110 READY
daily-ubuntu-2110-impish-v20210728 ubuntu-os-cloud-devel ubuntu-2110
This problem also appears to be reproducible via the gcloud UI, create a new virtual machine using the daily-ubuntu-2110-impish-v20210720 or greater and instruct the virtual machine to import a ssh_pub_key in the security tab. The Instance will start, yet still be inaccessible via the users private sshkey
The google-guest-agent.service appears to be responsible for adding the google project ssh keys to the instance once its deployed. Please see below when queried on the 20210719 image:
google-guest-agent.service - Google Compute Engine Guest Agent
Loaded: loaded (/lib/systemd/system/google-guest-agent.service; enabled; vendor preset: enabled)
Active: active (running) since Tue 2021-07-27 19:47:48 UTC; 18h ago
Main PID: 711 (google_guest_ag)
Tasks: 9 (limit: 8924)
Memory: 19.7M
CGroup: /system.slice/google-guest-agent.service
└─711 /usr/bin/google_guest_agent
Jul 27 19:47:55 sean-imp gpasswd[1469]: user google added by root to group floppy
Jul 27 19:47:55 sean-imp gpasswd[1475]: user google added by root to group audio
Jul 27 19:47:55 sean-imp gpasswd[1481]: user google added by root to group dip
Jul 27 19:47:55 sean-imp gpasswd[1487]: user google added by root to group video
Jul 27 19:47:55 sean-imp gpasswd[1493]: user google added by root to group plugdev
Jul 27 19:47:55 sean-imp gpasswd[1499]: user google added by root to group netdev
Jul 27 19:47:55 sean-imp gpasswd[1505]: user google added by root to group lxd
Jul 27 19:47:55 sean-imp gpasswd[1511]: user google added by root to group google-sudoers
Jul 27 19:47:55 sean-imp GCEGuestAgent[711]: 2021-07-27T19:47:55.1699Z GCEGuestAgent Info: Updating keys for user google.
Jul 27 19:47:55 sean-imp google_guest_agent[711]: 2021/07/27 19:47:55 logging client: rpc error: code = PermissionDenied desc = Clo>
lines 1-19/19 (END) |
|
2021-07-28 14:37:43 |
Joshua Powers |
summary |
Unable to SSH Into Instance when deploying Impish 12.10 |
Unable to SSH Into Instance when deploying Impish 21.10 |
|
2021-07-29 14:41:30 |
Francis Ginther |
bug task added |
|
cloud-init (Ubuntu) |
|
2021-08-06 20:57:01 |
Chad Smith |
attachment added |
|
2boot_google_guest_agent_journalctl.log https://bugs.launchpad.net/ubuntu/+source/google-guest-agent/+bug/1938299/+attachment/5516394/+files/2boot_google_guest_agent_journalctl.log |
|
2021-08-10 13:26:12 |
James Falcon |
cloud-init (Ubuntu): status |
New |
Incomplete |
|
2021-08-12 17:24:10 |
Pat Viafore |
tags |
|
rls-ii-incoming |
|
2021-08-19 15:19:41 |
Matthieu Clemenceau |
tags |
rls-ii-incoming |
fr-1631 rls-ii-incoming |
|
2021-08-19 16:45:40 |
Brian Murray |
tags |
fr-1631 rls-ii-incoming |
fr-1631 |
|
2021-08-19 16:45:55 |
Brian Murray |
nominated for series |
|
Ubuntu Impish |
|
2021-08-19 16:45:55 |
Brian Murray |
bug task added |
|
cloud-init (Ubuntu Impish) |
|
2021-08-19 16:45:55 |
Brian Murray |
bug task added |
|
google-guest-agent (Ubuntu Impish) |
|
2021-09-05 10:07:47 |
Norbert |
tags |
fr-1631 |
fr-1631 impish |
|
2021-10-06 17:24:44 |
Chad Smith |
cloud-init (Ubuntu Impish): status |
Incomplete |
Triaged |
|
2021-10-06 17:24:54 |
Chad Smith |
google-guest-agent (Ubuntu Impish): status |
New |
In Progress |
|
2021-10-06 17:25:00 |
Chad Smith |
cloud-init (Ubuntu Impish): importance |
Undecided |
Critical |
|
2021-10-06 17:25:02 |
Chad Smith |
google-guest-agent (Ubuntu Impish): importance |
Undecided |
Critical |
|
2021-10-07 02:12:31 |
Chad Smith |
google-guest-agent (Ubuntu Impish): importance |
Critical |
Medium |
|
2021-10-07 02:13:03 |
Chad Smith |
nominated for series |
|
Ubuntu Bionic |
|
2021-10-07 02:13:03 |
Chad Smith |
bug task added |
|
cloud-init (Ubuntu Bionic) |
|
2021-10-07 02:13:03 |
Chad Smith |
bug task added |
|
google-guest-agent (Ubuntu Bionic) |
|
2021-10-07 02:13:03 |
Chad Smith |
nominated for series |
|
Ubuntu Hirsute |
|
2021-10-07 02:13:03 |
Chad Smith |
bug task added |
|
cloud-init (Ubuntu Hirsute) |
|
2021-10-07 02:13:03 |
Chad Smith |
bug task added |
|
google-guest-agent (Ubuntu Hirsute) |
|
2021-10-07 02:13:03 |
Chad Smith |
nominated for series |
|
Ubuntu Focal |
|
2021-10-07 02:13:03 |
Chad Smith |
bug task added |
|
cloud-init (Ubuntu Focal) |
|
2021-10-07 02:13:03 |
Chad Smith |
bug task added |
|
google-guest-agent (Ubuntu Focal) |
|
2021-10-07 02:13:48 |
Chad Smith |
cloud-init (Ubuntu Impish): status |
Triaged |
In Progress |
|
2021-10-07 02:14:03 |
Chad Smith |
google-guest-agent (Ubuntu Impish): status |
In Progress |
New |
|
2021-10-07 13:54:04 |
James Falcon |
description |
Google Instances deployed with the Ubuntu 21.10 Daily images are inaccessible via SSH.
gcloud compute instances create sf-impish-v20200720 --zone us-west1-a --network "default" --no-restart-on-failure --image-project ubuntu-os-cloud-devel --image daily-ubuntu-2110-impish-v20210720 --machine-type n1-standard-2
Will result in a successful deploy yet, inaccessible via ssh from the end users configured laptop.
This appears to affect all daily images after 20210719.
daily-ubuntu-2110-impish-v20210719 ubuntu-os-cloud-devel ubuntu-2110 READY
daily-ubuntu-2110-impish-v20210720 ubuntu-os-cloud-devel ubuntu-2110 READY
daily-ubuntu-2110-impish-v20210721 ubuntu-os-cloud-devel ubuntu-2110 READY
daily-ubuntu-2110-impish-v20210723 ubuntu-os-cloud-devel ubuntu-2110 READY
daily-ubuntu-2110-impish-v20210724 ubuntu-os-cloud-devel ubuntu-2110 READY
daily-ubuntu-2110-impish-v20210725 ubuntu-os-cloud-devel ubuntu-2110 READY
daily-ubuntu-2110-impish-v20210728 ubuntu-os-cloud-devel ubuntu-2110
This problem also appears to be reproducible via the gcloud UI, create a new virtual machine using the daily-ubuntu-2110-impish-v20210720 or greater and instruct the virtual machine to import a ssh_pub_key in the security tab. The Instance will start, yet still be inaccessible via the users private sshkey
The google-guest-agent.service appears to be responsible for adding the google project ssh keys to the instance once its deployed. Please see below when queried on the 20210719 image:
google-guest-agent.service - Google Compute Engine Guest Agent
Loaded: loaded (/lib/systemd/system/google-guest-agent.service; enabled; vendor preset: enabled)
Active: active (running) since Tue 2021-07-27 19:47:48 UTC; 18h ago
Main PID: 711 (google_guest_ag)
Tasks: 9 (limit: 8924)
Memory: 19.7M
CGroup: /system.slice/google-guest-agent.service
└─711 /usr/bin/google_guest_agent
Jul 27 19:47:55 sean-imp gpasswd[1469]: user google added by root to group floppy
Jul 27 19:47:55 sean-imp gpasswd[1475]: user google added by root to group audio
Jul 27 19:47:55 sean-imp gpasswd[1481]: user google added by root to group dip
Jul 27 19:47:55 sean-imp gpasswd[1487]: user google added by root to group video
Jul 27 19:47:55 sean-imp gpasswd[1493]: user google added by root to group plugdev
Jul 27 19:47:55 sean-imp gpasswd[1499]: user google added by root to group netdev
Jul 27 19:47:55 sean-imp gpasswd[1505]: user google added by root to group lxd
Jul 27 19:47:55 sean-imp gpasswd[1511]: user google added by root to group google-sudoers
Jul 27 19:47:55 sean-imp GCEGuestAgent[711]: 2021-07-27T19:47:55.1699Z GCEGuestAgent Info: Updating keys for user google.
Jul 27 19:47:55 sean-imp google_guest_agent[711]: 2021/07/27 19:47:55 logging client: rpc error: code = PermissionDenied desc = Clo>
lines 1-19/19 (END) |
=== Begin SRU Template ===
[Impact]
In PR #919 (81299de), we refactored some of the code used to bring up networks across distros. Previously, the call to bring up network interfaces during 'init' stage unintentionally resulted in a no-op such that network interfaces were NEVER brought up by cloud-init, even if new network interfaces were found after crawling the metadata.
In #919, the code was altered to bring up these discovered network interfaces. On Ubuntu, this results in a 'netplan apply' call during 'init' stage for any ubuntu-based distro on a datasource that has a NETWORK dependency. On GCE, this additional 'netplan apply' conflicts with the google-guest-agent service, resulting in an instance that can not be connected to.
To fix this, we added a new 'disable_network_activation' option that can be enabled in /etc/cloud.cfg to disable the activation of network interfaces in 'init' stage.
[Test Case]
An integration test has been added at `tests/integration_tests/datasources/test_network_dependency.py` to test this functionality. To test manually:
1. Launch an instance on GCE
2. Install the cloud-init version with the fix
3. Add a file, '/etc/cloud/cloud.cfg.d/99-disable-network-activation.cfg' with the contents:
disable_network_activation: true
4. Run cloud-init clean --logs
5. Create a new image based on this instance
6. Launch a new instance based on the new image
7. Instance should launch successfully and able to be ssh'ed into
8. "['netplan', 'apply']" should not be present anywhere in /var/log/cloud-init.log.
9. "Bringing up newly configured network interfaces" should not exist anywhere in /var/log/cloud-init.log
In the failure case, we will fail at step 7.
[Regression Potential]
The code in question determines whether to bring up interfaces after applying network config. Accidentally not doing this should not be a problem as we previously (unintentionally) did not bring these interfaces up. Accidentally bringing up interfaces when we shouldn't be also generally shouldn't cause a large problem outside of GCE, because outside of GCE there aren't (that we're aware of) other processes independently setting up network. If this setup determination code somehow fails, it happens early enough in boot that it could leave an instance unusable, however, the code is small enough and defensive enough that we don't believe that is a possibility.
[Other Info]
Github PR: https://github.com/canonical/cloud-init/pull/1048
Upstream commit: <TODO>
=== End SRU Template ===
Original bug report:
Google Instances deployed with the Ubuntu 21.10 Daily images are inaccessible via SSH.
gcloud compute instances create sf-impish-v20200720 --zone us-west1-a --network "default" --no-restart-on-failure --image-project ubuntu-os-cloud-devel --image daily-ubuntu-2110-impish-v20210720 --machine-type n1-standard-2
Will result in a successful deploy yet, inaccessible via ssh from the end users configured laptop.
This appears to affect all daily images after 20210719.
daily-ubuntu-2110-impish-v20210719 ubuntu-os-cloud-devel ubuntu-2110 READY
daily-ubuntu-2110-impish-v20210720 ubuntu-os-cloud-devel ubuntu-2110 READY
daily-ubuntu-2110-impish-v20210721 ubuntu-os-cloud-devel ubuntu-2110 READY
daily-ubuntu-2110-impish-v20210723 ubuntu-os-cloud-devel ubuntu-2110 READY
daily-ubuntu-2110-impish-v20210724 ubuntu-os-cloud-devel ubuntu-2110 READY
daily-ubuntu-2110-impish-v20210725 ubuntu-os-cloud-devel ubuntu-2110 READY
daily-ubuntu-2110-impish-v20210728 ubuntu-os-cloud-devel ubuntu-2110
This problem also appears to be reproducible via the gcloud UI, create a new virtual machine using the daily-ubuntu-2110-impish-v20210720 or greater and instruct the virtual machine to import a ssh_pub_key in the security tab. The Instance will start, yet still be inaccessible via the users private sshkey
The google-guest-agent.service appears to be responsible for adding the google project ssh keys to the instance once its deployed. Please see below when queried on the 20210719 image:
google-guest-agent.service - Google Compute Engine Guest Agent
Loaded: loaded (/lib/systemd/system/google-guest-agent.service; enabled; vendor preset: enabled)
Active: active (running) since Tue 2021-07-27 19:47:48 UTC; 18h ago
Main PID: 711 (google_guest_ag)
Tasks: 9 (limit: 8924)
Memory: 19.7M
CGroup: /system.slice/google-guest-agent.service
└─711 /usr/bin/google_guest_agent
Jul 27 19:47:55 sean-imp gpasswd[1469]: user google added by root to group floppy
Jul 27 19:47:55 sean-imp gpasswd[1475]: user google added by root to group audio
Jul 27 19:47:55 sean-imp gpasswd[1481]: user google added by root to group dip
Jul 27 19:47:55 sean-imp gpasswd[1487]: user google added by root to group video
Jul 27 19:47:55 sean-imp gpasswd[1493]: user google added by root to group plugdev
Jul 27 19:47:55 sean-imp gpasswd[1499]: user google added by root to group netdev
Jul 27 19:47:55 sean-imp gpasswd[1505]: user google added by root to group lxd
Jul 27 19:47:55 sean-imp gpasswd[1511]: user google added by root to group google-sudoers
Jul 27 19:47:55 sean-imp GCEGuestAgent[711]: 2021-07-27T19:47:55.1699Z GCEGuestAgent Info: Updating keys for user google.
Jul 27 19:47:55 sean-imp google_guest_agent[711]: 2021/07/27 19:47:55 logging client: rpc error: code = PermissionDenied desc = Clo>
lines 1-19/19 (END) |
|
2021-10-08 02:08:17 |
Launchpad Janitor |
cloud-init (Ubuntu Impish): status |
In Progress |
Fix Released |
|
2021-10-08 03:28:33 |
Launchpad Janitor |
google-guest-agent (Ubuntu): status |
New |
Confirmed |
|
2021-10-08 03:28:33 |
Launchpad Janitor |
cloud-init (Ubuntu Bionic): status |
New |
Confirmed |
|
2021-10-08 03:28:33 |
Launchpad Janitor |
google-guest-agent (Ubuntu Bionic): status |
New |
Confirmed |
|
2021-10-08 03:28:33 |
Launchpad Janitor |
cloud-init (Ubuntu Focal): status |
New |
Confirmed |
|
2021-10-08 03:28:33 |
Launchpad Janitor |
google-guest-agent (Ubuntu Focal): status |
New |
Confirmed |
|
2021-10-08 03:28:33 |
Launchpad Janitor |
cloud-init (Ubuntu Hirsute): status |
New |
Confirmed |
|
2021-10-08 03:28:33 |
Launchpad Janitor |
google-guest-agent (Ubuntu Hirsute): status |
New |
Confirmed |
|
2021-10-08 09:44:47 |
Timo Aaltonen |
cloud-init (Ubuntu Hirsute): status |
Confirmed |
Fix Committed |
|
2021-10-08 09:44:50 |
Timo Aaltonen |
bug |
|
|
added subscriber Ubuntu Stable Release Updates Team |
2021-10-08 09:44:52 |
Timo Aaltonen |
bug |
|
|
added subscriber SRU Verification |
2021-10-08 09:44:55 |
Timo Aaltonen |
tags |
fr-1631 impish |
fr-1631 impish verification-needed verification-needed-hirsute |
|
2021-10-08 09:48:17 |
Timo Aaltonen |
cloud-init (Ubuntu Focal): status |
Confirmed |
Fix Committed |
|
2021-10-08 09:48:23 |
Timo Aaltonen |
tags |
fr-1631 impish verification-needed verification-needed-hirsute |
fr-1631 impish verification-needed verification-needed-focal verification-needed-hirsute |
|
2021-10-08 09:50:16 |
Timo Aaltonen |
cloud-init (Ubuntu Bionic): status |
Confirmed |
Fix Committed |
|
2021-10-08 09:50:24 |
Timo Aaltonen |
tags |
fr-1631 impish verification-needed verification-needed-focal verification-needed-hirsute |
fr-1631 impish verification-needed verification-needed-bionic verification-needed-focal verification-needed-hirsute |
|
2021-10-08 20:04:12 |
James Falcon |
attachment added |
|
integration_test_results.tar.gz https://bugs.launchpad.net/bugs/1938299/+attachment/5531668/+files/integration_test_results.tar.gz |
|
2021-10-11 02:33:56 |
Utkarsh Gupta |
tags |
fr-1631 impish verification-needed verification-needed-bionic verification-needed-focal verification-needed-hirsute |
fr-1631 impish verification-done-bionic verification-done-focal verification-done-hirsute verification-needed |
|
2021-10-11 02:34:04 |
Utkarsh Gupta |
bug |
|
|
added subscriber Utkarsh Gupta |
2021-10-12 02:38:28 |
Chad Smith |
bug task added |
|
netplan.io (Ubuntu) |
|
2021-10-13 15:19:30 |
Brian Murray |
netplan.io (Ubuntu Bionic): status |
New |
Won't Fix |
|
2021-10-13 15:19:52 |
Brian Murray |
netplan.io (Ubuntu Focal): status |
New |
Won't Fix |
|
2021-10-13 15:20:04 |
Brian Murray |
netplan.io (Ubuntu Hirsute): status |
New |
Won't Fix |
|
2021-10-13 15:20:48 |
Brian Murray |
netplan.io (Ubuntu Impish): status |
New |
Won't Fix |
|
2021-10-14 13:12:32 |
Chad Smith |
description |
=== Begin SRU Template ===
[Impact]
In PR #919 (81299de), we refactored some of the code used to bring up networks across distros. Previously, the call to bring up network interfaces during 'init' stage unintentionally resulted in a no-op such that network interfaces were NEVER brought up by cloud-init, even if new network interfaces were found after crawling the metadata.
In #919, the code was altered to bring up these discovered network interfaces. On Ubuntu, this results in a 'netplan apply' call during 'init' stage for any ubuntu-based distro on a datasource that has a NETWORK dependency. On GCE, this additional 'netplan apply' conflicts with the google-guest-agent service, resulting in an instance that can not be connected to.
To fix this, we added a new 'disable_network_activation' option that can be enabled in /etc/cloud.cfg to disable the activation of network interfaces in 'init' stage.
[Test Case]
An integration test has been added at `tests/integration_tests/datasources/test_network_dependency.py` to test this functionality. To test manually:
1. Launch an instance on GCE
2. Install the cloud-init version with the fix
3. Add a file, '/etc/cloud/cloud.cfg.d/99-disable-network-activation.cfg' with the contents:
disable_network_activation: true
4. Run cloud-init clean --logs
5. Create a new image based on this instance
6. Launch a new instance based on the new image
7. Instance should launch successfully and able to be ssh'ed into
8. "['netplan', 'apply']" should not be present anywhere in /var/log/cloud-init.log.
9. "Bringing up newly configured network interfaces" should not exist anywhere in /var/log/cloud-init.log
In the failure case, we will fail at step 7.
[Regression Potential]
The code in question determines whether to bring up interfaces after applying network config. Accidentally not doing this should not be a problem as we previously (unintentionally) did not bring these interfaces up. Accidentally bringing up interfaces when we shouldn't be also generally shouldn't cause a large problem outside of GCE, because outside of GCE there aren't (that we're aware of) other processes independently setting up network. If this setup determination code somehow fails, it happens early enough in boot that it could leave an instance unusable, however, the code is small enough and defensive enough that we don't believe that is a possibility.
[Other Info]
Github PR: https://github.com/canonical/cloud-init/pull/1048
Upstream commit: <TODO>
=== End SRU Template ===
Original bug report:
Google Instances deployed with the Ubuntu 21.10 Daily images are inaccessible via SSH.
gcloud compute instances create sf-impish-v20200720 --zone us-west1-a --network "default" --no-restart-on-failure --image-project ubuntu-os-cloud-devel --image daily-ubuntu-2110-impish-v20210720 --machine-type n1-standard-2
Will result in a successful deploy yet, inaccessible via ssh from the end users configured laptop.
This appears to affect all daily images after 20210719.
daily-ubuntu-2110-impish-v20210719 ubuntu-os-cloud-devel ubuntu-2110 READY
daily-ubuntu-2110-impish-v20210720 ubuntu-os-cloud-devel ubuntu-2110 READY
daily-ubuntu-2110-impish-v20210721 ubuntu-os-cloud-devel ubuntu-2110 READY
daily-ubuntu-2110-impish-v20210723 ubuntu-os-cloud-devel ubuntu-2110 READY
daily-ubuntu-2110-impish-v20210724 ubuntu-os-cloud-devel ubuntu-2110 READY
daily-ubuntu-2110-impish-v20210725 ubuntu-os-cloud-devel ubuntu-2110 READY
daily-ubuntu-2110-impish-v20210728 ubuntu-os-cloud-devel ubuntu-2110
This problem also appears to be reproducible via the gcloud UI, create a new virtual machine using the daily-ubuntu-2110-impish-v20210720 or greater and instruct the virtual machine to import a ssh_pub_key in the security tab. The Instance will start, yet still be inaccessible via the users private sshkey
The google-guest-agent.service appears to be responsible for adding the google project ssh keys to the instance once its deployed. Please see below when queried on the 20210719 image:
google-guest-agent.service - Google Compute Engine Guest Agent
Loaded: loaded (/lib/systemd/system/google-guest-agent.service; enabled; vendor preset: enabled)
Active: active (running) since Tue 2021-07-27 19:47:48 UTC; 18h ago
Main PID: 711 (google_guest_ag)
Tasks: 9 (limit: 8924)
Memory: 19.7M
CGroup: /system.slice/google-guest-agent.service
└─711 /usr/bin/google_guest_agent
Jul 27 19:47:55 sean-imp gpasswd[1469]: user google added by root to group floppy
Jul 27 19:47:55 sean-imp gpasswd[1475]: user google added by root to group audio
Jul 27 19:47:55 sean-imp gpasswd[1481]: user google added by root to group dip
Jul 27 19:47:55 sean-imp gpasswd[1487]: user google added by root to group video
Jul 27 19:47:55 sean-imp gpasswd[1493]: user google added by root to group plugdev
Jul 27 19:47:55 sean-imp gpasswd[1499]: user google added by root to group netdev
Jul 27 19:47:55 sean-imp gpasswd[1505]: user google added by root to group lxd
Jul 27 19:47:55 sean-imp gpasswd[1511]: user google added by root to group google-sudoers
Jul 27 19:47:55 sean-imp GCEGuestAgent[711]: 2021-07-27T19:47:55.1699Z GCEGuestAgent Info: Updating keys for user google.
Jul 27 19:47:55 sean-imp google_guest_agent[711]: 2021/07/27 19:47:55 logging client: rpc error: code = PermissionDenied desc = Clo>
lines 1-19/19 (END) |
=== Begin SRU Template ===
[Impact]
In PR #919 (81299de), we refactored some of the code used to bring up networks across distros. Previously, the call to bring up network interfaces during 'init' stage unintentionally resulted in a no-op such that network interfaces were NEVER brought up by cloud-init, even if new network interfaces were found after crawling the metadata.
In #919, the code was altered to bring up these discovered network interfaces. On Ubuntu, this results in a 'netplan apply' call during 'init' stage for any ubuntu-based distro on a datasource that has a NETWORK dependency. On GCE, this additional 'netplan apply' conflicts with the google-guest-agent service, shutting that service down due to that project's PartOf= systemd relationship resulting in an instance that can not be connected to.
To fix this, we added a new 'disable_network_activation' option that can be set to true in /etc/cloud.cfg.d/*.cfg by GCP image creators to avoid calling `netplan apply` when network configuration is written.
[Test Case]
An integration test has been added at `tests/integration_tests/datasources/test_network_dependency.py` to test this functionality. To test manually:
1. Launch an instance on GCE
2. Install the cloud-init version with the fix
3. Add a file, '/etc/cloud/cloud.cfg.d/99-disable-network-activation.cfg' with the contents:
disable_network_activation: true
4. Run cloud-init clean --logs
5. Create a new image based on this instance
6. Launch a new instance based on the new image
7. Instance should launch successfully and able to be ssh'ed into
8. "['netplan', 'apply']" should not be present anywhere in /var/log/cloud-init.log.
9. "Bringing up newly configured network interfaces" should not exist anywhere in /var/log/cloud-init.log
In the failure case, we will fail at step 7.
[Regression Potential]
The code in question determines whether to bring up interfaces after applying network config. Accidentally not doing this should not be a problem as we previously (unintentionally) did not bring these interfaces up. Accidentally bringing up interfaces when we shouldn't be also generally shouldn't cause a large problem outside of GCE, because outside of GCE there aren't (that we're aware of) other processes independently setting up network. If this setup determination code somehow fails, it happens early enough in boot that it could leave an instance unusable, however, the code is small enough and defensive enough that we don't believe that is a possibility.
Additionally, any cloud datasource that is discovered in `init-local` stage (Azure, Ec2, Hetzner, IBMCloud, OpenStack and Oracle) aren't exposed to this code path because full network config it emitted before system network is brought up so there is no need to call `netplan apply` at that time.
[Other Info]
Github PR: https://github.com/canonical/cloud-init/pull/1048
Upstream commit: <TODO>
=== End SRU Template ===
Original bug report:
Google Instances deployed with the Ubuntu 21.10 Daily images are inaccessible via SSH.
gcloud compute instances create sf-impish-v20200720 --zone us-west1-a --network "default" --no-restart-on-failure --image-project ubuntu-os-cloud-devel --image daily-ubuntu-2110-impish-v20210720 --machine-type n1-standard-2
Will result in a successful deploy yet, inaccessible via ssh from the end users configured laptop.
This appears to affect all daily images after 20210719.
daily-ubuntu-2110-impish-v20210719 ubuntu-os-cloud-devel ubuntu-2110 READY
daily-ubuntu-2110-impish-v20210720 ubuntu-os-cloud-devel ubuntu-2110 READY
daily-ubuntu-2110-impish-v20210721 ubuntu-os-cloud-devel ubuntu-2110 READY
daily-ubuntu-2110-impish-v20210723 ubuntu-os-cloud-devel ubuntu-2110 READY
daily-ubuntu-2110-impish-v20210724 ubuntu-os-cloud-devel ubuntu-2110 READY
daily-ubuntu-2110-impish-v20210725 ubuntu-os-cloud-devel ubuntu-2110 READY
daily-ubuntu-2110-impish-v20210728 ubuntu-os-cloud-devel ubuntu-2110
This problem also appears to be reproducible via the gcloud UI, create a new virtual machine using the daily-ubuntu-2110-impish-v20210720 or greater and instruct the virtual machine to import a ssh_pub_key in the security tab. The Instance will start, yet still be inaccessible via the users private sshkey
The google-guest-agent.service appears to be responsible for adding the google project ssh keys to the instance once its deployed. Please see below when queried on the 20210719 image:
google-guest-agent.service - Google Compute Engine Guest Agent
Loaded: loaded (/lib/systemd/system/google-guest-agent.service; enabled; vendor preset: enabled)
Active: active (running) since Tue 2021-07-27 19:47:48 UTC; 18h ago
Main PID: 711 (google_guest_ag)
Tasks: 9 (limit: 8924)
Memory: 19.7M
CGroup: /system.slice/google-guest-agent.service
└─711 /usr/bin/google_guest_agent
Jul 27 19:47:55 sean-imp gpasswd[1469]: user google added by root to group floppy
Jul 27 19:47:55 sean-imp gpasswd[1475]: user google added by root to group audio
Jul 27 19:47:55 sean-imp gpasswd[1481]: user google added by root to group dip
Jul 27 19:47:55 sean-imp gpasswd[1487]: user google added by root to group video
Jul 27 19:47:55 sean-imp gpasswd[1493]: user google added by root to group plugdev
Jul 27 19:47:55 sean-imp gpasswd[1499]: user google added by root to group netdev
Jul 27 19:47:55 sean-imp gpasswd[1505]: user google added by root to group lxd
Jul 27 19:47:55 sean-imp gpasswd[1511]: user google added by root to group google-sudoers
Jul 27 19:47:55 sean-imp GCEGuestAgent[711]: 2021-07-27T19:47:55.1699Z GCEGuestAgent Info: Updating keys for user google.
Jul 27 19:47:55 sean-imp google_guest_agent[711]: 2021/07/27 19:47:55 logging client: rpc error: code = PermissionDenied desc = Clo>
lines 1-19/19 (END) |
|
2021-10-14 14:03:02 |
James Falcon |
description |
=== Begin SRU Template ===
[Impact]
In PR #919 (81299de), we refactored some of the code used to bring up networks across distros. Previously, the call to bring up network interfaces during 'init' stage unintentionally resulted in a no-op such that network interfaces were NEVER brought up by cloud-init, even if new network interfaces were found after crawling the metadata.
In #919, the code was altered to bring up these discovered network interfaces. On Ubuntu, this results in a 'netplan apply' call during 'init' stage for any ubuntu-based distro on a datasource that has a NETWORK dependency. On GCE, this additional 'netplan apply' conflicts with the google-guest-agent service, shutting that service down due to that project's PartOf= systemd relationship resulting in an instance that can not be connected to.
To fix this, we added a new 'disable_network_activation' option that can be set to true in /etc/cloud.cfg.d/*.cfg by GCP image creators to avoid calling `netplan apply` when network configuration is written.
[Test Case]
An integration test has been added at `tests/integration_tests/datasources/test_network_dependency.py` to test this functionality. To test manually:
1. Launch an instance on GCE
2. Install the cloud-init version with the fix
3. Add a file, '/etc/cloud/cloud.cfg.d/99-disable-network-activation.cfg' with the contents:
disable_network_activation: true
4. Run cloud-init clean --logs
5. Create a new image based on this instance
6. Launch a new instance based on the new image
7. Instance should launch successfully and able to be ssh'ed into
8. "['netplan', 'apply']" should not be present anywhere in /var/log/cloud-init.log.
9. "Bringing up newly configured network interfaces" should not exist anywhere in /var/log/cloud-init.log
In the failure case, we will fail at step 7.
[Regression Potential]
The code in question determines whether to bring up interfaces after applying network config. Accidentally not doing this should not be a problem as we previously (unintentionally) did not bring these interfaces up. Accidentally bringing up interfaces when we shouldn't be also generally shouldn't cause a large problem outside of GCE, because outside of GCE there aren't (that we're aware of) other processes independently setting up network. If this setup determination code somehow fails, it happens early enough in boot that it could leave an instance unusable, however, the code is small enough and defensive enough that we don't believe that is a possibility.
Additionally, any cloud datasource that is discovered in `init-local` stage (Azure, Ec2, Hetzner, IBMCloud, OpenStack and Oracle) aren't exposed to this code path because full network config it emitted before system network is brought up so there is no need to call `netplan apply` at that time.
[Other Info]
Github PR: https://github.com/canonical/cloud-init/pull/1048
Upstream commit: <TODO>
=== End SRU Template ===
Original bug report:
Google Instances deployed with the Ubuntu 21.10 Daily images are inaccessible via SSH.
gcloud compute instances create sf-impish-v20200720 --zone us-west1-a --network "default" --no-restart-on-failure --image-project ubuntu-os-cloud-devel --image daily-ubuntu-2110-impish-v20210720 --machine-type n1-standard-2
Will result in a successful deploy yet, inaccessible via ssh from the end users configured laptop.
This appears to affect all daily images after 20210719.
daily-ubuntu-2110-impish-v20210719 ubuntu-os-cloud-devel ubuntu-2110 READY
daily-ubuntu-2110-impish-v20210720 ubuntu-os-cloud-devel ubuntu-2110 READY
daily-ubuntu-2110-impish-v20210721 ubuntu-os-cloud-devel ubuntu-2110 READY
daily-ubuntu-2110-impish-v20210723 ubuntu-os-cloud-devel ubuntu-2110 READY
daily-ubuntu-2110-impish-v20210724 ubuntu-os-cloud-devel ubuntu-2110 READY
daily-ubuntu-2110-impish-v20210725 ubuntu-os-cloud-devel ubuntu-2110 READY
daily-ubuntu-2110-impish-v20210728 ubuntu-os-cloud-devel ubuntu-2110
This problem also appears to be reproducible via the gcloud UI, create a new virtual machine using the daily-ubuntu-2110-impish-v20210720 or greater and instruct the virtual machine to import a ssh_pub_key in the security tab. The Instance will start, yet still be inaccessible via the users private sshkey
The google-guest-agent.service appears to be responsible for adding the google project ssh keys to the instance once its deployed. Please see below when queried on the 20210719 image:
google-guest-agent.service - Google Compute Engine Guest Agent
Loaded: loaded (/lib/systemd/system/google-guest-agent.service; enabled; vendor preset: enabled)
Active: active (running) since Tue 2021-07-27 19:47:48 UTC; 18h ago
Main PID: 711 (google_guest_ag)
Tasks: 9 (limit: 8924)
Memory: 19.7M
CGroup: /system.slice/google-guest-agent.service
└─711 /usr/bin/google_guest_agent
Jul 27 19:47:55 sean-imp gpasswd[1469]: user google added by root to group floppy
Jul 27 19:47:55 sean-imp gpasswd[1475]: user google added by root to group audio
Jul 27 19:47:55 sean-imp gpasswd[1481]: user google added by root to group dip
Jul 27 19:47:55 sean-imp gpasswd[1487]: user google added by root to group video
Jul 27 19:47:55 sean-imp gpasswd[1493]: user google added by root to group plugdev
Jul 27 19:47:55 sean-imp gpasswd[1499]: user google added by root to group netdev
Jul 27 19:47:55 sean-imp gpasswd[1505]: user google added by root to group lxd
Jul 27 19:47:55 sean-imp gpasswd[1511]: user google added by root to group google-sudoers
Jul 27 19:47:55 sean-imp GCEGuestAgent[711]: 2021-07-27T19:47:55.1699Z GCEGuestAgent Info: Updating keys for user google.
Jul 27 19:47:55 sean-imp google_guest_agent[711]: 2021/07/27 19:47:55 logging client: rpc error: code = PermissionDenied desc = Clo>
lines 1-19/19 (END) |
=== Begin SRU Template ===
[Impact]
In PR #919 (81299de), we refactored some of the code used to bring up networks across distros. Previously, the call to bring up network interfaces during 'init' stage unintentionally resulted in a no-op such that network interfaces were NEVER brought up by cloud-init, even if new network interfaces were found after crawling the metadata.
In #919, the code was altered to bring up these discovered network interfaces. On Ubuntu, this results in a 'netplan apply' call during 'init' stage for any ubuntu-based distro on a datasource that has a NETWORK dependency. On GCE, this additional 'netplan apply' conflicts with the google-guest-agent service, shutting that service down due to that project's PartOf= systemd relationship resulting in an instance that can not be connected to.
To fix this, we added a new 'disable_network_activation' option that can be set to true in /etc/cloud.cfg.d/*.cfg by image creators to disable the activation of network interfaces in 'init' stage. This will avoid the 'netplan apply' call on GCE instances.
[Test Case]
An integration test has been added at `tests/integration_tests/datasources/test_network_dependency.py` to test this functionality. To test manually:
1. Launch an instance on GCE
2. Install the cloud-init version with the fix
3. Add a file, '/etc/cloud/cloud.cfg.d/99-disable-network-activation.cfg' with the contents:
disable_network_activation: true
4. Run cloud-init clean --logs
5. Create a new image based on this instance
6. Launch a new instance based on the new image
7. Instance should launch successfully and able to be ssh'ed into
8. "['netplan', 'apply']" should not be present anywhere in /var/log/cloud-init.log.
9. "Bringing up newly configured network interfaces" should not exist anywhere in /var/log/cloud-init.log
In the failure case, we will fail at step 7.
[Regression Potential]
The code in question determines whether to bring up interfaces after applying network config. Accidentally not doing this should not be a problem as we previously (unintentionally) did not bring these interfaces up. Accidentally bringing up interfaces when we shouldn't be also generally shouldn't cause a large problem outside of GCE, because outside of GCE there aren't (that we're aware of) other processes independently setting up network. If this setup determination code somehow fails, it happens early enough in boot that it could leave an instance unusable, however, the code is small enough and defensive enough that we don't believe that is a possibility.
Additionally, any cloud datasource that is discovered in `init-local` stage (Azure, Ec2, Hetzner, IBMCloud, OpenStack and Oracle) aren't exposed to this code path because full network config it emitted before system network is brought up so there is no need to call `netplan apply` at that time.
[Other Info]
Github PR: https://github.com/canonical/cloud-init/pull/1048
Upstream commit: <TODO>
=== End SRU Template ===
Original bug report:
Google Instances deployed with the Ubuntu 21.10 Daily images are inaccessible via SSH.
gcloud compute instances create sf-impish-v20200720 --zone us-west1-a --network "default" --no-restart-on-failure --image-project ubuntu-os-cloud-devel --image daily-ubuntu-2110-impish-v20210720 --machine-type n1-standard-2
Will result in a successful deploy yet, inaccessible via ssh from the end users configured laptop.
This appears to affect all daily images after 20210719.
daily-ubuntu-2110-impish-v20210719 ubuntu-os-cloud-devel ubuntu-2110 READY
daily-ubuntu-2110-impish-v20210720 ubuntu-os-cloud-devel ubuntu-2110 READY
daily-ubuntu-2110-impish-v20210721 ubuntu-os-cloud-devel ubuntu-2110 READY
daily-ubuntu-2110-impish-v20210723 ubuntu-os-cloud-devel ubuntu-2110 READY
daily-ubuntu-2110-impish-v20210724 ubuntu-os-cloud-devel ubuntu-2110 READY
daily-ubuntu-2110-impish-v20210725 ubuntu-os-cloud-devel ubuntu-2110 READY
daily-ubuntu-2110-impish-v20210728 ubuntu-os-cloud-devel ubuntu-2110
This problem also appears to be reproducible via the gcloud UI, create a new virtual machine using the daily-ubuntu-2110-impish-v20210720 or greater and instruct the virtual machine to import a ssh_pub_key in the security tab. The Instance will start, yet still be inaccessible via the users private sshkey
The google-guest-agent.service appears to be responsible for adding the google project ssh keys to the instance once its deployed. Please see below when queried on the 20210719 image:
google-guest-agent.service - Google Compute Engine Guest Agent
Loaded: loaded (/lib/systemd/system/google-guest-agent.service; enabled; vendor preset: enabled)
Active: active (running) since Tue 2021-07-27 19:47:48 UTC; 18h ago
Main PID: 711 (google_guest_ag)
Tasks: 9 (limit: 8924)
Memory: 19.7M
CGroup: /system.slice/google-guest-agent.service
└─711 /usr/bin/google_guest_agent
Jul 27 19:47:55 sean-imp gpasswd[1469]: user google added by root to group floppy
Jul 27 19:47:55 sean-imp gpasswd[1475]: user google added by root to group audio
Jul 27 19:47:55 sean-imp gpasswd[1481]: user google added by root to group dip
Jul 27 19:47:55 sean-imp gpasswd[1487]: user google added by root to group video
Jul 27 19:47:55 sean-imp gpasswd[1493]: user google added by root to group plugdev
Jul 27 19:47:55 sean-imp gpasswd[1499]: user google added by root to group netdev
Jul 27 19:47:55 sean-imp gpasswd[1505]: user google added by root to group lxd
Jul 27 19:47:55 sean-imp gpasswd[1511]: user google added by root to group google-sudoers
Jul 27 19:47:55 sean-imp GCEGuestAgent[711]: 2021-07-27T19:47:55.1699Z GCEGuestAgent Info: Updating keys for user google.
Jul 27 19:47:55 sean-imp google_guest_agent[711]: 2021/07/27 19:47:55 logging client: rpc error: code = PermissionDenied desc = Clo>
lines 1-19/19 (END) |
|
2021-10-14 14:14:06 |
James Falcon |
description |
=== Begin SRU Template ===
[Impact]
In PR #919 (81299de), we refactored some of the code used to bring up networks across distros. Previously, the call to bring up network interfaces during 'init' stage unintentionally resulted in a no-op such that network interfaces were NEVER brought up by cloud-init, even if new network interfaces were found after crawling the metadata.
In #919, the code was altered to bring up these discovered network interfaces. On Ubuntu, this results in a 'netplan apply' call during 'init' stage for any ubuntu-based distro on a datasource that has a NETWORK dependency. On GCE, this additional 'netplan apply' conflicts with the google-guest-agent service, shutting that service down due to that project's PartOf= systemd relationship resulting in an instance that can not be connected to.
To fix this, we added a new 'disable_network_activation' option that can be set to true in /etc/cloud.cfg.d/*.cfg by image creators to disable the activation of network interfaces in 'init' stage. This will avoid the 'netplan apply' call on GCE instances.
[Test Case]
An integration test has been added at `tests/integration_tests/datasources/test_network_dependency.py` to test this functionality. To test manually:
1. Launch an instance on GCE
2. Install the cloud-init version with the fix
3. Add a file, '/etc/cloud/cloud.cfg.d/99-disable-network-activation.cfg' with the contents:
disable_network_activation: true
4. Run cloud-init clean --logs
5. Create a new image based on this instance
6. Launch a new instance based on the new image
7. Instance should launch successfully and able to be ssh'ed into
8. "['netplan', 'apply']" should not be present anywhere in /var/log/cloud-init.log.
9. "Bringing up newly configured network interfaces" should not exist anywhere in /var/log/cloud-init.log
In the failure case, we will fail at step 7.
[Regression Potential]
The code in question determines whether to bring up interfaces after applying network config. Accidentally not doing this should not be a problem as we previously (unintentionally) did not bring these interfaces up. Accidentally bringing up interfaces when we shouldn't be also generally shouldn't cause a large problem outside of GCE, because outside of GCE there aren't (that we're aware of) other processes independently setting up network. If this setup determination code somehow fails, it happens early enough in boot that it could leave an instance unusable, however, the code is small enough and defensive enough that we don't believe that is a possibility.
Additionally, any cloud datasource that is discovered in `init-local` stage (Azure, Ec2, Hetzner, IBMCloud, OpenStack and Oracle) aren't exposed to this code path because full network config it emitted before system network is brought up so there is no need to call `netplan apply` at that time.
[Other Info]
Github PR: https://github.com/canonical/cloud-init/pull/1048
Upstream commit: <TODO>
=== End SRU Template ===
Original bug report:
Google Instances deployed with the Ubuntu 21.10 Daily images are inaccessible via SSH.
gcloud compute instances create sf-impish-v20200720 --zone us-west1-a --network "default" --no-restart-on-failure --image-project ubuntu-os-cloud-devel --image daily-ubuntu-2110-impish-v20210720 --machine-type n1-standard-2
Will result in a successful deploy yet, inaccessible via ssh from the end users configured laptop.
This appears to affect all daily images after 20210719.
daily-ubuntu-2110-impish-v20210719 ubuntu-os-cloud-devel ubuntu-2110 READY
daily-ubuntu-2110-impish-v20210720 ubuntu-os-cloud-devel ubuntu-2110 READY
daily-ubuntu-2110-impish-v20210721 ubuntu-os-cloud-devel ubuntu-2110 READY
daily-ubuntu-2110-impish-v20210723 ubuntu-os-cloud-devel ubuntu-2110 READY
daily-ubuntu-2110-impish-v20210724 ubuntu-os-cloud-devel ubuntu-2110 READY
daily-ubuntu-2110-impish-v20210725 ubuntu-os-cloud-devel ubuntu-2110 READY
daily-ubuntu-2110-impish-v20210728 ubuntu-os-cloud-devel ubuntu-2110
This problem also appears to be reproducible via the gcloud UI, create a new virtual machine using the daily-ubuntu-2110-impish-v20210720 or greater and instruct the virtual machine to import a ssh_pub_key in the security tab. The Instance will start, yet still be inaccessible via the users private sshkey
The google-guest-agent.service appears to be responsible for adding the google project ssh keys to the instance once its deployed. Please see below when queried on the 20210719 image:
google-guest-agent.service - Google Compute Engine Guest Agent
Loaded: loaded (/lib/systemd/system/google-guest-agent.service; enabled; vendor preset: enabled)
Active: active (running) since Tue 2021-07-27 19:47:48 UTC; 18h ago
Main PID: 711 (google_guest_ag)
Tasks: 9 (limit: 8924)
Memory: 19.7M
CGroup: /system.slice/google-guest-agent.service
└─711 /usr/bin/google_guest_agent
Jul 27 19:47:55 sean-imp gpasswd[1469]: user google added by root to group floppy
Jul 27 19:47:55 sean-imp gpasswd[1475]: user google added by root to group audio
Jul 27 19:47:55 sean-imp gpasswd[1481]: user google added by root to group dip
Jul 27 19:47:55 sean-imp gpasswd[1487]: user google added by root to group video
Jul 27 19:47:55 sean-imp gpasswd[1493]: user google added by root to group plugdev
Jul 27 19:47:55 sean-imp gpasswd[1499]: user google added by root to group netdev
Jul 27 19:47:55 sean-imp gpasswd[1505]: user google added by root to group lxd
Jul 27 19:47:55 sean-imp gpasswd[1511]: user google added by root to group google-sudoers
Jul 27 19:47:55 sean-imp GCEGuestAgent[711]: 2021-07-27T19:47:55.1699Z GCEGuestAgent Info: Updating keys for user google.
Jul 27 19:47:55 sean-imp google_guest_agent[711]: 2021/07/27 19:47:55 logging client: rpc error: code = PermissionDenied desc = Clo>
lines 1-19/19 (END) |
=== Begin SRU Template ===
[Impact]
In PR #919 (81299de), we refactored some of the code used to bring up networks across distros. Previously, the call to bring up network interfaces during 'init' stage unintentionally resulted in a no-op such that network interfaces were NEVER brought up by cloud-init, even if new network interfaces were found after crawling the metadata.
In #919, the code was altered to bring up these discovered network interfaces. On Ubuntu, this results in a 'netplan apply' call during 'init' stage for any ubuntu-based distro on a datasource that has a NETWORK dependency. On GCE, this additional 'netplan apply' conflicts with the google-guest-agent service, shutting that service down due to that project's PartOf= systemd relationship resulting in an instance that can not be connected to.
To fix this, we added a new 'disable_network_activation' option that can be set to true in /etc/cloud.cfg.d/*.cfg by image creators to disable the activation of network interfaces in 'init' stage. This will avoid the 'netplan apply' call on GCE instances.
[Test Case]
An integration test has been added at `tests/integration_tests/datasources/test_network_dependency.py` to test this functionality. To test manually:
1. Launch an instance on GCE
2. Install the cloud-init version with the fix
3. Add a file, '/etc/cloud/cloud.cfg.d/99-disable-network-activation.cfg' with the contents:
disable_network_activation: true
4. Run cloud-init clean --logs
5. Create a new image based on this instance
6. Launch a new instance based on the new image
7. Instance should launch successfully and able to be ssh'ed into
8. "['netplan', 'apply']" should not be present anywhere in /var/log/cloud-init.log.
9. "Bringing up newly configured network interfaces" should not exist anywhere in /var/log/cloud-init.log
In the failure case, we will fail at step 7.
[Regression Potential]
The code in question determines whether to bring up interfaces after applying network config. Accidentally not doing this should not be a problem as we previously (unintentionally) did not bring these interfaces up. Accidentally bringing up interfaces when we shouldn't be also generally shouldn't cause a large problem outside of GCE, because outside of GCE there aren't (that we're aware of) other processes independently setting up network. If this setup determination code somehow fails, it happens early enough in boot that it could leave an instance unusable, however, the code is small enough and defensive enough that we don't believe that is a possibility.
Additionally, any cloud datasource that is discovered in `init-local` stage (Azure, Ec2, Hetzner, IBMCloud, OpenStack and Oracle) aren't exposed to this code path because full network config it emitted before system network is brought up so there is no need to call `netplan apply` at that time.
[Other Info]
Github PR: https://github.com/canonical/cloud-init/pull/1048
Upstream commit: https://github.com/canonical/cloud-init/commit/9c147e8341e287366790e60658f646cdcc59bef2
=== End SRU Template ===
Original bug report:
Google Instances deployed with the Ubuntu 21.10 Daily images are inaccessible via SSH.
gcloud compute instances create sf-impish-v20200720 --zone us-west1-a --network "default" --no-restart-on-failure --image-project ubuntu-os-cloud-devel --image daily-ubuntu-2110-impish-v20210720 --machine-type n1-standard-2
Will result in a successful deploy yet, inaccessible via ssh from the end users configured laptop.
This appears to affect all daily images after 20210719.
daily-ubuntu-2110-impish-v20210719 ubuntu-os-cloud-devel ubuntu-2110 READY
daily-ubuntu-2110-impish-v20210720 ubuntu-os-cloud-devel ubuntu-2110 READY
daily-ubuntu-2110-impish-v20210721 ubuntu-os-cloud-devel ubuntu-2110 READY
daily-ubuntu-2110-impish-v20210723 ubuntu-os-cloud-devel ubuntu-2110 READY
daily-ubuntu-2110-impish-v20210724 ubuntu-os-cloud-devel ubuntu-2110 READY
daily-ubuntu-2110-impish-v20210725 ubuntu-os-cloud-devel ubuntu-2110 READY
daily-ubuntu-2110-impish-v20210728 ubuntu-os-cloud-devel ubuntu-2110
This problem also appears to be reproducible via the gcloud UI, create a new virtual machine using the daily-ubuntu-2110-impish-v20210720 or greater and instruct the virtual machine to import a ssh_pub_key in the security tab. The Instance will start, yet still be inaccessible via the users private sshkey
The google-guest-agent.service appears to be responsible for adding the google project ssh keys to the instance once its deployed. Please see below when queried on the 20210719 image:
google-guest-agent.service - Google Compute Engine Guest Agent
Loaded: loaded (/lib/systemd/system/google-guest-agent.service; enabled; vendor preset: enabled)
Active: active (running) since Tue 2021-07-27 19:47:48 UTC; 18h ago
Main PID: 711 (google_guest_ag)
Tasks: 9 (limit: 8924)
Memory: 19.7M
CGroup: /system.slice/google-guest-agent.service
└─711 /usr/bin/google_guest_agent
Jul 27 19:47:55 sean-imp gpasswd[1469]: user google added by root to group floppy
Jul 27 19:47:55 sean-imp gpasswd[1475]: user google added by root to group audio
Jul 27 19:47:55 sean-imp gpasswd[1481]: user google added by root to group dip
Jul 27 19:47:55 sean-imp gpasswd[1487]: user google added by root to group video
Jul 27 19:47:55 sean-imp gpasswd[1493]: user google added by root to group plugdev
Jul 27 19:47:55 sean-imp gpasswd[1499]: user google added by root to group netdev
Jul 27 19:47:55 sean-imp gpasswd[1505]: user google added by root to group lxd
Jul 27 19:47:55 sean-imp gpasswd[1511]: user google added by root to group google-sudoers
Jul 27 19:47:55 sean-imp GCEGuestAgent[711]: 2021-07-27T19:47:55.1699Z GCEGuestAgent Info: Updating keys for user google.
Jul 27 19:47:55 sean-imp google_guest_agent[711]: 2021/07/27 19:47:55 logging client: rpc error: code = PermissionDenied desc = Clo>
lines 1-19/19 (END) |
|
2021-10-14 14:26:20 |
Launchpad Janitor |
cloud-init (Ubuntu Hirsute): status |
Fix Committed |
Fix Released |
|
2021-10-14 14:26:24 |
Brian Murray |
removed subscriber Ubuntu Stable Release Updates Team |
|
|
|
2021-10-14 14:26:47 |
Launchpad Janitor |
cloud-init (Ubuntu Focal): status |
Fix Committed |
Fix Released |
|
2021-10-14 14:27:17 |
Launchpad Janitor |
cloud-init (Ubuntu Bionic): status |
Fix Committed |
Fix Released |
|
2021-10-19 16:23:05 |
Launchpad Janitor |
netplan.io (Ubuntu): status |
New |
Confirmed |
|
2022-01-10 17:58:28 |
Brian Murray |
netplan.io (Ubuntu): status |
Confirmed |
Fix Released |
|
2022-01-26 22:03:55 |
Brian Murray |
google-guest-agent (Ubuntu Hirsute): status |
Confirmed |
Won't Fix |
|
2022-01-28 15:05:31 |
Launchpad Janitor |
google-guest-agent (Ubuntu): status |
Confirmed |
Fix Released |
|
2022-07-18 22:58:28 |
Brian Murray |
google-guest-agent (Ubuntu Impish): status |
Confirmed |
Won't Fix |
|
2022-08-16 14:58:42 |
Launchpad Janitor |
google-guest-agent (Ubuntu Focal): status |
Confirmed |
Fix Released |
|
2022-08-16 15:03:52 |
Launchpad Janitor |
google-guest-agent (Ubuntu Bionic): status |
Confirmed |
Fix Released |
|