Oracle DataSource Fails When Used With a Bionic Image

Bug #1939603 reported by John Chittum
16
This bug affects 2 people
Affects Status Importance Assigned to Milestone
cloud-init (Ubuntu)
Fix Released
Medium
Unassigned
Bionic
Fix Released
Undecided
Unassigned
Focal
Fix Released
Undecided
Unassigned
Hirsute
Fix Released
Undecided
Unassigned

Bug Description

=== Begin SRU Template ===
[Impact]
When attempting to launch a Bionic instance on Oracle Cloud Infrastructure, with an explicitly set datasource: [ Oracle ], the instance fails to run the OracleDataSource. This eventually leads to cloud-init falling back to NoDataSource. The root cause is cloud-init attempting to add routes to create an Ephemeral DHCP network. We can instead check for a response from the hardcoded metadata URL and skip adding unnecessary routes.

[Test Case]
1. Launch Oracle Bionic instance
2. Install cloud-init proposed version
3. mv /etc/cloud/cloud.cfg.d/99-oracle-compute-infra-datasource.cfg /etc/cloud/cloud.cfg.d/99-oracle-compute-infra-datasource.cfg.bak # File included as part of image build process
4. Enable Oracle in `dpkg-reconfigure cloud-init` # Only required for existing instances
5. Verify the datasource listed via `cloud-init status -l` shows DataSourceOracle and not DataSourceNoCloud or DataSourceOpenStack
6. Verify /var/log/cloud-init.log has no errors due to setting up routes.

[Regression Potential]
If the metadata service is down, we'll fall back to the erroneous behavior. However, cloud-init will fail in other ways if the metadata service is inaccessible.

[Other Info]
Github PR: https://github.com/canonical/cloud-init/pull/988
Upstream commit:
https://github.com/canonical/cloud-init/commit/612e39087aee3b1242765e7c4f463f54a6ebd723

=== End SRU Template ===

Initial bug:

When attempting to launch a Bionic instance on Oracle Cloud Infrastructure, with an explicitly set datasource: [ Oracle ], the instance fails to run the OracleDataSource. This leads to the instance not having SSH keys imported from the metadata service. The failure is related to the command:

Running command ['ip', '-4', 'route', 'add', '0.0.0.0/0', 'via', '10.0.0.1', 'dev', 'ens3'] with allowed return codes [0] (shell=False, capture=True)

Which showed up in the logs :

2021-08-11 13:56:13,289 - util.py[DEBUG]: Reading from /var/tmp/cloud-init/cloud-init-dhcp-p8n35ztd/dhcp.leases (quiet=False)
2021-08-11 13:56:13,289 - util.py[DEBUG]: Read 519 bytes from /var/tmp/cloud-init/cloud-init-dhcp-p8n35ztd/dhcp.leases
2021-08-11 13:56:13,289 - dhcp.py[DEBUG]: Received dhcp lease on ens3 for 10.0.0.66/255.255.255.0
2021-08-11 13:56:13,289 - __init__.py[DEBUG]: Attempting setup of ephemeral network on ens3 with 10.0.0.66/24 brd 10.0.0.255
2021-08-11 13:56:13,289 - subp.py[DEBUG]: Running command ['ip', '-family', 'inet', 'addr', 'add', '10.0.0.66/24', 'broadcast', '10.0.0.255', 'dev', 'ens3'] with allowed return codes [0] (shell=False, capture=True)
2021-08-11 13:56:13,291 - __init__.py[DEBUG]: Skip ephemeral network setup, ens3 already has address 10.0.0.66
2021-08-11 13:56:13,291 - subp.py[DEBUG]: Running command ['ip', '-4', 'route', 'add', '0.0.0.0/0', 'via', '10.0.0.1', 'dev', 'ens3'] with allowed return codes [0] (shell=False, capture=True)
2021-08-11 13:56:13,293 - handlers.py[DEBUG]: finish: init-local/search-Oracle: FAIL: no local data found from DataSourceOracle
2021-08-11 13:56:13,293 - util.py[WARNING]: Getting data from <class 'cloudinit.sources.DataSourceOracle.DataSourceOracle'> failed
2021-08-11 13:56:13,293 - util.py[DEBUG]: Getting data from <class 'cloudinit.sources.DataSourceOracle.DataSourceOracle'> failed
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/cloudinit/sources/__init__.py", line 792, in find_source
    if s.update_metadata([EventType.BOOT_NEW_INSTANCE]):
  File "/usr/lib/python3/dist-packages/cloudinit/sources/__init__.py", line 681, in update_metadata
    result = self.get_data()
  File "/usr/lib/python3/dist-packages/cloudinit/sources/__init__.py", line 292, in get_data
    return_value = self._get_data()
  File "/usr/lib/python3/dist-packages/cloudinit/sources/DataSourceOracle.py", line 138, in _get_data
    with network_context:
  File "/usr/lib/python3/dist-packages/cloudinit/net/dhcp.py", line 57, in __enter__
    return self.obtain_lease()
  File "/usr/lib/python3/dist-packages/cloudinit/net/dhcp.py", line 110, in obtain_lease
    ephipv4.__enter__()
  File "/usr/lib/python3/dist-packages/cloudinit/net/__init__.py", line 1088, in __enter__
    self._bringup_static_routes()
  File "/usr/lib/python3/dist-packages/cloudinit/net/__init__.py", line 1142, in _bringup_static_routes
    ['dev', self.interface], capture=True)
  File "/usr/lib/python3/dist-packages/cloudinit/subp.py", line 295, in subp
    cmd=args)
cloudinit.subp.ProcessExecutionError: Unexpected error while running command.
Command: ['ip', '-4', 'route', 'add', '0.0.0.0/0', 'via', '10.0.0.1', 'dev', 'ens3']
Exit code: 2
Reason: -
Stdout:
Stderr: RTNETLINK answers: File exists

This eventually leads to cloud-init falling back to NoDataSource.

To create this image, I:

* Updated CPC's livecd-rootfs code for Oracle to include:

# etc/cloud/cloud.cfg.d/99-oracle-compute-infra-datasource.cfg"
# Configuration for Oracle Cloud Infrastructure
datasource_list: [ Oracle ]

* created an image using CPC's livecd-rootfs using ubuntu-bartender
* registered a custom image in OCI
* attempted to create an instance using the custom image

I was unable to connect via ssh, getting "Permission denied (publickey)"
I attempted to create a serial connection, however, I was never able to successfully SSH in. It just hung forever.

In a second attempt, I tried to pass in a username:password to cloud-init. However, due to the failure of the datasource, and fallback to NoDataSource, my custom data was not loaded either

I was able to collect logs by terminating the instance, but keeping the boot volume. I then created a Bionic instance using the platform image, and verified that it worked with the OpenStack datasource currently in use. I then attached the boot volume from the now terminated instance as a block volume, ran the required iscsi commands (found via the web console after attaching the block volume), and mounted the drive to /mnt/nods. I was then able to collect the logs in /mnt/nods/var/log/cloud-init*. Because of how I had to collect logs, i was unable to run `cloud-init collect-logs`. I actually could run cloud-init in a chroot setup, like `sudo chroot /mnt/nods cloud-init collect-logs`. This failed with being unable to find the command `cloud-init`. Honestly not sure if that's the correct approach in the circumstance.

To reproduce, an image would need made with the datasource explicitly set to Oracle.

Revision history for this message
John Chittum (jchittum) wrote :
James Falcon (falcojr)
Changed in cloud-init (Ubuntu):
status: New → Triaged
importance: Undecided → Medium
Revision history for this message
James Falcon (falcojr) wrote :
James Falcon (falcojr)
Changed in cloud-init (Ubuntu):
status: Triaged → Fix Committed
James Falcon (falcojr)
description: updated
description: updated
description: updated
Revision history for this message
Brian Murray (brian-murray) wrote : Please test proposed package

Hello John, or anyone else affected,

Accepted cloud-init into hirsute-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/cloud-init/21.3-1-g6803368d-0ubuntu1~21.04.3 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-hirsute to verification-done-hirsute. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-hirsute. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in cloud-init (Ubuntu Hirsute):
status: New → Fix Committed
tags: added: verification-needed verification-needed-hirsute
Changed in cloud-init (Ubuntu Focal):
status: New → Fix Committed
tags: added: verification-needed-focal
Revision history for this message
Brian Murray (brian-murray) wrote :

Hello John, or anyone else affected,

Accepted cloud-init into focal-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/cloud-init/21.3-1-g6803368d-0ubuntu1~20.04.3 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-focal to verification-done-focal. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-focal. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Revision history for this message
Brian Murray (brian-murray) wrote :

Hello John, or anyone else affected,

Accepted cloud-init into bionic-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/cloud-init/21.3-1-g6803368d-0ubuntu1~18.04.3 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-bionic to verification-done-bionic. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-bionic. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in cloud-init (Ubuntu Bionic):
status: New → Fix Committed
tags: added: verification-needed-bionic
James Falcon (falcojr)
description: updated
Revision history for this message
James Falcon (falcojr) wrote :

My manual testing passed.

I followed the test case procedures manually. No Tracebacks or WARNings exist in /var/log/cloud-init.log . Some queries:

$ cloud-init query v1.cloud_name
oracle

$ cloud-init query v1.platform
oracle

$ cloud-init status -l
status: done
time: Thu, 23 Sep 2021 13:23:49 +0000
detail:
DataSourceOracle

I'm leaving this test manual as I'm not entirely sure this will be default Bionic behavior (as opposed to keeping OpenStack datasource).

Revision history for this message
James Falcon (falcojr) wrote (last edit ):

Attach file 21.3-1_series3.tar.gz.

These are the latest Azure and Oracle runs represented by this issue.

James Falcon (falcojr)
tags: added: verification-done verification-done-bionic verification-done-focal verification-done-hirsute
removed: verification-needed verification-needed-bionic verification-needed-focal verification-needed-hirsute
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (6.1 KiB)

This bug was fixed in the package cloud-init - 21.3-1-g6803368d-0ubuntu1~21.04.3

---------------
cloud-init (21.3-1-g6803368d-0ubuntu1~21.04.3) hirsute; urgency=medium

  * cherry-pick 612e3908: Add connectivity_url to Oracle's
    EphemeralDHCPv4 (#988) (LP: #1939603)
  * cherry-pick dc227869: Set Azure to apply networking config every BOOT
    (#1023)

cloud-init (21.3-1-g6803368d-0ubuntu1~21.04.2) hirsute; urgency=medium

  * cherry-pick 28e56d99: Azure: Retry dhcp on timeouts when polling
    reprovisiondata
  * cherry-pick e69a8874: Set Azure to only update metadata on
    BOOT_NEW_INSTANCE

cloud-init (21.3-1-g6803368d-0ubuntu1~21.04.1) hirsute; urgency=medium

  * d/cloud-init.templates: Add VMware datasource support
  * d/control: Add dependencies on python3-netifaces for VMware ds
  * New upstream snapshot. (LP: #1940871)
    - testing: Fix ssh keys integration test (#992)
    - Release 21.3 (#993)
    - Azure: During primary nic detection, check interface status continuously
      before rebinding again (#990) [aswinrajamannar]
    - Fix home permissions modified by ssh module (SC-338) (#984)
    - Add integration test for sensitive jinja substitution (#986)
    - Ignore hotplug socket when collecting logs (#985)
    - testing: Add missing mocks to test_vmware.py (#982)
    - add Zadara Edge Cloud Platform to the supported clouds list (#963)
      [sarahwzadara]
    - testing: skip upgrade tests on LXD VMs (#980)
    - Only invoke hotplug socket when functionality is enabled (#952)
    - Revert unnecessary lcase in ds-identify (#978) [Andrew Kutz]
    - cc_resolv_conf: fix typos (#969) [Shreenidhi Shedi]
    - Replace broken httpretty tests with mock (SC-324) (#973)
    - Azure: Check if interface is up after sleep when trying to bring it up
      (#972) [aswinrajamannar]
    - Update dscheck_VMware's rpctool check (#970) [Shreenidhi Shedi]
    - Azure: Logging the detected interfaces (#968) [Moustafa Moustafa]
    - Change netifaces dependency to 0.10.4 (#965) [Andrew Kutz]
    - Azure: Limit polling network metadata on connection errors (#961)
      [aswinrajamannar]
    - Update inconsistent indentation (#962) [Andrew Kutz]
    - cc_puppet: support AIO installations and more (#960) [Gabriel Nagy]
    - Add Puppet contributors to CLA signers (#964) [Noah Fontes]
    - Datasource for VMware (#953) [Andrew Kutz]
    - photon: refactor hostname handling and add networkd activator (#958)
      [sshedi]
    - Stop copying ssh system keys and check folder permissions (#956)
      [Emanuele Giuseppe Esposito]
    - testing: port remaining cloud tests to integration testing framework
      (SC-191) (#955)
    - generate contents for ovf-env.xml when provisioning via IMDS (#959)
      [Anh Vo]
    - Add support for EuroLinux 7 && EuroLinux 8 (#957) [Aleksander Baranowski]
    - Implementing device_aliases as described in docs (#945) [Mal Graty]
    - testing: fix test_ssh_import_id.py (#954)
    - Add ability to manage fallback network config on PhotonOS (#941) [sshedi]
    - Add VZLinux support (#951) [eb3095]
    - VMware: add network-config support in ovf-env.xml (#947) [PengpengSun]
    - Update pylint to v2.9.3 and fix the new issues...

Read more...

Changed in cloud-init (Ubuntu Hirsute):
status: Fix Committed → Fix Released
Revision history for this message
Brian Murray (brian-murray) wrote : Update Released

The verification of the Stable Release Update for cloud-init has completed successfully and the package is now being released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (6.1 KiB)

This bug was fixed in the package cloud-init - 21.3-1-g6803368d-0ubuntu1~20.04.3

---------------
cloud-init (21.3-1-g6803368d-0ubuntu1~20.04.3) focal; urgency=medium

  * cherry-pick 612e3908: Add connectivity_url to Oracle's
    EphemeralDHCPv4 (#988) (LP: #1939603)
  * cherry-pick dc227869: Set Azure to apply networking config every BOOT
    (#1023)

cloud-init (21.3-1-g6803368d-0ubuntu1~20.04.2) focal; urgency=medium

  * cherry-pick 28e56d99: Azure: Retry dhcp on timeouts when polling
    reprovisiondata
  * cherry-pick e69a8874: Set Azure to only update metadata on
    BOOT_NEW_INSTANCE

cloud-init (21.3-1-g6803368d-0ubuntu1~20.04.1) focal; urgency=medium

  * d/cloud-init.templates: Add VMware datasource support
  * d/control: Add dependencies on python3-netifaces for VMware ds
  * New upstream snapshot. (LP: #1940871)
    - testing: Fix ssh keys integration test (#992)
    - Release 21.3 (#993)
    - Azure: During primary nic detection, check interface status continuously
      before rebinding again (#990) [aswinrajamannar]
    - Fix home permissions modified by ssh module (SC-338) (#984)
    - Add integration test for sensitive jinja substitution (#986)
    - Ignore hotplug socket when collecting logs (#985)
    - testing: Add missing mocks to test_vmware.py (#982)
    - add Zadara Edge Cloud Platform to the supported clouds list (#963)
      [sarahwzadara]
    - testing: skip upgrade tests on LXD VMs (#980)
    - Only invoke hotplug socket when functionality is enabled (#952)
    - Revert unnecessary lcase in ds-identify (#978) [Andrew Kutz]
    - cc_resolv_conf: fix typos (#969) [Shreenidhi Shedi]
    - Replace broken httpretty tests with mock (SC-324) (#973)
    - Azure: Check if interface is up after sleep when trying to bring it up
      (#972) [aswinrajamannar]
    - Update dscheck_VMware's rpctool check (#970) [Shreenidhi Shedi]
    - Azure: Logging the detected interfaces (#968) [Moustafa Moustafa]
    - Change netifaces dependency to 0.10.4 (#965) [Andrew Kutz]
    - Azure: Limit polling network metadata on connection errors (#961)
      [aswinrajamannar]
    - Update inconsistent indentation (#962) [Andrew Kutz]
    - cc_puppet: support AIO installations and more (#960) [Gabriel Nagy]
    - Add Puppet contributors to CLA signers (#964) [Noah Fontes]
    - Datasource for VMware (#953) [Andrew Kutz]
    - photon: refactor hostname handling and add networkd activator (#958)
      [sshedi]
    - Stop copying ssh system keys and check folder permissions (#956)
      [Emanuele Giuseppe Esposito]
    - testing: port remaining cloud tests to integration testing framework
      (SC-191) (#955)
    - generate contents for ovf-env.xml when provisioning via IMDS (#959)
      [Anh Vo]
    - Add support for EuroLinux 7 && EuroLinux 8 (#957) [Aleksander Baranowski]
    - Implementing device_aliases as described in docs (#945) [Mal Graty]
    - testing: fix test_ssh_import_id.py (#954)
    - Add ability to manage fallback network config on PhotonOS (#941) [sshedi]
    - Add VZLinux support (#951) [eb3095]
    - VMware: add network-config support in ovf-env.xml (#947) [PengpengSun]
    - Update pylint to v2.9.3 and fix the new issues it sp...

Read more...

Changed in cloud-init (Ubuntu Focal):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (6.5 KiB)

This bug was fixed in the package cloud-init - 21.3-1-g6803368d-0ubuntu1~18.04.3

---------------
cloud-init (21.3-1-g6803368d-0ubuntu1~18.04.3) bionic; urgency=medium

  * d/cloud-init.templates: Add Oracle datasource support
  * cherry-pick 612e3908: Add connectivity_url to Oracle's
    EphemeralDHCPv4 (#988) (LP: #1939603)
  * cherry-pick dc227869: Set Azure to apply networking config every BOOT
    (#1023)

cloud-init (21.3-1-g6803368d-0ubuntu1~18.04.2) bionic; urgency=medium

  * cherry-pick 28e56d99: Azure: Retry dhcp on timeouts when polling
    reprovisiondata
  * cherry-pick e69a8874: Set Azure to only update metadata on
    BOOT_NEW_INSTANCE

cloud-init (21.3-1-g6803368d-0ubuntu1~18.04.1) bionic; urgency=medium

  * d/cloud-init.templates: Add VMware datasource support
  * d/control: Add dependencies on python3-netifaces for VMware ds
  * d/patches/ubuntu-advantage-revert-tip.patch: drop revert patch
    + ubuntu-advantage-tools completed SRU to bionic. Bionic now
      compatible with upstream ua python-client CLI behavior.
  * refresh patches:
   + debian/patches/ec2-dont-apply-full-imds-network-config.patch
   + debian/patches/openstack-no-network-config.patch
   + debian/patches/renderer-do-not-prefer-netplan.patch
  * New upstream snapshot. (LP: #1940871)
    - testing: Fix ssh keys integration test (#992)
    - Release 21.3 (#993)
    - Azure: During primary nic detection, check interface status continuously
      before rebinding again (#990) [aswinrajamannar]
    - Fix home permissions modified by ssh module (SC-338) (#984)
    - Add integration test for sensitive jinja substitution (#986)
    - Ignore hotplug socket when collecting logs (#985)
    - testing: Add missing mocks to test_vmware.py (#982)
    - add Zadara Edge Cloud Platform to the supported clouds list (#963)
      [sarahwzadara]
    - testing: skip upgrade tests on LXD VMs (#980)
    - Only invoke hotplug socket when functionality is enabled (#952)
    - Revert unnecessary lcase in ds-identify (#978) [Andrew Kutz]
    - cc_resolv_conf: fix typos (#969) [Shreenidhi Shedi]
    - Replace broken httpretty tests with mock (SC-324) (#973)
    - Azure: Check if interface is up after sleep when trying to bring it up
      (#972) [aswinrajamannar]
    - Update dscheck_VMware's rpctool check (#970) [Shreenidhi Shedi]
    - Azure: Logging the detected interfaces (#968) [Moustafa Moustafa]
    - Change netifaces dependency to 0.10.4 (#965) [Andrew Kutz]
    - Azure: Limit polling network metadata on connection errors (#961)
      [aswinrajamannar]
    - Update inconsistent indentation (#962) [Andrew Kutz]
    - cc_puppet: support AIO installations and more (#960) [Gabriel Nagy]
    - Add Puppet contributors to CLA signers (#964) [Noah Fontes]
    - Datasource for VMware (#953) [Andrew Kutz]
    - photon: refactor hostname handling and add networkd activator (#958)
      [sshedi]
    - Stop copying ssh system keys and check folder permissions (#956)
      [Emanuele Giuseppe Esposito]
    - testing: port remaining cloud tests to integration testing framework
      (SC-191) (#955)
    - generate contents for ovf-env.xml when provisioning via IMDS (#959)
      [Anh Vo]
    - ...

Read more...

Changed in cloud-init (Ubuntu Bionic):
status: Fix Committed → Fix Released
James Falcon (falcojr)
Changed in cloud-init (Ubuntu):
status: Fix Committed → Fix Released
Revision history for this message
zhouzhong (zhouzhong) wrote :

Hi, this affected me too, what is the root case of the Bionic route conflict? Bionic also pre-set a default route when startup?

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.