full config file wiped after apt-upgrade issued

Bug #1766401 reported by James Tigert on 2018-04-24
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
cloud-images
Undecided
Philip Roche
cloud-init
Medium
Unassigned
cloud-init (Ubuntu)
Low
Unassigned
Xenial
High
Unassigned
Artful
High
Unassigned
Bionic
Low
Unassigned
Cosmic
Low
Unassigned

Bug Description

=== Begin SRU Template ===
[Impact]
Upgrades of cloud-init on official Ubuntu 16.04 and 17.10 images
running on IBM public cloud will be unreachable after a reboot.

This is because cloud-init incorrectly recognizes the newly added
'IBMCloud' as the datasource to use and skips use of the previously used
ConfigDrive or NoCloud datasource. This is because
a.) Official 16.04 and 17.10 images of Ubuntu have seeded data
 in their /var/lib/cloud/seed/ directory that was used on first
 boot and should be continued to be used.
b.) IBMCloud's datasource in some scenarios appears looks similar
 to that of Config Drive. The version of cloud-init in xenial and
 artful checks to disable config-drive in order to identify the
 IBMCloud datasource. That should not happen unless the IBMCloud
 datasource is enabled, which is configured off in those images.

The fix is to continue using the existing datasource on those images.

[Test Case]
 * Launch an image on IBM public cloud.
 * add -proposed
 * apt-get update && apt-get install cloud-init
 * reboot
 * ssh back in

If you are able to ssh back in, then this bug has been fixed.

[Regression Potential]
This specific fix is tightly contained down the path involved in
IBM cloud. The most likely failure scenario would be failure
in 18.04 images which do not have the additional datasources built
into the images.

[Other Info]
Upstream commit at
  https://git.launchpad.net/cloud-init/commit/?id=11172924

=== End SRU Template ===

file: 50-cloud-init.cfg wiped of it's initial (working) configuration after a standard apt-upgrade was issued.

Cloud provider is SoftLayer.

Related branches

James Tigert (jimmy-tigert) wrote :

Version: Ubuntu 16.04.4

Kernel: 4.4.0-121-generic

David Britton (davidpbritton) wrote :

Some questions...

1) Do you have the previous contents of 50-cloud-init.cfg?

2) Did you make modifications to the file, and they were "reverted"?

3) Did you reboot, or was it just an `apt-get dist-upgrade` that you ran?

4) Please also attach what the cloud-init version was before and after your apt-upgrade attempt.

Changed in cloud-init:
status: New → Incomplete
tags: removed: dsid
Chad Smith (chad.smith) wrote :

A run of 'sudo cloud-init collect-logs' should create a cloud-init.tar.gz in your current working directory that could be attached to this bug. It'd give us the logs and config files that would aid triage in this case so we can see why cloud-init thought it should overwrite this file.

Download full text (3.7 KiB)

Thanks for getting back to me.

Hope this can help in some way:

1) Do you have the previous contents of 50-cloud-init.cfg?

Once the apt-upgrade was run, the contents were wiped. The only contents
that were left were

auto lo
iface lo inet loopback

2) Did you make modifications to the file, and they were "reverted"?

No mods were made. The VSI was running as installed from SoftLayer's
provisioning ISO. Whereas SEVERAL apt-upgrades (and reboots) were performed
previous to this issue, I can only assume that this is related to a newer
patch. The VSI has been up for a couple of months. I need to be clear, this
happened to 4 other VSI's I'm running, so this isn't a single user issue.
The other VSI's are configured the same way. I have 6 others that I haven't
updated as I am awaiting a bug fix. (Yes, I have already backed up their
network configs).

3) Did you reboot, or was it just an `apt-get dist-upgrade` that you
ran?

Yes. It wasn't until I rebooted that I discovered that my host (and 4
others) was unreachable. I had to KVM into the VSI to "restore" what was
lost.

4) Please also attach what the cloud-init version was before and after
your apt-upgrade attempt.

I don't have the original, I've only restored what I "think" was lost based
on another host that hasn't been updated yet. They're working again, so I
assume that what I restored was correct.

Here's what's there now (I've obviously removed the actual IP's):

 This file is generated from information provided by
# the datasource. Changes to it will not persist across an instance.
# To disable cloud-init's network configuration capabilities, write a file
# /etc/cloud/cloud.cfg.d/99-disable-network-config.cfg with the following:
# network: {config: disabled}
auto lo
iface lo inet loopback
   dns-nameservers 10.0.x.x 10.0.x.x

auto eth0
iface eth0 inet static
   address 10.x.x.x/24
   post-up route add -net 10.0.0.0 netmask 255.0.0.0 gw 10.x.x.1 || true
   pre-down route del -net 10.0.0.0 netmask 255.0.0.0 gw 10.x.x.1. || true
   post-up route add -net 161.x.x.x netmask 255.255.0.0 gw 10.x.x.1 || true
   pre-down route del -net 161.x.x.x netmask 255.255.0.0 gw 10.x.x.1 || true

auto eth1
iface eth1 inet static
  address 169.x.x.x/27
  post-up route add default gw 169.x.x.x || true
  pre-down route del default gw 169.x.x.x || true

** Changed in: cloud-init
       Status: New => Incomplete

Let me know if you need more.

*James Tigert*

*"I am not bound to win, but I am bound to be true. I am not bound to
succeed,but I am bound to live by the light that I have. I must stand with
anybody that stands right, and stand with him while he is right, and part
with him when he goes wrong." -Abraham Lincoln*

On Tue, Apr 24, 2018 at 12:28 PM, David Britton <<email address hidden>
> wrote:

> Some questions...
>
> 1) Do you have the previous contents of 50-cloud-init.cfg?
>
> 2) Did you make modifications to the file, and they were "reverted"?
>
> 3) Did you reboot, or was it just an `apt-get dist-upgrade` that you
> ran?
>
> 4) Please also attach what the cloud-init version was before and after
> your apt-upgrade attempt.
>
>
> ** Changed in: cloud-init
> Status: New => Incomplete...

Read more...

David Britton (davidpbritton) wrote :

Hi James

Thank you. If you could also please append logs from the collect-logs command that Chad requested.

David Britton (davidpbritton) wrote :

Partial Analysis from Chad...

failed upgrade path in Xenial is due to a file in image

/etc/cloud/cloud.cfg.d/99_networklayer_common.cfg

which limits datasource_list: [ ConfigDrive, NoCloud ] if we change that line to

  datasource_list: [ ConfigDrive, NoCloud, IBMCloud ]

ds-identify can discover IBMCloud in xenial. Also if we removed 99_networklayer_common.cfg altogether ds-identyify can discover IBMCloud

Dan Watkins (daniel-thewatkins) wrote :

OK, so I think the next step here is to create a test xenial image without that datasource_list and confirm that it boots as expected.

This, however, will not help the situation for already-deployed instances upgrading, so I guess we'll also need to look at having a cloud-init postinst hook go and make this modification on the fly?

Changed in cloud-images:
assignee: nobody → Philip Roche (philroche)
James Tigert (jimmy-tigert) wrote :

Sorry for the delay. Work + Life = Crazy.

Please see attached files.

*James Tigert*

*"I am not bound to win, but I am bound to be true. I am not bound to
succeed,but I am bound to live by the light that I have. I must stand with
anybody that stands right, and stand with him while he is right, and part
with him when he goes wrong." -Abraham Lincoln*

On Tue, Apr 24, 2018 at 3:06 PM, David Britton <email address hidden>
wrote:

> Hi James
>
> Thank you. If you could also please append logs from the collect-logs
> command that Chad requested.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1766401
>
> Title:
> full config file wiped after apt-upgrade issued
>
> Status in cloud-init:
> Incomplete
>
> Bug description:
> file: 50-cloud-init.cfg wiped of it's initial (working) configuration
> after a standard apt-upgrade was issued.
>
> Cloud provider is SoftLayer.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/cloud-init/+bug/1766401/+subscriptions
>

Chad Smith (chad.smith) wrote :

Validated Scott's attached fix on an upgraded Xenial exhibiting the same incorrect NoCloudDatasource identification. Once this related branch is 'Fix released' into bionic, an apt upgrade to the new cloud-init deb will correct the dhcp misconfiguration after running 'sudo cloud-init clean --reboot'

Scott Moser (smoser) on 2018-04-30
Changed in cloud-init:
status: Incomplete → Confirmed
importance: Undecided → Medium
Chad Smith (chad.smith) wrote :

An upstream commit landed for this bug.

To view that commit see the following URL:
https://git.launchpad.net/cloud-init/commit/?id=11172924

Changed in cloud-init:
status: Confirmed → Fix Committed
Scott Moser (smoser) on 2018-05-01
Changed in cloud-init (Ubuntu Xenial):
status: New → In Progress
importance: Undecided → High
Changed in cloud-init (Ubuntu Artful):
status: New → Confirmed
importance: Undecided → High
Changed in cloud-init (Ubuntu Bionic):
status: New → Confirmed
importance: Undecided → Low
Scott Moser (smoser) on 2018-05-01
description: updated

Hello James, or anyone else affected,

Accepted cloud-init into xenial-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/cloud-init/18.2-4-g05926e48-0ubuntu1~16.04.2 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-xenial to verification-done-xenial. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-xenial. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in cloud-init (Ubuntu Xenial):
status: In Progress → Fix Committed
tags: added: verification-needed verification-needed-xenial
Changed in cloud-init (Ubuntu Artful):
status: Confirmed → Fix Committed
tags: added: verification-needed-artful
Chris Halse Rogers (raof) wrote :

Hello James, or anyone else affected,

Accepted cloud-init into artful-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/cloud-init/18.2-4-g05926e48-0ubuntu1~17.10.2 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-artful to verification-done-artful. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-artful. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

James Tigert (jimmy-tigert) wrote :

Greetings,

As requested:

verification-done-artful

cloud-init version (18.2-4-g05926e48-0ubuntu1~16.04.1).

The patch seems to work! Thanks all!

-Jimmy Tigert

P.S. Let me know if you need further assistance.

Scott Moser (smoser) wrote :

marked verification-done-xenial based on comment 13 above.

Scott Moser (smoser) wrote :

Marking verification-done for artful.

IBM Cloud only provides Ubuntu at LTS versions, so there is no
artful image that we can launch. Thus, this test is a bit contrived
and the bug is quite unlikely to affect a user there.

To test anyway, we launch an Ubuntu 16.04, and manually upgrade it to
artful. Then, enable artful-proposed, install cloud-init and reboot.

From a fresh instance of 16.04 on IBM Cloud (launched with 'launch-softlayer')
from cloud-init's qa scripts and without any user-data.

Basically that looks like:
  apt-get update -qy && apt-get -qy dist-upgrade
  sed -i 's,xenial,artful,g' /etc/apt/sources.list
  apt-get -qy update
  apt-get dist-upgrade -qy
  apt-get -qy autoremove
  echo "deb http://archive.ubuntu.com/ubuntu artful-proposed main" |
     tee /etc/apt/sources.list.d/proposed.list
  apt-get update -q
  apt-get install cloud-init
  reboot
  # ssh back in and look around.

See attached file where I did all that.

tags: added: verification-done-artful
removed: verification-needed-artful
tags: added: verification-done verification-done-xenial
removed: verification-needed verification-needed-xenial
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package cloud-init - 18.2-4-g05926e48-0ubuntu1~17.10.2

---------------
cloud-init (18.2-4-g05926e48-0ubuntu1~17.10.2) artful-proposed; urgency=medium

  * cherry-pick 11172924: IBMCloud: Disable config-drive and nocloud
    only if IBMCloud (LP: #1766401)
  * cherry-pick 6ef92c98: IBMCloud: recognize provisioning environment
    during debug (LP: #1767166)

 -- Chad Smith <email address hidden> Tue, 01 May 2018 10:36:14 -0600

Changed in cloud-init (Ubuntu Artful):
status: Fix Committed → Fix Released

The verification of the Stable Release Update for cloud-init has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package cloud-init - 18.2-4-g05926e48-0ubuntu1~16.04.2

---------------
cloud-init (18.2-4-g05926e48-0ubuntu1~16.04.2) xenial-proposed; urgency=medium

  * cherry-pick 6ef92c98: IBMCloud: recognize provisioning environment
    during debug (LP: #1767166)
  * cherry-pick 11172924: IBMCloud: Disable config-drive and nocloud
    only if IBMCloud (LP: #1766401)

 -- Chad Smith <email address hidden> Mon, 30 Apr 2018 15:52:05 -0600

Changed in cloud-init (Ubuntu Xenial):
status: Fix Committed → Fix Released
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package cloud-init - 18.2-41-g3b712fce-0ubuntu1

---------------
cloud-init (18.2-41-g3b712fce-0ubuntu1) cosmic; urgency=medium

  * debian/new-upstream-snapshot: Remove script, now maintained elsewhere.
  * New upstream snapshot.
    - tests: do not rely on host /proc/cmdline in test_net.py
      [Lars Kellogg-Stedman] (LP: #1769952)
    - ds-identify: Remove dupe call to is_ds_enabled, improve debug message.
    - SmartOS: fix get_interfaces for nics that do not have addr_assign_type.
    - tests: fix package and ca_cert cloud_tests on bionic (LP: #1769985)
    - ds-identify: make shellcheck 0.4.6 happy with ds-identify.
    - pycodestyle: Fix deprecated string literals, move away from flake8.
    - azure: Add reported ready marker file. [Joshua Chan] (LP: #1765214)
    - tools: Support adding a release suffix through packages/bddeb.
    - FreeBSD: Invoke growfs on ufs filesystems such that it does not prompt.
      [Harm Weites] (LP: #1404745)
    - tools: Re-use the orig tarball in packages/bddeb if it is around.
    - netinfo: fix netdev_pformat when a nic does not have an address
      assigned. (LP: #1766302)
    - collect-logs: add -v flag, write to stderr, limit journal to single
      boot. (LP: #1766335)
    - IBMCloud: Disable config-drive and nocloud only if IBMCloud is enabled.
      (LP: #1766401)
    - Add reporting events and log_time around early source of blocking time

 -- Scott Moser <email address hidden> Fri, 11 May 2018 16:37:02 -0400

Changed in cloud-init (Ubuntu Cosmic):
status: Confirmed → Fix Released

This bug is believed to be fixed in cloud-init in version 18.3. If this is still a problem for you, please make a comment and set the state back to New

Thank you.

Changed in cloud-init:
status: Fix Committed → Fix Released
Scott Moser (smoser) on 2018-09-14
Changed in cloud-init (Ubuntu Bionic):
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers