[Cloud-init 18.5][CentOS 7 on vSphere] Crash when configuring static dual-stack (IPv4 + IPv6) networking

Bug #1850988 reported by Peter on 2019-11-01
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
cloud-init
Medium
Ryan Harper

Bug Description

Environment:
  - Stock CentOS 7 image template (comes with OpenVM tools) with cloud-init 18.5 installed
  - Single NIC VM
  - vSphere 6.5 hypervisor

Repro steps:
  - Customize the VM with a vSphere customization spec that has NIC setting with static IPv4 and IPv6 information
  - OpenVM tools running inside guest will delegate guest customization to cloud-init
  - Cloud-init crashes with ValueError: Unknown subnet type 'static6' found for interface 'ens192' . See the following relevant excerts and stacktrace (found in /var/log/cloudinit.log):

[...snip...]
2019-11-01 02:23:41,899 - DataSourceOVF.py[DEBUG]: Found VMware Customization Config File at /var/run/vmware-imc/cust.cfg
2019-11-01 02:23:41,899 - config_file.py[INFO]: Parsing the config file /var/run/vmware-imc/cust.cfg.
2019-11-01 02:23:41,900 - config_file.py[DEBUG]: FOUND CATEGORY = 'NETWORK'
2019-11-01 02:23:41,900 - config_file.py[DEBUG]: ADDED KEY-VAL :: 'NETWORK|NETWORKING' = 'yes'
2019-11-01 02:23:41,900 - config_file.py[DEBUG]: ADDED KEY-VAL :: 'NETWORK|BOOTPROTO' = 'dhcp'
2019-11-01 02:23:41,900 - config_file.py[DEBUG]: ADDED KEY-VAL :: 'NETWORK|HOSTNAME' = 'pr-centos-ci'
2019-11-01 02:23:41,900 - config_file.py[DEBUG]: ADDED KEY-VAL :: 'NETWORK|DOMAINNAME' = 'gsslabs.local'
2019-11-01 02:23:41,900 - config_file.py[DEBUG]: FOUND CATEGORY = 'NIC-CONFIG'
2019-11-01 02:23:41,900 - config_file.py[DEBUG]: ADDED KEY-VAL :: 'NIC-CONFIG|NICS' = 'NIC1'
2019-11-01 02:23:41,900 - config_file.py[DEBUG]: FOUND CATEGORY = 'NIC1'
2019-11-01 02:23:41,902 - config_file.py[DEBUG]: ADDED KEY-VAL :: 'NIC1|MACADDR' = '00:50:56:89:b7:48'
2019-11-01 02:23:41,902 - config_file.py[DEBUG]: ADDED KEY-VAL :: 'NIC1|ONBOOT' = 'yes'
2019-11-01 02:23:41,902 - config_file.py[DEBUG]: ADDED KEY-VAL :: 'NIC1|IPv4_MODE' = 'BACKWARDS_COMPATIBLE'
2019-11-01 02:23:41,902 - config_file.py[DEBUG]: ADDED KEY-VAL :: 'NIC1|BOOTPROTO' = 'static'
2019-11-01 02:23:41,902 - config_file.py[DEBUG]: ADDED KEY-VAL :: 'NIC1|IPADDR' = '1.1.1.4'
2019-11-01 02:23:41,902 - config_file.py[DEBUG]: ADDED KEY-VAL :: 'NIC1|NETMASK' = '255.255.255.0'
2019-11-01 02:23:41,902 - config_file.py[DEBUG]: ADDED KEY-VAL :: 'NIC1|IPv6ADDR|1' = '2600::10'
2019-11-01 02:23:41,902 - config_file.py[DEBUG]: ADDED KEY-VAL :: 'NIC1|IPv6NETMASK|1' = '64'
2019-11-01 02:23:41,903 - config_file.py[DEBUG]: ADDED KEY-VAL :: 'NIC1|IPv6GATEWAY|1' = '2600::1'
2019-11-01 02:23:41,903 - config_file.py[DEBUG]: FOUND CATEGORY = 'DNS'
2019-11-01 02:23:41,903 - config_file.py[DEBUG]: ADDED KEY-VAL :: 'DNS|DNSFROMDHCP' = 'no'
2019-11-01 02:23:41,904 - config_file.py[DEBUG]: ADDED KEY-VAL :: 'DNS|SUFFIX|1' = 'sqa.local'
2019-11-01 02:23:41,904 - config_file.py[DEBUG]: ADDED KEY-VAL :: 'DNS|NAMESERVER|1' = '192.168.0.10'
2019-11-01 02:23:41,904 - config_file.py[DEBUG]: ADDED KEY-VAL :: 'DNS|NAMESERVER|2' = 'fc00:10:118:192:250:56ff:fe89:64a8'
2019-11-01 02:23:41,904 - config_file.py[DEBUG]: FOUND CATEGORY = 'DATETIME'
2019-11-01 02:23:41,904 - config_file.py[DEBUG]: ADDED KEY-VAL :: 'DATETIME|TIMEZONE' = 'Asia/Kolkata'
2019-11-01 02:23:41,904 - config_file.py[DEBUG]: ADDED KEY-VAL :: 'DATETIME|UTC' = 'no'
2019-11-01 02:23:41,904 - DataSourceOVF.py[DEBUG]: Preparing the Network configuration
2019-11-01 02:23:41,907 - util.py[DEBUG]: Running command ['ip', 'addr', 'show'] with allowed return codes [0] (shell=False, capture=True)
2019-11-01 02:23:41,926 - config_nic.py[INFO]: Configuring the interfaces file
2019-11-01 02:23:41,927 - config_nic.py[INFO]: Debian OS not detected. Skipping the configure step
2019-11-01 02:23:41,927 - util.py[DEBUG]: Recursively deleting /var/run/vmware-imc

[...snip...]

2019-11-01 02:23:43,225 - stages.py[INFO]: Applying network configuration from ds bringup=False: {'version': 1, 'config': [{'subnets': [{'control': 'auto', 'netmask': '255.255.255.0', 'type': 'static', 'address': '1.1.1.4'}, {'netmask': '64', 'type': 'static6', 'address': '2600::10'}], 'type': 'physical', 'name': u'ens192', 'mac_address': '00:50:56:89:b7:48'}, {'search': ['sqa.local'], 'type': 'nameserver', 'address': ['192.168.0.10', 'fc00:10:118:192:250:56ff:fe89:64a8']}]}
2019-11-01 02:23:43,226 - __init__.py[DEBUG]: Selected renderer 'sysconfig' from priority list: None
2019-11-01 02:23:43,244 - util.py[WARNING]: failed stage init-local
2019-11-01 02:23:43,249 - util.py[DEBUG]: failed stage init-local
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/cloudinit/cmd/main.py", line 652, in status_wrapper
    ret = functor(name, args)
  File "/usr/lib/python2.7/site-packages/cloudinit/cmd/main.py", line 362, in main_init
    init.apply_network_config(bring_up=bool(mode != sources.DSMODE_LOCAL))
  File "/usr/lib/python2.7/site-packages/cloudinit/stages.py", line 672, in apply_network_config
    return self.distro.apply_network_config(netcfg, bring_up=bring_up)
  File "/usr/lib/python2.7/site-packages/cloudinit/distros/__init__.py", line 178, in apply_network_config
    dev_names = self._write_network_config(netconfig)
  File "/usr/lib/python2.7/site-packages/cloudinit/distros/rhel.py", line 65, in _write_network_config
    return self._supported_write_network_config(netconfig)
  File "/usr/lib/python2.7/site-packages/cloudinit/distros/__init__.py", line 93, in _supported_write_network_config
    renderer.render_network_config(network_config)
  File "/usr/lib/python2.7/site-packages/cloudinit/net/renderer.py", line 56, in render_network_config
    templates=templates, target=target)
  File "/usr/lib/python2.7/site-packages/cloudinit/net/sysconfig.py", line 641, in render_network_state
    templates=templates).items():
  File "/usr/lib/python2.7/site-packages/cloudinit/net/sysconfig.py", line 614, in _render_sysconfig
    cls._render_physical_interfaces(network_state, iface_contents)
  File "/usr/lib/python2.7/site-packages/cloudinit/net/sysconfig.py", line 472, in _render_physical_interfaces
    cls._render_subnets(iface_cfg, iface_subnets)
  File "/usr/lib/python2.7/site-packages/cloudinit/net/sysconfig.py", line 345, in _render_subnets
    iface_cfg.name))
ValueError: Unknown subnet type 'static6' found for interface 'ens192'

Related branches

Peter (devkits) wrote :
Ryan Harper (raharper) on 2019-11-01
Changed in cloud-init:
importance: Undecided → Medium
status: New → Triaged
Peter (devkits) wrote :

Hi,

Thank you for triaging this bug.

It looks like the fix is pending, which is great.

Until the fix is makes it in, are there any known/reasonable and scalable workarounds for this? For example, can cloud-init be parameterized to run a script on boot (before networking is configured) and patch some files on the target system? Any other suggestions are welcome.

Thanks,
P.

Ryan Harper (raharper) wrote :

Hi Peter,

Unfortunately bringing up networking is one of the first things cloud-init does even before we run things like 'bootcmd' scripts.

I'm not familiar enough with VMWare to know if there's a way to modify the images before it is booted or upload a different template image. If you can do either, you could apply the patch in the linked branch.

Peter (devkits) wrote :

Thanks for the workaround suggestions. We ended up patching the cloud-init source in our templates.

Any thoughts on when the fix may end up in a future GA/stable version of cloud-init?

Thanks,
P.

On Tue, Nov 5, 2019 at 7:50 AM Peter <email address hidden> wrote:

> Thanks for the workaround suggestions. We ended up patching the cloud-
> init source in our templates.
>
> Any thoughts on when the fix may end up in a future GA/stable version of
> cloud-init?
>

In the next few weeks; this branch, or something like it, will land in
master
and we'll start the ubuntu Stable Release Update process.

I suggest, if you've not already done so, to file a downstream issue with
RHEL/Centos against cloud-init there; the maintainers may cherry pick
the fix and push an update into downstream releases.

> Thanks,
> P.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1850988
>
> Title:
> [Cloud-init 18.5][CentOS 7 on vSphere] Crash when configuring static
> dual-stack (IPv4 + IPv6) networking
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/cloud-init/+bug/1850988/+subscriptions
>

admin (hardcore-01) wrote :

I get a different error
still on cent os 7

cloud-init: 2019-11-27 23:10:27,170 - __init__.py[WARNING]: Error persisting instance-data.json: 'utf8' codec can't decode byte 0xbb in position 1: invalid start byte
 cloud-init: 2019-11-27 23:10:27,281 - util.py[WARNING]: failed stage init
 cloud-init: failed run of stage init
cloud-init: ------------------------------------------------------------
cloud-init: Traceback (most recent call last):
cloud-init: File "/usr/lib/python2.7/site-packages/cloudinit/cmd/main.py", line 652, in status_wrapper
cloud-init: ret = functor(name, args)
cloud-init: File "/usr/lib/python2.7/site-packages/cloudinit/cmd/main.py", line 377, in main_init
cloud-init: init.update()
cloud-init: File "/usr/lib/python2.7/site-packages/cloudinit/stages.py", line 366, in update
cloud-init: self._store_vendordata()
cloud-init: File "/usr/lib/python2.7/site-packages/cloudinit/stages.py", line 401, in _store_vendordata
cloud-init: util.write_file(self._get_ipath('vendordata_raw'), raw_vd, 0o600)
cloud-init: File "/usr/lib/python2.7/site-packages/cloudinit/util.py", line 1863, in write_file
cloud-init: content = encode_text(content)
cloud-init: File "/usr/lib/python2.7/site-packages/cloudinit/util.py", line 157, in encode_text
cloud-init: return text.encode(encoding)
cloud-init: AttributeError: 'dict' object has no attribute 'encode'
cloud-init: ------------------------------------------------------------
systemd: cloud-init.service: main process exited, code=exited, status=1/FAILURE
systemd: Failed to start Initial cloud-init job (metadata service crawler).
systemd: Unit cloud-init.service entered failed state.
systemd: cloud-init.service failed.

admin (hardcore-01) wrote :

 cloud-init: Cloud-init v. 18.5 running 'init' at Wed, 27 Nov 2019 23:10:27 +0000. Up 15.73 seconds.

Ryan Harper (raharper) wrote :

I've pulled together the previous fix, applied to to eni, sysconfig and netplan renderers and added unittests.

https://github.com/canonical/cloud-init/pull/77

Changed in cloud-init:
status: Triaged → In Progress
Ryan Harper (raharper) wrote :

@admin

The different error looks related to this one:

https://bugs.launchpad.net/cloud-init/+bug/1722992

Chad Smith (chad.smith) wrote :

A fix for this bug was committed upstream in cloud-init at
https://github.com/canonical/cloud-init/commit/dacdd30080bd8183d1f1c1dc9dbcbc8448301529

Changed in cloud-init:
assignee: nobody → Ryan Harper (raharper)
status: In Progress → Fix Committed

This bug is believed to be fixed in cloud-init in version 20.1. If this is still a problem for you, please make a comment and set the state back to New

Thank you.

Changed in cloud-init:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers