Network config is incorrectly parsed when nameservers are specified

Bug #1843502 reported by Moustafa Moustafa
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
cloud-init
Invalid
Undecided
Unassigned
cloud-init (Suse)
Invalid
Undecided
Unassigned

Bug Description

The issue was reproduced on Azure with cloud-init 19.1 on a SLES12 SP4 machine. Looking at the code, the same behavior could be reproduced on any other configuration where the cloud provider specifies nameservers in the network configuration.
The specified nameservers in network configuration are ignored and cloud-init raises an error.
In network_state.py the function _v2_common builds a name_cmd dictionary which is then passed to the function handle_nameserver. The handle_nameserver has a decorator that enforces that passed in dictionary to have the key "address". But the _v2_common build a dictionary that has the key "addresses" instead. That results in raising an error.
Here's a snapshot of the cloud-init.log

2019-09-09 16:21:29,479 - network_state.py[DEBUG]: v2(nameserver) -> v1(nameserver):
{'search': 'xkf00b0rtzgejkug4xc2pcinre.xx.internal.cloudapp.net', 'type': 'nameserver', 'addresses': '168.63.129.16'}
2019-09-09 16:21:29,479 - network_state.py[WARNING]: Skipping invalid command: {'nameservers': {'search': 'xkf00b0rtzgejkug4xc2pcinre.xx.internal.cloudapp.net', 'addresses': '168.63.129.16'}, 'eth0': {'set-name': 'eth0', 'match': {'macaddress': u'00:0d:3a:6d:ca:25'}, 'dhcp4': True}}
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/cloudinit/net/network_state.py", line 321, in parse_config_v2
    self._v2_common(command)
  File "/usr/lib/python2.7/site-packages/cloudinit/net/network_state.py", line 697, in _v2_common
    self.handle_nameserver(name_cmd)
  File "/usr/lib/python2.7/site-packages/cloudinit/net/network_state.py", line 118, in decorator
    required_keys))
InvalidCommand: Command missing set(['address']) of required keys ['address']

Revision history for this message
Moustafa Moustafa (momousta) wrote :
Revision history for this message
Ryan Harper (raharper) wrote :

Thanks for the bug and the logs. Looking at the network-config that was generated:

>>> print(yaml.dump(nc, default_flow_style=False, indent=4))
ethernets:
    eth0:
        dhcp4: true
        match:
            macaddress: 00:0d:3a:6d:ca:25
        set-name: eth0
    nameservers:
        addresses: 168.63.129.16
        search: xkf00b0rtzgejk

The bug is that nameservers needs to be indented *under* eth0.

However, cloud-init upstream does not parse or process nameservers[1] from Azure metadata, so I can't understand why you have this bug unless the cloud-init 19.1 on SLES has some downstream patches.

1. https://git.launchpad.net/cloud-init/tree/cloudinit/sources/DataSourceAzure.py#n1305

Changed in cloud-init:
status: New → Incomplete
Revision history for this message
Robert Schweikert (rjschwei) wrote :

The SUSE cloud-init-19.1 version does not have specific patches for the Azure
source.

In prior investigation I could not figure out where the nameserver information is supposed to be coming from, not all data sources appear to provide it, including EC2 data source, yet in EC2 resolv.conf is populated correctly.

Revision history for this message
Ryan Harper (raharper) wrote :

Thanks Robert,

Moustafa, have you made any changes to the cloud-init package installed in your image?

The network_state parser works fine if you put nameservers in the correct location under the interface name. Note that, since we're using DHCP, we ignore any specific dns_nameservers that are provided in addition due to the potential conflict if the DHCP server provides it's own DHCP DNS values.

https://paste.ubuntu.com/p/X8qGqMM8CY/

Revision history for this message
Moustafa Moustafa (momousta) wrote :

I hit that issue while I was trying to provide a hackish fix for another issue which is reported in bugs.launchpad.net/cloud-init/+bug/1843634
I have some changes to the installed package as indicated on the other bug:
- I brought up eth0 by executing "ifup eth0" from DataSourceAzure code since it was down and the VM was unreachable via ssh (another hackish solution) I'm not sure why it's not brought up without my change.
- I back ported some changes from a future version that was not included in 19.1. The changes are in Azure.py in the find_endpoint method.

I tried to reverse engineer where could be a potential place to define the nameserver and the search domain and I added it in the same level as the eth0 based on my investigation which seems to be wrong in this case.

When I tried to define the nameserver and the address under "eth0", I didn't get the error I was getting but still the "/etc/resolv.conf" was not populated with any nameserver or search domain. I attached the log files for that.

Based on your comment I should not try to populate the nameserver or the search domain since it will be overridden anyways, But when I do so, The DNS is unreachable and the "/etc/resolv.conf" doesn't have any nameservers defined.

On another note, The part that I don't understand though is in the network_state.py where the _v2_common builds a dictionary with the key "addresses" not "address" in [name_cmd.update({'addresses': dns})]. This dictionary is then passed to the handle_nameserver function which requires the key "address"!

Revision history for this message
Ryan Harper (raharper) wrote :

OK, for now I'm going to mark this bug as invalid and let's sort through the bug you've filed there.

Revision history for this message
Ryan Harper (raharper) wrote :

Issue is related to local changes, marking invalid.

Changed in cloud-init:
status: Incomplete → Invalid
Changed in cloud-init (Suse):
status: New → Invalid
Revision history for this message
James Falcon (falcojr) wrote :
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.