Comment 6 for bug 1843634

Revision history for this message
Ryan Harper (raharper) wrote :

Thank you for the logs.

> Who is responsible for adding the name server and the search domain to the network_state ?

For Azure, there will not be any nameserver/search values added to network state as Azure's network-config is to DHCP only on eth0 and The DHCP _client_ from the OS runs will update /etc/resolv.conf

Looking at the log, it appears that Wickedd does run DHCP, but cloud-init.service started too early.

# here wicked starts
Sep 11 17:59:55.140087 localhost systemd[1]: Started System Logging Service.
Sep 11 17:59:55.796842 localhost systemd[1]: Started wicked DHCPv4 supplicant service.
Sep 11 17:59:55.805347 localhost systemd[1]: Started wicked DHCPv6 supplicant service.
Sep 11 17:59:55.812068 localhost systemd[1]: Started wicked AutoIPv4 supplicant service.
Sep 11 17:59:55.819849 localhost systemd[1]: Starting wicked network management service daemon...
Sep 11 17:59:55.933440 localhost systemd[1]: Started wicked network management service daemon.
Sep 11 17:59:55.940875 localhost systemd[1]: Starting wicked network nanny service...
Sep 11 17:59:56.022347 localhost systemd[1]: Started wicked network nanny service.

# then cloud-init-local.service
Sep 11 18:00:07.462643 localhost cloud-init[769]: Cloud-init v. 19.1 running 'init-local' at Wed, 11 Sep 2019 18:00:07 +0000. Up 37.63 seconds.

# then cloud-init-local runs dhclient to bring the network up to crawl Azure IMDS and write out
# network config file (/etc/sysconfig/network/ifcfg-eth0
Sep 11 18:00:09.304202 localhost dhclient[834]: Internet Systems Consortium DHCP Client 4.3.3
Sep 11 18:00:09.304631 localhost dhclient[834]: Copyright 2004-2015 Internet Systems Consortium.
Sep 11 18:00:09.304827 localhost dhclient[834]: All rights reserved.
Sep 11 18:00:09.305018 localhost dhclient[834]: For info, please visit https://www.isc.org/software/dhcp/
Sep 11 18:00:09.305262 localhost dhclient[834]:
Sep 11 18:00:09.415443 localhost dhclient[834]: Listening on LPF/eth0/00:0d:3a:6e:6f:8f
Sep 11 18:00:09.415458 localhost dhclient[834]: Sending on LPF/eth0/00:0d:3a:6e:6f:8f
Sep 11 18:00:09.415938 localhost kernel: NET: Registered protocol family 17
Sep 11 18:00:09.415496 localhost dhclient[834]: Sending on Socket/fallback
Sep 11 18:00:09.415533 localhost dhclient[834]: DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 4 (xid=0x1d3935b9)
Sep 11 18:00:09.431456 localhost dhclient[834]: DHCPOFFER from 168.63.129.16
Sep 11 18:00:14.432053 localhost dhclient[834]: DHCPREQUEST on eth0 to 255.255.255.255 port 67 (xid=0x1d3935b9)
Sep 11 18:00:14.449380 localhost dhclient[834]: DHCPACK from 168.63.129.16 (xid=0x1d3935b9)
Sep 11 18:00:14.460401 localhost dhclient[834]: bound to 10.0.0.17 -- renewal in 4294967295 seconds.

# Now cloud-init.service runs and *expects* networking to be up, but it isn't
Sep 11 18:00:15.621036 momousta-vm-945207600 cloud-init[861]: Cloud-init v. 19.1 running 'init' at Wed, 11 Sep 2019 18:00:15 +0000. Up 45.81 seconds.
Sep 11 18:00:15.636134 momousta-vm-945207600 cloud-init[861]: ci-info: +++++++++++++++++++++++++++Net device info++++++++++++++++++++++++++++
Sep 11 18:00:15.648761 momousta-vm-945207600 cloud-init[861]: ci-info: +--------+-------+-----------+-----------+-------+-------------------+
Sep 11 18:00:15.668090 momousta-vm-945207600 cloud-init[861]: ci-info: | Device | Up | Address | Mask | Scope | Hw-Address |
Sep 11 18:00:15.680368 momousta-vm-945207600 cloud-init[861]: ci-info: +--------+-------+-----------+-----------+-------+-------------------+
Sep 11 18:00:15.704126 momousta-vm-945207600 cloud-init[861]: ci-info: | eth0 | False | . | . | . | 00:0d:3a:6e:6f:8f |
Sep 11 18:00:15.720137 momousta-vm-945207600 cloud-init[861]: ci-info: | lo | True | 127.0.0.1 | 255.0.0.0 | host | . |
Sep 11 18:00:15.734575 momousta-vm-945207600 cloud-init[861]: ci-info: | lo | True | ::1/128 | . | host | . |
Sep 11 18:00:15.745009 momousta-vm-945207600 cloud-init[861]: ci-info: +--------+-------+-----------+-----------+-------+-------------------+
Sep 11 18:00:15.754527 momousta-vm-945207600 cloud-init[861]: ci-info: +++++++++++++++++++Route IPv6 info+++++++++++++++++++
Sep 11 18:00:15.765326 momousta-vm-945207600 cloud-init[861]: ci-info: +-------+-------------+---------+-----------+-------+
Sep 11 18:00:15.772241 momousta-vm-945207600 cloud-init[861]: ci-info: | Route | Destination | Gateway | Interface | Flags |
Sep 11 18:00:15.780449 momousta-vm-945207600 cloud-init[861]: ci-info: +-------+-------------+---------+-----------+-------+
Sep 11 18:00:15.787789 momousta-vm-945207600 cloud-init[861]: ci-info: +-------+-------------+---------+-----------+-------+
Sep 11 18:00:15.962107 momousta-vm-945207600 cloud-init[861]: 2019-09-11 18:00:15,947 - azure.py[ERROR]: Failed to read /var/lib/dhcp/dhclient.eth0.leases: [Errno 2] No such file or directory: '/var/lib/dhcp/dhclient.eth0.leases'

# and now we see that wickedd completes DHCP after cloud-init.service ran
ep 11 18:00:16.651842 momousta-vm-945207600 wickedd[811]: ni_process_reap: process 886 has not exited yet; now doing a blocking waitpid()
Sep 11 18:00:16.706739 momousta-vm-945207600 wickedd-dhcp4[770]: eth0: Request to acquire DHCPv4 lease with UUID 3036795d-66f7-0900-2b03-000002000000
Sep 11 18:00:16.707689 momousta-vm-945207600 wickedd-dhcp6[766]: eth0: Request to acquire DHCPv6 lease with UUID 3036795d-66f7-0900-2b03-000003000000 in mode auto
Sep 11 18:00:17.813712 momousta-vm-945207600 wickedd-dhcp4[770]: eth0: Committed DHCPv4 lease with address 10.0.0.17 (lease time 4294967295 sec, renew in 2147483647 sec, rebind in 536870911 sec)
Sep 11 18:00:18.759766 momousta-vm-945207600 dns-resolver[953]: ATTENTION: You have modified /etc/resolv.conf. Leaving it untouched...
Sep 11 18:00:18.819333 momousta-vm-945207600 dns-resolver[955]: You can find my version in /etc/resolv.conf.netconfig

So two things here:

1) cloud-init.service already has an After=wicked.service ... we need to confirm that is the correct service name that cloud-init should wait for

2) wicked doesn't like the modifications to resolv.conf that cloud-init makes
Sep 11 18:00:18.759766 momousta-vm-945207600 dns-resolver[953]: ATTENTION: You have modified /etc/resolv.conf. Leaving it untouched...
Sep 11 18:00:18.819333 momousta-vm-945207600 dns-resolver[955]: You can find my version in /etc/resolv.conf.netconfig

For (2) it appears the commit we have b74ebca563a21332b29482c8029e7908f60225a4 doesn't take into account a existing and non-empty template resolv.conf, as we load the existing content which then means we always write out the file (even with no nameservers or search paths to write).

def _render_dns(network_state, existing_dns_path=None):
        content = resolv_conf.ResolvConf("")
        if existing_dns_path and os.path.isfile(existing_dns_path):
            content = resolv_conf.ResolvConf(util.load_file(existing_dns_path))
        for nameserver in network_state.dns_nameservers:
            content.add_nameserver(nameserver)
        for searchdomain in network_state.dns_searchdomains:
            content.add_search_domain(searchdomain)
        if not str(content):
            return None
        header = _make_header(';')
        content_str = str(content)
        if not content_str.startswith(header):
            content_str = header + '\n' + content_str
        return content_str

I think we need to instead skip writing resolv.conf unless we have content to add

% git diff cloudinit/net/sysconfig.py
diff --git a/cloudinit/net/sysconfig.py b/cloudinit/net/sysconfig.py
index be5dede..12c6e36 100644
--- a/cloudinit/net/sysconfig.py
+++ b/cloudinit/net/sysconfig.py
@@ -585,7 +585,11 @@ class Renderer(renderer.Renderer):
             content.add_nameserver(nameserver)
         for searchdomain in network_state.dns_searchdomains:
             content.add_search_domain(searchdomain)
- if not str(content):
+ # content may be non-empty if an existing were present so also
+ # skip writing a resolv conf if we don't have any nameservers or
+ # search paths
+ if not any([len(network_state.dns_nameservers),
+ len(network_state.dns_searchdomains)]):
             return None
         header = _make_header(';')
         content_str = str(content)