bionic: static maas missing search domain in systemd-resolve configuration

Bug #1771885 reported by Andrew McLeod
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Canonical Juju
Fix Released
High
Eric Claude Jones
2.3
Fix Released
High
Eric Claude Jones
MAAS
Fix Released
Medium
Mike Pontillo
cloud-init
Won't Fix
Undecided
Unassigned

Bug Description

juju: 2.4-beta2
MAAS: 2.3.0

Testing deployment of LXD containers on bionic (specifically for an openstack deployment) lead to this problem:

https://bugs.launchpad.net/charm-nova-cloud-controller/+bug/1765405

Summary:

previously, the DNS config in the LXD containers were the same as the host machines

now, the DNS config is in systemd, the DNS server is set correctly, but the search domain is missing, so hostnames won't resolve.

Working resolv.conf on xenial lxd container:

nameserver 10.245.168.6
search maas

Non-working "systemd-resolve --status":

...
Link 21 (eth0)
      Current Scopes: DNS
       LLMNR setting: yes
MulticastDNS setting: no
      DNSSEC setting: no
    DNSSEC supported: no
         DNS Servers: 10.245.168.6

Working (now able to resolve hostnames after modifying netplan and adding search domain):

Link 21 (eth0)
      Current Scopes: DNS
       LLMNR setting: yes
MulticastDNS setting: no
      DNSSEC setting: no
    DNSSEC supported: no
         DNS Servers: 10.245.168.6
          DNS Domain: maas

ubuntu@juju-6406ff-2-lxd-2:/etc$ host node-name
node-name.maas has address 10.245.168.0

Tags: network bionic

Related branches

Revision history for this message
Frode Nordahl (fnordahl) wrote :

A interesting twist on this is that juju seems to do the right thing when host system is xenial and container is bionic (See below).

It may be that this is a generic issue at some level on Ubuntu after move to systemd-resolve. Other interesting bugs I have found on the subject:
https://bugs.launchpad.net/ubuntu/+source/network-manager/+bug/1684854
https://github.com/systemd/systemd/issues/6572

Excerpt of test displaying this working for juju deployed bionic container on xenial host system (all hosts are in the .maas domain and pinging by just using hostname part works. Repeating this test with Bionic as host system will fail):

$ juju status
Model Controller Cloud/Region Version SLA
default maas maas 2.4-rc1 unsupported

App Version Status Scale Charm Store Rev OS Notes

Unit Workload Agent Machine Public address Ports Message

Machine State DNS Inst id Series AZ Message
0 started 172.16.122.251 qkm377 xenial default Deployed
0/lxd/0 started 172.16.122.253 juju-4d3dd7-0-lxd-0 xenial default Container started
0/lxd/1 started 172.16.122.252 juju-4d3dd7-0-lxd-1 bionic default Container started

Controller Timestamp
15 May 2018 15:23:46+02:00

$ juju ssh 0 'lsb_release -a &&ping -c 1 awake-yak'
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 16.04.4 LTS
Release: 16.04
Codename: xenial
PING awake-yak.maas (172.16.122.250) 56(84) bytes of data.
64 bytes from awake-yak.maas (172.16.122.250): icmp_seq=1 ttl=64 time=0.319 ms

--- awake-yak.maas ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.319/0.319/0.319/0.000 ms
Connection to 172.16.122.251 closed.

$ juju ssh 0/lxd/0 'lsb_release -a &&ping -c 1 awake-yak'
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 16.04.4 LTS
Release: 16.04
Codename: xenial
PING awake-yak.maas (172.16.122.250) 56(84) bytes of data.
64 bytes from awake-yak.maas (172.16.122.250): icmp_seq=1 ttl=64 time=0.205 ms

--- awake-yak.maas ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.205/0.205/0.205/0.000 ms
Connection to 172.16.122.253 closed.

$ juju ssh 0/lxd/1 'lsb_release -a &&ping -c 1 awake-yak'
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 18.04 LTS
Release: 18.04
Codename: bionic
PING awake-yak.maas (172.16.122.250) 56(84) bytes of data.
64 bytes from awake-yak.maas (172.16.122.250): icmp_seq=1 ttl=64 time=0.116 ms

--- awake-yak.maas ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.116/0.116/0.116/0.000 ms
Connection to 172.16.122.252 closed.

Revision history for this message
John A Meinel (jameinel) wrote :

I know Bionic changed how we find the nameserver, is the issue that we aren't looking in the new place for the search domain or is it that the information isn't there?

Changed in juju:
assignee: nobody → Eric Claude Jones (ecjones)
importance: Undecided → High
milestone: none → 2.4-beta3
status: New → Triaged
tags: added: bionic network
summary: - 2.4-beta2 - lxd containers missing search domain in systemd-resolve
+ bionic: lxd containers missing search domain in systemd-resolve
configuration
Revision history for this message
Eric Claude Jones (ecjones) wrote : Re: bionic: lxd containers missing search domain in systemd-resolve configuration

"When MAAS deploys a node it will configure its resolver accordingly; in the case where the above settings are made and you are deploying Linux:

    The servers' resolv.conf file will have the IP address for the MAAS Region controller
    The BIND installation on the MAAS Region controller will have a forwarders entry set up for the addresses you provide.

The effect is that queries on the node will be sent to MAAS, which will resolve directly for queries in domains which it manages (by default, *.maas), and forward requests for everything else to the forwarder." - (https://askubuntu.com/questions/820925/how-do-i-set-a-dns-server-in-maas-that-will-be-passed-on-to-the-nodes)

When a container is created our MAAS provider code does not populate the container's DSN search domain by directly asking MAAS for that information. AFAIK, the provisioner code has a fallback heuristic that says "If I didn't get my DNS information from the provider then find it in the host's configuration."

On Xenial hosts, scraping the host for this information worked fine since MAAS plugs resolve.conf with the needed information. On Bionic hosts, something else is happening.

In short, maybe we should be getting this information directly from MAAS instead of getting it from what MAAS told the host machine.

In our case the netplan.yaml gets populated with information from a MAAS device (among other things). It might be sufficient to populate the search domain with the domain of the MASS device's FQDN which by default should be "maas". Another solution could be to ask MAAS in a more direct fashion (i.e GET /api/2.0/domains/).

Revision history for this message
Eric Claude Jones (ecjones) wrote :
Revision history for this message
Eric Claude Jones (ecjones) wrote :
Revision history for this message
Richard Harding (rharding) wrote :
Changed in juju:
status: Triaged → Fix Committed
Revision history for this message
Alex Kavanagh (ajkavanagh) wrote :

Sadly, I've still got this bug; just tested with 2.4-beta3+develop-e33ec12 (which may not be yet include this??)

The resolve.conf (generated) in the unit/container still has no search domain:

# This file is managed by man:systemd-resolved(8). Do not edit.
#
# This is a dynamic resolv.conf file for connecting local clients to the
# internal DNS stub resolver of systemd-resolved. This file lists all
# configured search domains.
#
# Run "systemd-resolve --status" to see details about the uplink DNS servers
# currently in use.
#
# Third party programs must not access this file directly, but only through the
# symlink at /etc/resolv.conf. To manage man:resolv.conf(5) in a different way,
# replace this symlink by a static file or a different symlink.
#
# See man:systemd-resolved.service(8) for details about the supported modes of
# operation for /etc/resolv.conf.

nameserver 127.0.0.53

Revision history for this message
John A Meinel (jameinel) wrote :

Can you include the output of /etc/netplan/*.yaml ?

A possibility is that we're still running into some sort of missing information and we did try to set the value, but we're overriding it:
        if !haveNameservers || !haveSearchDomains {
                logger.Warningf("incomplete DNS config found, discovering host's DNS config")
                dnsConfig, err := findDNSServerConfig()
                if err != nil {
                        return nil, errors.Trace(err)
                }

                // Since the result is sorted, the first entry is the primary NIC. Also,
                // results always contains at least one element.
                results[0].DNSServers = dnsConfig.Nameservers
                results[0].DNSSearchDomains = dnsConfig.SearchDomains
                logger.Debugf(
                        "setting DNS servers %+v and domains %+v on container interface %q",
                        results[0].DNSServers, results[0].DNSSearchDomains, results[0].InterfaceName,
                )
        }

If you try:
"juju debug-log --replay --include-module juju.provisioner"
"juju debug-log -m controller --replay --include-module juju.provisioner"

Do you see the line about 'incomplete DNS config found' ?

Can you also confirm what is on the host machine's /etc/resolve.conf? I would expect nameserver 127.0.0.53, but I'm not sure if systemd-resolved includes search path there, or only in its hidden reserved resolve information.

Revision history for this message
John A Meinel (jameinel) wrote :

@Alex, can you check that "host foo" still doesn't resolve? It may be that we don't need the search path in /etc/resolve.conf because of the new systemd-resolved changes. Our testing showed that 'host nuc-1' from inside the container did indeed resolve.

Revision history for this message
Alex Kavanagh (ajkavanagh) wrote :

Sure, I'll run up the system and leave it running; it'll be on ruxton dells; I'll report back here when it is up.

Revision history for this message
Alex Kavanagh (ajkavanagh) wrote :

So here is the information that hopefully will be useful.

1. /etc/netplan/99-juju.yaml on the container:

network:
  version: 2
  ethernets:
    eth0:
      match:
        macaddress: 00:16:3e:d0:b4:4e
      addresses:
      - 10.245.168.48/21
      gateway4: 10.245.168.1
      nameservers:
        addresses: [10.245.168.6]

2. host machine's /etc/resolve.conf:

# This file is managed by man:systemd-resolved(8). Do not edit.
#
...
#
# See man:systemd-resolved.service(8) for details about the supported modes of
# operation for /etc/resolv.conf.

3. nameserver 127.0.0.53

4. juju debug-log --replay --include-module juju.provisioner
machine-1: 15:34:02 WARNING juju.provisioner incomplete DNS config found, discovering host's DNS config
machine-2: 15:34:30 WARNING juju.provisioner incomplete DNS config found, discovering host's DNS config
machine-2: 15:34:39 WARNING juju.provisioner incomplete DNS config found, discovering host's DNS config
machine-2: 15:34:44 WARNING juju.provisioner failed to start machine 2/lxd/2 (failed to ensure LXD image: Failed remote image download: UNIQUE constraint failed: images_aliases.name), retrying in 10s (10 more attempts)
machine-2: 15:34:57 WARNING juju.provisioner incomplete DNS config found, discovering host's DNS config
machine-0: 15:35:08 WARNING juju.provisioner incomplete DNS config found, discovering host's DNS config
machine-0: 15:35:16 WARNING juju.provisioner incomplete DNS config found, discovering host's DNS config
machine-0: 15:35:22 WARNING juju.provisioner failed to start machine 0/lxd/2 (failed to ensure LXD image: Failed remote image download: UNIQUE constraint failed: images_aliases.name), retrying in 10s (10 more attempts)
machine-0: 15:35:37 WARNING juju.provisioner incomplete DNS config found, discovering host's DNS config
machine-3: 15:35:52 WARNING juju.provisioner incomplete DNS config found, discovering host's DNS config
machine-0: 15:36:41 WARNING juju.provisioner incomplete DNS config found, discovering host's DNS config
machine-1: 15:39:26 WARNING juju.provisioner incomplete DNS config found, discovering host's DNS config
machine-3: 15:40:57 WARNING juju.provisioner incomplete DNS config found, discovering host's DNS config
machine-3: 15:41:22 WARNING juju.provisioner incomplete DNS config found, discovering host's DNS config
machine-2: 15:42:53 WARNING juju.provisioner incomplete DNS config found, discovering host's DNS config

So, yes, lots of "incomplete DNS config found" lines (I think one per container).

5. juju debug-log -m controller --replay --include-module juju.provisioner
machine-0: 15:12:49 INFO juju.provisioner provisioner-harvest-mode is set to destroyed; unknown instances not stopped []

6. juju --version
2.4-beta3-xenial-amd64 (actually from snap: 2.4-beta3+develop-c17354d

7. The container and host does resolve; it's just that there is no search domain so nova-cloud-controller fails: See https://bugs.launchpad.net/charm-nova-cloud-controller/+bug/1765405

Add a search domain (maas) to the netplan config and applying it then allows nova-cloud-controller to work.

--

I've left the system up; if there are any more details you would like then please let me know.

Revision history for this message
John A Meinel (jameinel) wrote : Re: [Bug 1771885] Re: bionic: lxd containers missing search domain in systemd-resolve configuration
Download full text (3.8 KiB)

It sounds like we are failing to find the search domain and add it to the
configuration.

John
=:->

On Thu, May 24, 2018, 20:40 Alex Kavanagh <email address hidden>
wrote:

> So here is the information that hopefully will be useful.
>
> 1. /etc/netplan/99-juju.yaml on the container:
>
> network:
> version: 2
> ethernets:
> eth0:
> match:
> macaddress: 00:16:3e:d0:b4:4e
> addresses:
> - 10.245.168.48/21
> gateway4: 10.245.168.1
> nameservers:
> addresses: [10.245.168.6]
>
> 2. host machine's /etc/resolve.conf:
>
> # This file is managed by man:systemd-resolved(8). Do not edit.
> #
> ...
> #
> # See man:systemd-resolved.service(8) for details about the supported
> modes of
> # operation for /etc/resolv.conf.
>
> 3. nameserver 127.0.0.53
>
> 4. juju debug-log --replay --include-module juju.provisioner
> machine-1: 15:34:02 WARNING juju.provisioner incomplete DNS config found,
> discovering host's DNS config
> machine-2: 15:34:30 WARNING juju.provisioner incomplete DNS config found,
> discovering host's DNS config
> machine-2: 15:34:39 WARNING juju.provisioner incomplete DNS config found,
> discovering host's DNS config
> machine-2: 15:34:44 WARNING juju.provisioner failed to start machine
> 2/lxd/2 (failed to ensure LXD image: Failed remote image download: UNIQUE
> constraint failed: images_aliases.name), retrying in 10s (10 more
> attempts)
> machine-2: 15:34:57 WARNING juju.provisioner incomplete DNS config found,
> discovering host's DNS config
> machine-0: 15:35:08 WARNING juju.provisioner incomplete DNS config found,
> discovering host's DNS config
> machine-0: 15:35:16 WARNING juju.provisioner incomplete DNS config found,
> discovering host's DNS config
> machine-0: 15:35:22 WARNING juju.provisioner failed to start machine
> 0/lxd/2 (failed to ensure LXD image: Failed remote image download: UNIQUE
> constraint failed: images_aliases.name), retrying in 10s (10 more
> attempts)
> machine-0: 15:35:37 WARNING juju.provisioner incomplete DNS config found,
> discovering host's DNS config
> machine-3: 15:35:52 WARNING juju.provisioner incomplete DNS config found,
> discovering host's DNS config
> machine-0: 15:36:41 WARNING juju.provisioner incomplete DNS config found,
> discovering host's DNS config
> machine-1: 15:39:26 WARNING juju.provisioner incomplete DNS config found,
> discovering host's DNS config
> machine-3: 15:40:57 WARNING juju.provisioner incomplete DNS config found,
> discovering host's DNS config
> machine-3: 15:41:22 WARNING juju.provisioner incomplete DNS config found,
> discovering host's DNS config
> machine-2: 15:42:53 WARNING juju.provisioner incomplete DNS config found,
> discovering host's DNS config
>
> So, yes, lots of "incomplete DNS config found" lines (I think one per
> container).
>
> 5. juju debug-log -m controller --replay --include-module juju.provisioner
> machine-0: 15:12:49 INFO juju.provisioner provisioner-harvest-mode is set
> to destroyed; unknown instances not stopped []
>
> 6. juju --version
> 2.4-beta3-xenial-amd64 (actually from snap: 2.4-beta3+develop-c17354d
>
> 7. The container and host does resolve; it's just that there is no
> search doma...

Read more...

Revision history for this message
Eric Claude Jones (ecjones) wrote : Re: bionic: lxd containers missing search domain in systemd-resolve configuration

If at all helpful, it should be known that the patches above only apply to newly deployed containers. The patches above do not repair or correct existing containers.

David Ames (thedac)
summary: - bionic: lxd containers missing search domain in systemd-resolve
+ bionic: manual maas missing search domain in systemd-resolve
configuration
summary: - bionic: manual maas missing search domain in systemd-resolve
+ bionic: static maas missing search domain in systemd-resolve
configuration
Revision history for this message
David Ames (thedac) wrote :

Discussion with roaksoax about this and it seems likely this is a cloud-int / netplan problem. I have added cloud-init and maas just to be thorough.

When bionic is deployed using MAAS 2.3.0 using a static network config the DNS search domain is missing from the netplan configuration and or systemd-resolve.

I am attaching three sets of data. bionic-maas, bionic-dhcp and xenial-maas to show the differences.

Cloud init reports in cloud-init.log it has the information. See search bellow:

config= {
'config':
    [{'id': 'eno1', 'mac_address': 'd4:be:d9:a8:44:ff', 'mtu': 1500, 'name': 'eno1', 'subnets': [{'address': '10.245.168.26/21', 'dns_nameservers': ['10.245.168.6'], 'gateway': '10.245.168.1', 'type': 'static'}], 'type': 'physical'},
     {'id': 'eno2', 'mac_address': 'd4:be:d9:a8:45:01', 'mtu': 1500, 'name': 'eno2', 'subnets': [{'type': 'manual'}], 'type': 'physical'},
     {'id': 'eno3', 'mac_address': 'd4:be:d9:a8:45:03', 'mtu': 1500, 'name': 'eno3', 'subnets': [{'type': 'manual'}], 'type': 'physical'},
     {'id': 'eno4', 'mac_address': 'd4:be:d9:a8:45:05', 'mtu': 1500, 'name': 'eno4', 'subnets': [{'type': 'manual'}], 'type': 'physical'},
     {'address': ['10.245.168.6'], 'search': ['maas'], 'type': 'nameserver'}],
'version': 1}

But the /etc/netplan/50-cloud-init.yaml configuration is missing this information leading to resolution failures.

By contrast the xenial (/etc/network/interfaces.d/50-cloud-init.conf has the correct information when the logging shows the same input from MAAS.

See also the bionci DHCP example which gets the search domain information from DCHP.

Revision history for this message
David Ames (thedac) wrote :

Xenial-maas info

Revision history for this message
David Ames (thedac) wrote :

bionic DHCP

Revision history for this message
Andres Rodriguez (andreserl) wrote :

Hi David,

It seems that the issue would be a duplicate of this [1], although it seems it may have been fixed as part of the 18.2 release, as per [2[.

David, can you please confirm your images are up to date and using the latest version of cloud-init. If they are, please re-open the bug report below.

[1]: https://bugs.launchpad.net/cloud-init/+bug/1750884
[2]: https://bugs.launchpad.net/cloud-init/+bug/1750884/comments/27

Changed in maas:
status: New → Incomplete
Revision history for this message
David Ames (thedac) wrote :

@Andres,

In the cloud-init.logs I provide it is version 18.2. I also refreshed the images just to be sure. Is there a way I can prove the images are up to date? Willing to do so.

Unlike the description in [1] we are not getting the search domain in either /etc/resolv.conf or in systemd-resolve --status. Please see the attached logs. Could this be a regression?

[1] https://bugs.launchpad.net/cloud-init/+bug/1750884

Revision history for this message
David Ames (thedac) wrote :

Just to be 100% clear. We do get the nameserver setting, (see the attached logs) and resolution for FQDNs works. We do not get the search domain and therefore hostname only resolution does not work.

Revision history for this message
Andres Rodriguez (andreserl) wrote :

@David, could you please attach the output of 'maas <user> machine get-curtin-config <system-id>'

Revision history for this message
David Ames (thedac) wrote :
Download full text (4.8 KiB)

@Andres

$ maas ruxton machine get-curtin-config acq33q

apt:
  preserve_sources_list: false
  primary:
  - arches:
    - default
    uri: http://archive.ubuntu.com/ubuntu
  proxy: http://10.245.168.6:8000/
  security:
  - arches:
    - default
    uri: http://archive.ubuntu.com/ubuntu
cloudconfig:
  maas-cloud-config:
    content: "#cloud-config\ndatasource:\n MAAS: {consumer_key: <REDACTED>,\
      \ metadata_url: 'http://10.245.168.6:5240/MAAS/metadata/',\n token_key: <REDACTED>,\
      \ token_secret: <REDACTED>}\n"
    path: /etc/cloud/cloud.cfg.d/90_maas_cloud_config.cfg
  maas-datasource:
    content: 'datasource_list: [ MAAS ]'
    path: /etc/cloud/cloud.cfg.d/90_maas_datasource.cfg
  maas-reporting:
    content: "#cloud-config\nreporting:\n maas: {consumer_key: <REDACTED>,\
      \ endpoint: 'http://10.245.168.6:5240/MAAS/metadata/status/acq33q',\n token_key:\
      \ <REDACTED>, token_secret: <REDACTED>,\n type:\
      \ webhook}\n"
    path: /etc/cloud/cloud.cfg.d/90_maas_cloud_init_reporting.cfg
  maas-ubuntu-sso:
    content: '#cloud-config

      snappy: {email: <email address hidden>}

      '
    path: /etc/cloud/cloud.cfg.d/90_maas_ubuntu_sso.cfg
debconf_selections:
  grub2: grub2 grub2/update_nvram boolean false
  maas: 'cloud-init cloud-init/datasources multiselect MAAS

    cloud-init cloud-init/maas-metadata-url string http://10.245.168.6:5240/MAAS/metadata/

    cloud-init cloud-init/maas-metadata-credentials string oauth_consumer_key=<REDACTED>&oauth_token_key=<REDACTED>&oauth_token_secret=<REDACTED>

    cloud-init cloud-init/local-cloud-config string apt:\n preserve_sources_list:
    false\n primary:\n - arches: [default]\n uri: http://archive.ubuntu.com/ubuntu\n proxy:
    http://10.245.168.6:8000/\n security:\n - arches: [default]\n uri: http://archive.ubuntu.com/ubuntu\napt_preserve_sources_list:
    true\napt_proxy: http://10.245.168.6:8000/\nmanage_etc_hosts: false\nmanual_cache_clean:
    true\nreporting:\n maas: {consumer_key: <REDACTED>, endpoint: ''http://10.245.168.6:5240/MAAS/metadata/status/acq33q'',\n token_key:
    <REDACTED>, token_secret: <REDACTED>,\n type:
    webhook}\nsystem_info:\n package_mirrors:\n - arches: [i386, amd64]\n failsafe:
    {primary: ''http://archive.ubuntu.com/ubuntu'', security: ''http://security.ubuntu.com/ubuntu''}\n search:\n primary:
    [''http://archive.ubuntu.com/ubuntu'']\n security: [''http://archive.ubuntu.com/ubuntu'']\n -
    arches: [default]\n failsafe: {primary: ''http://ports.ubuntu.com/ubuntu-ports'',
    security: ''http://ports.ubuntu.com/ubuntu-ports''}\n search:\n primary:
    [''http://ports.ubuntu.com/ubuntu-ports'']\n security: [''http://ports.ubuntu.com/ubuntu-ports'']\n

    '
early_commands:
  driver_00:
  - sh
  - -c
  - echo third party drivers not installed or necessary.
install:
  log_file: /tmp/install.log
  post_files:
  - /tmp/install.log
kernel:
  mapping: {}
  package: linux-signed-generic
late_commands:
  maas:
  - wget
  - --no-proxy
  - http://10.245.168.6:5240/MAAS/metadata/latest/by-id/acq33q/
  - --post-data
  - op=netboot_off
  - -O
  - /dev/null
network:
  conf...

Read more...

Revision history for this message
David Ames (thedac) wrote :

Removing the DNS settings from the subnet in MAAS resolves the problem. Bionic then receives the correct search domain.

I could make the argument that having the DNS setting on the subnet should either allow you to set search domain or it should not exist. But from our point of view the bug is resolved.

Thanks for everyone's work on this.

Revision history for this message
David Britton (dpb) wrote :

Given the workaround available for maas & cloud-init this is working as expected. Thanks for the debugging everyone.

Changed in cloud-init:
status: New → Won't Fix
Changed in maas:
status: Incomplete → Invalid
Revision history for this message
Neiloy Mukerjee (neiloy) wrote :

I can confirm both that this bug exists and that the referenced workaround deals with the issue.

Context: nova-cloud-controller deployment was producing
hook failed: "cloud-compute-relation-changed" for nova-compute:cloud-compute

On the unit machine, the /etc/netplan/99-juju.yaml started as below:
network:
  version: 2
  ethernets:
    eth0:
      match:
        macaddress: 00:16:3e:9c:a1:5c
      addresses:
      - 10.246.114.24/21
      gateway4: 10.246.112.1
      nameservers:
        addresses: [10.246.112.3]

Adding a search domain under nameservers, as below:
network:
  version: 2
  ethernets:
    eth0:
      match:
        macaddress: 00:16:3e:9c:a1:5c
      addresses:
      - 10.246.114.24/21
      gateway4: 10.246.112.1
      nameservers:
        search: [maas]
        addresses: [10.246.112.3]

and then running a netplan apply allowed the deployment to continue as expected.

Changed in juju:
status: Fix Committed → Fix Released
Revision history for this message
Andres Rodriguez (andreserl) wrote :

I'm re-opening this task for MAAS, as a user has been able to reproduce this issue in a different context. While there's a work-around on comment #22, the situation is that even when the same network configuration sent for xenial and bionic deployments is the same, the configuration differs, due to how cloud-init handles netplan configuration.

More specifically, in Xenial, when MAAS sends DNS config for both "global" and per-interface/subnet configuration, the resulting config is that the machine will have an aggregation of the configuration for DNS.

However, when deploying Bionic, when MAAS sends the same exact configuration, cloud-init interpret's it different and *only* the network configuration of an interface is taken into consideration, while the global is ignored.

As such, this *only* becomes an issue when the user overrides the DNS on specific subnet, which results in the search domain not being considered from the global config.

As such, we will make an improvement in MAAS to ensure that the search domain is always included regardless.

Changed in maas:
status: Invalid → New
milestone: none → 2.5.0
Changed in maas:
assignee: nobody → Mike Pontillo (mpontillo)
importance: Undecided → Medium
status: New → Triaged
Revision history for this message
Mike Pontillo (mpontillo) wrote :

I've proposed a change in MAAS that will replicate the DNS search path configuration on a per-interface basis in the v1 network preseed YAML passed to cloud-init (if there is a DNS server on the interface; i.e. defined on the subnet in MAAS). Hopefully that will smooth things over for those who encounter this in the future.

Changed in maas:
status: Triaged → Fix Committed
Changed in maas:
milestone: 2.5.0 → 2.5.0alpha1
Changed in maas:
status: Fix Committed → Fix Released
no longer affects: maas/2.3
no longer affects: maas/2.4
Revision history for this message
James Falcon (falcojr) wrote :
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.