Canonical Juju

bionic LXD containers on bionic hosts get incorrect /etc/resolve.conf files

Bug #1764317 reported by John A Meinel on 2018-04-16

This bug affects 5 people

Affects		Status	Importance	Assigned to	Milestone
	Canonical Juju	Fix Released	High	Eric Claude Jones	Canonical Juju 2.4-beta2
	2.3	Fix Released	High	Eric Claude Jones	Canonical Juju 2.3.8

Bug Description

I just tried:
juju bootstrap --bootstrap-series=bionic maas
juju deploy cs:~jameinel/ubuntu-lite --series=bionic --to lxd:0 -m controller

After doing so, the container fails to start up because it "cannot resolve archive.ubuntu.com" (note I think we've seen this in CI runs as well).

However, the reason for that is because the host machine has this /etc/resolve.conf:

nameserver 127.0.0.53
search maas

Presumably this means we're running a local DNS proxy that is actually itself configured to talk to MAAS for any more information.

However, when launching a container, we read the host machines DNS information if we don't have any other information. But it is very clear that 127.0.0.53 is not going to be available from inside the container.

It appears that 127.0.0.53 is being created from systemd-resolved. (just looking around at other bugs like: https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1624320)

I don't see the other config files for "resolved.conf" namely
/etc/systemd/resolved.conf (exists but doesn't contain anything interesting)
/etc/systemd/resolved.conf.d (doesn't exist)
/run/systemd/resolved.conf.d (doesn't exist)
/usr/lib/systemd/resolved.conf.d (doesn't exist)

I did find on the host system:
/run/systemd/resolve/resolv.conf which contains only:
nameserver 10.0.0.1
search maas

I injected that inside the container, and then ran
systemctl restart systemd-resolved

And then I was able to get:
# systemctl status systemd-resolved
...
Apr 16 07:53:20 juju-930c9c-0-lxd-0 systemd-resolved[687]: Negative trust anchors: 10.in-addr.arpa 16.172.in-addr.arpa 17.172.in-addr.arpa 18.172.in-addr.arpa 19.172.in-addr.arpa 20.172.in-addr.arpa 21.172.in-add
Apr 16 07:53:20 juju-930c9c-0-lxd-0 systemd-resolved[687]: Using system hostname 'juju-930c9c-0-lxd-0'.
Apr 16 07:53:20 juju-930c9c-0-lxd-0 systemd[1]: Started Network Name Resolution.

However,
root@juju-930c9c-0-lxd-0:~# host archive.ubuntu.com
;; connection timed out; no servers could be reached

While this does work:
# host archive.ubuntu.com 10.0.0.1
Using domain server:
Name: 10.0.0.1
Address: 10.0.0.1#53
Aliases:

archive.ubuntu.com has address 91.189.88.152
archive.ubuntu.com has address 91.189.88.149
archive.ubuntu.com has address 91.189.88.161
archive.ubuntu.com has address 91.189.88.162
archive.ubuntu.com has IPv6 address 2001:67c:1560:8001::14
archive.ubuntu.com has IPv6 address 2001:67c:1360:8001::17
archive.ubuntu.com has IPv6 address 2001:67c:1360:8001::21
archive.ubuntu.com has IPv6 address 2001:67c:1560:8001::11

From what I can tell, systemd-resolved might read /etc/resolv.conf on startup if it is not a symlink to /run/systemd/resolve/stub-resolve.conf and then configure itself appropriately, before replacing it with the symlink.

Running on the host machine I see:
c# systemd-resolve --status
Global
          DNSSEC NTA: 10.in-addr.arpa
                      16.172.in-addr.arpa
                      168.192.in-addr.arpa
                      17.172.in-addr.arpa
                      18.172.in-addr.arpa
                      19.172.in-addr.arpa
                      20.172.in-addr.arpa
                      21.172.in-addr.arpa
                      22.172.in-addr.arpa
                      23.172.in-addr.arpa
                      24.172.in-addr.arpa
                      25.172.in-addr.arpa
                      26.172.in-addr.arpa
                      27.172.in-addr.arpa
                      28.172.in-addr.arpa
                      29.172.in-addr.arpa
                      30.172.in-addr.arpa
                      31.172.in-addr.arpa
                      corp
                      d.f.ip6.arpa
                      home
                      internal
                      intranet
                      lan
                      local
                      private
                      test

Link 6 (vethXG5VHE)
      Current Scopes: none
       LLMNR setting: yes
MulticastDNS setting: no
      DNSSEC setting: no
    DNSSEC supported: no

Link 4 (br-enp0s25)
      Current Scopes: DNS
       LLMNR setting: yes
MulticastDNS setting: no
      DNSSEC setting: no
    DNSSEC supported: no
         DNS Servers: 10.0.0.1
          DNS Domain: maas

Link 3 (lxdbr0)
      Current Scopes: none
       LLMNR setting: yes
MulticastDNS setting: no
      DNSSEC setting: no
    DNSSEC supported: no

Link 2 (enp0s25)
      Current Scopes: none
       LLMNR setting: yes
MulticastDNS setting: no
      DNSSEC setting: no
    DNSSEC supported: no

Those seem to be set up by juju in the 'netplan' configuration at:
/etc/netplan/99-juju.yaml:
network:
  version: 2
  ethernets:
    enp0s25:
      match:
        macaddress: b8:ae:ed:79:c7:92
      set-name: enp0s25
      mtu: 1500
  bridges:
    br-enp0s25:
      interfaces: [enp0s25]
      addresses:
      - 10.0.0.156/24
      gateway4: 10.0.0.1
      nameservers:
        search: [maas]
        addresses: [10.0.0.1]
      mtu: 1500

However, inside the container we end up with:
network:
  version: 2
  ethernets:
    eth0:
      match:
        macaddress: 00:16:3e:4a:39:18
      addresses:
      - 10.0.0.26/24
      gateway4: 10.0.0.1
      nameservers:
        search: [maas]
        addresses: [127.0.0.53]

Editing /etc/netplan/99-juju.yaml inside the container and then running
netplan generate
netplan apply

Does show the container getting the right dns server to forward to.

So we need to figure out how we are getting the right DNS server for the host machine, so that we can put it into the host's 99-juju.yaml, but we are overriding that value with the host's /etc/resolve.conf which is no longer the right value.

I'm guessing we'll also be doing the wrong thing for a KVM container.

Tags:

Ryan Beisner (1chb1n) on 2018-04-19

tags:

added: uosci