OVN DNS not working as documented

Bug #2059405 reported by Martin Ananda Boeker
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
neutron
Confirmed
Medium
Unassigned

Bug Description

Env: 2023.1

As far as I can tell, I have configured OVN and DNS as documented.

In kolla.yml
kolla_enable_ovn: true

In kolla/globals.yml:
neutron_plugin_agent: ovn
neutron_enable_ovn_agent: true

It seems that it does not matter what I put in dns.yml, and documentation confirms that because OVN should be doing dns responses by grabbing queries to port 53. The behavior however is very strange. I only have two instances, vm1 (172.30.89.175) and vm2 (172.30.89.177)

Here is the output of `ovn-sbctl list dns`:

_uuid : cdc31ab2-a363-4585-a835-c8019d4b265d
datapaths : [ca41c1b4-f4b1-4606-99e5-dc47a383accf]
external_ids : {dns_id="4c6895d8-fad3-4591-acc4-6a4ed8710d2b"}
records : {"175.89.30.172.in-addr.arpa"=vm1.aio.local, "177.89.30.172.in-addr.arpa"=vm2.aio.local, vm1="172.30.89.175", vm1.aio.local="172.30.89.175", vm2="172.30.89.177", vm2.aio.local="172.30.89.177"}

Here's the output of trying to communicate between VMs:

admin@vm1:~$ resolvectl
Global
       Protocols: -LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported
resolv.conf mode: stub

Link 2 (ens3)
    Current Scopes: DNS
         Protocols: +DefaultRoute +LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported
Current DNS Server: 172.30.89.76
       DNS Servers: 172.30.89.46 172.30.89.61 172.30.89.76
        DNS Domain: aio.local

admin@vm1:~$ ping vm2
ping: vm2: Temporary failure in name resolution

admin@vm1:~$ host vm2
Host vm2.aio.local not found: 5(REFUSED)

admin@vm1:~$ host vm2.aio.local
Host vm2.aio.local not found: 5(REFUSED)

admin@vm1:~$ host vm2 172.30.89.46
Using domain server:
Name: 172.30.89.46
Address: 172.30.89.46#53
Aliases:

vm2.aio.local has address 172.30.89.177
Host vm2.aio.local not found: 5(REFUSED)
Host vm2.aio.local not found: 5(REFUSED)

172.30.89.46 172.30.89.61 172.30.89.76 are the controllers, however during testing we went as far as to disable Designate, so they cannot answer. However we see that when we manually specify a dns server to query against, even if that dns server does not know the answer, OVN responds with the correct address (and then we get two additional REFUSED errors).

This is very strange behavior.. Are we missing something here?

Tags: dns ovn
Revision history for this message
Martin Ananda Boeker (mboeker) wrote :

Because the controllers are not doing DNS, I removed them from the OVN config and dns.yml. In the test below, I'm querying the gateway, which of course also does not resolve DNS, but you can see OVN is providing the correct address.. I rebuilt the VMs so now vm2 has IP 172.30.89.175.

admin@vm1:~$ resolvectl
Global
       Protocols: -LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported
resolv.conf mode: stub

Link 2 (ens3)
Current Scopes: none
     Protocols: -DefaultRoute +LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported
    DNS Domain: aio.local

admin@vm1:~$ ping vm2
ping: vm1: Temporary failure in name resolution

admin@vm1:~$ host vm2
Host vm1.aio.local not found: 2(SERVFAIL)

admin@vm1:~$ host vm1 172.30.89.46
Using domain server:
Name: 172.30.89.46
Address: 172.30.89.46#53
Aliases:

vm1.aio.local has address 172.30.89.175
Host vm1.aio.local not found: 5(REFUSED)
Host vm1.aio.local not found: 5(REFUSED)

So once again, OVN has the answer, but it's not providing it until I try to query something outside, and even then I get the correct answer in addition to two failures.

Revision history for this message
Will Szumski (willjs) wrote :

OVN will snaffle the DNS queries before forwarding them on to the DNS servers configured in the VM. If OVN has an entry, it will respond without forwarding it. Are you sure this is not what is happening here? Have you configured broken DNS servers in the VM?

Revision history for this message
Martin Ananda Boeker (mboeker) wrote :

Hi Will,

So OVN is responding, but only when I specify an external server, regardless of what's in resolvectl. And even then, we get the correct response from OVN followed by error messages.

Here is the resolvectl output, with the controllers set as DNS servers. Note above that I've also tried this without any DNS servers specified. Currently the controllers are running designate, but of course there are no entries for vm1 or vm2 specifically created:

Global
         Protocols: LLMNR=resolve -mDNS -DNSOverTLS DNSSEC=no/unsupported
  resolv.conf mode: stub

Link 2 (eth0)
    Current Scopes: DNS LLMNR/IPv4 LLMNR/IPv6
         Protocols: +DefaultRoute LLMNR=resolve -mDNS -DNSOverTLS DNSSEC=no/unsupported
Current DNS Server: 172.30.89.76
       DNS Servers: 172.30.89.46 172.30.89.61 172.30.89.76
        DNS Domain: aio.local

[admin@vm2 ~]$ host vm1
Host vm1 not found: 2(SERVFAIL)

[admin@vm2 ~]$ host vm1.aio.local
Host vm1.aio.local not found: 3(NXDOMAIN)

[admin@vm2 ~]$ host vm1 172.30.89.46
Using domain server:
Name: 172.30.89.46
Address: 172.30.89.46#53
Aliases:

vm1.aio.local has address 172.30.89.177
Host vm1.aio.local not found: 3(NXDOMAIN)
Host vm1.aio.local not found: 3(NXDOMAIN)

[admin@vm2 ~]$ host vm1 1.2.3.4
Using domain server:
Name: 1.2.3.4
Address: 1.2.3.4#53
Aliases:

vm1.aio.local has address 172.30.89.177
;; communications error to 1.2.3.4#53: timed out
;; communications error to 1.2.3.4#53: timed out
;; no servers could be reached

;; communications error to 1.2.3.4#53: timed out
;; communications error to 1.2.3.4#53: timed out
;; no servers could be reached

You can see, if I specify nothing as a DNS server it just fails using the short hostname. If I specify anything as a DNS server, even if it's junk, OVN is responding but I also get errors.

Revision history for this message
Martin Ananda Boeker (mboeker) wrote (last edit ):

Here is evidence that OVN is NOT actually catching the DNS traffic, even though it is reaching the DNS server (controller):

ON VM:
admin@vm1:~$ host vm2
Host vm2.aio.local not found: 5(REFUSED)

ON CONTROLLER, tcpdump -n port 53:

12:30:08.086208 IP 172.30.89.176.38733 > 172.30.89.61.53: 8954+ [1au] A? vm2.aio.local. (44)
12:30:08.086396 IP 172.30.89.61.53 > 172.30.89.176.38733: 8954 Refused- 0/0/1 (44)

The REFUSED response from the controller is expected, because there is no DNS entry in designate for vm2, but the question is why did OVN not reply since clearly the request left the VM. Here again the OVN config:

ubuntu@AIOTEST02:~$ ovn-sbctl list dns
_uuid : f18eeb3b-3319-4546-ad58-1549f8ed7f70
datapaths : [c36f655d-0364-45bf-a750-663ad676d607]
external_ids : {dns_id="db82ba60-c867-49eb-bb65-0de79745aafb"}
records : {"174.89.30.172.in-addr.arpa"=vm2.aio.local, "176.89.30.172.in-addr.arpa"=vm1.aio.local, vm1="172.30.89.176", vm1.aio.local="172.30.89.176", vm2="172.30.89.174", vm2.aio.local="172.30.89.174"}

Revision history for this message
Will Szumski (willjs) wrote :

Unsure, it looks like you have the relevant configuration (dns extension and dns domain). I would suggest marking this as affecting neutron as they will likely know more about the intricate details. I'd also include your OVN version as I know this is older in the Ubuntu images than Rocky.

Revision history for this message
Martin Ananda Boeker (mboeker) wrote :

Kayobe config seems correct, marking as Neutron.
OVN internal version is : [23.03.1-20.27.0-70.6]

affects: kayobe → neutron
Revision history for this message
Martin Ananda Boeker (mboeker) wrote :

Saw there is also a project for "networking-ovn" but I feel like OVN itself is behaving correctly.. If feedback is that it should be there instead I will move it again.

Revision history for this message
Brian Haley (brian-haley) wrote :

I do see something similar running Neutron from master branch with OVN 23.03.3. Not sure of why the failure someone will need to triage further.

tags: added: dns ovn
Changed in neutron:
status: New → Confirmed
Revision history for this message
Bernard Cafarelli (bcafarel) wrote :

And for question in #7, moving to neutron is correct, networking-ovn was used when OVN mechanism driver was still a separate project

Changed in neutron:
importance: Undecided → Medium
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.