systemd-resolved and libvirt dnsmasq instance get into a busy loop when a query is issued for a URI or SRV record
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
dnsmasq (Ubuntu) |
New
|
Undecided
|
Unassigned | ||
libvirt (Ubuntu) |
New
|
Undecided
|
Unassigned | ||
systemd (Ubuntu) |
Won't Fix
|
Undecided
|
Unassigned |
Bug Description
I have a zesty system that uses systemd-resolved, as per default, which also has dnsmasq configured for use on interface virbr0 for my libvirt bridge.
This system is also part of a Kerberos realm. Recent versions of Kerberos do a lookup of a URI RR, à la:
$ nslookup -q=URI _kerberos.dodds.net
Server: 127.0.0.53
Address: 127.0.0.53#53
Non-authoritative answer:
*** Can't find _kerberos.
Authoritative answers can be found from:
$
There is no URI DNS record published for this domain, so the lack of response is correct. However, systemd-resolved and dnsmasq then get in a busy loop, passing the same query back and forth between each other. (Confirmed with wireshark.)
If I query SRV records under the same domain (which is also part of what kerberos does), these positive results are correctly returned to the client, but systemd-resolved and dnsmasq again get into a busy loop.
If I query a URI record for a domain /other than/ what I have configured as my DNS search domain, there is no busy loop.
If I query other kinds of records (whether they return results or not), such as A, CNAME, and MX records, there is no busy loop.
/etc/resolv.conf looks like:
# Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8)
# DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN
# 127.0.0.53 is the systemd-resolved stub resolver.
# run "systemd-resolve --status" to see details about the actual nameservers.
domain dodds.net
search dodds.net
nameserver 127.0.0.53
nameserver 192.168.122.1
systemd-resolve says:
$ systemd-resolve --status
Global
DNS Servers: 192.168.122.1
DNS Domain: dodds.net
DNSSEC NTA: 10.in-addr.arpa
[...]
Link 2 (wlan2)
Current Scopes: DNS LLMNR/IPv4 LLMNR/IPv6
LLMNR setting: yes
MulticastDNS setting: no
DNSSEC setting: no
DNSSEC supported: no
DNS Servers: 192.168.0.1
DNS Domain: dodds.net
$
I'm not sure where this bug lies. Both DNS servers are by design configured not to cache results, in order to avoid cache poisoning and information leaks, so neither DNS server can detect that they've already asked for the record and don't need to recurse. I think it's probably a bug for dnsmasq to be configured as a server for resolving 'dodds.net' - I have nowhere specified that this is appropriate and this potentially conflicts with legitimate records in this domain. But it also must be a bug that SRV/URI records result in recursion but A/CNAME/MX records do not.
summary: |
systemd-resolved and libvirt dnsmasq instance get into a busy loop when - a query is issued for a URI record + a query is issued for a URI or SRV record |
Trying to understand why the dnsmasq is registered in /etc/resolv.conf at all, I find that it's listed in /etc/resolvconf /resolv. conf.d/ tail. So 192.168.122.1 being listed as a global DNS server is a result of local configuration, which means this problem is at least partly self-inflicted.
If I remove this from /etc/resolvconf /resolv. conf.d/ tail and restart systemd-resolved, I no longer see 192.168.122.1 listed at all in systemd-resolve --status. So there is no longer any DNS loop; OTOH, I also no longer get DNS resolution of the names of my VMs. While this works around the original symptom (which is still a bug somewhere, due to the correct handling of A/CNAME/MX but wrong handling of SRV/URI), there also needs to be a proper way to register libvirt's dnsmasq as an auxiliary DNS server for the VMs.