systemd-resolved-dnssec breaks name resolution on lxd domain

Bug #2119652 reported by Nick Rosbrook
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
lxd
Fix Released
Unknown
bind9 (Ubuntu)
Invalid
Undecided
Unassigned
dnsmasq (Ubuntu)
Won't Fix
Undecided
Lukas Märdian
libvirt (Ubuntu)
Invalid
Undecided
Unassigned
livecd-rootfs (Ubuntu)
Invalid
Undecided
Unassigned
lxd (Ubuntu)
Won't Fix
Undecided
Unassigned
strongswan (Ubuntu)
Fix Released
Undecided
Lukas Märdian
systemd (Ubuntu)
Won't Fix
High
Nick Rosbrook

Bug Description

By default, LXD containers will be configured with DNS pointing to the server listening on lxdbr0 on the host. The DHCP leases additionally configure the 'lxd' domain. LXD starts a dnsmasq server which is DNSSEC compatible, but by default is not actually configured for DNSSEC. This leads to DNSSEC validation errors as seen below:

root@q1:~# apt policy systemd-resolved-dnssec
systemd-resolved-dnssec:
  Installed: 257.7-1ubuntu3
  Candidate: 257.7-1ubuntu3
  Version table:
 *** 257.7-1ubuntu3 100
        100 http://archive.ubuntu.com/ubuntu questing-proposed/main amd64 Packages
        100 /var/lib/dpkg/status
root@q1:~# resolvectl
Global
         Protocols: -LLMNR -mDNS -DNSOverTLS DNSSEC=allow-downgrade/supported
  resolv.conf mode: stub

Link 47 (eth0)
    Current Scopes: DNS
         Protocols: +DefaultRoute -LLMNR -mDNS -DNSOverTLS DNSSEC=allow-downgrade/supported
Current DNS Server: 10.148.181.1
       DNS Servers: 10.148.181.1 fd42:f983:5882:c87f::1 fe80::216:3eff:fed9:e3c1
        DNS Domain: lxd
     Default Route: yes
root@q1:~# ping q2.lxd
ping: q2.lxd: Temporary failure in name resolution
root@q1:~# nslookup q2
;; Got SERVFAIL reply from 127.0.0.53
Server: 127.0.0.53
Address: 127.0.0.53#53

** server can't find q2.lxd: SERVFAIL

root@q1:~# resolvectl dnssec eth0 no
root@q1:~# nslookup q2
Server: 127.0.0.53
Address: 127.0.0.53#53

Non-authoritative answer:
Name: q2.lxd
Address: 10.148.181.44
Name: q2.lxd
Address: fd42:f983:5882:c87f:216:3eff:fec5:c96c

root@q1:~# ping -c 1 q2.lxd
PING q2.lxd (fd42:f983:5882:c87f:216:3eff:fec5:c96c) 56 data bytes
64 bytes from q2.lxd (fd42:f983:5882:c87f:216:3eff:fec5:c96c): icmp_seq=1 ttl=64 time=0.205 ms

--- q2.lxd ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.205/0.205/0.205/0.000 ms

root@q1:~# journalctl -b -u systemd-resolved.service --grep "DNSSEC validation failed"
Aug 06 14:15:33 q1 systemd-resolved[1526]: [🡕] DNSSEC validation failed for question lxd IN DS: no-signature
Aug 06 14:15:33 q1 systemd-resolved[1526]: [🡕] DNSSEC validation failed for question q1.lxd IN DS: no-signature
Aug 06 14:15:33 q1 systemd-resolved[1526]: [🡕] DNSSEC validation failed for question q1.lxd IN A: no-signature
Aug 06 14:15:33 q1 systemd-resolved[1526]: [🡕] DNSSEC validation failed for question q1.lxd IN AAAA: no-signature
Aug 06 14:15:33 q1 systemd-resolved[1526]: [🡕] DNSSEC validation failed for question lxd IN DS: no-signature
Aug 06 14:15:33 q1 systemd-resolved[1526]: [🡕] DNSSEC validation failed for question q1.lxd IN DS: no-signature
Aug 06 14:15:33 q1 systemd-resolved[1526]: [🡕] DNSSEC validation failed for question q1.lxd IN AAAA: no-signature
Aug 06 14:15:33 q1 systemd-resolved[1526]: [🡕] DNSSEC validation failed for question q1.lxd IN A: no-signature
Aug 06 14:16:21 q1 systemd-resolved[1526]: [🡕] DNSSEC validation failed for question lxd IN DS: no-signature
Aug 06 14:16:21 q1 systemd-resolved[1526]: [🡕] DNSSEC validation failed for question q2.lxd IN DS: no-signature
Aug 06 14:16:21 q1 systemd-resolved[1526]: [🡕] DNSSEC validation failed for question q2.lxd IN A: no-signature
Aug 06 14:16:21 q1 systemd-resolved[1526]: [🡕] DNSSEC validation failed for question q2.lxd IN AAAA: no-signature
Aug 06 14:16:21 q1 systemd-resolved[1526]: [🡕] DNSSEC validation failed for question lxd IN DS: no-signature
Aug 06 14:16:21 q1 systemd-resolved[1526]: [🡕] DNSSEC validation failed for question q2.lxd IN DS: no-signature
Aug 06 14:16:21 q1 systemd-resolved[1526]: [🡕] DNSSEC validation failed for question q2.lxd IN A: no-signature
Aug 06 14:16:21 q1 systemd-resolved[1526]: [🡕] DNSSEC validation failed for question q2.lxd IN AAAA: no-signature
Aug 06 14:16:21 q1 systemd-resolved[1526]: [🡕] DNSSEC validation failed for question lxd IN DS: no-signature
Aug 06 14:16:21 q1 systemd-resolved[1526]: [🡕] DNSSEC validation failed for question lxd.lxd IN DS: no-signature
Aug 06 14:16:21 q1 systemd-resolved[1526]: [🡕] DNSSEC validation failed for question q2.lxd.lxd IN DS: no-signature
Aug 06 14:16:21 q1 systemd-resolved[1526]: [🡕] DNSSEC validation failed for question q2.lxd.lxd IN A: no-signature
Aug 06 14:16:21 q1 systemd-resolved[1526]: [🡕] DNSSEC validation failed for question q2.lxd.lxd IN AAAA: no-signature
Aug 06 14:16:21 q1 systemd-resolved[1526]: [🡕] DNSSEC validation failed for question lxd IN DS: no-signature
Aug 06 14:16:21 q1 systemd-resolved[1526]: [🡕] DNSSEC validation failed for question lxd.lxd IN DS: no-signature
Aug 06 14:16:21 q1 systemd-resolved[1526]: [🡕] DNSSEC validation failed for question q2.lxd.lxd IN DS: no-signature
Aug 06 14:16:21 q1 systemd-resolved[1526]: [🡕] DNSSEC validation failed for question q2.lxd.lxd IN A: no-signature
Aug 06 14:16:21 q1 systemd-resolved[1526]: [🡕] DNSSEC validation failed for question q2.lxd.lxd IN AAAA: no-signature
Aug 06 14:16:25 q1 systemd-resolved[1526]: [🡕] DNSSEC validation failed for question lxd IN DS: no-signature
Aug 06 14:16:25 q1 systemd-resolved[1526]: [🡕] DNSSEC validation failed for question q2.lxd IN DS: no-signature
Aug 06 14:16:25 q1 systemd-resolved[1526]: [🡕] DNSSEC validation failed for question q2.lxd IN A: no-signature

Again, since the dnsmasq server listening on lxdbr0 is DNSSEC *compatible*, the downgrade logic implied by DNSSEC=allow-downgrade does not kick in.

Related branches

Revision history for this message
Nick Rosbrook (enr0n) wrote :

This is also causing strongswan vs systemd/257.7-1ubuntu3 autopkgtest failures in the host-to-host test [1]:

[ ... ]
Loading creds in container sun
871s loaded certificate from '/etc/swanctl/x509/sunCert.pem'
871s loaded certificate from '/etc/swanctl/x509ca/strongswanCert.pem'
871s loaded ED25519 key from '/etc/swanctl/private/sunKey.pem'
871s Loading connections in container sun
871s loaded connection 'sun-moon'
871s successfully loaded 1 connections, 0 unloaded
871s Generating traffic from moon to sun
871s ping: sun.lxd: Temporary failure in name resolution
871s Something failed, gathering debug info
[ ... ]

[1] https://autopkgtest.ubuntu.com/results/autopkgtest-questing/questing/amd64/s/strongswan/20250802_043325_5ea6b@/log.gz

Lukas Märdian (slyon)
tags: added: server-todo
Changed in strongswan (Ubuntu):
assignee: nobody → Lukas Märdian (slyon)
Revision history for this message
Lukas Märdian (slyon) wrote :

I guess we should consider adding a custom drop-in config inside the strongswan host-to-host containers, disabling DNSSEC, e.g.:
"""
[Resolve]
DNSSEC=no
"""

This would unblock the systemd migration.

Longer term, we should consider improvements to the LXD dnsmasq configuration. Either having it properly sign its authoritative domains properly, using DNSSEC, or disabling DNSSEC completely, so make the "allow-downgrade" fallback kick in.

Nick Rosbrook (enr0n)
Changed in systemd (Ubuntu):
assignee: nobody → Nick Rosbrook (enr0n)
Revision history for this message
Lukas Märdian (slyon) wrote :
Changed in lxd:
status: Unknown → New
Revision history for this message
Nick Rosbrook (enr0n) wrote :

I have uploaded a workaround for strongswan's host-to-host test. This should allow systemd to migrate.

Changed in strongswan (Ubuntu):
status: New → Fix Committed
Revision history for this message
Nick Rosbrook (enr0n) wrote :

> Forwarded to upstream LXD: https://github.com/canonical/lxd/issues/16252

The conclusion here is that we want to make the 'lxd' domain a negative trust anchor on LXD images. We can do this by shipping /usr/lib/dnssec-trust-anchors.d/lxd.negative in LXD images, by adding a customization in livecd-rootfs.

Revision history for this message
Lukas Märdian (slyon) wrote :

We need to check the exact DNS requests & response being exchanged between sd-resolved & dnsmasq, to better understand how systemd-resolved is validating that dnsmasq supports dnssec (even though the --dnssec flag isn't passed by LXD to it) and why that is causing it to expect .lxd to support DNSSEC.

The "allow-downgrade" mechanism should detect such instances and accept them without DNSSEC validation.

Changed in dnsmasq (Ubuntu):
assignee: nobody → Lukas Märdian (slyon)
status: New → Triaged
Revision history for this message
Nick Rosbrook (enr0n) wrote :

> The "allow-downgrade" mechanism should detect such instances and accept them without DNSSEC validation.

I don't think that's the intent of DNSSEC=allow-downgrade.

IIUC, dnsmasq, regardless of the presence of --dnssec, understands how to respond to DNSSEC queries. When systemd-resolved asks for DNSSEC validation of foo.lxd, dnsmasq says "I can't validate that" by sending an empty response for the validation.

In particular, because the response from dnsmasq contains the DO flag, and an empty RRSIG, systemd-resolved concludes "this server understands DNSSEC, and the record is unsigned, therefore validation failed". At least, that's my basic understanding of the systemd-resolved logic [1].

If, on the other hand, dnsmasq responded with some garbage that indicated it doesn't even _understand_ DNSSEC, systemd-resolved would invoke the allow-downgrade fallback, and accept the response without validation.

[1] https://github.com/systemd/systemd/blob/v257.8/src/resolve/resolved-dns-server.c#L699

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package strongswan - 6.0.1-6ubuntu4

---------------
strongswan (6.0.1-6ubuntu4) questing; urgency=medium

  * d/t/host-to-host: configure negative trust anchor for lxd domain
    Do this instead of disabling DNSSEC per-interface (LP: #2119652)

 -- Nick Rosbrook <email address hidden> Thu, 21 Aug 2025 12:46:41 -0400

Changed in strongswan (Ubuntu):
status: Fix Committed → Fix Released
Revision history for this message
Lukas Märdian (slyon) wrote :

IIUC, specific local/private domains (zones) can be excluded from DNSSEC validation for the different tools.

So if your environment defines a private zone that is not available via the DNS root servers, it needs to be excluded locally:

On the client side (systemd-resolved), through a negative trust-anchor:

# cat /usr/lib/dnssec-trust-anchors.d/lxd.negative
lxd

On the server (resolver) side:

- dnsmasq:
server=/lxd/LXD_GATEWAY_IP # this disables DNSSEC for the "lxd" zone, unless a corresponding trust-anchor is specified

- bind9:
"""
options
{
   [...]
   validate-except
   {
       "lxd";
   };
};
"""

Changed in bind9 (Ubuntu):
status: New → Invalid
Revision history for this message
Lukas Märdian (slyon) wrote :

We could consider shipping a list of negative trust anchors as part of systemd-resolved-dnssec, but longer term it would be better to fix individual projects/environments (like LXD) directly to do the right thing for private, unsigned zones. Keeping DNSSEC exceptions close to its origin.

Revision history for this message
Nick Rosbrook (enr0n) wrote :

DNSSEC exceptions must be made by the client/resolver requesting the validation. We cannot configure LXD's DNS server such that DNSSEC-enabled clients like systemd-resolved will start accepting it's unsigned records as "validated".

The dnsmasq and bind9 options you specify above, IIUC, refer to queries made *by* dnsmasq and bind9 themselves when requesting DNSSEC validation from upstream servers.

Revision history for this message
Lukas Märdian (slyon) wrote :

I've had a debugging session with @cpaelzer earlier today, where we ran into some "Temporary failure in name resolution" issues inside a KVM guest (for all - even global - domains) and "resolvectl dnssec enp1s0 false" made it work, which was scary. – But we fiddled quite a bit with the host networking, too.

Trying to reproduce this libvirt/KVM issue on a clean Noble host and Questing guest, everything worked as expected.

In libvirt/Qemu we don't have a private zone defined:
$ cat /etc/resolv.conf
[...]
nameserver 127.0.0.53
options edns0 trust-ad
search .

In Multipass, we have the "multipass" domain defined, causing issues:
$ cat /etc/resolv.conf
[...]
nameserver 127.0.0.53
options edns0 trust-ad
search multipass

=> This can be worked around in a similar way as the LXD workaround, using a negative trust-anchor (see comment #9):

$ cat /usr/lib/dnssec-trust-anchors.d/mp.negative
multipass

=> But I did not find a way how to assign the Multipass project to this bug report on Launchpad. Will reach out to them individually.

In LXD we have the "lxd" domain defined, causing issues:
$ cat /etc/resolv.conf
[...]
nameserver 127.0.0.53
options edns0 trust-ad
search lxd

=> Interestingly, after installing systemd 257.7-1ubuntu3 (incl. systemd-resolved-dnssec) and rebooting the LXD container/VM (I tried both), name resolution was working initially, even on the .lxd local name.

In the journal log I found some interesting messages:

"""
Aug 26 10:23:32 q1 systemd-resolved[112]: Using degraded feature set UDP instead of UDP+EDNS0+DO for DNS server 10.238.94.1.
Aug 26 10:23:32 q1 systemd-resolved[112]: [🡕] Server 10.238.94.1 does not support DNSSEC, downgrading to non-DNSSEC mode.
Aug 26 10:50:59 q1 systemd-resolved[112]: Grace period over, resuming full feature set (UDP+EDNS0+DO) for DNS server 10.238.94.1.
Aug 26 10:50:59 q1 systemd-resolved[112]: Using degraded feature set UDP instead of UDP+EDNS0+DO for DNS server 10.238.94.1.
"""

systemd-resolved seems to correctly detect that the upstream dnsmasq server is not supporting DNSSEC and enabling the fallback (non-DNSSEC) mode. After a grace period of some 25 sec, it resets to full DNSSEC support and eventually downgrades again.

This whole dance goes on for a while but eventually stops working, especially after a "systemctl restart systemd-resolved". So something seems to be wonky in systemd-resolved's detection of the upstream DNS server feature set and logic for activation of the fallback mode.

Similar issues of feature detection, especially in "local" DNS servers (e.g. home routes or virtualization environments) have been described by upstream systemd some 5 years ago:
* https://<email address hidden>/message/AFHNUEHKC5KJVGBGSJBH2BMESUAGDF4H/
* https://<email address hidden>/message/P63RI3VBQ7NGL3AKMTR7PCVHVSCPYCLF/

It seems like this might still not be as reliable as we'd want it to be, and I'm pondering if we should downgrade that "Recommends: systemd-resolved-dnssec" to a "Suggests" after all...

Revision history for this message
Nick Rosbrook (enr0n) wrote :

> systemd-resolved seems to correctly detect that the upstream dnsmasq server is not supporting DNSSEC...

What makes you say it's "correct" in this case? Are you testing with a dnsmasq server that doesn't know about DNSSEC? As we have already discussed, "unsigned records" != "lacks DNSSEC support".

If you are able to trigger the downgrade reliably, please capture the debug-level logs from a a single query in systemd-resolved. For example,

$ resolvectl log-level debug
$ resolvectl query <name>.lxd
$ journalctl -u systemd-resolved --since "5s ago"

or something.

> It seems like this might still not be as reliable as we'd want it to be, and I'm pondering if we should downgrade that "Recommends: systemd-resolved-dnssec" to a "Suggests" after all...

If we go this route, let's please just revert the change all together. I don't think carrying the extra binary package is worth it for a "Suggests:".

Revision history for this message
Lukas Märdian (slyon) wrote :
Revision history for this message
Lukas Märdian (slyon) wrote :

> What makes you say it's "correct" in this case? Are you testing with a dnsmasq server that doesn't know about DNSSEC? As we have already discussed, "unsigned records" != "lacks DNSSEC support".

Right, sd-resolved seems to query for the DO flag (= DNSSEC OK) from the EDNS0 protocol extension and is then falling back to legacy UDP, without DNSSEC support. – I still need to dig deeper how this detection is working in detail.

Nick Rosbrook (enr0n)
tags: added: dcr-incoming
Changed in lxd:
status: New → Fix Released
Revision history for this message
Lukas Märdian (slyon) wrote :
Download full text (10.2 KiB)

FTR: here is a sd-resolved debug log of:

$ resolvectl log-level debug
$ resolvectl flush-caches
$ resolvectl query nn-abi.lxd # this is another LXD container, running on my host.

=> as we can see, it does not get a DS record (as expected, as the .lxd domain has no chain of trust):
Aug 28 14:23:58 tender-fowl systemd-resolved[123]: Found verdict for lookup lxd IN DS: bogus
Aug 28 14:23:58 tender-fowl systemd-resolved[123]: [🡕] DNSSEC validation failed for question lxd IN DS: no-signature

"""
Aug 28 14:23:54 tender-fowl systemd-resolved[123]: Flushed all caches.
Aug 28 14:23:54 tender-fowl systemd-resolved[123]: Sent message type=method_return sender=n/a destination=:1.17 path=n/a interface=n/a member=n/a cookie=24 reply_cookie=2 signature=n/a error-name=n/a error-message=n/a
Aug 28 14:23:58 tender-fowl systemd-resolved[123]: Got message type=method_call sender=:1.18 destination=org.freedesktop.resolve1 path=/org/freedesktop/resolve1 interface=org.freedesktop.resolve1.Manager member=ResolveHostname cookie=2 reply_cookie=0 signature=isit error-name=n/a error-message=n/a
Aug 28 14:23:58 tender-fowl systemd-resolved[123]: idn2_lookup_u8: nn-abi.lxd → nn-abi.lxd
Aug 28 14:23:58 tender-fowl systemd-resolved[123]: Sent message type=method_call sender=n/a destination=org.freedesktop.DBus path=/org/freedesktop/DBus interface=org.freedesktop.DBus member=GetConnectionCredentials cookie=25 reply_cookie=0 signature=s error-name=n/a error-message=n/a
Aug 28 14:23:58 tender-fowl systemd-resolved[123]: Got message type=method_return sender=org.freedesktop.DBus destination=:1.0 path=n/a interface=n/a member=n/a cookie=15 reply_cookie=25 signature=a{sv} error-name=n/a error-message=n/a
Aug 28 14:23:58 tender-fowl systemd-resolved[123]: D-Bus hostname resolution request from client PID 633 (resolvectl) with UID 0
Aug 28 14:23:58 tender-fowl systemd-resolved[123]: Looking up RR for nn-abi.lxd IN A.
Aug 28 14:23:58 tender-fowl systemd-resolved[123]: Looking up RR for nn-abi.lxd IN AAAA.
Aug 28 14:23:58 tender-fowl systemd-resolved[123]: Sent message type=method_call sender=n/a destination=org.freedesktop.DBus path=/org/freedesktop/DBus interface=org.freedesktop.DBus member=AddMatch cookie=26 reply_cookie=0 signature=s error-name=n/a error-message=n/a
Aug 28 14:23:58 tender-fowl systemd-resolved[123]: Sent message type=method_call sender=n/a destination=org.freedesktop.DBus path=/org/freedesktop/DBus interface=org.freedesktop.DBus member=GetNameOwner cookie=27 reply_cookie=0 signature=s error-name=n/a error-message=n/a
Aug 28 14:23:58 tender-fowl systemd-resolved[123]: Got message type=method_return sender=org.freedesktop.DBus destination=:1.0 path=n/a interface=n/a member=n/a cookie=17 reply_cookie=27 signature=s error-name=n/a error-message=n/a
Aug 28 14:23:58 tender-fowl systemd-resolved[123]: Cache miss for nn-abi.lxd IN A
Aug 28 14:23:58 tender-fowl systemd-resolved[123]: Firing regular transaction 64076 for <nn-abi.lxd IN A> scope dns on eth0/* (validate=yes).
Aug 28 14:23:58 tender-fowl systemd-resolved[123]: Using feature level UDP+EDNS0+DO for transaction 64076.
Aug 28 14:23:58 tender-fowl systemd-resolved[123]: Using DNS server 10.238.94.1 for tran...

Revision history for this message
Nick Rosbrook (enr0n) wrote :

Marking "won't fix" for systemd, since we are just reverting to DNSSEC=no by default.

Changed in systemd (Ubuntu):
status: Confirmed → Won't Fix
Changed in livecd-rootfs (Ubuntu):
status: New → Invalid
Lukas Märdian (slyon)
Changed in libvirt (Ubuntu):
status: New → Invalid
Changed in lxd (Ubuntu):
status: New → Won't Fix
Changed in dnsmasq (Ubuntu):
status: Triaged → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.