systemd-resolved stub gives SERVFAIL for DNSSEC negative response

Bug #2062542 reported by Marco van Zwetselaar
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
systemd (Ubuntu)
New
Low
Unassigned

Bug Description

This issue surfaced when researching the issue that Postfix on my system (with DANE enabled) deferred mail deliveries with 100s of this warning in the log:

    Warning: DANE TLSA lookup problem: Host or domain name not found. Name service error for name=_25._tcp.cluster5.us.messagelabs.com type=TLSA: Host not found, try again

The DNS resolver on my machine was pointing at the systemd-resolved stub:

    $ cat /etc/resolv.conf | grep nameserver
    nameserver 127.0.0.53

    $ resolvectl status
    Global
        Protocols: -LLMNR -mDNS -DNSOverTLS DNSSEC=allow-downgrade/supported
        resolv.conf mode: stub

Note DNSSEC is enabled (else Postfix couldn't be doing DANE). Now if I query the TLSA record for the messagelab server, I get a SERVFAIL from the stub resolver:

    $ delv +dnssec _25._tcp.cluster5.us.messagelabs.com TLSA
    ;; resolution failed: SERVFAIL

Whereas if I query my upstream DNS or Google DNS, I get a DNSSEC validated (negative) response:

    $ delv @8.8.8.8 +dnssec _25._tcp.cluster5.us.messagelabs.com TLSA
    ;; resolution failed: ncache nxrrset
    ; negative response, fully validated
    ; _25._tcp.cluster5.us.messagelabs.com. 299 IN \-TLSA ;-$NXRRSET
    ; _25._tcp.cluster5.us.messagelabs.com. RRSIG NSEC ...
    ; _25._tcp.cluster5.us.messagelabs.com. NSEC \000._25._tcp.cluster5.us.messagelabs.com. A PTR HINFO MX TXT RP AAAA SRV NAPTR SSHFP RRSIG NSEC SVCB HTTPS SPF IXFR AXFR CAA
    ; messagelabs.com. SOA ns-1714.awsdns-22.co.uk. awsdns-hostmaster.amazon.com. 1 7200 900 1209600 86400
    ; messagelabs.com. RRSIG SOA ...

I assume Postfix (with smtp_tls_security_level = dane i.e. "Opportunistic DANE") deals with the negative response by downgrading to "encrypt", whereas the SERVFAIL response makes it refuse to connect altogether.

My workaround was to switch from the systemd-resolved stub resolver to the upstream servers. In /etc/systemd/resolved.conf set:

    DNS=... your upstream servers if not already given through DHCP ...
    DNSStubListener=no

Then restart the service and restart Postfix if it is chrooted (so the new /etc/resolv.conf gets copied into the chroot):

    systemctl restart systemd-resolved
    systemctl restart postfix

I am not sure if this could be considered a Postfix bug as well (it could consider a SERVFAIL on a TLSA record the same as a negative), but surely it seems to me the systemd-resolved stub resolver should not return the SERVFAIL here.

For more background on this bug report, please see https://serverfault.com/a/1158198/299950

Tags: mantic
Revision history for this message
Nick Rosbrook (enr0n) wrote :

What version of Ubuntu is this?

Changed in systemd (Ubuntu):
status: New → Incomplete
importance: Undecided → Low
Revision history for this message
Marco van Zwetselaar (zwets) wrote :

This is on mantic, systemd-resolved 253.5-1ubuntu6.1

tags: added: mantic
description: updated
Revision history for this message
Databay (rs-databay) wrote :

I can confirm this problem also on Ubuntu Jammy, systemd-resolved from systemd 249.11-0ubuntu3.12.

I had mails queued to cluster5.eu.messagelabs.com:25 in my queues for hours.

Local stub-resolver failed with SERVFAIL:

prod-mail-01:~$ delv +dnssec _25._tcp.cluster5.us.messagelabs.com TLSA
;; resolution failed: SERVFAIL

An internal unbound resolver or Google DNS worked:
delv @10.1.1.4 +dnssec _25._tcp.cluster5.us.messagelabs.com TLSA
;; resolution failed: ncache nxrrset
; negative response, fully validated
; _25._tcp.cluster5.us.messagelabs.com. 900 IN \-TLSA ;-$NXRRSET
; _25._tcp.cluster5.us.messagelabs.com. RRSIG NSEC ...
; _25._tcp.cluster5.us.messagelabs.com. NSEC \000._25._tcp.cluster5.us.messagelabs.com. A PTR HINFO MX TXT RP AAAA SRV NAPTR DNAME SSHFP RRSIG NSEC SVCB HTTPS SPF IXFR AXFR CAA
; messagelabs.com. SOA ns-1714.awsdns-22.co.uk. awsdns-hostmaster.amazon.com. 1 7200 900 1209600 86400
; messagelabs.com. RRSIG SOA ...

Mails queued with error:
May 30 09:22:17 vm-ewkf-prod-mail-01 postfix/smtp[3087917]: 7DE0041E79: to=<email address hidden>, relay=none, delay=63515, delays=63515/0.03/0.08/0, dsn=4.7.5, status=deferred (TLSA lookup error for cluster5.eu.messagelabs.com:25)
May 30 10:07:17 vm-ewkf-prod-mail-01 postfix/smtp[3089367]: 8EE4C41DC6: to=<email address hidden>, relay=none, delay=67515, delays=67515/0.03/0.09/0, dsn=4.7.5, status=deferred (TLSA lookup error for cluster5.eu.messagelabs.com:25)
May 30 10:12:18 vm-ewkf-prod-mail-01 postfix/smtp[3089603]: 4E46041E69: to=<email address hidden>, relay=none, delay=67632, delays=67632/0.04/0.09/0, dsn=4.7.5, status=deferred (TLSA lookup error for cluster5.eu.messagelabs.com:25)

After disabling stub-resolver everything went out:
May 30 11:11:42 prod-mail-01 postfix/smtp[3092649]: 7DE0041E79: to=<email address hidden>, relay=cluster5.eu.messagelabs.com[195.245.231.72]:25, delay=70080, delays=70079/0.56/0.23/0.31, dsn=2.0.0, status=sent (250 ok 1717060302 qp 31363 server-5.tower-565.messagelabs.com!1717060301!18002!1)
May 30 11:11:42 prod-mail-01 postfix/qmgr[3092578]: 7DE0041E79: removed
May 30 11:11:42 prod-mail-01 postfix/smtp[3092651]: 4E46041E69: to=<email address hidden>, relay=cluster5.eu.messagelabs.com[85.158.142.214]:25, delay=71196, delays=71195/0.58/0.31/0.45, dsn=2.0.0, status=sent (250 ok 1717060302 qp 12390 server-3.tower-732.messagelabs.com!1717060301!14409!1)
May 30 11:11:42 prod-mail-01 postfix/smtp[3092650]: 318D441E07: to=<email address hidden>, relay=cluster5.eu.messagelabs.com[85.158.142.210]:25, delay=70351, delays=70350/0.57/0.33/0.44, dsn=2.0.0, status=sent (250 ok 1717060302 qp 7378 server-5.tower-728.messagelabs.com!1717060301!22678!1)
May 30 11:11:42 prod-mail-01 postfix/qmgr[3092578]: 4E46041E69: removed
May 30 11:11:42 prod-mail-01 postfix/qmgr[3092578]: 318D441E07: removed

Revision history for this message
Databay (rs-databay) wrote :

This problem shows up when postfix uses DANE and DNSSEC to verify TLSA Records.
This will also affect other services that verify TLS-Services by querying for TLSA Records via DNSSEC.
So this is not a postfix-specific use-case.

Nick Rosbrook (enr0n)
Changed in systemd (Ubuntu):
status: Incomplete → New
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.