walinuxagent blocks DNS fallback to TCP

Bug #1886128 reported by Darii Nurgaleev
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
systemd (Ubuntu)
Invalid
Undecided
Unassigned
walinuxagent (Ubuntu)
New
Undecided
Unassigned

Bug Description

[impact]

on azure instances, walinuxagent blocks all (new) TCP connections to the azure nameserver, which prevents fallback to TCP DNS for truncated dns queries

[test case]

on an azure instance:

ddstreet@lp1886128:~$ systemd-resolve --status | grep Servers
         DNS Servers: 168.63.129.16
ddstreet@lp1886128:~$ dig +retries=0 +timeout=1 +short +tcp @168.63.129.16 toomany100.ddstreet.org
;; connection timed out; no servers could be reached
;; Connection to 168.63.129.16#53(168.63.129.16) for toomany100.ddstreet.org failed: timed out.

change the actual nameserver ip in the 'dig' command to match what resolved is configured with (which comes from dhcp)

[regression potential]

TBD

[scope]

TBD

[original description]

Description: Ubuntu 18.04.4 LTS
Release: 18.04

systemd-resolve --version

systemd 237
+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP
+GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD -IDN2 +IDN
-PCRE2 default-hierarchy=hybrid

We met an error: on an attempt to resolve address, the following issue appears:

; <<>> DiG 9.11.3-1ubuntu1.11-Ubuntu <<>> mharder-formrec.cognitiveservices.azure.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 44096
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 65494
;; QUESTION SECTION:
;mharder-formrec.cognitiveservices.azure.com. IN A

;; Query time: 231 msec
;; SERVER: 127.0.0.53#53(127.0.0.53)
;; WHEN: Tue Apr 28 20:47:14 UTC 2020
;; MSG SIZE rcvd: 72

Let me provide you important notes about the issue:
1) It's not reproducing on Ubuntu 16;
2) Bypassing systemd-resolve - everything works fine;
3) Only the difference between systemd-resolve and END is UDP_PAYLOAD_SIZE

Successful query:

1135 16:27:25.964386 10.1.0.4 168.63.129.16 DNS 128 Standard query 0xc2d4 A mharder-formrec.cognitiveservices.azure.com OPT

Domain Name System (query)
    Transaction ID: 0xc2d4
    Flags: 0x0120 Standard query
        0... .... .... .... = Response: Message is a query
        .000 0... .... .... = Opcode: Standard query (0)
        .... ..0. .... .... = Truncated: Message is not truncated
        .... ...1 .... .... = Recursion desired: Do query recursively
        .... .... .0.. .... = Z: reserved (0)
        .... .... ..1. .... = AD bit: Set
        .... .... ...0 .... = Non-authenticated data: Unacceptable
    Questions: 1
    Answer RRs: 0
    Authority RRs: 0
    Additional RRs: 1
    Queries
        mharder-formrec.cognitiveservices.azure.com: type A, class IN
    Additional records
        <Root>: type OPT
            Name: <Root>
            Type: OPT (41)
            UDP payload size: 4096
            Higher bits in extended RCODE: 0x00
            EDNS0 version: 0
            Z: 0x0000
                0... .... .... .... = DO bit: Cannot handle DNSSEC security RRs
                .000 0000 0000 0000 = Reserved: 0x0000
            Data length: 12
            Option: COOKIE
Unsuccessful query:

1128 16:27:25.713886 10.1.0.4 168.63.129.16 DNS 116 Standard query 0x198d A mharder-formrec.cognitiveservices.azure.com OPT

Domain Name System (query)
    Transaction ID: 0x198d
    Flags: 0x0100 Standard query
        0... .... .... .... = Response: Message is a query
        .000 0... .... .... = Opcode: Standard query (0)
        .... ..0. .... .... = Truncated: Message is not truncated
        .... ...1 .... .... = Recursion desired: Do query recursively
        .... .... .0.. .... = Z: reserved (0)
        .... .... ...0 .... = Non-authenticated data: Unacceptable
    Questions: 1
    Answer RRs: 0
    Authority RRs: 0
    Additional RRs: 1
    Queries
        mharder-formrec.cognitiveservices.azure.com: type A, class IN
    Additional records
        <Root>: type OPT
            Name: <Root>
            Type: OPT (41)
            UDP payload size: 512
            Higher bits in extended RCODE: 0x00
            EDNS0 version: 0
            Z: 0x0000
                0... .... .... .... = DO bit: Cannot handle DNSSEC security RRs
                .000 0000 0000 0000 = Reserved: 0x0000
            Data length: 0
Notable difference:

Success:
            UDP payload size: 4096

Failure:
            UDP payload size: 512
And notable differences in the responses:

Success:
    Flags: 0x8180 Standard query response, No error
        .... ..0. .... .... = Truncated: Message is not truncated

Failure:
    Flags: 0x8380 Standard query response, No error
        .... ..1. .... .... = Truncated: Message is truncated

Interestingly, systemd-resolved is setting the maximum payload size to 512 regardless of whether EDNS0 is configured and regardless of what is sent to it for the payload size.
I tried to found a way to change UDP_PAYLOAD_SIZE,but it seems it is only possible to change it only with direct code modifications.

Revision history for this message
Dan Streetman (ddstreet) wrote :

Are you referring to edns0 from glibc to the local stub resolver, or edns0 from systemd-resolved to the upstream nameserver?

I don't see any problem when i resolve the name on bionic:

$ lsb_release -c
Codename: bionic

$ dpkg -l systemd|grep systemd
ii systemd 237-3ubuntu10.41 amd64 system and service manager

$ dig +short mharder-formrec.cognitiveservices.azure.com
westus2.api.cognitive.microsoft.com.
cognitiveusw2prod.trafficmanager.net.
cognitiveusw2prod.azure-api.net.
apimgmttmmtjxmdjuddplpewicwu8gnxxj7ehaj3ubplfwharv.trafficmanager.net.
cognitiveusw2prod-westus2-01.regional.azure-api.net.
apimgmthsn6metwepz5stnvukztxi3dks7nna13rgbo90ytolj.cloudapp.net.
52.229.16.14

Changed in systemd (Ubuntu):
status: New → Incomplete
Revision history for this message
Darii Nurgaleev (darii-nurgaleev) wrote :

Thank you for the answer. We are reffering to the local stub resolver( 127.0.0.53 ). As a workaround we have created symbolic link:

sudo ln -sf /run/systemd/resolve/resolv.conf /etc/resolv.conf

However, with the local stub resolver is still not working.

Revision history for this message
Dan Streetman (ddstreet) wrote :

> We are reffering to the local stub resolver

yes, i understand that, but i'm asking if you are talking about local traffic TO the stub resolver, or traffic FROM the stub resolver to your upstream nameserver.

If you have pcap showing the problem, please attach it.

If you're not sure what I'm talking about, then you should do:

$ sudo systemctl edit systemd-resolved

and in the file editor it opens, add this content:

[Service]
Environment=SYSTEMD_LOG_LEVEL=debug

then save the file and reboot.

then try looking up your host again (using the local stub resolver of course).

Gather the resolved logs and attach or paste them here.

$ journalctl -b -u systemd-resolved --no-pager

Revision history for this message
Darii Nurgaleev (darii-nurgaleev) wrote :

Thank you,

I have gathered required log as you mentioned:

Output of journalctl -b -u systemd-resolved --no-pager( please note: after the first read, link will disappear )
https://file.io/2LcfbtNf

Output of dig:

dig mharder-formrec.cognitiveservices.azure.com

; <<>> DiG 9.11.3-1ubuntu1.12-Ubuntu <<>> mharder-formrec.cognitiveservices.azure.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 16016
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 65494
;; QUESTION SECTION:
;mharder-formrec.cognitiveservices.azure.com. INA

;; Query time: 231 msec
;; SERVER: 127.0.0.53#53(127.0.0.53)
;; WHEN: Wed Jul 08 07:40:24 UTC 2020
;; MSG SIZE rcvd: 72

Revision history for this message
Dan Streetman (ddstreet) wrote :

> please note: after the first read, link will disappear

it's already gone (this is a public bug)

Revision history for this message
Darii Nurgaleev (darii-nurgaleev) wrote :

Added log as attachment

Revision history for this message
Dan Streetman (ddstreet) wrote :
Download full text (4.3 KiB)

well that's not a pcap, a pcap is a packet capture, e.g. from tcpdump.

Your log shows your response is truncated:
Jul 08 07:27:22 ubuntu18oras systemd-resolved[963]: Got DNS stub UDP query packet for id 2283
Jul 08 07:27:22 ubuntu18oras systemd-resolved[963]: Looking up RR for mharder-formrec.cognitiveservices.azure.com IN A.
Jul 08 07:27:22 ubuntu18oras systemd-resolved[963]: Cache miss for mharder-formrec.cognitiveservices.azure.com IN A
Jul 08 07:27:22 ubuntu18oras systemd-resolved[963]: Transaction 26533 for <mharder-formrec.cognitiveservices.azure.com IN A> scope dns on eth0/*.
Jul 08 07:27:22 ubuntu18oras systemd-resolved[963]: Using feature level UDP+EDNS0 for transaction 26533.
Jul 08 07:27:22 ubuntu18oras systemd-resolved[963]: Using DNS server 168.63.129.16 for transaction 26533.
Jul 08 07:27:22 ubuntu18oras systemd-resolved[963]: Sending query packet with id 26533.
Jul 08 07:27:22 ubuntu18oras systemd-resolved[963]: Processing query...
Jul 08 07:27:22 ubuntu18oras systemd-resolved[963]: Processing incoming packet on transaction 26533. (rcode=SUCCESS)
Jul 08 07:27:22 ubuntu18oras systemd-resolved[963]: Reply truncated, retrying via TCP.

resolved then retries using tcp, but your upstream nameserver doesn't respond:

Jul 08 07:27:23 ubuntu18oras systemd-resolved[963]: Timeout reached on transaction 26533.

you should make sure your upstream nameserver supports tcp and/or check your firewalling to make sure tcp can reach your upstream nameserver, and/or make sure your upstream nameserver supports larger udp packet sizes with edns0.

An actual packet capture would show exactly what is going on.

for reference, on my system (Ubuntu Bionic 18.04 container) edns0 works fine for that hostname without any truncation:

Jul 08 17:35:18 lp1886128-b systemd-resolved[1114]: Got DNS stub UDP query packet for id 18607
Jul 08 17:35:18 lp1886128-b systemd-resolved[1114]: Looking up RR for mharder-formrec.cognitiveservices.azure.com IN A.
Jul 08 17:35:18 lp1886128-b systemd-resolved[1114]: Switching to DNS server 10.202.51.1 for interface eth0.
Jul 08 17:35:18 lp1886128-b systemd-resolved[1114]: Cache miss for mharder-formrec.cognitiveservices.azure.com IN A
Jul 08 17:35:18 lp1886128-b systemd-resolved[1114]: Transaction 3905 for <mharder-formrec.cognitiveservices.azure.com IN A> scope dns on eth0/*.
Jul 08 17:35:18 lp1886128-b systemd-resolved[1114]: Using feature level UDP+EDNS0 for transaction 3905.
Jul 08 17:35:18 lp1886128-b systemd-resolved[1114]: Using DNS server 10.202.51.1 for transaction 3905.
Jul 08 17:35:18 lp1886128-b systemd-resolved[1114]: Sending query packet with id 3905.
Jul 08 17:35:18 lp1886128-b systemd-resolved[1114]: Processing query...
Jul 08 17:35:18 lp1886128-b systemd-resolved[1114]: Processing incoming packet on transaction 3905. (rcode=SUCCESS)
Jul 08 17:35:18 lp1886128-b systemd-resolved[1114]: Verified we get a response at feature level UDP+EDNS0 from DNS server 10.202.51.1.
Jul 08 17:35:18 lp1886128-b systemd-resolved[1114]: Added positive unauthenticated cache entry for mharder-formrec.cognitiveservices.azure.com IN CNAME 899s on */INET/10.202.51.1
Jul 08 17:35:18 lp1886128-b systemd-resolved[1114]: Added positive u...

Read more...

Revision history for this message
Darii Nurgaleev (darii-nurgaleev) wrote :

Thank you for the detailed explanation.

Let me clarify some things here:

1) In the initial reply, I provided two types of reponses:

- A successful one, that goes right through EDNS0 with UDP payload size 4096
- An unsuccessful one, that goes through the local stub resolver, but with udp payload size 512. I believe that successful example confirms that EDNS supports larger UDP payload size. Is it correct?

Could you please advise how to increase UDP payload size for the local stub resolver?

2) I have gathered data using tcpdump, I hope it sheds some light on this.

Revision history for this message
Dan Streetman (ddstreet) wrote :

> dns.pcap Edit (14.5 KiB, application/vnd.tcpdump.pcap)

that's tcpdump output, not actually a pcap; to get an actual pcap use the tcpdump -w parameter to write the packets to a file.

> Could you please advise how to increase UDP payload size for the local stub resolver?

resolved adjusts levels automatically. looking at your dns packet capture will be the easiest way to tell what the problem is.

Revision history for this message
Darii Nurgaleev (darii-nurgaleev) wrote :

Thank you for the explanation, I have gathered dns.pcap file with the required option.

Dan Streetman (ddstreet)
Changed in systemd (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Dan Streetman (ddstreet) wrote :

Unfortunately, this is a bug in upstream systemd.

For some reason, resolved's current upstream code clamps the 'best' server protocol level at 512-byte-sized EDNS0 if DNSSEC is disabled. Since the default is for DNSSEC to be disabled, this means by default, resolved will restrict its udp edns0 packets sent to upstream nameservers at 512 bytes, as you indicated in the bug description.

normally, this is ok because most nameservers support TCP, and resolved always falls back to TCP when it finds a truncated response. However, your upstream nameserver seems to either not support TCP DNS, or its port 53 is firewalled for TCP.

This will take some time to open an upstream bug and investigate patching. If you open an upstream bug please add a comment with the link.

Changed in systemd (Ubuntu):
assignee: nobody → Dan Streetman (ddstreet)
importance: Undecided → Medium
status: Confirmed → In Progress
Revision history for this message
Chad Kimes (chkimes) wrote :

Azure DNS does support DNS over TCP. Looking through a separate packet capture I had taken during investigating this issue, I don't see any attempts by resolved to open a TCP connection to port 53 of the DNS server (in fact, I don't see any use of TCP port 53 at all).

Wireshark filters used:

udp.port == 53 or tcp.port == 53 (shows all DNS traffic)
tcp.port == 53 (shows nothing)

Revision history for this message
Dan Streetman (ddstreet) wrote :

I spun up an azure instance and tested, and indeed tcp port 53 appears completely missing from any tcpdump, but only for packets sent to the upstream nameserver. TCP sent to port 53 on *any* other ip address does make it out, but tcp port 53 to the nameserver does not. There are no routing rules or ip routes or iptables I could find that might be blocking the packets, so I'm not sure why the packets aren't being sent out, but that's the cause of the failure here to fallback to TCP DNS.

So 1) we do need to fix upstream systemd to use >512 byte edns0 udp with upstream nameservers, and also 2) something is wrong with the azure instances that is blocking TCP DNS to the upstream nameserver.

Revision history for this message
Dan Streetman (ddstreet) wrote :

To repro on azure instance:

ddstreet@lp1886128:~$ systemd-resolve --status | grep Servers
         DNS Servers: 168.63.129.16
ddstreet@lp1886128:~$ dig +retries=0 +timeout=1 +short +tcp @168.63.129.16 toomany100.ddstreet.org
;; connection timed out; no servers could be reached
;; Connection to 168.63.129.16#53(168.63.129.16) for toomany100.ddstreet.org failed: timed out.

tcp does work with external dns servers, e.g.:

ddstreet@lp1886128:~$ dig +retries=0 +timeout=1 +short +tcp @8.8.8.8 toomany100.ddstreet.org | head -5
10.254.201.0
10.254.201.1
10.254.201.18
10.254.201.19
10.254.201.2

note that the TCP packets to 168.63.129.16 do not make it out of the instance; they don't show up in the tcpdump at all, although TCP packets to 168.63.129.15 or 168.63.129.17 *do* show up. So there is something specific to that ip address that's causing TCP packets to get lost inside the instance.

Revision history for this message
Dan Streetman (ddstreet) wrote :

aha:

ddstreet@lp1886128:~$ sudo iptables -n -t security -L OUTPUT
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
ACCEPT tcp -- 0.0.0.0/0 168.63.129.16 owner UID match 0
DROP tcp -- 0.0.0.0/0 168.63.129.16 ctstate INVALID,NEW

it seems like this is being added by walinuxagent:

Jul 13 16:21:15 lp1886128 python3[1298]: 2020/07/13 16:21:15.672132 INFO ExtHandler Successfully added Azure fabric firewall rules
Jul 13 16:21:15 lp1886128 python3[1298]: 2020/07/13 16:21:15.683188 INFO ExtHandler Firewall rules:
Jul 13 16:21:15 lp1886128 python3[1298]: Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
Jul 13 16:21:15 lp1886128 python3[1298]: pkts bytes target prot opt in out source destination
Jul 13 16:21:15 lp1886128 python3[1298]: Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
Jul 13 16:21:15 lp1886128 python3[1298]: pkts bytes target prot opt in out source destination
Jul 13 16:21:15 lp1886128 python3[1298]: Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes)
Jul 13 16:21:15 lp1886128 python3[1298]: pkts bytes target prot opt in out source destination
Jul 13 16:21:15 lp1886128 python3[1298]: 0 0 ACCEPT tcp -- * * 0.0.0.0/0 168.63.129.16 owner UID match 0
Jul 13 16:21:15 lp1886128 python3[1298]: 0 0 DROP tcp -- * * 0.0.0.0/0 168.63.129.16 ctstate INVALID,NEW

summary: - systemd-resolved does not resolve address due to udp payload size.
+ walinuxagent blocks DNS fallback to TCP
Changed in systemd (Ubuntu):
status: In Progress → Invalid
Revision history for this message
Dan Streetman (ddstreet) wrote :

@darii-nurgaleev I marked this invalid for systemd as the specific problem in this case is blocking fallback to TCP DNS by walinuxagent, but if you want to open a separate bug to track updating systemd to use large edns0 packets even without DNSSEC, please feel free to.

description: updated
Revision history for this message
Dan Streetman (ddstreet) wrote :

also note that the iptable rules that walinuxagent adds does allow TCP traffic to the nameserver if the process is running as root, e.g.:

ddstreet@lp1886128:~$ dig +retries=0 +timeout=1 +short +tcp @168.63.129.16 toomany100.ddstreet.org
;; connection timed out; no servers could be reached
;; Connection to 168.63.129.16#53(168.63.129.16) for toomany100.ddstreet.org failed: timed out.
ddstreet@lp1886128:~$ sudo dig +retries=0 +timeout=1 +short +tcp @168.63.129.16 toomany100.ddstreet.org | head -5
10.254.201.0
10.254.201.1
10.254.201.18
10.254.201.19
10.254.201.2

however systemd-resolved doesn't run as root, it runs as the 'systemd-resolve' user.

Revision history for this message
Chris Newcomer (cnewcomer) wrote :

Issue filed against walinuxagent on GitHub:
https://github.com/Azure/WALinuxAgent/issues/1673

Dan Streetman (ddstreet)
Changed in systemd (Ubuntu):
assignee: Dan Streetman (ddstreet) → nobody
importance: Medium → Undecided
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.