libc resolver stops searching domain search list after getting back NSEC record

Bug #1717015 reported by Jonathan Kamens
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
systemd (Ubuntu)
Expired
High
Unassigned

Bug Description

Suppose that:

1. you have a "search" line in your /etc/resolv.conf file;
2. it has two domains in it; and
3. the first of the two domains does DNSSEC, including returning NSEC records for nonexisting hosts.

In this situation, when you try to look up a host name in the second domain without specifying the domain part of the host name, the libc resolver will stop after it gets back the NSEC record and report that the host name doesn't exist, rather than moving on to the second domain in the search list and searching for the host in that domain.

See also https://bugs.launchpad.net/ubuntu/+source/bind9/+bug/1717014 .

ProblemType: Bug
DistroRelease: Ubuntu 17.04
Package: libc6 2.24-9ubuntu2.2
ProcVersionSignature: Ubuntu 4.10.0-33.37-generic 4.10.17
Uname: Linux 4.10.0-33-generic x86_64
ApportVersion: 2.20.4-0ubuntu4.5
Architecture: amd64
CurrentDesktop: Unity:Unity7
Date: Wed Sep 13 16:00:45 2017
Dependencies:
 gcc-6-base 6.3.0-12ubuntu2
 libc6 2.24-9ubuntu2.2
 libgcc1 1:6.3.0-12ubuntu2
InstallationDate: Installed on 2016-08-09 (400 days ago)
InstallationMedia: Ubuntu 16.04.1 LTS "Xenial Xerus" - Release amd64 (20160719)
SourcePackage: glibc
UpgradeStatus: Upgraded to zesty on 2017-04-19 (147 days ago)

Revision history for this message
Jonathan Kamens (jik) wrote :
Changed in glibc (Ubuntu):
importance: Undecided → Medium
importance: Medium → High
Revision history for this message
Florian Weimer (fweimer) wrote :

Do you have packet captures which demonstrate the problem? We will need the data exchanged between the host which performs the DNS lookup and the recursive resolvers configured in resolv.conf.

We are pretty sure that the stub resolver is tolerant to unknown DNS record types, so the behavior you report is rather odd.

Revision history for this message
Jonathan Kamens (jik) wrote :

The host name jik5.kamens.us exists:

>$ host jik5.kamens.us
>jik5.kamens.us has address 146.115.42.232

Just to be clear, that's a DNS record, not a local entry in /etc/hosts. To prove that, here's the last few lines of the output of `dig jik5.kamens.us +trace`:

>jik5.kamens.us. 1800 IN A 146.115.42.232
>kamens.us. 1800 IN NS dns2.registrar-servers.com.
>kamens.us. 1800 IN NS dns1.registrar-servers.com.
>;; Received 118 bytes from 216.87.152.33#53(dns2.registrar-servers.com) in 47 ms

I run a local named. When this is in my /etc/resolv.conf:

>nameserver 127.0.0.1
>search quantopian.com kamens.us

...here's what I get when I run "ping jik5":

>$ ping jik5
>ping: jik5: Name or service not known

And here's what happens when I reverse the order of the domains on the "search" line in /etc/resolv.conf and list kamens.us first:

>$ ping jik5
>PING jik5.kamens.us (146.115.42.232) 56(84) bytes of data.
>64 bytes from 146-115-42-232.s5094.c3-0.abr-ubr1.sbo-abr.ma.cable.rcncustomer.com (146.115.42.232): icmp_seq=1 ttl=64 time=0.371 ms
>...

I've attached the packet capture resulting from the first lookup above, i.e., the one when quantopian.com is listed first in /etc/resolv.conf.

Revision history for this message
Florian Weimer (fweimer) wrote :

Thanks. I have difficulties reproducing this.

Have you set options in /etc/resolv.conf? What's the contents of your /etc/nsswitch.conf file, particularly the “hosts” line?

It's odd that the DO bit is set on the query. Ordinarily, the glibc stub resolver would not send such queries without application patches.

Revision history for this message
Jonathan Kamens (jik) wrote :

This is my entire /etc/resolv.conf (excluding comments) when the problem manifests:

nameserver 127.0.0.1
search quantopian.com kamens.us

This is the hosts line in /etc/nsswitch.conf:

hosts: files mdns4_minimal [NOTFOUND=return] resolve [!UNAVAIL=return] dns myhostname

Revision history for this message
Florian Weimer (fweimer) wrote :

“resolve” before “dns” in /etc/nsswitch.conf means you are using systemd-resolved, not the glibc stub resolver, so this does not look like a glibc bug anymore.

Jonathan Kamens (jik)
affects: glibc (Ubuntu) → systemd (Ubuntu)
Revision history for this message
Steve Langasek (vorlon) wrote :

As of Ubuntu 17.10, libnss-resolve is not installed by default. Is this problem reproducible when libnss-resolve is removed, using the resolved stub resolver instead of the NSS module?

I don't appear to be able to confirm the original behavior against the quantopian.com domain (I don't get any NSEC responses), and don't have another DNSSEC-enabled domain to hand that I can test with.

Changed in systemd (Ubuntu):
status: New → Incomplete
Revision history for this message
Jonathan Kamens (jik) wrote :

I uninstalled libnss-resolve and the problem persists:

$ sudo apt-get remove libnss-resolve
...
$ sudo systemd-resolve --flush-caches
$ host jik5
Host jik5.quantopian.com not found: 2(SERVFAIL)
$ cat /etc/resolv.conf
# Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8)
# DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN
# 127.0.0.53 is the systemd-resolved stub resolver.
# run "systemd-resolve --status" to see details about the actual nameservers.

nameserver 127.0.0.53
search quantopian.com kamens.us
$

Note that "Just don't use libnss-resolve" wouldn't be a very good answer to this problem even if it worked, because things like openvpn-systemd-resolved, which I use, depend on it.

Revision history for this message
Steve Langasek (vorlon) wrote : Re: [Bug 1717015] Re: libc resolver stops searching domain search list after getting back NSEC record

On Sat, Jan 27, 2018 at 01:55:07PM -0000, Jonathan Kamens wrote:
> I uninstalled libnss-resolve and the problem persists:
>
> $ sudo apt-get remove libnss-resolve
> ...
> $ sudo systemd-resolve --flush-caches
> $ host jik5
> Host jik5.quantopian.com not found: 2(SERVFAIL)
> $ cat /etc/resolv.conf
> # Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8)
> # DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN
> # 127.0.0.53 is the systemd-resolved stub resolver.
> # run "systemd-resolve --status" to see details about the actual nameservers.

> nameserver 127.0.0.53
> search quantopian.com kamens.us
> $

Ok, then I will need some help understanding how to reproduce this problem,
since simply inserting quantopian.com in the search list in /etc/resolv.conf
on an Ubuntu 17.10 system with default settings is insufficient to reproduce
the problem you describe.

Have you also changed the default DNSSEC settings for systemd-resolved in
/etc/systemd/resolved.conf ? What is the complete output of
'systemd-resolve --status'?

> Note that "Just don't use libnss-resolve" wouldn't be a very good answer
> to this problem even if it worked, because things like openvpn-systemd-
> resolved, which I use, depend on it.

Well, that's a bug in the openvpn-systemd-resolved package, it should not
depend on libnss-resolve for what it does.

--
Steve Langasek Give me a lever long enough and a Free OS
Debian Developer to set it on, and I can move the world.
Ubuntu Developer http://www.debian.org/
<email address hidden> <email address hidden>

Revision history for this message
Florian Weimer (fw) wrote :

I believe systemd-resolved is still active on the system. It's just not queried over whatever interface nss_resolved uses, but over DNS, via the stub resolver at 127.0.0.53. If the systemd-resolved has bad data, it will probably return bad data on the DNS interface as well.

Revision history for this message
Steve Langasek (vorlon) wrote :

Yes, systemd-resolved will still be in use as the local stub resolver. But there have certainly been behavior differences between the stub resolver and the dbus service in the past, so this was still useful to rule out.

The problem is still not reproducible for me locally, however.

Revision history for this message
Jonathan Kamens (jik) wrote :

I haven't changed /etc/systemd/resolved.conf.

Here's systemd-resolve --status

Global
          DNS Domain: cnn.com
          DNSSEC NTA: 10.in-addr.arpa
                      16.172.in-addr.arpa
                      168.192.in-addr.arpa
                      17.172.in-addr.arpa
                      18.172.in-addr.arpa
                      19.172.in-addr.arpa
                      20.172.in-addr.arpa
                      21.172.in-addr.arpa
                      22.172.in-addr.arpa
                      23.172.in-addr.arpa
                      24.172.in-addr.arpa
                      25.172.in-addr.arpa
                      26.172.in-addr.arpa
                      27.172.in-addr.arpa
                      28.172.in-addr.arpa
                      29.172.in-addr.arpa
                      30.172.in-addr.arpa
                      31.172.in-addr.arpa
                      corp
                      d.f.ip6.arpa
                      home
                      internal
                      intranet
                      lan
                      local
                      private
                      test

Link 5 (virbr0-nic)
      Current Scopes: none
       LLMNR setting: yes
MulticastDNS setting: no
      DNSSEC setting: no
    DNSSEC supported: no

Link 4 (virbr0)
      Current Scopes: none
       LLMNR setting: yes
MulticastDNS setting: no
      DNSSEC setting: no
    DNSSEC supported: no

Link 3 (wlp3s0)
      Current Scopes: DNS LLMNR/IPv4 LLMNR/IPv6
       LLMNR setting: yes
MulticastDNS setting: no
      DNSSEC setting: no
    DNSSEC supported: no
         DNS Servers: 192.168.43.1

Link 2 (enp0s25)
      Current Scopes: none
       LLMNR setting: yes
MulticastDNS setting: no
      DNSSEC setting: no
    DNSSEC supported: no

Revision history for this message
Jonathan Kamens (jik) wrote :

Note: the problem is now even worse than what I reported above. If I put "search kamens.us" in /etc/resolv.conf and then try to resolve "jik5", "jik5.kamens.us", or "jik5.kamens.us.", all of which should resolve successfully, they all fail with SERVFAIL.

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for systemd (Ubuntu) because there has been no activity for 60 days.]

Changed in systemd (Ubuntu):
status: Incomplete → Expired
Steve Langasek (vorlon)
Changed in systemd (Ubuntu):
status: Expired → Won't Fix
status: Won't Fix → New
Revision history for this message
Dimitri John Ledkov (xnox) wrote :

1) one should not manually adjust search domains in /etc/resolv.conf
2) systemd-resolved should learn about search domains

 - for example, set search domains in /etc/systemd/resolved.conf if nothing sets them on per link basis vai resolved dbus api or networkd.network files.

3) /etc/resolv.conf should be a symlink to ../run/systemd/resolve/stub-resolv.conf
4) ../run/systemd/resolve/stub-resolv.conf should be dynamically updated by resolved to contain the correct search domains
5) resolved does not send DNSSEC info to clients that do not support DNSSEC nor requested a DNSSEC response
6) if you expect DNSSEC validation from responses resolved provides, please manually enable DNSSEC in /etc/systemd/resolved.conf and all the relevant links via systemd-resolve cmdline tool (if not managed vai networkd.network units)

Changed in systemd (Ubuntu):
status: New → Incomplete
Revision history for this message
Steve Langasek (vorlon) wrote :

On Tue, Apr 03, 2018 at 08:59:19AM -0000, Dimitri John Ledkov wrote:
> 1) one should not manually adjust search domains in /etc/resolv.conf
> 2) systemd-resolved should learn about search domains
> - for example, set search domains in /etc/systemd/resolved.conf if
> nothing sets them on per link basis vai resolved dbus api or
> networkd.network files.
>
> 3) /etc/resolv.conf should be a symlink to
> ../run/systemd/resolve/stub-resolv.conf

> 4) ../run/systemd/resolve/stub-resolv.conf should be dynamically updated
> by resolved to contain the correct search domains

If systemd-resolved is going to publish search domain instructions to
../run/systemd/resolve/stub-resolv.conf anyway for use by the libc client, I
don't see any reason to say "one should not manually adjust search domains".

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for systemd (Ubuntu) because there has been no activity for 60 days.]

Changed in systemd (Ubuntu):
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.